Quantcast
Channel: CodeSection,代码区,数据库(综合) - CodeSec
Viewing all articles
Browse latest Browse all 6262

AWS Serverless Lambda Scheduled Events to Store Tweets in Couchbase

$
0
0

This blog has explained a few Serverless concepts with codesamples:

Serverless FaaS with AWS Lambda and Java AWS IoT Button, Lambda and Couchbase Microservice using AWS API Gateway, AWS Lambda and Couchbase Microservice using AWS Serverless Application Model and Couchbase

This particular blog entry will showhow to use AWS Lambda to store tweets of atweeter inCouchbase.Hereare the high level components:


AWS Serverless Lambda Scheduled Events to Store Tweets in Couchbase

The key concepts are:

Lambda Function deployed using Serverless Application Model Triggered every 3 hours using Scheduled Events Uses Twitter4J API to query new tweets since the last fetch Use Couchbase Java SDK API to store JSON documents in theCouchbase Server

Complete sample code for this blog is available at github.com/arun-gupta/twitter-n1ql .

Serverless Application Model

Serverless Application Model , or SAM, defines simplified syntax for expressing serverless resources. SAM extends AWS CloudFormation to add support for API Gateway, AWS Lambda and Amazon DynamoDB. Read more details in Microservice using AWS Serverless Application Model and Couchbase . For our application, SAM template is available at github.com/arun-gupta/twitter-n1ql/blob/master/template-example.yml and shown below:

AWSTemplateFormatVersion : '2010-09-09' Transform: AWS::Serverless-2016-10-31 Description: Twitter Feed Analysis using Couchbase/N1QL Resources: TrumpFeed: Type: AWS::Serverless::Function Properties: Handler: org.sample.twitter.TwitterRequestHandler Runtime: java8 CodeUri: s3://arungupta.me/twitter-feed-1.0-SNAPSHOT.jar Timeout: 30 MemorySize: 1024 Environment: Variables: COUCHBASE_HOST: <value> COUCHBASE_BUCKET_PASSWORD: <value> Role: arn:aws:iam::598307997273:role/microserviceRole Events: Timer: Type: Schedule Properties: Schedule: rate(3 hours)

What do we see here?

Function is packaged and available in a S3 bucket Handlerclass is org.sample.twittter.TwitterRequestHandler and is at github.com/arun-gupta/twitter-n1ql/blob/master/twitter-feed/src/main/java/org/sample/twitter/TwitterRequestHandler.java . It looks like: public class TwitterRequestHandler implements RequestHandler<Request, String> { @Override public String handleRequest(Request request, Context context) { if (request.getName() == null) request.setName("realDonaldTrump"); int tweets = new TwitterFeed().readFeed(request.getName()); return "Updated " + tweets + " tweets for " + request.getName() + "!"; } } By default, this class readsthe twitter handle of Donald Trump . More fun on that coming in a subsequent blog. COUCHBASE_HOST and COUCHBASE_BUCKET_PASSWORD are environment variables that provide EC2 hostwhere Couchbasedatabase isrunning and the password of the bucket. Function can be triggered by different events. In our case, this istriggered every three hours. More details about the expression used here are at Schedule Expressions Using Rate or Cron . Fetching Tweets using Twitter4J

Tweets are read using Twitter4J API. It is an unofficial TwitterAPI that provides a Java abstraction over Twitter REST API . Here is a simple example:

Twitter twitter = getTwitter(); Paging paging = new Paging(page, count, sinceId); List<Status> list = twitter.getUserTimeline(user, paging);

Twitter4J Docs and Javadocs are pretty comprehensive. Twitter API allows to read only last 200 tweets. Lambda function is invoked every 3 hours. The tweet frequency of @realDonaldTrump is not 200 every 3 hours, at least yet. If it does reach that dangerous level then we canadjust the rate to trigger Lambda function morefrequently. JSON representation of each tweet is storedin Couchbase server usingCouchbase Java SDK. AWS Lambda supports Node, python and C#. And so you can useCouchbase Node SDK,Couchbase Python SDKorCouchbase .NET SDK to write these functions as well. Twitter4J API allows to fetch tweets since the id of a particular tweet. This allows to ensure that duplicate tweets are notfetched. This requires us tosort all tweetsin a particular order and then pick theid of the most recent tweet. This was solved using the simpleN1QL query:

SELECT id FROM twitter ORDER BY id DESC LIMIT 1

The syntax is very SQL-like. More on this in a subsequent blog.

Store Tweets in Couchbase

The final item is to store the retrieved tweets in Couchbase. Value of COUCHABSE_HOST environment variableis used toconnect to the Couchbase instance. The value of COUCHBASE_BUCKET_PASSWORD environment variable is to connect to the secure bucket where all JSON documents are stored. Itsvery critical that the bucketbe password protected and not directlyspecified in the source code. More on this in a subsequent blog. TheJSON document is upserted (insert or update)in Couchbase using the Couchbase Java API:

bucket.upsert(jsonDocument); This Lambda Function has been running for a few days now and

Viewing all articles
Browse latest Browse all 6262

Trending Articles