Wednesday, January 18, 2012

Demo - How to Get Started with Amazon DynamoDB for CloudSpokes Challenges

This morning, Amazon announced Amazon DynamoDB, their Internet-scale NoSQL database service. Back in November we were fortunate enough to be invited to participate in a private beta for Amazon DynamoDB. We’ve since then had some time to “work it over” a bit and have put together a demo application to show off some functionality. The code is posted at our GitHub repo for your forking convenience.

We are very excited about the possibility of using Amazon DynamoDB for a few services at CloudSpokes so you can plan on seeing some really cool CloudSpokes challenges in the near future.

The API is very straight forward and easy to work with. If you’ve used other NoSQL databases then you should have no problem wrapping your head around Amazon DynamoDB. It has simple storage and query methods, allowing you to store and access data items with a flexible number of attributes using simple “Put” or “Get” web services APIs. Amazon DynamoDB provides a native API for HTTP and SDKs for Java, PHP and .NET. More are reportedly in the works.

What is Amazon DynamoDB and why would I want to use it?

Amazon DynamoDB is a fast, highly scalable, highly available, cost-effective non-relational database service that scales automatically without limits or administration. This service is tightly coupled with Amazon S3 andAmazon EC2, collectively providing the ability to store, process and query data sets in the cloud.

If you have massive amounts of highly transactional data then Amazon DynamoDB might be for you:

  • Store Social Graph data for processing
  • Storing GPS data for devices
  • Data storage for Hadoop processes
  • Record user activity logs
  • NFC processes
  • Recording clicks for A/B testing

Blazing Fast - Amazon DynamoDB runs on a new solid state disk (SSD) architecture for low-latency response times. Read latencies average less than 5 milliseconds, and write latencies average less than 10 milliseconds. We found our applications to be extremely responsive.

Hands Off Administration - Amazon DynamoDB is a fully managed service – no need to worry about hardware or software provisioning, setup and configuration, software patching, or partitioning data over multiple instances as you scale. For instance, when you create a table, you need to specify the request throughput you want for your table. In the background, Amazon DynamoDB handles the provisioning of resources to meet the requested throughput rate.

Auto Scaling - To continue with the “no administration” theme, Amazon DynamoDB can automatically scale machine resources in response to increases in database traffic without the need of client-side partitioning. Alternatively, you can also proactively manage performance with a few simple commands.

Security Baked In - Amazon DynamoDB is integrated with AWS Identity and Access Management (access keys and tokens) allowing you to provide access to defined users and groups, assign granular security credentials and user access, much more.

Centralized Monitoring - As with most everything in AWS-land, you can easily view metrics for your Amazon DynamoDB table in the AWS Management Console. You can also view your request throughput and latency for each API as well as resource consumption through Amazon CloudWatch.

API Overview

From a high level, Amazon DynamoDB API provides the following functionality:

  • Create a table
  • Delete a table
  • Request the current state of a table
  • Get a list of all of the tables for the current account
  • Put an item
  • Get one or more items by primary key
  • Update the attributes in a single item
  • Delete an item
  • Scan a table and optionally filter the items returned using comparison operators
  • Query for items using a range index and comparison operators
  • Increment or decrement a numeric value

Data Model

Amazon DynamoDB stores data in tables containing items with a collection of name-value pairs (attributes). Items (anaglous to a record) are managed by assigning each item a primary key value. Unlike traditional databases, the table is schemaless and only relies on the primary key. Items can contain combination of attributes. For example:

"Name" = "Member Search with Redis"
"ChallengeId" = 1219
"Categories" = "aws", "ruby", "mobile"
"Ratings" = 17, 36

Primary Keys & Indexes

When creating a new table, you define the primary key and type of key to be used. Amazon DynamoDB supports a one name/value pair primary key (a hash primary key; string or number) or two name/value pair primary key (a hash-and-range primary key) for index values.

Hash key example: "ChallengeId" = 1219
Hash-and-range key example: "MemberId" = "romin", "MemberNumber" = "976"

Note: the Query API is only available for hash-and-range primary key tables. If you are using a simple hash key, then you need to use the Scan API.

Data Types

Amazon DynamoDB supports two scalar data types (Number and String) and multi-valued types (String Set and Number Set). Everything is stored in Amazon DynamoDB as a UTF-8 string value. You designate the data as a Number, String, String Set, or Number Set in the request but there is no distinction between an int, long, float, etc. For example:

item.put("member", new AttributeValue().withS("romin"));
item.put("challenge", new AttributeValue().withN("1219"));

Amazon DynamoDB supports both Number Sets and String Sets:

 "members":{"SS":["kenji776, romin, akkishore"]},
 "wins" : {"NS":["14", "10", "8"]}

Amazon DynamoDB uses JSON as the transport protocol. However, the JSON data is parsed and stored nativly on disk.

That’s a quick overview so make sure to check out the DynamoDB documentation for more details. The documentation is very well done and has clear instructions and code samples for Java, PHP and .Net. If you are into database performance, check out the details on provisioned throughput, data consistency, conditional operations, performance factors and more.

How to get started?

Sign up for a new AWS account (if you don’t already have one) and get your AWS Access Key ID and Secret Access Key from your account’s security section. Walk through their Getting Started Guide for samples. In your code, just add your credentials to the AWSDynamoDBClient and you are ready to start making requests. All of the API calls are pretty straightforward and work as you would expect them to.

Pricing is, again, pay as you go but Amazon DynamoDB is part of the AWS’s Free Usage Tier so check it out for more info.

No comments:

Post a Comment