Skip to main content

Asynchronous API with DynamoDB Streams

Overview

As we are using actions within Auth0 we need them to be as performant as possible. There are many limitations on size and response time and because of that we propse that we should be storing tokens in an asynchronous manner.

A server request from the should return immediately — without wait for completion. The data flow should be designed in a way that it does not depend upon an immediate response from the server.

There are several architecture patterns to do this. But a major problem with asynchronous processing is error handling. How would the action know if the request failed? We should not lose data in the process. In fact, a fire and forget service, cannot afford to fail. Even if the processing failed for some reason, the data has to reach the database.

DynamoDB streams provides a solution to this problem.

Proposal

Most of the traditional databases have a concept of Triggers. These are events generated when some data change in the DB. DynamoDB Streams however, generate a stream of events that flow into a target. The current targets available are Lambda or Kinesis.

We can have a Lambda function triggered by such events that can process this data. The incoming API call can directly dump the data into the DynamoDB using API gateway service integration. This ensures very low response time in the API. We can configure the DynamoDB Streams to invoke a Lambda function that can process the incoming API call.

The advantage here is that the data is already available in DyanmoDB before we have processed it. Even if downstream processing fails, the data is already available in the DB. In the unlikely event of an insert or update failure, the API itself will return an error and the action will be able to handle it.

Thus, with DynamoDB Streams, we have best of both worlds: low response time and error resilience.

Why not call the lambda directly from the API Gateway?

There are two aspects to this. Foremost, the response time. Directly adding data from API Gateway to DynamoDB is faster than invoking a Lambda. So the action gets a faster response.

Second, resilience and error handling get complex when we invoke Lambda directly from the API gateway. What do we do if the Lambda fails half way or it cannot scale? We cannot retry the action because of the limitations on execution. Do we invoke another API that will heal the damage? This is too much intelligence to be carried into the action. It is best that the service manages this for itself.

Considerations

  • How do we controll access to our API Gateway?
    • Resource policies: allow or deny access to your APIs and methods from specified source IP addresses or VPC endpoints.
    • Lambda authorizer: Lambda functions that control access to REST API methods using bearer token authentication.

Advantages

  • Tokens are stored centrally rather than tied to an auth provider.
  • We have a common way to ciper and decipher between domains and by environment.
  • We can cleanup the by expiring items using the TTL timestamp. That means we can delete items without consuming any write throughput. TTL is provided at no extra cost as a means to reduce stored data volumes by retaining only the items that remain current for your workload’s needs.

Disadvantages

  • More infrastructure and work required to achieve the same as Storing Hybris Tokens in Auth0 App Metadata
  • Adding infrastructure no matter how fast will always introduce a dependency.
  • Having to request records in an asychronous manner

Resources