AWS Step Functions
Step Functions are a serverless function orchestrator that helps to implement complex decoupled workflows.
Overview
Step Functions is built on two main concepts. State Machine and Tasks.
A state machine is defined using the JSON-based Amazon States Language. When an AWS Step Functions state machine is created, it stitches the components together and shows the developers their system and how it is being configured.
A task performs work by using an activity or a Lambda function, or passing parameters to the API actions of other services.
How a Step Function Works
The state mahine is the core component of a Step Function. It defines communication between states and how data is passed from one state to another
State
Performs work in the state machine (Task State)
Makes a choice between branches of execution (Choice State)
Stops execution (Fail or Succeed State)
Adds some fixed data between and input and output (Pass State)
Provides a timeout (Wait State)
Runs parallel branches of execution (Parallel State)
Example of a state definition
"States": {
"FirstState": {
"Type": "Task",
"Resource": "arn:aws:lambda:x:x:function:helloWorld",
"Next": "ChoiceState"
}
"ChoiceState": {
...
}
...
}
Input & Output
A Step Function execution receives its input as JSON to the first state. InputPath, Parameters, ResultSelector, ResultPath, and OutputPath each manipulate JSON as it moves through each state in your workflow.
InputPath selects which parts of the JSON input to pass to the task of the Task state.
For example, suppose the input to your state includes the following.
{
"comment": "Example for InputPath.",
"dataset1": {
"val1": 1,
"val2": 2,
"val3": 3
},
"dataset2": {
"val1": "a",
"val2": "b",
"val3": "c"
}
}
You could apply the InputPath.
"InputPath": "$.dataset2",
Parameters are used to create a collection of key-value pairs that are passed as input. The values of each can either be static values that you include in your state machine definition, or selected from either the input or the context object with a path.
ResultPath then selects what combination of the state input and the task result to pass to the output.
OutputPath can filter the JSON output to further limit the information that’s passed to the output.
Pros
- Orchestrating decoupled resources together
- Retry: Before Step Functions, there was no easy way to retry in the event of a timeout error or runtime error or any other type of failure. It also provides an exponential backoff feature.
- Error Handling: It provides an easy way of error handling at each state.
- States.Timeout — When a Task state cannot finish the job within the TimeoutSeconds or does not send heartbeat using SendTaskHeartbeat within HeartbeatSeconds value.
- States.TaskFailed — When a Task state fails for any reason.
- States.Permissions — When a Task state does not have sufficient privileges to call the Lambda/Activity code.
- States.All — It captures any known error name.
Cons
- Missing Triggers: Some Event Sources and Triggers are still missing, such as DynamoDB and Kinesis.
- State machine Execution name: Each Execution name for a state machine has to be unique (not used in the last 90 days). This can be very tricky.
Resources
https://docs.aws.amazon.com/step-functions/latest/dg/connect-api-gateway.html