Amazon EMR: One or more steps that specify all of the functions to be performed on the data. job ID A five-character, alphanumeric string that uniquely identifies an AWS Import/Export storage device in your shipment.
To create a Lambda state machine. Our state machine has two states: our Lambda function task, and a success state. The function requires that we create a Step Functions Task that invokes our function. This task state is used as the first step in our state machine The success state is added to the state machine using that task's next() method.
Unfortunately you can't dynamically specify the timeout for a state, but you can dynamically tell a Wait state how long it should wait. With that said, I would recommend that you use a Parallel State with two branches and a catch block. The first branch contains a Wait State and a Fail State (your timeout). The other branch contains your normal State Machine logic and a Fail State.
Whenever a branch fails inside a Parallel state, it aborts all running states in the other branches. Luckily you are able to catch these errors in the Parallel State and redirect it to another state depending on which branch failed. Heres an example of what I mean (change the values in the HardCodedInputs state to control which branch fails).
{
"StartAt": "HardCodedInputs",
"States": {
"HardCodedInputs": {
"Type": "Pass",
"Parameters": {
"WaitBranchInput": {
"timeout": 5,
"Comment": "Change the value of timeout"
},
"WorkerBranchInput": {
"SecondsPath": 3,
"Comment": "SecondsPath is used for testing purposes to simulate how long the worker will run"
}
},
"Next": "Parallel"
},
"Parallel": {
"Type": "Parallel",
"End": true,
"Catch": [{
"ErrorEquals": ["TimeoutExpired"],
"ResultPath": "$.ParralelStateOutput",
"Next": "ExecuteIfTimedOut"
}, {
"ErrorEquals": ["WorkerSuccess"],
"ResultPath": "$.ParralelStateOutput",
"Next": "ExecuteIfWorkerSuccesfull"
}],
"Branches": [{
"StartAt": "DynamicTimeout",
"States": {
"DynamicTimeout": {
"Type": "Wait",
"InputPath": "$.WaitBranchInput",
"SecondsPath": "$.timeout",
"Next": "TimeoutExpired"
},
"TimeoutExpired": {
"Type": "Fail",
"Cause": "TimeoutExceeded.",
"Error": "TimeoutExpired"
}
}
},
{
"StartAt": "WorkerState",
"States": {
"WorkerState": {
"Type": "Wait",
"InputPath": "$.WorkerBranchInput",
"SecondsPath": "$.SecondsPath",
"Next": "WorkerSuccessful"
},
"WorkerSuccessful": {
"Type": "Fail",
"Cause": "Throw Worker Success Exception",
"Error": "WorkerSuccess"
}
}
}
]
},
"ExecuteIfTimedOut": {
"Type": "Pass",
"End": true
},
"ExecuteIfWorkerSuccesfull": {
"Type": "Pass",
"End": true
}
}
}
Since I couldn't find a solution for dynamic timeout.
I've made a workaround using AWS Choice state
I was needed to wait for an answer from a microservice, the time depended on the quantity of objects, which I've sent to it. Process of each object took like 3 minutes in average, therefore the timeout could be from 3 minutes and more.
All the results, my microservice has written into a DB. So I created a lambda, that checks the DB in a loop.
The exit condition is
I work with Serverless framework, here is my final solution:
VerifyLambda:
Type: Task
Resource: arn:aws:lambda:#{AWS::Region}:#{AWS::AccountId}:function:verify-step
Next: IsFinished
IsFinished:
Type: Choice
Choices:
- Variable: $.isFinish
BooleanEquals: false
Next: Wait 3m
Default: NextLambdaStep
Wait 3m:
Type: Wait
Seconds: 180
Next: VerifyLambda
NextLambdaStep: ...
One way to handle it would be to catch the timeout error and issue a command to kill the fargate task :
Like in this example from docs https://docs.aws.amazon.com/step-functions/latest/dg/concepts-error-handling.html :
{
"Comment": "A Hello World example of the Amazon States Language using an AWS Lambda function",
"StartAt": "HelloWorld",
"States": {
"HelloWorld": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789012:function:sleep10",
"TimeoutSeconds": 2,
"Catch": [ {
"ErrorEquals": ["States.Timeout"],
"Next": "fallback"
} ],
"End": true
},
"fallback": {
"Type": "Pass",
"Result": "Hello, AWS Step Functions!",
"End": true
}
}
}
With task tokens, I believe you're supposed to use the Heartbeat timeout rather than a general timeout.
In the docs it calls out "The "HeartbeatSeconds": 600 field sets the heartbeat timeout interval to 10 minutes." and that "If the waiting task doesn't receive a valid task token within that 10-minute period, the task fails with a States.Timeout error name."
I think since it's a different service integration Heartbeat works here.
https://docs.aws.amazon.com/step-functions/latest/dg/connect-to-resource.html#connect-wait-token
If you are using Jenkins pipeline, and the newer declarative style (has a top level element) then there is a pipeline {timeout that can be used for the overall job, or on individual stages:option
pipeline {
agent any
options {
timeout(time: 1, unit: 'HOURS') // timeout on whole pipeline job
}
stages {
stage('Example') {
options {
timeout(time: 1, unit: 'HOURS') // timeout on this stage
}
steps {
echo 'Hello World'
}
}
}
}
Docs: https://jenkins.io/doc/book/pipeline/syntax/#options