Troubleshooting¶
Kinesis Data Streams¶
As detailed in other sections, StreamAlert utilizes Amazon Kinesis Data Streams.
Review Kinesis Streams key concepts
ThroughputExceeded¶
Pertains to WriteProvisionedThroughputExceeded
or ProvisionedThroughputExceededException
The documentation above states:
“Each shard can support up to 1,000 records per second for writes, up to a maximum total data write rate of 1 MB per second (including partition keys)”
“Each PutRecords request can support up to 500 records. Each record in the request can be as large as 1 MB, up to a limit of 5 MB for the entire request, including partition keys”
If you’re experiencing either error, one of the following holds true:
You are exceeding 1000 records/s write on at least one shard
You are exceeding 1MB/s on at least one shard
You sent > 500 records in a single PutRecords request
You sent a record > 1MB
You are sending > 5MB in a PutRecords request to at least one shard
(1),(2),(5) can be addressed by allocating more shards or using a partition key
(3) and (4) can be addressed by using code or agents with the proper limit checks
How you setup your partition keys depends on your use-cases, scale and how you’re sending your data.
In our experience, there are three common use-cases:
No partition key (small scale)
Per-batch partition key (medium scale)
Per-record partition key (larger scale)
Explanation: A PutRecordsBatch request can have up to 500 records amounting to a total of 5MB. If you’re doing a per-batch partition key, that means you’re attempting to send up to 5MB to a single shard that has a limit of 1MB/s. Keep in mind: if your code/agent uses splay or has reasonable retry logic, an error or exception does not imply data loss and may still be a viable strategy.
StreamAlert enables AWS Enhanced Monitoring to help you diagnose these types of issues via shard-level metrics. Simply go to CloudWatch
-> Metrics
-> Kinesis
. This also allows you to measure IncomingBytes and IncomingRecords.
DescribeStream: Rate exceeded¶
Or DescribeDeliveryStream: Rate exceeded.
This API call is limited to 10 requests/s. Your agent/code should not be using this API call to determine if the Kinesis Stream is available to receive data. The agent/code should simply attempt to send the data and gracefully handle any exceptions.
certificate verify failed¶
Run the following command on the impacted host, choosing the correct region: openssl s_client -showcerts -connect kinesis.us-west-2.amazonaws.com:443
If this returns Verify return code: 0 (ok)
, your agent/code needs to use Amazon’s root and/or intermediate certificates (PEM) for TLS to function properly