Getting Started

It only takes a few minutes to get StreamAlert up and running! These instructions have been tested on MacOS, but should also work on most linux systems.

Install Dependencies

  1. Install Python 2.7 and pip
  2. Install Terraform v0.11.X:
brew install terraform  # MacOS Homebrew
terraform --version     # Must be v0.11.X
  1. Install virtualenv:
pip install --user virtualenv
virtualenv --version
  1. If you’re on a Linux system, you may need to install the Python development libraries:
sudo apt install python-dev    # Debian
sudo yum install python-devel  # CentOS/RHEL

Download StreamAlert

  1. Clone the latest stable release of StreamAlert:
git clone --branch stable https://github.com/airbnb/streamalert.git
  1. Create and activate a virtual environment:
cd streamalert
virtualenv -p python2.7 venv
source venv/bin/activate
  1. Install the StreamAlert requirements:
pip install -r requirements.txt
  1. Run unit tests to make sure everything is installed correctly:
tests/scripts/unit_tests.sh

Configure AWS Credentials

1. Create an AWS account and an IAM user with permissions for at least the following services:

  • Athena
  • CloudTrail
  • CloudWatch Events and Logs
  • DyanmoDB
  • IAM
  • Kinesis Firehose and Streams
  • KMS
  • Lambda
  • S3
  • SNS
  • SQS
  1. Configure your AWS credentials
pip install --user awscli
aws configure

Deploy

  1. Set basic StreamAlert configuration options:
./manage.py configure aws_account_id 111111111111  # Replace with your 12-digit AWS account ID
./manage.py configure prefix NAME                  # Choose a unique name prefix (alphanumeric characters only)
  1. Build the StreamAlert infrastructure for the first time:
./manage.py terraform init

There will be multiple Terraform prompts, type “yes” at each one to continue.

Note

You only need to ./manage.py terraform init once for any given StreamAlert deployment, although it is safe to run again if necessary.

3. At this point, StreamAlert is up and running! You can, for example, see the S3 buckets that were automatically created:

aws s3 ls | grep streamalert

You can also login to the AWS web console and see StreamAlert’s CloudWatch logs, Lambda functions, etc.

Live Test

Now let’s upload some data and trigger an alert to see StreamAlert in action! This example uses SNS for both sending the log data and receiving the alert, but StreamAlert also supports many other data sources and alert outputs.

  1. Create 2 SNS topics:
aws sns create-topic --name streamalert-test-data
aws sns create-topic --name streamalert-test-alerts
  1. Export some environment variables for easy re-use later:
export SA_REGION=us-east-1        # StreamAlert deployment region
export SA_ACCOUNT=111111111111    # AWS account ID
export SA_EMAIL=email@domain.com  # Email to receive an SNS notification
  1. Subscribe your email to the alerts SNS topic:
aws sns subscribe --topic-arn arn:aws:sns:$SA_REGION:$SA_ACCOUNT:streamalert-test-alerts \
    --protocol email --notification-endpoint $SA_EMAIL

Note

You will need to click the verification link in your email to activate the subscription.

4. Add the streamalert-test-data SNS topic as an input to the (default) prod cluster. Open conf/clusters/prod.json and change the stream_alert module to look like this:

{
  "stream_alert": {
    "rule_processor": {
      "enable_metrics": true,
      "inputs": {
        "aws-sns": [
          "arn:aws:sns:REGION:ACCOUNTID:streamalert-test-data"
        ]
      },
      "log_level": "info",
      "memory": 128,
      "timeout": 10
    }
  }
}

5. Tell StreamAlert which log schemas will be sent to this input. Open conf/sources.json and change the sns section to look like this:

{
  "sns": {
    "streamalert-test-data": {
      "logs": [
        "cloudwatch"
      ]
    }
  }
}
  1. Add the alert topic as a StreamAlert output:
$ ./manage.py output new --service aws-sns

Please supply a short and unique descriptor for this SNS topic: test-email

Please supply SNS topic name: streamalert-test-alerts

If you look at conf/outputs.json, you’ll notice that the SNS topic was automatically added.

7. Configure a rule to send to the alerts topic. We will use rules/community/cloudtrail/cloudtrail_root_account_usage.py as an example, which alerts on any usage of the root AWS account. Change the rule decorator to:

@rule(
    logs=['cloudwatch:events'],
    req_subkeys={'detail': ['userIdentity', 'eventType']},
    outputs=['aws-sns:test-email']  # Add this line
)
def cloudtrail_root_account_usage(rec):
  1. Now we need to update StreamAlert with these changes:
# Hook the streamalert-test-data SNS topic up to the StreamAlert rule processor
./manage.py terraform build

# Deploy a new version of all of the Lambda functions with the updated rule and config files
./manage.py lambda deploy -p all

Note

Use terraform build and lambda deploy to apply any changes to StreamAlert’s configuration or Lambda functions, respectively. Some changes (like this example) require both.

  1. Time to test! Create a file named cloudtrail-root.json with the following contents:
{
  "account": "1234",
  "detail": {
    "eventType": "AwsConsoleSignIn",
    "userIdentity": {
      "type": "Root"
    }
  },
  "detail-type": "CloudTrail Test",
  "id": "1234",
  "region": "us-east-1",
  "resources": [],
  "source": "1.1.1.2",
  "time": "now",
  "version": "2018"
}

This is only a rough approximation of what the real log might look like, but good enough for our purposes. Then send it off to the data SNS topic:

aws sns publish --topic-arn arn:aws:sns:$SA_REGION:$SA_ACCOUNT:streamalert-test-data \
    --message "$(cat cloudtrail-root.json)"

If all goes well, an alert should arrive in your inbox within a few minutes! If not, look for any errors in the CloudWatch Logs for the StreamAlert Lambda functions.

10. After 10 minutes (the default refresh interval), the alert will also be searchable from AWS Athena. Select your StreamAlert database in the dropdown on the left and preview the alerts table:

Query Alerts Table in Athena

(Here, my name prefix is testv2.) If no records are returned, look for errors in the athena_partition_refresh function or try invoking it directly.

And there you have it! Ingested log data is parsed, classified, and scanned by the StreamAlert rules engine and any resulting alerts are delivered to your configured output(s) within a matter of minutes.