Skip to main content

The Snowplow Collector on AWS

On a AWS pipeline, the Snowplow Stream Collector receives events sent over HTTP(S), and writes them to a raw Kinesis stream. From there, the data is picked up and processed by the Snowplow validation and enrichment job.

On a AWS pipeline the basic steps are:

  1. In the AWS console, create two Kinesis streams to which the collector will write good payloads and bad events.
  2. (Optional) Set up an SQS buffer to handle spikes in traffic.
  3. Configure and run the collector, using the main collector documentation, which describes the core concepts of how the collector works, and the configuration options.
  4. (Optional) Configure and run Snowbridge with an SQS source and a Kinesis target to move data from your SQS buffer back to the primary Kinesis queue.
  5. (Optional) Sink the raw data to S3 using the Snowplow S3 loader. This is recommended in production so keep a copy of the raw data before any processing.
Was this page helpful?