RDB shredder configuration reference
caution
You are reading documentation for an outdated version. Here’s the latest one!
Shredder and loader use different configurations starting from 2.0.0. An example config for shredder can be found here.
This is a complete list of the options that can be configured
input | Required. S3 url the enriched archive. It must be populated separately with run=YYYY-MM-DD-hh-mm-ss directories. |
---|---|
output.path | Required. S3 url of the shredded output. |
output.compression | Optional. One of "NONE" or "GZIP". Default value GZIP. |
output.region | Optional if it can be resolved with AWS region provider chain. AWS region of the S3 bucket. |
queue.type | Required. Type of the queue. It can be either sqs or sns. |
queue.queueName | Required if queue type is sqs. Name of the sqs queue. |
queue.topicArn | Required if queue type is sns. ARN of sns topic. |
queue.region | Optional if it can be resolved with AWS region provider chain. AWS region of the sqs queue or sns topic. |
formats.default | Required, either TSV or JSON. Data format produced by default by the shredder. TSV is recommended as it enables table autocreation, but requires Iglu Server to be available with known schemas (including Snowplow schemas). JSON does not require Iglu Server, but requires Redshift JSONPaths to be configured and does not support table autocreation |
formats.tsv | Required, list of iglu uri, but can be set to empty list []. If default is set to JSON these list of schemas will still be shredded into TSV |
formats.json | Required, list of iglu uri, but can be set to empty list []. If default is set to TSV these list of schemas will still be shredded into JSON |
formats.skip | Required, list of iglu uri, but can be set to empty list []. Schemas for which loading can be skipped. |
monitoring.sentry.dsn | Optional. For tracking runtime exceptions. |