Enrich Kafka (cloud agnostic)

enrich-kafka is a standalone JVM application that reads from and writes to Kafka. It can be run from anywhere, as long as it can communicate with your Kafka cluster.

It is published on Docker Hub and can be run with the following command:

docker run \
  -it --rm \
  -v $PWD:/snowplow \
  snowplow/snowplow-enrich-kafka:3.7.1 \
  --enrichments /snowplow/enrichments \
  --iglu-config /snowplow/resolver.json \
  --config /snowplow/config.hocon

Above assumes that you have following directory structure:

enrichments directory, (possibly empty) with all enrichment configuration JSONs
Iglu Resolver configuration JSON
configuration HOCON

It is possible to use environment variables in all of the above (for Iglu and enrichments starting from 3.7.0 only).

Alternatively, you can download and run a jar file from the github release.

java -jar snowplow-enrich-kafka-3.7.1.jar \
  --enrichments /snowplow/enrichments \
  --iglu-config /snowplow/resolver.json \
  --config /snowplow/config.hocon

Configuration guide can be found on this page and information about the monitoring on this one.

Telemetry notice

By default, Snowplow collects telemetry data for Enrich Kafka (since version 3.0.0). Telemetry allows us to understand how our applications are used and helps us build a better product for our users (including you!).

This data is anonymous and minimal, and since our code is open source, you can inspect what’s collected.

If you wish to help us further, you can optionally provide your email (or just a UUID) in the telemetry.userProvidedId configuration setting.

If you wish to disable telemetry, you can do so by setting telemetry.disable to true.

See our telemetry principles for more information.