Enrich Pubsub (GCP)
enrich-pubsub
is a standalone JVM application that reads from and writes to PubSub topics.
It can be run from anywhere, as long as it has permissions to access the topics.
It is published on Docker Hub and can be run with the following command:
docker run \
-it --rm \
-v $PWD:/snowplow \
-e GOOGLE_APPLICATION_CREDENTIALS=/snowplow/snowplow-gcp-account-11aa55ff6b1b.json \
snowplow/snowplow-enrich-pubsub:3.7.1 \
--enrichments /snowplow/enrichments \
--iglu-config /snowplow/resolver.json \
--config /snowplow/config.hocon
Above assumes that you have following directory structure:
- GCP credentials JSON file
enrichments
directory, (possibly empty) with all enrichment configuration JSONs- Iglu Resolver configuration JSON
- enrich-pubSub configuration HOCON
It is possible to use environment variables in all of the above (for Iglu and enrichments starting from 3.7.0
only).
Alternatively, you can download and run a jar file from the github release.
java -jar snowplow-enrich-pubsub-3.7.1.jar \
--enrichments /snowplow/enrichments \
--iglu-config /snowplow/resolver.json \
--config /snowplow/config.hocon
Configuration guide can be found on this page and information about the monitoring on this one.
Telemetry notice
By default, Snowplow collects telemetry data for Enrich PubSub (since version 3.0.0). Telemetry allows us to understand how our applications are used and helps us build a better product for our users (including you!).
This data is anonymous and minimal, and since our code is open source, you can inspect what’s collected.
If you wish to help us further, you can optionally provide your email (or just a UUID) in the telemetry.userProvidedId
configuration setting.
If you wish to disable telemetry, you can do so by setting telemetry.disable
to true
.
See our telemetry principles for more information.