Skip to main content

Quick Start FAQ

Why is there a limit on throughput?

Each Snowplow application is deployed as a docker image on a single EC2/ CE instance. This, along with the streams themselves (AWS only), are the limiting factor when it comes to throughput. We made this decision for the following reasons:

  • We wanted to keep the costs of this experience low, and using ECS fargate or kubernetes would be more expensive
  • A single instance per application is more than enough resource for a proof of concept or first production use case and to get you started with our OSS

How do I shut down the pipeline?

If you would like to shut down your pipeline then you can easily do so by running terraform destroy.

Note that if you want to delete your S3 bucket and Postgres databases you would need to do that from within the AWS or Google cloud console. If you want to maintain these then you can - just be aware that next time you spin up your pipeline you might see errors when the script to create the S3 bucket and Postgres DBs is running.

How do I make the pipeline production ready?

If you are at the point where you would like to deliver higher volume production use cases, then here are some general guidelines on delivering a highly available, auto scaling pipeline:

On AWS:

How do I upgrade the version of the application that I am using?

We release new versions of our pipeline components very frequently; however the versions used within the terraform modules are updated in line with our platform releases since these are the most stable and recommended versions of our components. Sign-up to get the latest updates on platform releases and new features.

When a new version of a module is released, follow these instructions to upgrade:

  • Update the module version in your terraform
  • Run terraform plan to check for what changes will be made

With the standard deployment, you will only have a single collector instance. This means you will experience brief downtime, typically less than a minute. To prevent this you will need to move to multiple collector set up, so there are multiple collector instances behind the load balancer.

Which enrichments are enabled by default?

The following enrichments are enabled by default within the Enrich module:

Other available enrichments enrichments and the configurations can be found here.

To enable a different enrichment would need to add the appropriate terraform inputs to the snowplow-devops/enrich-kinesis-ec2/aws module.

Troubleshooting Terraform Errors

The following are some common errors that you might encounter when running terraform plan or terraform apply.

AWS:

Error: Invalid provider configuration

Provider "registry.terraform.io/hashicorp/aws" requires explicit configuration. Add a provider block to the root module and configure the provider's required arguments as described in the provider documentation.

Solution: Double check that your AWS Access Key ID and AWS Secret Access Key are set up correctly (with no typos) using the aws configured command.

Error: "x.x.x.x" is not a valid CIDR block

"x.x.x.x" is not a valid CIDR block: invalid CIDR address: x.x.x.x with module.iglu_server.aws_security_group_rule.ingress_tcp_22, on .terraform/modules/iglu_server/main.tf line 143, in resource "aws_security_group_rule" "ingress_tcp_22": 143: cidr_blocks = var.ssh_ip_allowlist

Solution: Add a mask to the IP in your terraform.tvars file, e.g. x.x.x.x/32.

Error creating application Load Balancer

ValidationError: At least two subnets in two different Availability Zones must be specified status code: 400

Solution: Add subnets to cover at least 2 availability zones. See this AWS guide on how to set this up.

After this step, add the two freshly created subnets to your terraform.tfvars file like this: public_subnet_ids = ["subnet-00000000", "subnet-00000001"] and run terraform apply again.

Error creating DB Subnet Group

DBSubnetGroupDoesNotCoverEnoughAZs: DB Subnet Group doesn't meet availability zone coverage requirement. Please add subnets to cover at least 2 availability zones. Current coverage: 1 status code: 400

Solution: Add subnets to cover at least 2 availability zones. See this AWS guide on how to set this up.

After this step, add the two freshly created subnets to your terraform.tfvars file like this: public_subnet_ids = ["subnet-00000000", "subnet-00000001"] and run terraform apply again.

Error creating DB Instance: InvalidParameterValue

The parameter MasterUserPassword is not a valid password. Only printable ASCII characters besides '/', '@', '"', ' ' may be used.

Solution: Modify the iglu_db_password in your terraform.tfvars file so that it does not contain any forbidden characters like '_', '*'. Make sure the password is not longer than 13 characters.

Was this page helpful?