How schemas translate to database types
- Redshift and Postgres
- Databricks
- BigQuery
- Snowflake
- Elastic
The row order in this table is important. Type lookup stops after the first match is found scanning from top to bottom (with the two exceptions of "null" and "required" — the first two rows in the table).
Json Schema | Redshift/Postgres Type |
OR
Note that each major schema version ( |
|
Fields that are not listed as Note that each major schema version ( |
|
The
If content size is longer than 4096 it would be truncated when inserted into the Redshift. |
|
OR
|
|
|
If content size is longer than 4096 it would be truncated when inserted into the Redshift. |
|
|
|
|
|
Content is stringified and quoted. If content size is longer than 65535 it would be truncated when inserted into the Redshift. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
If nothing matches above, this is a catch-all. |
Values will be quoted as in JSON. If content size is longer than 4096 it would be truncated when inserted into the Redshift. |
All fields in databricks are nullable
. Having "null"
in the "type"
or "enum"
does not affect the warehouse type,
and is ignored for the purposes of type casting as per the table below.
The row order in this table is important. Type lookup stops after the first match is found scanning from top to bottom (with the single exception of "null" — the first row in the table).
Json Schema | Databricks Type |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
More details
For example,
result is |
|
More details
For example,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Values will be quoted as in JSON. |
If nothing matches above, this is a catch-all. |
Values will be quoted as in JSON. |
The row order in this table is important. Type lookup stops after first match is found scanning from top to bottom (with the single exception of "null" — the first row in the table).
Json Schema | BigQuery Type |
OR
|
|
If the Objects can be nullable. Nested fields can also be nullable (same rules as for everything else). |
|
If the Arrays can be nullable. Nested fields can also be nullable (same rules as for everything else). |
|
|
|
|
|
|
|
|
|
|
|
OR
|
|
|
|
|
|
|
Values will be quoted as in JSON. |
If nothing matches above, this is a catch-all. |
Values will be quoted as in JSON. |
All types are JSON
.
When loading enriched events, the resulting JSONs are like the Snowplow Canonical Event model with the following changes.
Boolean fields reformatted
All boolean fields like br_features_java
are either "0"
or "1"
in the canonical event model. The JSON converts these values to false
and true
.
New geo_location
field
The geo_latitude
and geo_longitude
fields are combined into a single geo_location
field of Elasticsearch's "geo_point" type.
Unstructured events
Unstructured events are expanded into full JSONs. For example, the event
{
"schema": "iglu:com.snowplowanalytics.snowplow/link_click/jsonschema/1-0-1",
"data": {
"targetUrl": "http://snowplowanalytics.com/analytics/index.html",
"elementId": "action",
"elementClasses": [],
"elementTarget": ""
}
}
would be converted to the field
{
"unstruct_com_snowplowanalytics_snowplow_link_click_1": {
"targetUrl": "http://snowplowanalytics.com/analytics/index.html",
"elementId": "action",
"elementClasses": [],
"elementTarget": ""
}
}
Custom contexts
Each custom context in an array is similarly expanded to a JSON with its own field. For example, the array
[
{
"schema": "iglu:com.acme/contextOne/jsonschema/1-0-0",
"data": {
"key": "value"
}
}
{
"schema": "iglu:com.acme/contextTwo/jsonschema/3-0-0",
"data": {
"name": "second"
}
}
]
would be converted to
{
"contexts_com_acme_context_one_1": {
"key": "value"
},
"contexts_com_acme_context_two_3": {
"name": "second"
}
}