Skip to main content

How schemas translate to database types

note

The row order in this table is important. Type lookup stops after the first match is found scanning from top to bottom (with the two exceptions of "null" and "required" — the first two rows in the table).

Json SchemaRedshift/Postgres Type
{
"type": ["null", T1, ...]
}

OR

{
"enum": ["null", E1, ...]
}

"null" is not considered for type casting logic, only for the nullability constraint. Type lookup will continue down the table.

"null"’s position in a list (type or enum) does not matter.

Note that each major schema version (1-0-0, 2-0-0, etc) results in a new column (name ending with _1, _2, etc). Once the loader creates a column for a given schema version as NULLABLE or NOT NULL, it will never alter the nullability constraint for that column. For example, if a field is nullable in schema version 1-0-0 and not nullable in version 1-0-1, the column will remain nullable. (In this example, the Enrich application will still validate data according to the schema, accepting null values for 1-0-0 and rejecting them for 1-0-1.)

NULLABLE

{
"properties": {
"f1": {"type": "T"}
},
"required": [ "f1" ]
}

"required" is not considered for type casting logic, only for the nullability constraint. Type lookup will continue down the table.

Fields that are not listed as "required" are nullable. Fields are also nullable when they are listed as "required" but have "null" in their type or enum definition. (In the latter case, the Enrich application will still validate that the field is present, even if it’s null.)

Note that each major schema version (1-0-0, 2-0-0, etc) results in a new column (name ending with _1, _2, etc). Once the loader creates a column for a given schema version as NULLABLE or NOT NULL, it will never alter the nullability constraint for that column. For example, if a field is not required in schema version 1-0-0 and required in version 1-0-1, the column will remain nullable. (In this example, the Enrich application will still validate data according to the schema, accepting null values for 1-0-0 and rejecting them for 1-0-1.)

NOT NULL

{
"enum": [E1, E2, ...]
}

The enum can contain more than one JavaScript type: string, number|integer, boolean. For the purposes of this number and integer are the same.

array, object, NaN and other types in the enum will be cast as fallback VARCHAR(4096).

If content size is longer than 4096 it would be truncated when inserted into the Redshift.

VARCHAR(M)

M is the maximum size of json.stringify(E*)

{
"type": ["boolean", "integer"]
}

OR

{
"type": ["integer", "boolean"]
}

VARCHAR(10)

{
"type": [T1, T2, ...]
}

VARCHAR(4096)

If content size is longer than 4096 it would be truncated when inserted into the Redshift.

{
"type": "string",
"format": "date-time"
}

TIMESTAMP

{
"type": "string",
"format": "date"
}

DATE

{
"type": "array"
}

VARCHAR(65535)

Content is stringified and quoted.

If content size is longer than 65535 it would be truncated when inserted into the Redshift.

{
"type": "integer",
"maximum": M
}
  • M 32767

SMALLINT

{
"type": "integer",
"maximum": M
}
  • 32767 < M 2147483647

INT

{
"type": "integer",
"maximum": M
}
  • M >2147483647

BIGINT

{
"type": "integer",
"enum": [E1, E2, ...]
}
  • Maximum E* 32767

SMALLINT

{
"type": "integer",
"enum": [E1, E2, ...]
}
  • 32767 < maximum E* 2147483647

INT

{
"type": "integer",
"enum": [E1, E2, ...]
}
  • Maximum E* > 2147483647

BIGINT

{
"type": "integer"
}

BIGINT

{
"multipleOf": B
}

INT

{
"type": "number",
"multipleOf": B
}
  • Only works for B=2

DECIMAL(36,2)

{
"type": "number"
}

DOUBLE

{
"type": "boolean"
}

BOOLEAN

{
"type": "string",
"minLength": M,
"maxLength": M
}
  • M is the same in minLength and maxLength

CHAR(M)

{
"type": "string",
"format": "uuid"
}

CHAR(36)

{
"type": "string",
"format": "ipv6"
}

VARCHAR(39)

{
"type": "string",
"format": "ipv4"
}

VARCHAR(15)

{
"type": "string",
"format": "email"
}

VARCHAR(255)

{
"type": "string",
"maxLength": M
}
  • enum is not defined

VARCHAR(M)

{
"enum": ["E1"]
}
  • E1 is the only element

CHAR(M)

M is the size of json.stringify("E1")

If nothing matches above, this is a catch-all.

VARCHAR(4096)

Values will be quoted as in JSON.

If content size is longer than 4096 it would be truncated when inserted into the Redshift.

Was this page helpful?