In the documentation there is a section on semi-structured considerations that warns against certain situations in which values for a given path in a VARIANT column will not be materialized; for example, if the values do not all have the same data type. That is, if both {"foo":1}
and {"foo":"1"}
are present, the value for "foo" obviously cannot be extracted as its column.
But in the case where all values have the same type, it's still not entirely clear how so-called non-native types are handled.
The documentation describes for example dates and timestamps as non-native in the context of the VARIANT data type (meaning that such values are "stored as strings"), my question is whether this extends to number types such as FLOAT8
. The documentation suggests that native types might be understood in the context of JSON (which has a native number type that is rather hybrid in nature).
- Is
FLOAT8
a native type in VARIANT data or is it stored as a string? - Would such a value (stored as a string) be extracted into its own column or appear as a "parsed semi-structured structure" along with those remaining values that weren't extracted into other columns?
The documentation suggests that one makes performance tests "to see which structure provides the best performance" but this would be a lot easier with an accurate understanding of the extraction logic.