Wanted to ask / discuss with you about null values behaviour in BigQuery.
I have noticed that filtering out real values in a NULLABLE column, will results filtering out both the value requested and NULL values.
Take this query for ex:
select * from
(select NULL as some_nullable_col, "name1" as name),
(select 4 as some_nullable_col, "name2" as name),
(select 1 as some_nullable_col, "name3" as name),
(select 7 as some_nullable_col, "name4" as name),
(select 3 as some_nullable_col, "name5" as name)
--WHERE some_nullable_col != 3
All results return as expected,
And then:
select * from
(select NULL as some_nullable_col, "name1" as name),
(select 4 as some_nullable_col, "name2" as name),
(select 1 as some_nullable_col, "name3" as name),
(select 7 as some_nullable_col, "name4" as name),
(select 3 as some_nullable_col, "name5" as name)
WHERE some_nullable_col != 3
will omit 2 columns. the value 3 and null.
I guess this happens because BigQuery won't index null values / won't scan null values on where clause for efficiency, but it also brings troubles:
Each time I filter on a nullable column, the filter will look like
WHERE some_nullable_col != 3 OR some_nullable_col IS NULL
This is obviously less comfortable.
Just wanted to get an explanation / does BigQuery's roadmap offers a fix for this issue?