4
votes

I'm trying to create the following PostgreSQL query using SQLAlchemy Core:

SELECT DISTINCT ON (carrier) carrier,
       LAST_VALUE(ground) OVER wnd AS ground,
       LAST_VALUE(destinationzipstart) OVER wnd AS destinationzipstart,
       LAST_VALUE(destinationzipend) OVER wnd AS destionationzipend
FROM   tblshippingzone
WHERE  sourcezipstart <= 43234
AND    sourcezipend >= 43234
AND    destinationzipstart NOT BETWEEN 99500 AND 99950
AND    destinationzipstart NOT BETWEEN 96700 AND 96899
AND    destinationzipstart >= 1000
AND    (contiguous IS NULL OR contiguous = True)
AND    ground IS NOT NULL
WINDOW wnd AS (
   PARTITION BY carrier ORDER BY ground DESC, destinationzipstart);

This is what I have so far:

# Short-hand for accessing cols
all_cols = ShippingZoneDAL._table.c

# Window params
window_p = {'partition_by': all_cols.carrier,
            'order_by': [desc(all_cols.ground), all_cols.destination_zip_start]}

# Select columns
select_cols = [distinct(all_cols.carrier).label('carrier'),
           over(func.last_value(all_cols.ground), **window_p).label('ground'),
           over(func.last_value(all_cols.destination_zip_start), **window_p).label('destination_zip_start'),
           over(func.last_value(all_cols.destination_zip_end), **window_p).label('destination_zip_end')]

# Filter exprs
exprs = [all_cols.source_zip_start <= 43234,
     all_cols.source_zip_end >= 43234,
     ~all_cols.destination_zip_start.between(99500, 99950),  # Alaska zip codes
     ~all_cols.destination_zip_start.between(96700, 96899),  # Hawaii zip codes
     all_cols.destination_zip_start >= 1000,                 # Eliminates unusual territories
     or_(all_cols.contiguous == True, all_cols.contiguous == None),
     all_cols.ground != None]

# Build query
query = select(*select_cols).where(and_(*exprs))

But I get an error when building the query:

ArgumentError: FROM expression expected

Any ideas what I'm missing here?

BONUS POINTS:

I originally wanted the window function to be this instead:

WINDOW wnd AS (
  PARTITION BY carrier ORDER BY ground
  ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING);

But it seemed like sqlalchemy didn't support the 'ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING', based on this support request: https://bitbucket.org/zzzeek/sqlalchemy/issue/3049/support-range-specificaiton-in-window

Is there a way to use that clause, or no?

1

1 Answers

6
votes

It was mostly just a matter of re-arranging various methods into a working order. Here's the answer, if anyone runs into something similar:

# Short-hand for accessing cols
all_cols = ShippingZoneDAL._table.c

# Window params
window_p = {'partition_by': all_cols.carrier,
            'order_by': [desc(desc(all_cols.ground)), all_cols.destination_zip_start]}

# Select columns
select_cols = select(
                [all_cols.carrier,
                 func.last_value(all_cols.ground).over(**window_p).label(shipment_method),
                 func.last_value(all_cols.destination_zip_start).over(**window_p).label('destination_zip_start'),
                 func.last_value(all_cols.destination_zip_end).over(**window_p).label('destination_zip_end')])

# Filter exprs
exprs = [all_cols.source_zip_start <= 43234,
         all_cols.source_zip_end >= 43234,
         ~all_cols.destination_zip_start.between(99500, 99950),
         ~all_cols.destination_zip_start.between(96700, 96899),
         all_cols.destination_zip_start >= 1000,
         or_(all_cols.contiguous == True, all_cols.contiguous == None),
         all_cols.ground != None]

# Build query
query = select_cols.where(and_(*exprs)).distinct(all_cols.carrier)

Key notes to keep in mind with the solution above:

  • SQLAlchemy Core won't see select(*select_cols) as equivalent to select([all_cols.ground, etc]) in this scenario. Probably because the over method needs to be computed in the context of a select, or you lose reference to the FROM table.

  • To use DISTINCT ON from PostgreSQL, make sure the distinct comes after the primary select. If just used in the SELECT itself, it will just become a standard DISTINCT clause for that column.

  • Be careful with the labels themselves - the columns returned will only have key defined, and not name like a normal table column from the object.

If anyone still wants to tackle my bonus question, feel free to :) Still not sure if there's a way to use that yet in SQLAlchemy.