2
votes

I was doing some reading on google and the sqlalchmey documentation but could not find any kind of built in functionlity that could take a standard sequel formated table and transform it into a cross tab query like Microsoft Access.

I have in the past when using excel and microsoft access created "cross tab" queries. Below is the sequel code from an example:

TRANSFORM Min([Fixed Day-19_Month-8_142040].VoltageAPhase) AS MinOfVoltageAPhase
SELECT [Fixed Day-19_Month-8_142040].Substation, [Fixed Day-19_Month-8_142040].Feeder, [Fixed Day-19_Month-8_142040].MeterID
FROM [Fixed Day-19_Month-8_142040]
GROUP BY [Fixed Day-19_Month-8_142040].Substation, [Fixed Day-19_Month-8_142040].Feeder, [Fixed Day-19_Month-8_142040].MeterID
PIVOT [Fixed Day-19_Month-8_142040].Date;

I am very unskilled when it comes to sequel and the only way I was able to write this was by generating it in access.

My question is: "Since SQL alchemy python code is really just a nice way of calling or generating sequel code using python functions/methods, is there a way I could use SQL alchemy to call a custom query that generates the sequel code (in above block) to make a cross tab query? Obviously, I would have to change some of the sequel code to shoehorn it in with the correct fields and names but the keywords should be the same right?

The other problem is...in addition to returning the objects for each entry in the table, I would need the field names...I think this is called "meta-data"? The end goal being once I had that information, I would want to output to excel or csv using another package.

UPDATED

Okay, so Van's suggestion to use pandas I think is the way to go, I'm currently in the process of figuring out how to create the cross tab:

def OnCSVfile(self,event):
        
        query = session.query(Exception).filter_by(company = self.company)
        data_frame = pandas.read_sql(query.statement,query.session.bind)        ## Get data frame in pandas
        pivot = data_frame.crosstab()

So I have been reading the pandas link you provided and have a question about the parameters.

pandas.crosstab(index, columns, values=None, rownames=None, colnames=None, aggfunc=None, margins=False, dropna=True)

Since, I'm calling "crosstab" off the dataframe object, I assume there must be some kind of built-in way the dataframe recognizes column and row names. For index, I would pass in a list of strings that specify which fields I want tabulated in rows? Columns I would pass in a list of strings that specifiy which field I want along the column? From what I know about cross tab queries, there should only be one specification field for column right? For values, I want minimum function, so I would have to pass some parameter to return the minimum value. Currently searching for an answer.

So if I have the following fields in my flat data frame (my original Sequel Query).

Name, Date and Rank

And I want to pivot the data as follows:

Name = Row of Crosstab

Date = Column of Crosstab

Rank = Min Value of Crosstab

Would the function call be something like:

data_frame.crosstab(['Name'], ['Date'], values=['Rank'],aggfunc = min)

I tried this code below:

    query = session.query(Exception)
    data_frame = pandas.read_sql(query.statement,query.session.bind)
    row_list =  pandas.Series(['meter_form'])
    col_list = pandas.Series(['company'])
    print row_list
    pivot = data_frame.crosstab(row_list,col_list)

But I get this error about data_frame not having the attribute cross tab:

enter image description here

1

1 Answers

3
votes

I guess this might be too much new information for you at once. Nonetheless, I would approach it completely differently. I would basically use pandas python library to do all the tasks:

  1. Retrive the data: since you are using sqlalchemy already, you can simply query the database for only the data you need (flat, without any CROSSTAB/PIVOT)

  2. Transform: put it intoa pandas.DataFrame. For example, like this:

    import pandas as pd
    
    query = session.query(FixedDay...)
    df = pd.read_sql(query.statement, query.session.bind)
    
  3. Pivot: Call pivot = df.crosstab(...) to create a pivot in memory. See pd.crosstab for more information.

  4. Export: Save it to Excel/csv using DataFrame.to_excel