1
votes

The default behaviour when importing data from a database table (such as SQL Server) is to bring in all columns and then select which columns you would like to remove.

Is there a way to do the reverse? ie Select which columns you want from a table? Preferably without using a Native SQL solution.

M:

let
    db = Sql.Databases("sqlserver.database.url"){[Name="DatabaseName"]}[Data],
    Sales_vDimCustomer = db{[Schema="Sales",Item="vDimCustomer"]}[Data],
    remove_columns = Table.RemoveColumns(Sales_vDimCustomer,{"Key", "Code","Column1","Column2","Column3","Column4","Column5","Column6","Column7","Column8","Column9","Column10"})
in
    remove_columns 

The snippet above shows the connection and subsequent removal.

Compared to the native SQL way way:

= Sql.Database("sqlserver.database.url", "DatabaseName", [Query="
    SELECT Name,
           Representative,
           Status,
           DateLastModified, 
           UserLastModified,  
           ExtractionDate
    FROM Sales.vDimCustomer
"])

I can't see much documentation on the }[Data], value in the step so was hoping maybe that I could hijack that field to specify which fields from that data.

Any ideas would be great! :)

My first concern is that when this gets compiled down to SQL, it gets sent as two queries (as watched in ExpressProfiler).

The first query removes the selected columns and the second selects all columns.

My second concern is that if a column is added to or removed from the database then it could crash my report (additional columns in Excel Tables jump your structured table language formulas to the wrong column). This is not a problem using Native SQL as it just won't select the new column and would actually crash if the column was removed which is something I would want to know about.

1

1 Answers

4
votes

Ouch that was actually easy after I had another think and a look at the docs.

let db = Sql.Databases("sqlserver.database.url"){[Name="DatabaseName"]}[Data], Sales_vDimCustomer = Table.SelectColumns( (db{[Schema="Sales",Item="vDimCustomer"]}[Data], { "Name", "Representative", "Status", "DateLastModified", "UserLastModified", "ExtractionDate" } ) in Sales_vDimCustomer

This also loaded much faster than the other way and only generated one SQL requested instead of two.