0
votes

I'm trying to use Excel Power Query (Stuck with Excel 2010 in this one) to check if a folder called "SWPPP", which is inside a list of directories, has files in it.

The folder structure is as follows.

|O:\Planning Projects
|---2012\
|--------00-000 A Custom Folder Name\
|------------------------------------\SWPPP
|---2013\
|--------00-000 A Custom Folder Name\
|------------------------------------\SWPPP
|---2014\
|--------00-000 A Custom Folder Name\
|------------------------------------\SWPPP
|---2015\
|--------PB-391-000 A Custom Folder Name\
|------------------------------------\SWPPP
|--------10-000 Another Custom Folder Name\
|------------------------------------\SWPPP
|---2016\
|--------00-000 A Custom Folder Name\
|------------------------------------\SWPPP
|--------10-300 Another Custom Folder Name\
|------------------------------------\SWPPP
|---2017\
|--------00-000 A Custom Folder Name\
|------------------------------------\SWPPP
|--------10-000 Another Custom Folder Name\
|------------------------------------\SWPPP
|---Fill Permits\
|--------These folders I don't care about\
|--------Another Folder I don't care about\
|---Pat's Inspections\
|--------These folders I don't care about\
|--------Another Folder I don't care about\

From root (O:\Planning Projects), I'm only interested in the folders that are inside the "year" folders (e.g.: 2012, 2013, 2014...). There are more than 600 sub-directories (close to 700) inside them, and they will keep growing as time goes. I'm trying to automate and future-proof this as much as possible.

What I need is to go in each of those "year" folders, go to the sub-folders, check if there is a "SWPPP" folder, and then check if there are any files in it.

I'm close to get it done, but there's a part which I'm stuck. I'm getting the infamous:

Formula.Firewall: Query 'Query1' (step 'loadedFiles') references other queries or steps, so it may not directly access a data source. Please rebuild this data combination.

Before going any further, I've done my research and accessed a couple of sites:

https://www.excelguru.ca/blog/2015/03/11/power-query-errors-please-rebuild-this-data-combination/

https://community.powerbi.com/t5/Desktop/Formula-Firewall-Query-references-other-queries-so-it-may-not/td-p/18619

Power Query - Please rebuild this data combination

This is the procedure I've followed:

  1. 1.I load the root directory. (O:\Planning Projects)
  2. Do some logic to get the "years" folder into a list: {2012, 2013, 2014,etc}\
  3. From the list, I iterate and get the directories that are in each of those lists. (So far, so good)
    1. Now, I try to append "SWPPP" to the folder structure, and try to get the Contents from the folder. Then that's where I get struck with the error.

I tried breaking the algorithm in different parts, as suggested by excelguru.ca, but that didn't do anything.

My second approach:

  1. Repeat steps 1 -3 from previous procedure.
  2. Load the directories into the Excel (See Step 3), and then, concatenated "SWPPP" in a custom table I created.
  3. Load the custom table into a new query using Power Query's import from Excel file (By importing the same file).
  4. I tried reading the files again, but it gave me the same error.

I'm using m programming language to put everything together (Need some help, only know it for a day), and I'm stuck.

Here's the second approach:

This first query loads all the sub-directories from the "year" folders.

Query Name: "LoadInspections"
let
    Url = "O:\Planning Projects",
    Source = Folder.Contents(Url),
    Cols = Table.SelectColumns(Source, "Name"),
    ColsList = Table.ToList(Cols),
    Cond = (val) => try not Number.IsNaN(Number.FromText(val)) otherwise false,
    Years = List.Select(ColsList, (val) => val = "pre 2012" or Cond(val)),
    load = List.Transform(Years, (val) => Folder.Contents(Url & "\" & val)),
    combinedTables = Table.Combine(load)

in
  combinedTables

This is the query that loads the custom table I created in Excel, which concatenates the directories with "SWPPP":

Query Name: ApplicationNumber_Files_Folders

let
    Source = Excel.Workbook(File.Contents("O:\MS4 Program\MCM4 Construction Site Stormwater Runoff Control\SWPPP Tracker\SWPPP Inspection Processor.xlsm"), null, true),
    ApplicationNumber_Files_Folders_Table = Source{[Item="ApplicationNumber_Files_Folders",Kind="Table"]}[Data],
    #"Changed Type" = Table.TransformColumnTypes(ApplicationNumber_Files_Folders_Table,{{"Application Number", type text}, {"Planning Project", type text}, {"Hidden Folder Path", type text}, {"Folder Path", type text}})
in
    #"Changed Type"

Then, as suggested by excelguru.ca, I tried breaking it up in a separate query:

Query Name: Query1

let
    Source = ApplicationNumber_Files_Folders,
    Cols = Table.SelectColumns(Source, "Hidden Folder Path"),
    ColsList = Table.ToList(Cols),
    SWPPP = List.Transform(ColsList, (url) => url&"\SWPPP"),
    // This line below is the problematic one
    loadedFiles = List.Transform(SWPPP , (url) => Folder.Contents(url))


in
    loadedFiles

I've visited these sites:

1
Is it not possible to just use the built in filters methods to filter down to only year folders then filter only to directory paths that end in "SWPPP"? The process you are going through sounds potentially over-developed for what you say you need.Wedge
@Wedge Actually, that's not a bad idea. Do you know if I could look to all the files under a certain sub-directory using Power Query?Jose A
Absolutely agree with @Wedge.Mike Honey
Also note the formula firewall can be disabled by setting Privacy to Ignore.Mike Honey
@Wedge: I've been wondering about something. Is it possible for me to get directories that match certain criteria? For example, subdirectories that match only years.Jose A

1 Answers

2
votes

If you are looking for your SWPP directories that have files in them, I would probably just change the base level query to be "Folder.Files" instead of "Folder.Contents", as this will return every file in every subdirectory of the specified base directory.

= Folder.Files("C:\folders")

Base Query

If these SWPP directories only exist in your year folders, all you really need to do is filter the Folder Path column to anything that ends in "SWPP\"

= Table.SelectRows(#"Filtered Rows", each Text.EndsWith([Folder Path], "SWPP\"))

enter image description here

If you wanted to filter only for the year directories, it depends how complex and varied the directories you are looking for are. If they are all just year folders in a single directory as you described, you can just do something simple like check the text in that location of the filepath to see if it is "20" (assuming your years are only going back to 2000 anyways...), which isn't the most modular check, but may be adequate for your needs.

= Table.SelectRows(Source, each (Text.Range([Folder Path], 11, 2) = "20"))

enter image description here

From there you can just filter down to whatever you need (removing all the columns except the folder path and remove duplicates on it will give you a list of every SWPP directory that has files in it)