I've searched through web and asked people about my "simple" problem, but I got no satisfactoy answer.
My problem is following:
I'm comparing data from Exhange (in form of excel files, that are saved in some folders) and data in my system (SQL query from database). I'm designing a tool that compares data FROM - TO specific DATE. All my exchange data filenames are based on specific date format, some string and excel file format varies (sometimes .xls, .xlsx, .xlsm).
Obviously what I need to do is to write a loop to search for needed files from "FROM" date to "TO" date. Let's take a period from 7th July 2020 to 13th July 2020. Let's say, that files for 11th July 2020 are missing. Bear in mind, that my files are stored in some location with multiple subfolders named by MONTH etc.
Example:
C:\Users\VB\Desktop\VB\python\05
C:\Users\VB\Desktop\VB\python\06
C:\Users\VB\Desktop\VB\python\07
Here are some examples of my filenames:
07.07.2020 - BestScore.xls
07.07.2020 - WorstScore.xlsx
08.07.2020 - BestScore.xls
08.07.2020 - WorstScore.xlsx
09.07.2020 - BestScore.xls
09.07.2020 - WorstScore.xls
10.07.2020 - BestScore.xls
10.07.2020 - WorstScore.xlsm
12.07.2020 - BestScore.xls
12.07.2020 - WorstScore.xlsx
My basic code looks like this:
import os
from datetime import timedelta
startD = date(2020,7,10)
day= timedelta(days=1)
EndD = date(2020,7,13)
folder = 'C:\Users\VB\Desktop\VB\python'
while startD <= EndD:
date=(startD.strftime("%d.%m.%Y"))
file = date + '-BestScore'
file2 = date + '-Worstscore'
**code IF file or file2 is found ---> do something **
** ELSE IF file or file2 is not found ---> print(file or file2 not found)
Problem occurs because I must use Wildcards and need to search through multiple folders (sometimes I need to comapare data for few months backward, so searching in different folder is a must).
I have tried to use different functions for looping through multiple folder:
- os.walk()
- glob.glob()
- glob2.iglob()
but none of them works the way I want. While looping, these functions check each file for wildcard name and obviously return "else if" statement for above for EACH filename that is not exactly named:
no file for 20200713
no file for 20200713
no file for 20200713
no file for 20200713
no file for 20200713
no file for 20200713
no file for 20200713
no file for 20200713
no file for 20200713
no file for 20200713
no file for 20200713
no file for 20200713
no file for 20200713
no file for 20200713
no file for 20200713
I don't need to check each file if it is the right one, I just want to receive results like this:
found 07.07.2020 - BestScore.xls
found 07.07.2020 - WorstScore.xlsx
found 08.07.2020 - BestScore.xls
found 08.07.2020 - WorstScore.xlsx
found 09.07.2020 - BestScore.xls
found 09.07.2020 - WorstScore.xls
found 10.07.2020 - BestScore.xls
found 10.07.2020 - WorstScore.xlsm
NOT found 11.07.2020 - Bestscore
NOT found 11.07.2020 - Worstscore
found 12.07.2020 - BestScore.xls
found 12.07.2020 - WorstScore.xlsx
To sum up, I need a solution to search in multiple subfolders with Wildcard* and NOT to check every file with IF statement.
I'm learning python for few months and I think this should not be a great problem to solve, but I'm kinda confused about it. Solving this problem would complete my project, since everything else is already working :)
I would be very glad for any help.
Thanks.