0
votes

Given the project structure

.
├── source/
│   └── 1.xlsx
└── test.py

on a locally synchronized SharePoint directory with 1.xlsx containing nothing more than the string one in cell A1 and test.py containing:

import glob
import pandas as pd
import os

dir_path = os.path.dirname(os.path.realpath(__file__))

def main():
    file_list = glob.glob(rf"{dir_path}\source\*.xlsx")

    excel_list = []
    for file in file_list:
        excel_list.append(pd.read_excel(file))

    excl_merged = pd.concat(excel_list)

    writer = pd.ExcelWriter(rf"{dir_path}\merged.xlsx", engine='xlsxwriter')
    excl_merged.to_excel(writer, index=False)
    writer.save()

if __name__ == "__main__":
    main()

The script is run and ./merged.xlsx created. Now, the file is opened within the browser. 2.xlsx containing the string two (cell A1) is created within the source directory. The script is ran a second time. The browser view will not update. OneDrive will show a synchronization error:

enter image description here

If merged.xlsx is opened locally, Excel will give the following info:

enter image description here

I guess that synchronization trouble is caused by the total rewrite of merged.xlsx by the Python script. Is there an appropriate solution to update merged.xlsx instead of overwriting it and keep synchronization alive?

I would guess that the issue isn't caused by overwriting the file but by the fact that you are losing metadata from the file that Sharepoint is using to track users and modifications. XlsxWriter can add some basic document properties but not the review and user information that Excel adds to the file. I don't know if OpenPyXL can do it either. You would probably need to use a library that interacts with Excel directly like xlwings.jmcnamara