I have used openpyxl with Workbook(write_only=True) to create large Excel xlsx files. In this mode I cannot format the Excel headers. I then save the xlsx and open it again with openpyxl load_workbook(my_book) and have the ability to format the cells. I then save the file. If the file isn't too large it saves but otherwise gives a memory error. So Openfile allows me to create and save the worksheet but not necessarily re-open and save the same worksheet. In this example I just load_workbook then save without changing the xlsx to show the error
from openpyxl import Workbook
from openpyxl import load_workbook
from openpyxl.styles import Font
from openpyxl.styles import PatternFill
wb = Workbook(write_only=True)
# then append a lot of rows
logging.info('Save unformatted xlsx')
wb.save(my_book)
workbook = load_workbook(my_book)
# the worksheet always loads ok at this point, even when 700,000 rows
workbook.save(my_book)
## Immediately after load_workbook I do workbook.save(my_book).
When the rowcount is around 8600 there is no problem. When 350,000 there is a memory error
File "src\lxml\serializer.pxi", line 1268, in lxml.etree._IncrementalFileWriter._handle_error
File "src\lxml\etree.pyx", line 316, in lxml.etree._ExceptionContext._raise_if_stored
File "src\lxml\serializer.pxi", line 650, in lxml.etree._FilelikeWriter.write
MemoryError
logging.info('Saved unformatted xlsx immediately after opening again')
workbook = load_workbook(my_book)
# If no error I do some formatting and all is well and can save ok
Python 3.4.3 openpyxl (2.5.1) lxml (4.2.1)
There are a lot of solutions for older versions but I cannot see any for openpyxl (2.5.1). Does anyone have an answer for openpyxl, or can recommend what to use to open an existing large xlsx and format cells?