We have multiple applications developed in Visual Foxpro 8.0 running in a data center on Windows 2008 R2 on VMware. We also have a Citrix farm on the same network where users run yet another VFP 8.0 application in Citrix sessions. All applications share the same set of data tables located on a file server (also Windows 2008 R2 VM). Virtual hosts are connected by 10Gb LAN (managed switch).
Since mid-July we started seeing random 1104 "Error reading file..." errors on multiple different applications on multiple servers. All of them reference different files on the file server.
The problem started mid-July and it frequency gradually increased. Earlier it was most frequent in the afternoons by 3 pm, now it happens from early morning till late afternoon. It affects EDI servers (these run batch jobs in unattended mode) and Citrix servers and a variety of applications. It occurs when a VFP application (any of them) tries to open a database container file or individual tables most often with USE command but some times executing a SQL Select statement, or when loading a VFP form that opens tables in DataEnvironment
We caught a moment when the same exact error happened on two different servers running different applications at the same exact moment (up to a second). We also saw two different applications running on the same computer erroring out at the same moment.
We replaced the file server with a new virtual machine with no relief (we since changed it back to the old file server ).
We disabled the antivirus.
We updated VMware on all hosts to the latest version.
Sysinternals Process Monitor displays "INVALID_NETWORK_RESPONSE" event when the error occurs.
We captured traffic on both the server side and client side when the error occurred and had it analyzed by a network analysis specialist. He observed a peculiar pattern, where client OS starts retrieving the file in question from the file server AFTER VFP application had thrown an error. It seems that VFP application requests a file from OS, then it either gets an abnormal response or just times out and only after that the OS sends packets requesting the file. Again, this happens sporadically.
OpLocks and SMB2 have been disabled on all computers both on the server and client side of the equation for many years and everything was running smoothly until now...
Any advice would be greatly appreciated.