I have an Azure website on which a user can upload a lot of XML-files. These files need to be processed and filled in the database.
For this processing I use a continues webjob.
For non-relevant reasons all the uploaded files need to be process per user. So I have a table with all the files and the userId. And I have a table with running jobs. I have multiple webjobs doing the same process. Each webjob looks in the files table if any file needs to be processed. Before starting it checks against the running jobs table if another job is not already processing the files of the user.
This works fine and can run for months without any problem. But sometimes the continuous web jobs are restarting. Mostly at night (my time) making me miss valuable processing time. I'm the only one accessing Azure. I have not deployed anything new prior to the restart. The job is processing most of the time when it restarts. So a memory problem can be an issue. But I'm running a S3 and max cpu and memory don't exceed 40%. The logging isn't very helpful either:
[01/25/2018 05:03:20 > 5657e1: INFO] Starting job: 28158.
[01/25/2018 09:49:24 > 5657e1: SYS INFO] WebJob is still running
[01/25/2018 20:23:06 > 5657e1: SYS INFO] Status changed to Starting
[01/25/2018 20:23:06 > 5657e1: SYS INFO] WebJob singleton setting is False
Because the web job is not nicely finished the running job table isn't updated. On restart the job still thinks the files of the user are processed by another web job making all jobs waiting for each other and nothing is happening.
How can I see why the job is restarting? When I know the reason I might fix it. Any help is much appreciated.
Update I changed my entry point and added the following lines at the top of my main method:
// Get the shutdown file path from the environment
_shutdownFile = Environment.GetEnvironmentVariable("WEBJOBS_SHUTDOWN_FILE");
_log.Info("Watching " + _shutdownFile);
// Setup a file system watcher on that file's directory to know when the file is created:
var filename = Path.GetFileName(_shutdownFile);
if (filename != null)
{
var fileSystemWatcher = new FileSystemWatcher(filename);
fileSystemWatcher.Created += OnAzureRestart;
fileSystemWatcher.Changed += OnAzureRestart;
fileSystemWatcher.NotifyFilter = NotifyFilters.CreationTime | NotifyFilters.FileName | NotifyFilters.LastWrite;
fileSystemWatcher.IncludeSubdirectories = false;
fileSystemWatcher.EnableRaisingEvents = true;
_log.Info("FileSystemWatcher is set-up");
}
But after publishing it to Azure the webjob won't start but throws and error:
[02/08/2018 15:23:56 > a93630: ERR ] Unhandled Exception: System.ArgumentException: The directory name gugfn3vx.0gk is invalid.
[02/08/2018 15:23:56 > a93630: ERR ] at System.IO.FileSystemWatcher..ctor(String path, String filter)
[02/08/2018 15:23:56 > a93630: ERR ] at System.IO.FileSystemWatcher..ctor(String path)
[02/08/2018 15:23:56 > a93630: ERR ] at TaskRunner.Program.Main(String[] args)
I think the problem is at this line Path.GetFileName(_shutdownFile)
because the file doesn't exist when the webjob is still running.
Any more advice?
Update 2 Somehow I made a wrong code change. This is the working code:
// Get the shutdown file path from the environment
_shutdownFile = Environment.GetEnvironmentVariable("WEBJOBS_SHUTDOWN_FILE");
_log.Info("Watching " + _shutdownFile);
// Setup a file system watcher on that file's directory to know when the file is created:
var folder = Path.GetDirectoryName(_shutdownFile);
if (folder != null)
{
var fileSystemWatcher = new FileSystemWatcher(folder);
fileSystemWatcher.Created += OnAzureRestart;
fileSystemWatcher.Changed += OnAzureRestart;
fileSystemWatcher.NotifyFilter = NotifyFilters.CreationTime | NotifyFilters.FileName | NotifyFilters.LastWrite;
fileSystemWatcher.IncludeSubdirectories = false;
fileSystemWatcher.EnableRaisingEvents = true;
_log.Info("FileSystemWatcher is set-up");
}
The change is in line var folder = Path.GetDirectoryName(_shutdownFile);