So I have a game where the main thread does the usual update / render logic you'd expect, and a second thread that does some very intense processing. The problem I'm having is that occasionally the main thread will be interrupted and the game will drop below 60FPS. I'm fairly sure that it's being blocked by the other thread, but since there's no explicit locking, I have no way of proving it.
There are a few scenarios I can think up of as to why the the main thread would be blocked by the secondary thread:
- The second thread allocates a lot of small objects; the memory allocation forces one thread to wait while the other allocates memory. This seems unlikely as you'd expect that after allocating one small object, then the main thread could get on with allocating what it needs to.
- Some form of JIT optimization that prevents the secondary thread from being interrupted when it's taking too long. This makes no sense at all.
- Some sort of cross-thread reference that's being locked. Unlikely as the code is intentionally separated by a queue where the secondary thread picks up items off the queue, but does not lock and prevent items being placed on the queue.
- Bad thread prioritization by the OS, also very unlikely since this issue occurs on both Linux and Windows.
I've tried putting in stopwatches and measuring what areas of code take time, but this doesn't really tell me much beyond "the main thread was stopped for 500ms at random"; it doesn't actually tell me if there's locks blocking the main thread for an extended period of time.
Are there any techniques I can use to narrow down the cause of this issue?
----- EDIT -----
These are the results of running the Mono profiler and reporting on lock contentions:
Monitor lock summary
Lock object 0x7f05190c9fe0: 1 contentions
0.002126 secs total wait time, 0.002126 max, 0.002126 average
1 contentions from:
(wrapper runtime-invoke) object:runtime_invoke_void__this__ (object,intptr,intptr,intptr)
System.Threading.Thread:StartInternal ()
System.Threading.Timer/Scheduler:SchedulerThread ()
(wrapper unknown) System.Threading.Monitor:FastMonitorEnterV4 (object,bool&)
System.Threading.Monitor:Enter (object,bool&)
System.Threading.Monitor:TryEnter (object,int,bool&)
(wrapper managed-to-native) System.Threading.Monitor:try_enter_with_atomic_var (object,int,bool&)
Lock object 0x7f051910b100: 1 contentions
0.000628 secs total wait time, 0.000628 max, 0.000628 average
1 contentions from:
Ninject.Components.ComponentContainer:Get (System.Type)
Ninject.Components.ComponentContainer:ResolveInstance (System.Type,System.Type)
Ninject.Components.ComponentContainer:CreateNewInstance (System.Type,System.Type)
System.Reflection.ConstructorInfo:Invoke (object[])
System.Reflection.MonoCMethod:Invoke (System.Reflection.BindingFlags,System.Reflection.Binder,object[],System.Globalization.CultureInfo)
System.Reflection.MonoCMethod:DoInvoke (object,System.Reflection.BindingFlags,System.Reflection.Binder,object[],System.Globalization.CultureInfo)
System.Reflection.MonoCMethod:InternalInvoke (object,object[])
(wrapper managed-to-native) System.Reflection.MonoCMethod:InternalInvoke (System.Reflection.MonoCMethod,object,object[],System.Exception&)
(wrapper runtime-invoke) <Module>:runtime_invoke_void__this___object (object,intptr,intptr,intptr)
Ninject.Activation.Caching.ActivationCache:.ctor (Ninject.Activation.Caching.ICachePruner)
Ninject.Activation.Caching.GarbageCollectionCachePruner:Start (Ninject.Activation.Caching.IPruneable)
(wrapper remoting-invoke-with-check) System.Threading.Timer:.ctor (System.Threading.TimerCallback,object,int,int)
System.Threading.Timer:.ctor (System.Threading.TimerCallback,object,int,int)
System.Threading.Timer:Init (System.Threading.TimerCallback,object,long,long)
System.Threading.Timer:Change (long,long,bool)
System.Threading.Timer/Scheduler:Change (System.Threading.Timer,long)
(wrapper unknown) System.Threading.Monitor:FastMonitorEnterV4 (object,bool&)
System.Threading.Monitor:Enter (object,bool&)
System.Threading.Monitor:TryEnter (object,int,bool&)
(wrapper managed-to-native) System.Threading.Monitor:try_enter_with_atomic_var (object,int,bool&)
Lock object 0x7f05190ca000: 1 contentions
0.000347 secs total wait time, 0.000347 max, 0.000347 average
1 contentions from:
(wrapper runtime-invoke) object:runtime_invoke_void__this__ (object,intptr,intptr,intptr)
System.Threading.Thread:StartInternal ()
System.Threading.Timer/Scheduler:SchedulerThread ()
(wrapper remoting-invoke-with-check) System.Threading.EventWaitHandle:Reset ()
System.Threading.EventWaitHandle:Reset ()
(wrapper unknown) System.Threading.Monitor:FastMonitorEnterV4 (object,bool&)
System.Threading.Monitor:Enter (object,bool&)
System.Threading.Monitor:TryEnter (object,int,bool&)
(wrapper managed-to-native) System.Threading.Monitor:try_enter_with_atomic_var (object,int,bool&)
Lock contentions: 3
Lock acquired: 3
Lock failures: 0
This is from running the game for around 20-30 seconds, under which I observed at least 10 lag spikes. There's only 3 lock contentions in that time, and all of them take less than 16ms to resolve.