Background
I have a unit test framework that creates entities for my unit tests, preforms the test, then automagically deletes the entities. It had been working fine except that some entities take 15 - 30 seconds to delete in our dev environment.
I recently received a VM setup in the Amazon Cloud to perform some long term changes requiring a couple release cycles to complete. When I run a unit test on VM, I'm continually getting SQL Timeout Errors attempting to delete the entity.
Steps
I've gone down this set of discovery / action steps:
- Turned on tracing, saw that timeout was occurring on
fn_CollectForCascadeWrapperwhich is used to handle cascading deletes. My unit test only has 6 entities in it, and they are deleted in such a way that no cascading deletes are needed. Ran Estimated Execution Plan on it and added some of the indexes it requested. This still didn't fix the timeout issue. - Turned on the Resource Manager on the VM to look at Disk Access / Memory / CPU. When I attempt a delete, the CPU hits 20% for 2 seconds, then drops down to near 0. Memory is unchanged, but Disk Read Access on the Resource Manager Goes crazy high, and stays that way for 7-10 minutes.
- Hard Coded the
fn_CollectForCascadeWrapperto return a result meaning nothing is required to be cascaded for the 6 entities in my unit test. Ran the unit test and again got the SQL Timeout Error. According to the Tracing, the actual delete statement was timing out:
delete from [New_inquiryExtensionBase] where ([New_inquiryId] = '7e250a5f-890e-40ae-9d2d-c55bbd7250cd');
delete from [New_inquiryBase]
OUTPUT DELETED.[New_inquiryId], 10012
into SubscriptionTrackingDeletedObject (ObjectId, ObjectTypeCode)
where ([New_inquiryId] = '7e250a5f-890e-40ae-9d2d-c55bbd7250cd')
- Ran the query manually in SQL Management Studio. Took around 3 minutes to complete. No Triggers on the tables, so I thought the time must be due to the insert. Looked at the
SubscriptionTrackingDeletedObjecttable, and noticed it had 2100 records in it. Deleted all records in the table, and reran my unit test. It actually worked in the normal 15-30 second time frame for deletes. - Researched and discovered what the
SubscriptionTrackingDeletedObjectis used for, and that the Async Service cleans it up. Noticed that the Async Service was not running on the server. Turned the service on, waited 10 minutes and queried the table again. My 6 entities were still listed there. Looked in trace log and saw timeout errors:Error cleaning up Principal Object Access Table - Researched POA and performed a
SELECT COUNT(*)on the table and 7 minutes later it returned 261 million records! Researched how to cleanup the table and the only thing I found was for Role Up 6 (we're currently on 11).
What Next?
Could the POA be affecting the Delete? Or is it just the POA that is affecting the Async Service that is affecting the delete? Could inserting into the SubscriptionTrackingDeletedObject really be causing my problem?

