397
votes

I was having a discussion with a teammate about locking in .NET. He's a really bright guy with an extensive background in both lower-level and higher-level programming, but his experience with lower level programming far exceeds mine. Anyway, He argued that .NET locking should be avoided on critical systems expected to be under heavy-load if at all possible in order to avoid the admittedly small possibility of a "zombie thread" crashing a system. I routinely use locking and I didn't know what a "zombie thread" was, so I asked. The impression I got from his explanation is that a zombie thread is a thread that has terminated but somehow still holds onto some resources. An example he gave of how a zombie thread could break a system was a thread begins some procedure after locking on some object, and then is at some point terminated before the lock can be released. This situation has the potential to crash the system, because eventually, attempts to execute that method will result in the threads all waiting for access to an object that will never be returned, because the thread that is using the locked object is dead.

I think I got the gist of this, but if I'm off base, please let me know. The concept made sense to me. I wasn't completely convinced that this was a real scenario that could happen in .NET. I've never previously heard of "zombies", but I do recognize that programmers who have worked in depth at lower levels tend to have a deeper understanding of computing fundamentals (like threading). I definitely do see the value in locking, however, and I have seen many world class programmers leverage locking. I also have limited ability to evaluate this for myself because I know that the lock(obj) statement is really just syntactic sugar for:

bool lockWasTaken = false;
var temp = obj;
try { Monitor.Enter(temp, ref lockWasTaken); { body } }
finally { if (lockWasTaken) Monitor.Exit(temp); }

and because Monitor.Enter and Monitor.Exit are marked extern. It seems conceivable that .NET does some kind of processing that protects threads from exposure to system components that could have this kind of impact, but that is purely speculative and probably just based on the fact that I've never heard of "zombie threads" before. So, I'm hoping I can get some feedback on this here:

  1. Is there a clearer definition of a "zombie thread" than what I've explained here?
  2. Can zombie threads occur on .NET? (Why/Why not?)
  3. If applicable, How could I force the creation of a zombie thread in .NET?
  4. If applicable, How can I leverage locking without risking a zombie thread scenario in .NET?

Update

I asked this question a little over two years ago. Today this happened:

Object is in a zombie state.

7
are you sure your co-mate does not talk about deadlocking??user57508
@AndreasNiedermair - I know what deadlocking is and it was clearly not a matter of misuse of that terminology. Deadlocking was mentioned in the conversation and was clearly distinct from a "zombie thread". To me, the main distinction is that a dead lock has a two-way un-resolvable dependency whereas zombie thread is one-way, and requires a terminated process. If you disagree and think there's a better way to look at these things, please explainsmartcaveman
I think the term "zombie" actually comes from a UNIX backgound, as in "zombie process", right??? There is a clear definition of a "zombie process" in UNIX: it describes a child process that has terminated but where the parent of the child process still needs to "release" the child process (and it's resources) by calling wait or waitpid. The child process is then called a "zombie process". See also howtogeek.com/119815hogliux
If part of your program crashes, leaving the program in an undefined state, then of course that can cause problems with the rest of your program. The same thing can happen if you improperly handle exceptions in a single-threaded program. The problem isn't with threads, the problem is that you have global mutable state and that you're not properly handling unexpected thread termination. Your "really bright" coworker is totally off base on this one.Jim Mischel
"Ever since the first computers, there have always been ghosts in the machine. Random segments of code that have grouped together to form unexpected protocols..."Chris Laplante

7 Answers

245
votes
  • Is there a clearer definition of a "zombie thread" than what I've explained here?

Seems like a pretty good explanation to me - a thread that has terminated (and can therefore no longer release any resources), but whose resources (e.g. handles) are still around and (potentially) causing problems.

  • Can zombie threads occur on .NET? (Why/Why not?)
  • If applicable, How could I force the creation of a zombie thread in .NET?

They sure do, look, I made one!

[DllImport("kernel32.dll")]
private static extern void ExitThread(uint dwExitCode);

static void Main(string[] args)
{
    new Thread(Target).Start();
    Console.ReadLine();
}

private static void Target()
{
    using (var file = File.Open("test.txt", FileMode.OpenOrCreate))
    {
        ExitThread(0);
    }
}

This program starts a thread Target which opens a file and then immediately kills itself using ExitThread. The resulting zombie thread will never release the handle to the "test.txt" file and so the file will remain open until the program terminates (you can check with process explorer or similar). The handle to "test.txt" won't be released until GC.Collect is called - it turns out it is even more difficult than I thought to create a zombie thread that leaks handles)

  • If applicable, How can I leverage locking without risking a zombie thread scenario in .NET?

Don't do what I just did!

As long as your code cleans up after itself correctly (use Safe Handles or equivalent classes if working with unmanaged resources), and as long as you don't go out of your way to kill threads in weird and wonderful ways (safest way is just to never kill threads - let them terminate themselves normally, or through exceptions if necessary), the only way that you are going to have something resembling a zombie thread is if something has gone very wrong (e.g. something goes wrong in the CLR).

In fact its actually surprisingly difficult to create a zombie thread (I had to P/Invoke into a function that esentially tells you in the documentation not to call it outside of C). For example the following (awful) code actually doesn't create a zombie thread.

static void Main(string[] args)
{
    var thread = new Thread(Target);
    thread.Start();
    // Ugh, never call Abort...
    thread.Abort();
    Console.ReadLine();
}

private static void Target()
{
    // Ouch, open file which isn't closed...
    var file = File.Open("test.txt", FileMode.OpenOrCreate);
    while (true)
    {
        Thread.Sleep(1);
    }
    GC.KeepAlive(file);
}

Despite making some pretty awful mistakes, the handle to "test.txt" is still closed as soon as Abort is called (as part of the finalizer for file which under the covers uses SafeFileHandle to wrap its file handle)

The locking example in C.Evenhuis answer is probably the easiest way to fail to release a resource (a lock in this case) when a thread is terminated in a non-weird way, but thats easily fixed by either using a lock statement instead, or putting the release in a finally block.

See also

47
votes

I've cleaned up my answer a bit, but left the original one below for reference

It’s the first time I've heard of the term zombies so I'll assume its definition is:

A thread that has terminated without releasing all of its resources

So given that definition, then yes, you can do that in .NET, as with other languages (C/C++, java).

However, I do not think this as a good reason not to write threaded, mission critical code in .NET. There may be other reasons to decide against .NET but writing off .NET just because you can have zombie threads somehow doesn't make sense to me. Zombie threads are possible in C/C++ (I'd even argue that it’s a lot easier to mess up in C) and a lot of critical, threaded apps are in C/C++ (high volume trading, databases etc).

Conclusion If you are in the process of deciding on a language to use, then I suggest you take the big picture into consideration: performance, team skills, schedule, integration with existing apps etc. Sure, zombie threads are something that you should think about, but since it’s so difficult to actually make this mistake in .NET compared to other languages like C, I think this concern will be overshadowed by other things like the ones mentioned above. Good luck!

Original Answer Zombies can exist if you don't write proper threading code. The same is true for other languages like C/C++ and Java. But this is not a reason not to write threaded code in .NET.

And just like with any other language, know the price before using something. It also helps to know what is happening under the hood so you can foresee any potential problems.

Reliable code for mission critical systems is not easy to write, whatever language you're in. But I'm positive it’s not impossible to do correctly in .NET. Also AFAIK, .NET threading is not that different from threading in C/C++, it uses (or is built from) the same system calls except for some .net specific constructs (like the light weight versions of RWL and event classes).

first time I've heard of the term zombies but based on your description, your colleague probably meant a thread that terminated without release all resources. This could potentially cause a deadlock, memory leak or some other bad side effect. This is obviously not desirable but singling out .NET because of this possibility is probably not a good idea since it’s possible in other languages too. I'd even argue that it’s easier to mess up in C/C++ than in .NET (especially so in C where you don't have RAII) but a lot of critical apps are written in C/C++ right? So it really depends on your individual circumstances. If you want to extract every ounce of speed from your application and want to get as close to bare metal as possible, then .NET might not be the best solution. If you are on a tight budget and do a lot of interfacing with web services/existing .net libraries/etc then .NET may be a good choice.

27
votes

Right now most of my answer has been corrected by the comments below. I won't delete the answer because I need the reputation points because the information in the comments may be valuable to readers.

Immortal Blue pointed out that in .NET 2.0 and up finally blocks are immune to thread aborts. And as commented by Andreas Niedermair, this may not be an actual zombie thread, but the following example shows how aborting a thread can cause problems:

class Program
{
    static readonly object _lock = new object();

    static void Main(string[] args)
    {
        Thread thread = new Thread(new ThreadStart(Zombie));
        thread.Start();
        Thread.Sleep(500);
        thread.Abort();

        Monitor.Enter(_lock);
        Console.WriteLine("Main entered");
        Console.ReadKey();
    }

    static void Zombie()
    {
        Monitor.Enter(_lock);
        Console.WriteLine("Zombie entered");
        Thread.Sleep(1000);
        Monitor.Exit(_lock);
        Console.WriteLine("Zombie exited");
    }
}

However when using a lock() { } block, the finally would still be executed when a ThreadAbortException is fired that way.

The following information, as it turns out, is only valid for .NET 1 and .NET 1.1:

If inside the lock() { } block an other exception occurs, and the ThreadAbortException arrives exactly when the finally block is about to be ran, the lock is not released. As you mentioned, the lock() { } block is compiled as:

finally 
{
    if (lockWasTaken) 
        Monitor.Exit(temp); 
}

If another thread calls Thread.Abort() inside the generated finally block, the lock may not be released.

24
votes

This isn't about Zombie threads, but the book Effective C# has a section on implementing IDisposable, (item 17), which talks about Zombie objects which I thought you may find interesting.

I recommend reading the book itself, but the gist of it is that if you have a class either implementing IDisposable, or containing a Desctructor, the only thing you should be doing in either is releasing resources. If you do other things here, then there is a chance that the object will not be garbage collected, but will also not be accessible in any way.

It gives an example similar to below:

internal class Zombie
{
    private static readonly List<Zombie> _undead = new List<Zombie>();

    ~Zombie()
    {
        _undead.Add(this);
    }
}

When the destructor on this object is called, a reference to itself is placed on the global list, meaning it stays alive and in memory for the life of the program, but isn't accessible. This may mean that resources (particularly unmanaged resources) may not be fully released, which can cause all sorts of potential issues.

A more complete example is below. By the time the foreach loop is reached, you have 150 objects in the Undead list each containing an image, but the image has been GC'd and you get an exception if you try to use it. In this example, I am getting an ArgumentException (Parameter is not valid) when I try and do anything with the image, whether I try to save it, or even view dimensions such as height and width:

class Program
{
    static void Main(string[] args)
    {
        for (var i = 0; i < 150; i++)
        {
            CreateImage();
        }

        GC.Collect();

        //Something to do while the GC runs
        FindPrimeNumber(1000000);

        foreach (var zombie in Zombie.Undead)
        {
            //object is still accessable, image isn't
            zombie.Image.Save(@"C:\temp\x.png");
        }

        Console.ReadLine();
    }

    //Borrowed from here
    //http://stackoverflow.com/a/13001749/969613
    public static long FindPrimeNumber(int n)
    {
        int count = 0;
        long a = 2;
        while (count < n)
        {
            long b = 2;
            int prime = 1;// to check if found a prime
            while (b * b <= a)
            {
                if (a % b == 0)
                {
                    prime = 0;
                    break;
                }
                b++;
            }
            if (prime > 0)
                count++;
            a++;
        }
        return (--a);
    }

    private static void CreateImage()
    {
        var zombie = new Zombie(new Bitmap(@"C:\temp\a.png"));
        zombie.Image.Save(@"C:\temp\b.png");
    }
}

internal class Zombie
{
    public static readonly List<Zombie> Undead = new List<Zombie>();

    public Zombie(Image image)
    {
        Image = image;
    }

    public Image Image { get; private set; }

    ~Zombie()
    {
        Undead.Add(this);
    }
}

Again, I am aware you were asking about zombie threads in particular, but the question title is about zombies in .net, and I was reminded of this and thought others may find it interesting!

21
votes

On critical systems under heavy load, writing lock-free code is better primarily because of the performance improvments. Look at stuff like LMAX and how it leverages "mechanical sympathy" for great discussions of this. Worry about zombie threads though? I think that's an edge case that's just a bug to be ironed out, and not a good enough reason not to use lock.

Sounds more like your friend is just being fancy and flaunting his knowledege of obscure exotic terminology to me! In all the time I was running the performance labs at Microsoft UK, I never came across an instance of this issue in .NET.

3
votes

1.Is there a clearer definition of a "zombie thread" than what I've explained here?

I do agree that "Zombie Threads" exist, it's a term to refer to what happens with Threads that are left with resources that they don't let go of and yet don't completely die, hence the name "zombie," so your explanation of this referral is pretty right on the money!

2.Can zombie threads occur on .NET? (Why/Why not?)

Yes they can occur. It's a reference, and actually referred to by Windows as "zombie": MSDN uses the Word "Zombie" for Dead processes/threads

Happening frequently it's another story, and depends on your coding techniques and practices, as for you that like Thread Locking and have done it for a while I wouldn't even worry about that scenario happening to you.

And Yes, as @KevinPanko correctly mentioned in the comments, "Zombie Threads" do come from Unix which is why they are used in XCode-ObjectiveC and referred to as "NSZombie" and used for debugging. It behaves pretty much the same way... the only difference is an object that should've died becomes a "ZombieObject" for debugging instead of the "Zombie Thread" which might be a potential problem in your code.

0
votes

I can make zombie threads easily enough.

var zombies = new List<Thread>();
while(true)
{
    var th = new Thread(()=>{});
    th.Start();
    zombies.Add(th);
}

This leaks the thread handles (for Join()). It's just another memory leak as far as we are concerned in the managed world.

Now then, killing a thread in a way that it actually holds locks is a pain in the rear but possible. The other guy's ExitThread() does the job. As he found, the file handle got cleaned up by the gc but a lock around an object wouldn't. But why would you do that?