Dabeaz

Dave Beazley's mondo computer blog. [ homepage ]

Friday, November 20, 2009

Python Thread Deadlock Avoidance

One danger of writing programs based on threads is the potential for deadlock--a problem that almost invariably shows up if you happen to write thread code that tries to acquire more than one mutex lock at once. For example:

a_lock = threading.Lock()
b_lock = threading.Lock()

def foo():
    with a_lock:
         ...
         with b_lock:
              # Do something
              ...

t1 = threading.Thread(target=foo)
t1.start()

Code like that looks innocent enough until you realize that some other thread in the system also has a similar idea about locking--but acquires the locks in a slightly different order:

def bar():
    with b_lock:
         ...
         with a_lock:
              # Do something (maybe)
              ...

Sure, the code might be lucky enough work "most" of the time. However, you will suffer a thousand sorrows if both threads try to acquire those locks at about the same time and you have to figure out why your program is mysteriously nonresponsive.

Computer scientists love to spend time thinking about such problems--especially if it means they can make up some diabolical problem about philosophers that they can put on an operating systems exam. However, I'll spare you the details of that.

The problem of deadlock is not something that I would normally spend much time thinking about, but I recently saw some material talking about improved thread support in C++0x. For example, this article has some details. In particular, it seems that C++0x offers a new locking operation std::lock() that can acquire multiple mutex locks all at once while avoiding deadlock. For example:

std::unique_lock<std::mutex> lock_a(a.m,std::defer_lock);
std::unique_lock<std::mutex> lock_b(b.m,std::defer_lock);
std::lock(lock_a,lock_b);      // Lock both locks
...
... do something involving data protected by both locks
...

I don't actually know how C++0x implements its lock() operation, but I do know that one way to avoid deadlock is to put some kind of ordering on all of the locks in a program. If you then strictly enforce a policy that all locks have to be acquired in increasing order, you can avoid deadlock. Just as an example, if you had two locks A and B, you could assign a unique number to each lock such as A=1 and B=2. Then, in any part of the program that wanted to acquire both lock A and B, you just make a rule that A always has to be acquired first (because its number is lower). In such a scheme, the thread bar() shown earlier would simply be illegal. That lock() operation in C++ is almost certainly doing something similar to this--that is, it knows enough about the locks so that they can acquired without deadlock.

All of this got me thinking--I wonder how hard it would be to implement the lock() operation in Python? Not hard as it turns out. First step is to change the name--given that acquire() is the typical method used to acquire a lock, let's just call the operation acquire() to make it more clear. You can define acquire() as a context-manager and simply order locks according to their id() value like this:

class acquire(object):
    def __init__(self,*locks):
        self.locks = sorted(locks, key=lambda x: id(x))
    def __enter__(self):
        for lock in self.locks:
            lock.acquire()
    def __exit__(self,ty,val,tb):
        for lock in reversed(self.locks):
            lock.release()
        return False

Okay, that was easy enough to do, but does it work? Let's try it on the classic dining philosophers problem (look it up if you need a refresher):

import threading

# The philosopher thread
def philosopher(left, right):
    while True:
        with acquire(left,right):
             print threading.currentThread(), "eating"

# The chopsticks
NSTICKS = 5
chopsticks = [threading.Lock() 
              for n in xrange(NSTICKS)]

# Create all of the philosophers
phils = [threading.Thread(target=philosopher,
                          args=(chopsticks[n],chopsticks[(n+1) % NSTICKS]))
         for n in xrange(NSTICKS)]

# Run all of the philosophers
for p in phils:
    p.start()

If you try this code, you'll find that the philosophers run all day with no deadlock. Just as an experiment, you can try changing the philosopher() implementation to one that acquires the locks separately:

def philosopher(left, right):
    while True:
        with left:
             with right:
                 print threading.currentThread(), "eating"

Yep, almost instantaneously deadlock. So, as you can see, our acquire() operation seems to be working.

There's still one last aspect of this experiment that needs to be addressed. One potential problem with our acquire() operation is that it doesn't prevent a user from using it in a nested manner as before. For example, someone might write code like this:

with acquire(a_lock,b_lock):
     ...
     with acquire(c_lock, d_lock):
          ...

Catching such cases at the time of definition would be difficult (if not impossible). However, we could make the acquire() context manager keep a record of all previously acquired locks using a list placed in thread local storage. Here's a new implementation--and just for kicks, I'm going to switch it over to a context manager defined by a generator (mainly because I can and generators are cool):

import threading
from contextlib import contextmanager

local = threading.local()
@contextmanager
def acquire(*locks):
    locks = sorted(locks, key=lambda x: id(x))   
    acquired = getattr(local,"acquired",[])
    # Check to make sure we're not violating the order of locks already acquired   
    if acquired:
        if max(id(lock) for lock in acquired) >= id(locks[0]):
            raise RuntimeError("Lock Order Violation")
    acquired.extend(locks)
    local.acquired = acquired
    try:
        for lock in locks:
            lock.acquire()
        yield
    finally:
        for lock in reversed(locks):
            lock.release()
        del acquired[-len(locks):]

If you use this version, you'll find that the philosophers work just fine as before. However, now consider this slightly modified version with the nested acquires:

# The philosopher thread                                                                                             
def philosopher(left, right):
    while True:
        with acquire(left):
            with acquire(right):
                print threading.currentThread(), "eating"

Unlike the previous version that had nested with statements and deadlocked, this one runs. However, one of the philosophers crashes with a nasty traceback:

Exception in thread Thread-5:
Traceback (most recent call last):
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/threading.py", line 522, in __bootstrap_inner
    self.run()
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/threading.py", line 477, in run
    self.__target(*self.__args, **self.__kwargs)
  File "hier4.py", line 53, in philosopher
    with acquire(right):
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/contextlib.py", line 16, in __enter__
    return self.gen.next()
  File "hier4.py", line 35, in acquire
    raise RuntimeError("Lock Order Violation")
RuntimeError: Lock Order Violation

Very good. That's exactly what we wanted.

So, what's the moral of this story. First of all, I don't think you should use this as a license to go off and write a bunch of multithreaded code that relies on nested lock acquisitions. Sure, the context manager might catch some potential problems, but it won't change the fact that you'll still want to blow your head off after debugging some other horrible problem that comes up with your overly clever and/or complicated design.

I think the main take-away is an appreciation for Python's context-manager feature. There's so much more you can do with a context manager than simply closing a file or releasing an individual lock.

Disclaimer: I didn't do a hugely exhaustive internet search to see if anyone else had implemented anything similar to this in Python. If you know of some links to related work, tell me. I'll add them here.

posted by Dave # 2:50 PM

Comments:

Cool! I think to make this practical the lock ordering must be under control of the application. It might be a partial order too!

# posted by

guido : 3:00 PM

I learned about contextmanager! Nice

# posted by

Andriy Drozdyuk : 11:33 AM

++ Dig it.

# posted by

benjaminws : 9:18 AM

Dabeaz

Friday, November 20, 2009

Python Thread Deadlock Avoidance

Archives