Python has accumulated a lot of… character over the years. We’ve got no less then 3 profiling libraries for single threaded execution and a multi-threaded profiler with an incompatible interface (Yappi). Since many applications use more then one thread, this can be a bit annoying.
Yappi works most of the time. Except it can sometimes cause your application to hang for unknown reasons (I blame signals, personally). The other issue is that Yappi doesn’t have a way of collecting call-stack information. (I don’t necessarily care that memcpy takes all of the time, I want to know who called memcpy). In particular, the lovely gprof2dot can take in pstats dumps and output a very nice profile graph.
To address this for my uses, I glom together cProfile runs from multiple threads. In case it might be useful for other people I wrote a quick gist illustrating how to do it. To make it easy to drop in, I monkey-patch the Thread.run method, but you can use a more maintainable approach if you like (I create a subclass ProfileThread in my applications).
from threading import Thread
import cProfile
import pstats
def enable_thread_profiling():
'''Monkey-patch Thread.run to enable global profiling.
Each thread creates a local profiler; statistics are pooled
to the global stats object on run completion.'''
Thread.stats = None
thread_run = Thread.run
def profile_run(self):
self._prof = cProfile.Profile()
self._prof.enable()
thread_run(self)
self._prof.disable()
if Thread.stats is None:
Thread.stats = pstats.Stats(self._prof)
else:
Thread.stats.add(self._prof)
Thread.run = profile_run
def get_thread_stats():
stats = getattr(Thread, 'stats', None)
if stats is None:
raise ValueError, 'Thread profiling was not enabled,'\
'or no threads finished running.'
return stats
if __name__ == '__main__':
enable_thread_profiling()
import time
t = Thread(target=time.sleep, args=(1,))
t.start()
t.join()
get_thread_stats().print_stats()