lsprof meets KCachegrind
You can now visualize the output of the lsprof Python profiler using KCachegrind, an excellent visualization tool for profiling data.
One useful feature of the hotshot Python profiler is the existence of the hotshot2calltree conversion filter, that produces output suitable for KCachegrind. So after the s/hotshot/lsprof thread occured on the python-devel mailing list, and that Bazaar-NG added support for lsprof, I was itching to use KCachegrind with lsprof data.
In good libre software style, I eventually scratched my itch and wrote a patch to add calltree support to lsprof. The lsprof maintainers are welcome to apply it.
With lsprof and KCachegrind I was able to quickly identify a performance bug in Bazaar-NG that should be easy to fix. The data generated by hotshot completely misses that issue, at least when displayed by KCachegrind. I am not sure why.
One thing I know for sure is that hotshot2calltree has a serious bug. It causes KCachegrind to confuse multiple functions with the same name and from the same Python file. That completely messes up the call graph and greatly reduces the pertinence of the data.
Specifically, the Bazaar-NG code uses two decorators, needs_read_lock and needs_write_lock, defined in the same module, which return closures of local functions called decorated. Each decorator uses a different decorated function, but hostshot2calltree produces data that causes KCachegrind to only see one decorated function. In the example I'm looking at, it's the one in the needs_write_lock decorator.
KCachegrind seems to make C/C++ assumptions, and seems not to expect two different functions with the same name and in the same file. My lsprof patch works around the issue by including the Python module names and line numbers into the function names in the calltree output. Incidentally, that also makes for more informative names.
27 Jan. 2006 — lsprof meets KCachegrind (5 comments)
