Tracing Memory Leaks In Python
Recently, I ran into a hard-to-put-your-finger-on issue: a memory leak. The memory usage would increase gradually until the load on the server becomes so high that it becomes unresponsive. While restarting the server would get us out of the pinch, it was not really a solution as that is just kicking the can down the road. Until this is issue is found and fixed, it’ll just keep coming back again and again.
An added challenge for me is to do this for a process that is already running.
Luckily for me, I found a post on Stack Overflow with an answer that looked exactly like what I was looking for so I gave it a shot. It worked out very well. Now I’m armed with a tool ready for the next memory leak that comes my way.
There are 2 modules that we will utilize for this analysis:
tracemalloc
- Available as a built-in module starting in Python 3.4
- A debug tool to trace memory blocks allocated by Python
pyrasite
- A library and a set of tools for injecting code into running Python programs
- Specifcally running the top 10 snippet in a
pyrasite-shell
I do the following on an Ubuntu 18.04 LTS server running python 3.6.
pyrasite
Installing and getting pyrasite
to run turned out to be a bit tricky so I’m documenting those steps here.
Install pyrasite
by running the following:
pip install pyrasite
On Ubuntu, since 10.10, the scope of ptrace
is restricted for security reasons. So if you want to trace the memory allocation for any python
process running in your system, you will need to perform the following additional steps:
echo 0 > /proc/sys/kernel/yama/ptrace_scope
To make this change permanent, set ptrace_scope
to 0
in /etc/sysctl.d/10-ptrace.conf
. What 0
and 1
do is really this:
- 0 - Standard scope
**
PTRACE
can be used by any process on any other process that shares the sameuid
and is dumpable (i.e. hasn’t beensetuid
, or hasprctl(PR_SET_DUMPABLE)
called on it).CAP_SYS_PTRACE
overrides any limitations. - 1 - Child-only
**
PTRACE
can be used by any process on any child process that shares the sameuid
and is dumpable.CAP_SYS_PTRACE
overrides any limitations.
In my case, I was only looking to troubleshoot child processes, so I was OK with leaving this setting unchanged.
Now that pyrasite
is installed, I attempt to run pyrasite-shell
. You need to provide the pid
of the process you want to connect to. For the sake of this post, let’s assume that pid
to be 476
. I issue the following command:
pyrasite-shell 476
I wait for good while and nothing happens. It just hangs. So I issue a Ctrl + c
and I get this stacktrace:
File "/usr/local/bin/pyrasite-shell", line 11, in <module>
sys.exit(shell())
File "/usr/local/lib/python2.7/dist-packages/pyrasite/tools/shell.py", line 30, in shell
ipc.connect()
File "/usr/local/lib/python2.7/dist-packages/pyrasite/ipc.py", line 95, in connect
self.wait()
File "/usr/local/lib/python2.7/dist-packages/pyrasite/ipc.py", line 151, in wait
(clientsocket, address) = self.server_sock.accept()
File "/usr/lib/python2.7/socket.py", line 202, in accept
sock, addr = self._sock.accept()
KeyboardInterrupt
A quick search brings me to this GitHub issue where one answer says to specify verbose=True
as the third parameter in this method call.
Running pyrasite-shell
this time again showed me what the problem was:
pyrasite-shell 476
/bin/sh: 1: gdb: not found
I install gdb
in Ubuntu by running the following command:
apt update && apt install gdb
I ran pyrasite-shell
for the 3rd time and the program hung again. I force quit and I see the following error (I still had verbose=True
from the earlier change):
'PyGILState_Ensure' has unknown return type; cast the call to its declared return type
'PyRun_SimpleString' has unknown return type; cast the call to its declared return type
History has not yet reached $1.
^CTraceback (most recent call last):
File "/usr/bin/pyrasite-shell", line 11, in <module>
load_entry_point('pyrasite==2.0', 'console_scripts', 'pyrasite-shell')()
File "/usr/lib/python2.7/site-packages/pyrasite/tools/shell.py", line 30, in shell
ipc.connect()
File "/usr/lib/python2.7/site-packages/pyrasite/ipc.py", line 95, in connect
self.wait()
File "/usr/lib/python2.7/site-packages/pyrasite/ipc.py", line 151, in wait
(clientsocket, address) = self.server_sock.accept()
File "/usr/lib/python2.7/socket.py", line 206, in accept
sock, addr = self._sock.accept()
KeyboardInterrupt
Yet another GitHub issue describes this exact problem. Luckily another user has solved this problem. Unfortunately the change has not yet been merged into pyrasite
. So I install the patched version of pyrasite
provided by the user who fixed the issue by running the following command:
pip install git+https://github.com/diamond-lizard/pyrasite.git
Now when I run pyrasite-shell
again, I get the shell.
pyrasite-shell 476
Pyrasite Shell 2.0
Connected to '/your/python/process'
Python 3.6.7 (default, Oct 22 2018, 11:32:17)
[GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
(DistantInteractiveConsole)
>>>
Huzzah!
tracemalloc
Top 10
Now run the following snippet to trace the memmory allocation.
import tracemalloc
tracemalloc.start()
# ... run your application ...
snapshot = tracemalloc.take_snapshot()
top_stats = snapshot.statistics('lineno')
print("[ Top 10 ]")
for stat in top_stats[:10]:
print(stat)
That’s it.
Happy tracing!