Debugging memory

This chapter is a quick introduction on memory debugging for PHP source code. It is not a full course: memory debugging is not hard, but you need some experience with it, by practicing a lot, what you’ll probably have to do anyway when designing any C-written code. We will introduce here a very well known memory debugger: valgrind; And how to use it with PHP to debug memory issues.

A quick note about valgrind

Valgrind is a well-known tool used under many Unix environments to debug a lot of common memory problem scenarios in any C/C++ written software. Valgrind is a multi-tool frontend about memory debugging. The most used underlying tool is called “memcheck”. It works by replacing every libc’s heap allocation by its own, and tracks what you do with them. You may find interest in the usage of “massif” as well: it is a memory tracker that can be useful to understand the general heap memory usage of a program.

Note

You should read the Valgrind documentation to go further. It is well written, with tiny representative examples.

For the memory allocation replacement to take place, you need to run the program you want to analyze (PHP here) through valgrind, aka the launched binary will be valgrind.

As valgrind replaces and tracks all libc’s heap allocations, it tends to slow-down debugged programs a lot. You will notice it in the case of PHP. Although the slow-down is not that dramatic with PHP, it can still be clearly felt; just don’t worry if you notice it, this is normal.

Valgrind is not the only tool you may use, but the most common one. Dr.Memory, LeakSanitizer, Electric Fence, AddressSanitizer are other common tools.

Before starting

Here are the steps needed to have a good experience in memory debugging, and to ease chances to find flaws and reduce debugging times:

  • You should always use a debug build of PHP. It is irrelevant to try to debug memory on a production build.

  • You should always start the debugger with USE_ZEND_ALLOC=0 environment. You may have learnt in the Zend Memory Manager chapter that this environment var disables ZendMM for the current process launch. It is highly recommended to do so when launching a memory debugger. Fully bypassing ZendMM helps a lot in understanding the traces generated by valgrind.

  • It is also highly recommended to start the memory debugger with environment ZEND_DONT_UNLOAD_MODULES=1. That will prevent PHP from unloading extensions’ .so files at the end of the process. This is to get better valgrind report traces; if PHP would have unloaded extensions when valgrind was about to display its errors, those later would be incomplete as the file from which to grab information is not part of the process memory image anymore.

  • You may need some suppressions. As you tell PHP not to unload its extensions at the end of the process, you may be given false positive in valgrind output. PHP extensions are checked against leaks, if you get false positive on your platform, you can shut them up using a suppression like this one. Feel free to write your own file based on such an example.

  • Valgrind is clearly a better tool than Zend Memory Manager to find leaks and other memory-related issues. You should always run valgrind on your code, it is really a must-do step for every C programmer. You run it whether because you get a crash and want to find and debug it, or as a quality tool like nothing bad seems to show on surface, valgrind is the tool to point hidden flaws ready to blow at your face once or later. Use it, even if you think everything seems all right about your code: you could get surprised.

Warning

You must use valgrind (or any memory debugger) on your program. It is impossible to feel 100% confident in every strong C program, not to debug memory. Memory bugs lead to harmful security issues and program crashes, often randomly, depending on many parameters.

Memory leak detection example

Starter

Valgrind is a full heap memory debugger. It can also debug process memory maps and functions stacks. Please, get more information in its documentation.

Let’s go to detect a dynamic-memory leak, and try with an easy one, the most-common ones you’ll meet:

PHP_RINIT_FUNCTION(pib)
{
    void *foo = emalloc(128);
}

The code above leaks 128 bytes at each request, because it doesn’t have an efree() related call for such a buffer. As it is a call to emalloc(), and thus goes through Zend Memory Manager, that later will warn us about this leak like we saw in ZendMM chapter. Let’s see as well if valgrind can notice the leak:

> ZEND_DONT_UNLOAD_MODULES=1 USE_ZEND_ALLOC=0 valgrind --leak-check=full --suppressions=/path/to/suppression
--show-reachable=yes --track-origins=yes ~/myphp/bin/php -dextension=pib.so /tmp/foo.php

We launch a PHP-CLI process using valgrind. We suppose an extension named “pib” here. Here is the output:

==28104== 128 bytes in 1 blocks are definitely lost in loss record 1 of 1
==28104==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==28104==    by 0xA3701E: __zend_malloc (zend_alloc.c:2820)
==28104==    by 0xA362E7: _emalloc (zend_alloc.c:2413)
==28104==    by 0xE896F99: zm_activate_pib (pib.c:1880)
==28104==    by 0xA79F1B: zend_activate_modules (zend_API.c:2537)
==28104==    by 0x9D31D3: php_request_startup (main.c:1673)
==28104==    by 0xB5909A: do_cli (php_cli.c:964)
==28104==    by 0xB5A423: main (php_cli.c:1381)

==28104== LEAK SUMMARY:
==28104==    definitely lost: 128 bytes in 1 blocks
==28104==    indirectly lost: 0 bytes in 0 blocks
==28104==    possibly lost: 0 bytes in 0 blocks
==28104==    still reachable: 0 bytes in 0 blocks
==28104==    suppressed: 7,883 bytes in 40 blocks

At our level, “definitely lost” is what we must look at.

Note

For details about the different fields output by memcheck, please have a look at its documentation.

Note

We used USE_ZEND_ALLOC=0 to disable and fully bypass Zend Memory Manager. Every call to its API (f.e, emalloc()), will lead directly to a libc call, like we can see on the calgrind output stack frames.

Valgrind caught our leak.

Easy enough, now we could generate a leak using a persistent allocation, aka a dynamic memory allocation bypassing ZendMM and using traditional libc. Go:

PHP_RINIT_FUNCTION(pib)
{
    void *foo = malloc(128);
}

Here is the report:

==28758==    128 bytes in 1 blocks are definitely lost in loss record 1 of 1
==28758==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==28758==    by 0xE896F82: zm_activate_pib (pib.c:1880)
==28758==    by 0xA79F1B: zend_activate_modules (zend_API.c:2537)
==28758==    by 0x9D31D3: php_request_startup (main.c:1673)
==28758==    by 0xB5909A: do_cli (php_cli.c:964)
==28758==    by 0xB5A423: main (php_cli.c:1381)

Caught as well.

Note

Valgrind catches everything, really. Every little piece of forgotten byte somewhere in the HUGE process memory map will get reported by valgrind eyes. You can’t pass through.

More complex use-case

Here is a more complex setup. Can you spot the leaks in the code below ?

static zend_array ar;

PHP_MINIT_FUNCTION(pib)
{
    zend_string *str;
    zval string;

    str = zend_string_init("yo", strlen("yo"), 1);
    ZVAL_STR(&string, str);

    zend_hash_init(&ar, 8, NULL, ZVAL_PTR_DTOR, 1);
    zend_hash_next_index_insert(&ar, &string);
}

There are two leaks here. First, we allocate a zend_string but we don’t free it. Second, we allocate a new zend_hash but as well, we don’t free it. Let’s launch that with valgrind, and see the result:

==31316== 296 (264 direct, 32 indirect) bytes in 1 blocks are definitely lost in loss record 1 of 2
==32006==    by 0xA3701E: __zend_malloc (zend_alloc.c:2820)
==32006==    by 0xA814B2: zend_hash_real_init_ex (zend_hash.c:133)
==32006==    by 0xA816D2: zend_hash_check_init (zend_hash.c:161)
==32006==    by 0xA83552: _zend_hash_index_add_or_update_i (zend_hash.c:714)
==32006==    by 0xA83D58: _zend_hash_next_index_insert (zend_hash.c:841)
==32006==    by 0xE896AF4: zm_startup_pib (pib.c:1781)
==32006==    by 0xA774F7: zend_startup_module_ex (zend_API.c:1843)
==32006==    by 0xA77559: zend_startup_module_zval (zend_API.c:1858)
==32006==    by 0xA85AF5: zend_hash_apply (zend_hash.c:1508)
==32006==    by 0xA77B25: zend_startup_modules (zend_API.c:1969)

==31316== 32 bytes in 1 blocks are indirectly lost in loss record 2 of 2
==31316==    by 0xA3701E: __zend_malloc (zend_alloc.c:2820)
==31316==    by 0xE880B0D: zend_string_alloc (zend_string.h:122)
==31316==    by 0xE880B76: zend_string_init (zend_string.h:158)
==31316==    by 0xE896F9D: zm_activate_pib (pib.c:1781)
==31316==    by 0xA79F1B: zend_activate_modules (zend_API.c:2537)
==31316==    by 0x9D31D3: php_request_startup (main.c:1673)
==31316==    by 0xB5909A: do_cli (php_cli.c:964)
==31316==    by 0xB5A423: main (php_cli.c:1381)

==31316== LEAK SUMMARY:
==31316== definitely lost: 328 bytes in 2 blocks

Like expected, both leaks are reported. As you can see, valgrind is accurate, it puts your eyes where they need to be.

Let’s fix them now:

PHP_MSHUTDOWN_FUNCTION(pib)
{
    zend_hash_destroy(&ar);
}

We destroy the persistent array at the end of PHP process, in MSHUTDOWN. As when we created it, we passed it ZVAL_PTR_DTOR as a destructor, it will run that callback on any items we inserted. This is the zval destructor which will destroy zvals analyzing their content. For IS_STRING types, the destructor will release the zend_string and free it if necessary. Done.

Note

As you can see, PHP - like any C strong program - is full of nested pointers. The zend_string is encapsulated into a zval, itself being part as a zend_array. Leaking the array will abviously leak both the zval and the zend_string, but zvals are not heap allocated (we allocated on stack), and thus there is no leak to report about it. You should get used you the fact that forgetting to release/free a compound structure such as a zend_array leads to tons of leaks, as often, structures embeds structures embedding structures, etc…

Buffer overflow/underflow detection

Leaking memory is bad. It will lead your program to trigger OOM once or later, and it will slow down the host machine dramatically as that latter gets less and less memory available as time runs. This is the syndrom of memory leaks.

But there is worse: buffer out-of-bounds access. Accessing a pointer outside the allocation limits is the root of so many evil operations (like getting a root shell on the machine) that you should absolutely prevent them. Lighter, out-of-bounds access also frequently lead to program crash by memory corruption. However, this all depends on the hardware target machine, the compiler used and options, the OS memory layout, the libc used, etc… Many factors.

Thus, out-of-bounds access are very nasty, they are bombs that may or may not blow up, now, or in a minute or if you get excessively lucky they’ll never blow up.

Valgrind is a memory debugger, and hence is able to detect any out-of-bound access from any memory area (heap and stack). This is the same memcheck tool to use as to find leaks.

Let’s see an easy example:

PHP_MINIT_FUNCTION(pib)
{
    char *foo = malloc(16);
    foo[16] = 'a';
    foo[-1] = 'a';
}

This code allocates a buffer, and on purpose writes one byte beyond and one byte after the bounds. Now if you run such a code, you have something like one chance out of two for it to crash immediately, and then randomly. You may also have created a security hole in PHP, but it may not be remotely exploitable (such a behavior stays uncommon).

Warning

Out-of-bounds access lead to undefined behavior. It is not predictable what is going to happen, but be sure that it’s bad (immediate crash), or terrifying (security issue). Remember.

Let’s ask valgrind, with the exact same command line to launch it as before, nothing changes, except the output:

==12802== Invalid write of size 1
==12802==    at 0xE896A98: zm_startup_pib (pib.c:1772)
==12802==    by 0xA774F7: zend_startup_module_ex (zend_API.c:1843)
==12802==    by 0xA77559: zend_startup_module_zval (zend_API.c:1858)
==12802==    by 0xA85AF5: zend_hash_apply (zend_hash.c:1508)
==12802==    by 0xA77B25: zend_startup_modules (zend_API.c:1969)
==12802==    by 0x9D4541: php_module_startup (main.c:2260)
==12802==    by 0xB5802F: php_cli_startup (php_cli.c:427)
==12802==    by 0xB5A367: main (php_cli.c:1348)
==12802==  Address 0xeb488f0 is 0 bytes after a block of size 16 alloc'd
==12802==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==12802==    by 0xE896A85: zm_startup_pib (pib.c:1771)
==12802==    by 0xA774F7: zend_startup_module_ex (zend_API.c:1843)
==12802==    by 0xA77559: zend_startup_module_zval (zend_API.c:1858)
==12802==    by 0xA85AF5: zend_hash_apply (zend_hash.c:1508)
==12802==    by 0xA77B25: zend_startup_modules (zend_API.c:1969)
==12802==    by 0x9D4541: php_module_startup (main.c:2260)
==12802==    by 0xB5802F: php_cli_startup (php_cli.c:427)
==12802==    by 0xB5A367: main (php_cli.c:1348)
==12802==
==12802== Invalid write of size 1
==12802==    at 0xE896AA6: zm_startup_pib (pib.c:1773)
==12802==    by 0xA774F7: zend_startup_module_ex (zend_API.c:1843)
==12802==    by 0xA77559: zend_startup_module_zval (zend_API.c:1858)
==12802==    by 0xA85AF5: zend_hash_apply (zend_hash.c:1508)
==12802==    by 0xA77B25: zend_startup_modules (zend_API.c:1969)
==12802==    by 0x9D4541: php_module_startup (main.c:2260)
==12802==    by 0xB5802F: php_cli_startup (php_cli.c:427)
==12802==    by 0xB5A367: main (php_cli.c:1348)
==12802==  Address 0xeb488df is 1 bytes before a block of size 16 alloc'd
==12802==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==12802==    by 0xE896A85: zm_startup_pib (pib.c:1771)
==12802==    by 0xA774F7: zend_startup_module_ex (zend_API.c:1843)
==12802==    by 0xA77559: zend_startup_module_zval (zend_API.c:1858)
==12802==    by 0xA85AF5: zend_hash_apply (zend_hash.c:1508)
==12802==    by 0xA77B25: zend_startup_modules (zend_API.c:1969)
==12802==    by 0x9D4541: php_module_startup (main.c:2260)
==12802==    by 0xB5802F: php_cli_startup (php_cli.c:427)
==12802==    by 0xB5A367: main (php_cli.c:1348)

Both invalid writes have been detected, and now your goal is to track them and fix them.

Here, we used an example where we write memory out of bounds, this is the worst scenario as your write operation, if it succeeds (it could lead immediately to a SIGSEGV) will overwrite some critical areas next to that pointer. As we allocated using libc’s malloc(), we’re going to overwrite critical head and tail blocks libc uses to manage and track its allocations. Depending on many things (platform, libc used, how it got compiled, etc…), that will lead to a crash.

Valgrind could also report invalid reads. That means you perform a memory read operation out of the bounds of an allocated pointer. Better scenario that a block overwrite, you still access memory area you should not, and here again in such a scenario that could lead to an immediate crash, or later, or never? Don’t do that.

Note

As soon as you read “Invalid” in the output of valgrind, that smells really bad for you. Whether invalid read or write, you have a problem in your code, and you should consider this problem as high risk: fix it now, really.

Here is a second example about string concatenations:

char *foo = strdup("foo");
char *bar = strdup("bar");

char *foobar = malloc(strlen("foo") + strlen("bar"));

memcpy(foobar, foo, strlen(foo));
memcpy(foobar + strlen("foo"), bar, strlen(bar));

fprintf(stderr, "%s", foobar);

free(foo);
free(bar);
free(foobar);

Can you spot the problem?

Let’s ask valgrind:

==13935== Invalid read of size 1
==13935==    at 0x4C30F74: strlen (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==13935==    by 0x768203E: fputs (iofputs.c:33)
==13935==    by 0xE896B91: zm_startup_pib (pib.c:1779)
==13935==    by 0xA774F7: zend_startup_module_ex (zend_API.c:1843)
==13935==    by 0xA77559: zend_startup_module_zval (zend_API.c:1858)
==13935==    by 0xA85AF5: zend_hash_apply (zend_hash.c:1508)
==13935==    by 0xA77B25: zend_startup_modules (zend_API.c:1969)
==13935==    by 0x9D4541: php_module_startup (main.c:2260)
==13935==    by 0xB5802F: php_cli_startup (php_cli.c:427)
==13935==    by 0xB5A367: main (php_cli.c:1348)
==13935==  Address 0xeb48986 is 0 bytes after a block of size 6 alloc'd
==13935==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==13935==    by 0xE896B14: zm_startup_pib (pib.c:1774)
==13935==    by 0xA774F7: zend_startup_module_ex (zend_API.c:1843)
==13935==    by 0xA77559: zend_startup_module_zval (zend_API.c:1858)
==13935==    by 0xA85AF5: zend_hash_apply (zend_hash.c:1508)
==13935==    by 0xA77B25: zend_startup_modules (zend_API.c:1969)
==13935==    by 0x9D4541: php_module_startup (main.c:2260)
==13935==    by 0xB5802F: php_cli_startup (php_cli.c:427)
==13935==    by 0xB5A367: main (php_cli.c:1348)

Line 1779 points to the fprintf() call. That call did call for fputs() which itself called strlen() (both from libc), and here strlen() reads 1 byte invalid.

We simply forgot the \0 to terminate our string. We pass fprintf() a string that is not valid. It first tries to compute the length of that string calling strlen(). strlen() will then scan the buffer until it finds \0, and it will scan pass the bound of the buffer as we forgot to zero-terminate it. We are lucky here, strlen() only passes one byte off of the end. That could have been way more, and that could have crashed because we don’t really know where the next \0 will be in memory, that is random.

Solution:

size_t len   = strlen("foo") + strlen("bar") + 1;   /* note the +1 for \0 */
char *foobar = malloc(len);

/* ... ... same code ... ... */

foobar[len - 1] = '\0'; /* terminate the string properly */

Note

The error described above is one of the most common on in C. They are called off-by-one mistakes : you forget to allocate just one byte, but you will create tons of problems in the code just because of that.

Finally here is a last example to show a use-after-free scenario. This is also a very common mistake in C programming, which is as bad as bad-memory-access: it creates security flaws that can lead to very nasty behaviors. Obviously, valgrind can detect use-after-free. Here is one:

char *foo = strdup("foo");
free(foo);

memcpy(foo, "foo", sizeof("foo"));

Here again, a PHP scenario that has nothing to do with PHP but still. We free a pointer, and reuse it after. This is a big mistake. Let’s ask valgrind:

==14594== Invalid write of size 1
==14594==    at 0x4C3245C: memcpy@GLIBC_2.2.5 (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==14594==    by 0xE896AA1: zm_startup_pib (pib.c:1774)
==14594==    by 0xA774F7: zend_startup_module_ex (zend_API.c:1843)
==14594==    by 0xA77559: zend_startup_module_zval (zend_API.c:1858)
==14594==    by 0xA85AF5: zend_hash_apply (zend_hash.c:1508)
==14594==    by 0xA77B25: zend_startup_modules (zend_API.c:1969)
==14594==    by 0x9D4541: php_module_startup (main.c:2260)
==14594==    by 0xB5802F: php_cli_startup (php_cli.c:427)
==14594==    by 0xB5A367: main (php_cli.c:1348)
==14594==  Address 0xeb488e0 is 0 bytes inside a block of size 4 free'd
==14594==    at 0x4C2EDEB: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==14594==    by 0xE896A86: zm_startup_pib (pib.c:1772)
==14594==    by 0xA774F7: zend_startup_module_ex (zend_API.c:1843)
==14594==    by 0xA77559: zend_startup_module_zval (zend_API.c:1858)
==14594==    by 0xA85AF5: zend_hash_apply (zend_hash.c:1508)
==14594==    by 0xA77B25: zend_startup_modules (zend_API.c:1969)
==14594==    by 0x9D4541: php_module_startup (main.c:2260)
==14594==    by 0xB5802F: php_cli_startup (php_cli.c:427)
==14594==    by 0xB5A367: main (php_cli.c:1348)
==14594==  Block was alloc'd at
==14594==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==14594==    by 0x769E8D9: strdup (strdup.c:42)
==14594==    by 0xE896A70: zm_startup_pib (pib.c:1771)
==14594==    by 0xA774F7: zend_startup_module_ex (zend_API.c:1843)
==14594==    by 0xA77559: zend_startup_module_zval (zend_API.c:1858)
==14594==    by 0xA85AF5: zend_hash_apply (zend_hash.c:1508)
==14594==    by 0xA77B25: zend_startup_modules (zend_API.c:1969)
==14594==    by 0x9D4541: php_module_startup (main.c:2260)
==14594==    by 0xB5802F: php_cli_startup (php_cli.c:427)
==14594==    by 0xB5A367: main (php_cli.c:1348)

Everything is clear here again.

Conclusions

Use a memory debugger before pushing to production. As you have learnt in this chapter, the tiny little byte you forget in your computations can lead to an exploitable security hole. It also often leads (very often) to a simple crash. That means that your cool-and-nice extension could cut down an entire (set of) server and every of its clients.

C is a very rigorous programming language. You are given billions of bytes of memory to program, and you must arrange those to perform some computation. But don’t mess up with that huge power: in the best case (rare), nothing will happen, in a worse case (very common) you’ll randomly crash here and there, and in the worst scenario, you create a breach in the program that happens to be remotely exploitable…

You are tooled and clever, take care of the machine memory, really.