Thursday, May 21, 2009

Debugging a segmentation fault using gdb

I am not a big proponent of gdb. If you *really* know what you are doing, gdb shouldn't be required. But, every now and then, you come across code that has used function pointers exclusively and then, hand-debugging becomes painful. gdb to the rescue.

You'll need the following pre-requisites to use gdb to debug a segmentation fault:
1) make sure you have compiled the executable WITH debugging symbols. i.e. the "-g" flag. eg
gcc -g -o hello hello.c
Without debugging symbols, gdb won't be able to do much.

2) Linux should core-dump on segmentation fault. Set:
ulimit -c unlimited
(man ulimit for more info)


Now just run that the excutable that is segfaulting. As soon as it segfaults, you should get an output something like "Segmentation fault (core dumped)". ls in your working directory and you will find a new core file has been created (probably with the name core.{pid})

Now, we just have to tell gdb to analyze this core. Here's how
gdb {executable} {dump file}

eg. gdb hello core.1324

Check out the output spit out by gdb and make sure that all debugging symbols have been loaded.
Now, on the gdb prompt:

(gdb) bt
(bt = backtrace .. prints stack strace)
with this backtrace you'll now know *exactly* where the program segfaulted. The code file, line number and the call which was the culprit.

You can even analyze variable values on any frame. Just change to that frame:
(gdb) frame {num}
eg. (gdb) frame 2

and use:
(gdb) info locals
(gdb) info args
to query the values of local variables and passed arguments, respectively.

With all this info, you can pin down the exact reason for the segfault pretty easily. Saves a lot of time!