Linux/Android Kernel Debugging with vmlinux-gdb

28 May 2015

Introduction

Debugging the kernel may be accomplished via multiple avenues. Until now, the simplest and most effective solution has always been the use of printk. KDB is another solution which consists of an embedded debugger within the kernel itself.

With the introduction of virtualisation technologies, other approaches are now possible. For instance, QEMU implements a GDB stub. When the virtual machine is spawn, a gdbserver-like will be listening on a specific port. This option also exists on the default Android emulator.

Even with symbols, the use of GDB to access some internal kernel structures may be obscure for a beginner. Earlier versions of GDB only supported guile scripting. Some scripts exist within the kernel tree but have limited functionalities.

Since GDB 7.2, the use of Python is supported as a scripting language. In the security community, the best examples of this usage are the plugins PEDA and GEF for exploit writing and reverse engineering.

Since Linux 4.0, Jan Kiszka has created a plugin for GDB that allows easy access to the kernel structures. Hopefully, this plugin will be used to make kernel exploration easier for debugging and potentially kernel exploit development.

This post presents the use of that plugin on the Android emulator, with some upcoming functionalities. There should be enough information to adapt the procedure to a standard Linux kernel.

GDB version matching

When used in conjunction with gdbserver, it is necessary to have a matching version for the client. Although this is not mentioned in the documentation, if they don't match, expect segfaults and random behaviour. In our case, the defaults AVDs will have gdbserver installed. On Lollipop:

$ adb shell
root@generic:/ # gdbserver --version                                           
GNU gdbserver (GDB) 7.6

Android Kernel

A kernel image for QEMU with symbols is available in aosp/prebuilts/qemu-kernel/arm. This could be used for basic symbol resolution when debugging. However, it does not contain debug information such as types and structures. To get this information, we do need to recompile the kernel.

This step is quite easy and multiple tutorials exist online. Make sure you use the android-goldfish-3.4 branch and that you enable CONFIG_DEBUG_INFO. I also recommend to use a prebuilt toolchain for this stage.

Recompile GDB

The default Android toolchain for userland (arm-linux-androideabi-*) is compiled with Python enabled. Unfortunately, the bare-metal toolchain (arm-eabi*) which is used to build the kernel is not. To recompile, grab the binutils-gdb git tree, checkout the required version and follow the standard procedure. Your configure should be similar to:

$ ./configure --with-python --target=arm-eabi --disable-werror --program-prefix=arm-eabi- --prefix=/home/user/src/gdb-build

vmlinux-gdb

Although a first version of the plugin has been shipped with 4.0, we will use the latest version directly from Jan's repository. You need to copy scripts/gdb/* to the Android kernel directory as well as create a symbolic link:

$ pwd
<...>/aosp/kernel/goldfish
$ cp -R /home/user/src/linux/scripts/gdb ./scripts/
$ ln -s scripts/gdb/vmlinux-gdb.py .

When loading a file, GDB automatically looks up for a <file>-gdb.py file which will be loaded as plugin. You can now run your recompiled GDB with vmlinux:

$ /home/user/src/gdb-build/bin/arm-eabi-gdb vmlinux

At the first run, you should come across a warning from GDB that the plugin has been disabled for security reasons. Follow the instructions to enable the auto-loading.

Usage

Start the emulator with our new kernel and the GDB stub from QEMU:

$ emulator -verbose -show-kernel -debug init -kernel <...>/aosp/kernel/goldfish/arch/arm/boot/zImage -avd kitkat -qemu -s

In the GDB console, you can connect to the GDB stub:

Reading symbols from <...>/aosp/kernel/goldfish/vmlinux...done.
(gdb) target remote :1234
Remote debugging using :1234
__alloc_pages_nodemask (gfp_mask=gfp_mask@entry=512, order=order@entry=0, 
    zonelist=0xc04ac1e8 <contig_page_data+1464>, nodemask=nodemask@entry=0x0)
    at mm/page_alloc.c:2471
2471            if (!preferred_zone)

If you haven't got the symbols at that stage make sure your GDB versions match and that you compiled the kernel with the correct options. Now for the fun part, you can start by displaying the current kernel processes:

(gdb) lx-ps
0xc0482c90 <init_task> 0 swapper
0xde81cc00 1 init
0xde81c800 2 kthreadd
0xde81c400 3 ksoftirqd/0
0xde824c00 5 kworker/u:0
0xde824800 6 khelper
0xde824400 7 sync_supers
0xde824000 8 bdi-default
[...]
0xcbd2c800 1103 ReferenceQueueD
0xcbd2c400 1104 FinalizerDaemon
0xcbd2c000 1105 FinalizerWatchd
0xcbd9ac00 1106 HeapTrimmerDaem
0xcbd9a800 1107 GCDaemon
0xcbd9a400 1108 Binder_1
0xcbd9a000 1109 Binder_2
0xcbdab400 1114 pool-1-thread-1
0xcbdab000 1115 Thread-122
0xcbdd8c00 1116 CameraHolder

The first column is the address of the task_struct of the process:

(gdb) p *(struct task_struct *)0xde81cc00
$1 = {state = 1, stack = 0xde828000, usage = {counter = 2}, flags = 1077936384, ptrace = 0, 
  on_rq = 0, prio = 120, static_prio = 120, normal_prio = 120, rt_priority = 0, 
  sched_class = 0xc036596c <fair_sched_class>, se = {load = {weight = 1024, [...]

Each task contains a pointer towards the previous and next task. These pointers are actually pointing to the list itself and not the object. To get the exact address of the object, the kernel developers generally use container_of. The same macro is implemented in the plugin to quickly browse data structures. For example to find the next task_struct:

(gdb) p $container_of(init_task.tasks.next, "struct task_struct", "tasks")
$2 = (struct task_struct *) 0xde81cc00

It is also possible to lookup a task by its PID and examine its attributes:

(gdb) p $lx_task_by_pid(52).comm
$3 = "surfaceflinger\000"
(gdb) p *$lx_task_by_pid(52).cred
$4 = {usage = {counter = 122}, uid = 1000, gid = 1003, suid = 1000, sgid = 1003, euid = 1000, egid = 1003, 
  fsuid = 1000, fsgid = 1003, securebits = 0, cap_inheritable = {cap = {0, 0}}, cap_permitted = {cap = {0, 0}}, 
  cap_effective = {cap = {0, 0}}, cap_bset = {cap = {4294967295, 4294967295}}, security = 0xde100bc0, 
  user = 0xde13a500, user_ns = 0xc0486338 <init_user_ns>, group_info = 0xde072c80, rcu = {next = 0x0, func = 0x0}}

Finally, another command exists to check the consistency of a list:

(gdb) lx-list-check init_task.tasks
Starting with: {next = 0xde81cdc8, prev = 0xd7a6d1c8}
list is consistent: 54 node(s)

Limitations

Unfortunately, there are some limitations. This plugin will only ever support the current version of Linux. For instance, in Linux 3.5, printk has been modified to use variable-length record buffer. In the 3.4 Android example, lx-dmesg will not work since it expects log_first_idx and other symbols which are not present. Ideally, an out of tree version should be done which will be compatible with previous releases.

Conclusion

We've seen how to install and use the vmlinux-gdb plugin. Although still at an early stage, I hope this simple plugin get used for internal kernel exploration.

Extra

When getting your hand on a server, always check if KDB is activated. If this is the case, gaining root access is trivial.