20:09:24 #startmeeting Fedora ARM Tech Talk - Debugging ARM vexpress kernels with gdb 20:09:24 Meeting started Fri Feb 15 20:09:24 2013 UTC. The chair is jonmasters. Information about MeetBot at http://wiki.debian.org/MeetBot. 20:09:24 Useful Commands: #action #agreed #halp #info #idea #link #topic. 20:09:44 #chair bconoboy clark|w dmarlin j_dulaney pwhalen 20:09:44 Current chairs: bconoboy clark|w dmarlin j_dulaney jonmasters pwhalen 20:09:50 jonmasters: A few minutes 20:09:53 But, go ahead 20:09:58 I'll follow along 20:10:50 ok, thank you everyone (and those reading the archives) for joining the second Fedora ARM technical talk 20:11:00 to follow along, you will need the following: 20:11:18 1). A copy of the vexpress (Versatile Express) release of F18, which you can download from the wiki: 20:11:24 http://fedoraproject.org/wiki/Architectures/ARM/F18/Versatile_Express 20:11:41 2). A copy of the "debuginfo" package that matches the kernel from the F18 release: 20:11:47 http://armpkgs.fedoraproject.org/packages/kernel/3.6.10/8.fc18/armv7hl/kernel-debuginfo-3.6.10-8.fc18.armv7hl.rpm 20:12:16 3). A suitable gdb to use to debug the kernel. I generally build my own cross compilers using crosstool-ng, however, you can get one from Linaro: 20:12:24 https://launchpad.net/linaro-toolchain-binaries/trunk/2013.01/+download/gcc-linaro-arm-linux-gnueabihf-4.7-2013.01-20130125_linux.tar.bz2 20:12:39 (Al Stone will include gdb in the future RPM releases available from Linaro) 20:13:06 Make sure you have followed the instructions to get a vexpress system setup 20:13:37 vexpress, or Versatile Express, is a development platform sold by ARM as part of their Keil tooling 20:13:57 it is a physical system that includes an ATX-like motherboard with slots for two daughterboards 20:14:18 on the motherboard, there are many devices, such as audio interfaces, MMC controllers, flash memory, and so on 20:14:34 there is room provided for two additional daughterboards: 20:14:44 1). A CoreTile 20:14:47 2). A LogicTile 20:15:08 The CoreTile provides whatever latest and greatest CPU ARM are working on in a removable form factor 20:15:39 The LogicTile contains an FPGA that allows various SoC (System-on-Chip) specific hardware to be tested before it is fabbed in a silicon foundary 20:16:49 ARM CPU designers and developers use the Versatile Express platform to test new designs, combining the CoreTile (e.g. a Cortex-A9 containing several Cortex-A9MP cores) with a LogicTile containing their magic SoC hardware bits they are making, for example Calxeda would use a process such as this to design their highbank chip 20:17:19 The latest and greatest stuff is A15 (Eagle), for which various TC (Test Chips) were initially made available as CoreTiles for Versatile 20:17:39 Now, not everyone has a Versatile Express hardware platform, but it has another use 20:18:07 because Versatile is the reference platform made by ARM, it is natural that various emulation software would target (pretend to look like) a Versatile Express 20:18:15 this is the case for qemu 20:18:28 qemu provides a "vexpress" model of the physical Versatile Express hardware platform 20:18:59 it does not implement all of the Versatile hardware features, but it does include the main ones, such as the flash, audio, LCD controller, and so on 20:19:21 it is this model (emulation of physical hardware platform) that we target in Fedora with the vexpress images 20:19:35 Let me clear up another confusion too 20:19:58 People often get confused between qemu used in virtualization like on x86, and qemu used to emulate physical hardware 20:20:37 When you are doing "real" KVM-style virtualization, like on A15 or x86_64 systems, you are really only using bits of qemu to provide IO for the virtualized system 20:21:00 When we are using qemu here, however, we are talking about providing a full emulation of a physical hardware platform 20:21:15 It is unfortunate that there are two very disjoint uses of qemu because this confuses many people 20:22:19 It also means, for example, that when we get to virtualization later on in Fedora when we are doing the A15 (kernel-lpae) we may use qemu to provide some IO that looks like versatile just because the virt folks in the community are lazily going with vexpress as the environment 20:22:43 but that does not have any direct bearing on our support for "vexpress" as a hardware target 20:22:46 does that make sense? 20:23:12 j_dulaney: ? 20:23:30 it does 20:23:33 ok 20:23:41 It does 20:23:43 Sorry 20:23:44 so with that cleared up, let's talk about the kernel 20:24:03 so, there are actually 3 different "vexpress" systems 20:24:20 * j_dulaney is burning some serious cycles 20:24:27 that is the original one, which is very legacy, and had IO devices located at different memory addresses, etc. 20:24:35 that is known as "versatile" within the kernel 20:24:41 (we don't support that) 20:24:57 then, there is the vexpress we are targeting, which provides a Cortex-A9 based environment 20:25:12 then, there is a newer A15 (Eagle) model that we are not yet supporting 20:25:28 so if you ever look at the code in the kernel you will see various different possibilities :) 20:25:39 When I say vexpress, I mean the A9 one we are using today 20:25:55 now, first I will suggest we all boot a vexpress system 20:26:15 but without using bconoboy 's boot script, because I need to make some changes to it 20:26:34 so, let's boot up a vexpress system in non-graphic mode. Here is what we will do: 20:26:56 extract the tarball containing the vexpress F18 release 20:26:59 you should see this: 20:27:08 [jcm@independence Fedora-18-vexpress-xfce-armhfp]$ ls 20:27:08 boot Fedora-18-vexpress-xfce-armhfp.img 20:27:23 in the boot directory, you will have some of the following files (but not all yet): 20:27:33 [jcm@independence boot]$ ls 20:27:33 boot-vexpress vmlinux-3.6.10-8.fc18.armv7hl 20:27:33 initramfs-3.6.10-8.fc18.armv7hl.img vmlinuz-3.6.10-8.fc18.armv7hl 20:27:43 take a look at that boot-vexpress file 20:27:49 in there, you will see the following snippet: 20:28:08 if [ $GUI = 1 ]; then 20:28:09 qemu-system-arm -machine vexpress-a9 -m 1024 -net nic -net user \ 20:28:09 -append "rw root=/dev/mmcblk0p3 rootwait physmap.enabled=0" \ 20:28:09 -kernel "$KERN" \ 20:28:09 -initrd "$RAMFS" \ 20:28:09 -sd "$IMAGE" $FDTARG 20:28:11 else 20:28:13 qemu-system-arm -machine vexpress-a9 -m 1024 -nographic -net nic -net user \ 20:28:15 -append "console=ttyAMA0,115200n8 rw root=/dev/mmcblk0p3 rootwait physmap.enabled=0" \ 20:28:17 -kernel "$KERN" \ 20:28:19 -initrd "$RAMFS" \ 20:28:21 -sd "$IMAGE" $FDTARG 20:28:25 fi 20:28:27 we are going to run qemu-system-arm manually 20:28:29 make sure you have qemu-system-arm installed 20:28:46 rostedt: as an aside, thanks for joining us, logs will be available for you to catch up 20:28:55 thanks 20:29:07 clark|w: perhaps you want to send a private copy to Steven now 20:29:26 ok, so with the qemu system model installed, we're going to run qemu manually 20:29:30 like this: 20:29:42 qemu-system-arm -machine vexpress-a9 -m 1024 -nographic -net nic -net user -append "console=ttyAMA0,115200n8 rw root=/dev/mmcblk0p3 rootwait physmap.enabled=0" -kernel vmlinuz-3.6.10-8.fc18.armv7hl -initrd initramfs-3.6.10-8.fc18.armv7hl.img -sd ../Fedora-18-vexpress-xfce-armhfp.img 20:30:07 You see there that I have simply copy-pasted the invocation from the boot script, substituting in the correct kernel, initrd, and disk image 20:30:23 If you were to hit enter, that would boot now 20:30:31 You can do this if you like, but shut it down afterwar 20:30:34 * afterward 20:31:05 the qemu model supports a few things directly, such as loading a kernel and initramfs 20:31:20 it also supports loading a dtb if it is told to do so (we have not here) 20:31:30 qemu also contains an internal gdb "stub" 20:31:59 A gdb stub provides a mechanism for gdb (the GNU Debugger) to remotely control a "target" (in this case a system emulation model of a vexpress) 20:32:12 gdb requires a kernel with full symbolic information 20:32:47 that means an unstripped kernel, or a much bigger kernel vmlinux (not not vmlinuz) that still contains all of the link-time information about which source files various code came from 20:33:18 fortunately, we keep these around in Fedora. You can get one by finding the kernel package in Koji corresponding to an installed kernel. I'll save you the bother here. You want this: 20:33:28 http://armpkgs.fedoraproject.org/packages/kernel/3.6.10/8.fc18/armv7hl/kernel-debuginfo-3.6.10-8.fc18.armv7hl.rpm 20:33:56 that "kernel-debuginfo" package contains all of the debugging information in DWARF format that is usually stripped from our "production" Fedora binaries 20:34:10 I usually take a kernel-debuginfo package and manually extract it, as follows: 20:34:31 [jcm@independence vexpress]$ rpm2cpio kernel-debuginfo-3.6.10-8.fc18.armv7hl.rpm | cpio -idv 20:34:57 Doing this will result in a lot of files being extracted under a "usr" directory within the current working directory 20:35:01 e.g.: 20:35:10 [jcm@independence vexpress]$ ls -l usr/ 20:35:10 total 4 20:35:10 drwxrwxr-x. 3 jcm jcm 4096 Feb 15 14:59 lib 20:35:30 on a normal system, if you installed this directly it would mean stuff would go in /usr/lib 20:35:52 but we are not installing directly because we are wanting to play with armv7hl kernel binaries on an x86_64 host system 20:36:07 you will find some fun bits in 20:36:21 [jcm@independence vexpress]$ ls -l usr/lib/debug/lib/modules/3.6.10-8.fc18.armv7hl/ 20:36:21 total 77876 20:36:21 drwxr-xr-x. 5 jcm jcm 4096 Feb 15 14:59 extra 20:36:21 drwxr-xr-x. 11 jcm jcm 4096 Feb 15 14:59 kernel 20:36:21 -rwxr-xr-x. 1 jcm jcm 79734393 Feb 15 14:59 vmlinux 20:36:54 because the kernel Kbuild system is wired up that /lib/modules/kernel_version/ contains links to various parts of the kernel build, Fedora also keeps the vmlinux file from a kernel compile in there 20:37:19 this means that when the debuginfo scripts (within the redhat-rpm-config) run, this all gets stashed in that location 20:37:29 anyway, that huge vmlinux file is what you want 20:37:53 I copy it over to where I already have the "boot" subdirectory from my vexpress image. This means, my vexpress "boot" directory looks like: 20:38:06 [jcm@independence boot]$ ls -l 20:38:06 total 93440 20:38:06 -rwxr-xr-x. 1 jcm jcm 1970 Feb 1 21:35 boot-vexpress 20:38:06 -rw-r--r--. 1 jcm jcm 12542803 Feb 1 21:35 initramfs-3.6.10-8.fc18.armv7hl.img 20:38:06 -rwxr-xr-x. 1 jcm jcm 79734393 Feb 15 15:01 vmlinux-3.6.10-8.fc18.armv7hl 20:38:07 -rw-r--r--. 1 jcm jcm 3392832 Feb 1 21:35 vmlinuz-3.6.10-8.fc18.armv7hl 20:38:15 Notice that I renamed it to vmlinux-kernel_version 20:38:36 so now I have something I can use with gdb, it's time to get the corresponding source code 20:38:57 you could take the SRPM from Koji when you get the debuginfo and prep that 20:39:02 but I am using Fedora GIT 20:39:05 so: 20:39:35 in my /data/work/Fedora_GIT directory I have previously done a "fedpkg clone kernel" 20:39:47 followed by a "fedpkg switch-branch f18" 20:40:06 then, I have made a working git branch with "git checkout -b f18-lesson" 20:40:35 and I have reset the contents of that branch to the version of the source closest to the RPM I am playing with: 20:40:50 $ git reset --hard 0689821d3cb11e672f32a9909a0396d3c0da4314 20:41:17 if you were on a more recent kernel and were debugging the latest and greatest, you of course would just have it handy 20:41:34 now, with a copy of the Fedora kernel git, you can prep a source tree: 20:41:50 $ fedpkg --dist f18 prep --arch armv7hl 20:42:02 You want to do that way for two reasons: 20:42:11 1). You're on a non-f18 branch so it doesn't know the "dist" tag to use 20:42:31 2). You want to tell it to prep for "armv7hl" for completeness, or it will prep for x86_64, which is technically pedantically incorrect 20:42:43 after doing this, you will see a copy of the kernel source in: 20:43:07 [jcm@independence kernel]$ ls -ld kernel-3.6.fc18/linux-3.6.10-6.fc18.armv7hl 20:43:07 drwxr-xr-x. 24 jcm jcm 4096 Feb 15 15:07 kernel-3.6.fc18/linux-3.6.10-6.fc18.armv7hl 20:43:33 change into that directory 20:43:44 so mine looks like: 20:43:56 [jcm@independence linux-3.6.10-6.fc18.armv7hl]$ ls 20:43:56 arch ipc 20:43:56 block Kbuild 20:43:57 config-arm-generic Kconfig 20:43:57 config-arm-highbank kernel 20:43:57 config-arm-imx kernel-3.6.10-armv5tel-kirkwood.config 20:43:59 config-arm-kirkwood kernel-3.6.10-armv7l.config 20:44:01 config-arm-omap kernel-3.6.10-armv7l-highbank.config 20:44:03 config-arm-tegra kernel-3.6.10-armv7l-imx.config 20:44:05 config-arm-versatile kernel-3.6.10-armv7l-omap.config 20:44:07 config-debug kernel-3.6.10-armv7l-tegra.config 20:44:09 config-generic lib 20:44:11 config-i686-PAE MAINTAINERS 20:44:13 config-local Makefile 20:44:15 config-nodebug merge.pl 20:44:17 config-powerpc32-generic mm 20:44:19 config-powerpc32-smp net 20:44:21 config-powerpc64 README 20:44:23 config-powerpc64p7 REPORTING-BUGS 20:44:27 config-powerpc-generic samples 20:44:29 configs scripts 20:44:31 config-s390x security 20:44:33 config-sparc64-generic sound 20:44:35 config-x86-32-generic temp-armv5tel-kirkwood 20:44:37 config-x86_64-generic temp-armv7l-highbank 20:44:39 config-x86-generic temp-armv7l-imx 20:44:41 COPYING temp-armv7l-omap 20:44:43 CREDITS temp-armv7l-tegra 20:44:45 crypto temp-armv7l-versatile 20:44:47 Documentation temp-x86-32 20:44:49 drivers temp-x86-64 20:44:51 firmware tools 20:44:53 fs usr 20:44:57 include virt 20:44:59 init 20:45:01 --- end flood --- 20:45:03 wcohen: as an aside, welcome, this is logged so you can catch up later, or someone will send you a backlog privately 20:45:06 ok, continuing... 20:45:11 so with a kernel tree prepped, and a vexpress system image setup, and a debuginfo kernel extracted, all we need now is a debugger! 20:45:38 There is not (yet) a generic cross-gdb in Fedora for ARM that can run on x86_64 and target arm 20:45:51 however, there are great cross compilers originally packaged by dhowells 20:45:59 and there will be a gdb 20:46:09 in the interim, please use the Linaro gdb from this location: 20:46:17 https://launchpad.net/linaro-toolchain-binaries/trunk/2013.01/+download/gcc-linaro-arm-linux-gnueabihf-4.7-2013.01-20130125_linux.tar.bz2 20:46:28 I keep all of my toolchains in /data/toolchains 20:46:37 including many I have built myself with crosstool, and so on 20:46:39 so I have: 20:46:57 [jcm@independence linux-3.6.10-6.fc18.armv7hl]$ ls /data/toolchains/linaro/gcc-linaro-arm-linux-gnueabihf-4.7-2013.01-20130125_linux/ 20:47:06 arm-linux-gnueabihf bin lib libexec share 20:47:22 now, going back to the kernel source directory we extracted before 20:47:35 if you're in the right directory you will see MAINTAINERS and Documentation and so on 20:47:53 from there, you want to fire up gdb to debug the kernel vmlinux as follows: 20:48:07 /data/toolchains/linaro/gcc-linaro-arm-linux-gnueabihf-4.7-2013.01-20130125_linux/bin/arm-linux-gnueabihf-gdb ~/vexpress/Fedora-18-vexpress-xfce-armhfp/boot/vmlinux-3.6.10-8.fc18.armv7hl 20:48:34 so I am inside the kernel source directory, and I am specifically calling the Linaro gdb and passing it the path to my local copy of the vmlinux kernel 20:48:49 you will see output ending in: 20:48:59 Reading symbols from /home/jcm/vexpress/Fedora-18-vexpress-xfce-armhfp/boot/vmlinux-3.6.10-8.fc18.armv7hl...done. 20:48:59 (gdb) 20:49:24 now, on another terminal, you want to start up gdb as before, but adding two additional parameters 20:49:58 oops, I mean not gdb but qemu on another terminal 20:50:03 so that should have been: 20:50:09 now, on another terminal, you want to start up qemu as before, but adding two additional parameters 20:50:33 one parameter will tell the qemu model to start up its internal gdb stub and listen for gdb to connect to it on local port 1234 20:50:46 the other parameter will tell qemu not to start executing until it is told to via gdb 20:50:57 those are the -s and -S parameters in this line: 20:51:10 qemu-system-arm -machine vexpress-a9 -m 1024 -nographic -net nic -net user -append "console=ttyAMA0,115200n8 rw root=/dev/mmcblk0p3 rootwait physmap.enabled=0" -kernel vmlinuz-3.6.10-8.fc18.armv7hl -initrd initramfs-3.6.10-8.fc18.armv7hl.img -sd ../Fedora-18-vexpress-xfce-armhfp.img -s -S 20:51:20 if I run that now: 20:51:32 [jcm@independence boot]$ qemu-system-arm -machine vexpress-a9 -m 1024 -nographic -net nic -net user -append "console=ttyAMA0,115200n8 rw root=/dev/mmcblk0p3 rootwait physmap.enabled=0" -kernel vmlinuz-3.6.10-8.fc18.armv7hl -initrd initramfs-3.6.10-8.fc18.armv7hl.img -sd ../Fedora-18-vexpress-xfce-armhfp.img -s -S 20:51:32 pulseaudio: set_sink_input_volume() failed 20:51:32 pulseaudio: Reason: Invalid argument 20:51:32 pulseaudio: set_sink_input_mute() failed 20:51:33 pulseaudio: Reason: Invalid argument 20:51:39 20:51:55 ok, now in the gdb window, I connect to that qemu using a special gdb command: 20:52:09 (gdb) target remote localhost:1234 20:52:10 Remote debugging using localhost:1234 20:52:10 0x60000000 in ?? () 20:52:17 (gdb) 20:52:34 now I have connected gdb to the qemu gdb stub and gdb is controlling qemu 20:52:54 the qemu is not running until we tell gdb to start it 20:53:05 but first, let's set a breakpoint to stop execution later 20:53:13 the classical example is "start_kernel" 20:53:36 start_kernel is always the first generic high-level C code executed when a Linux architecture gets out of early init 20:53:39 so let's do that: 20:53:50 (gdb) break start_kernel 20:53:50 Breakpoint 1 at 0xc060a50c: file init/main.c, line 467. 20:53:57 and now, let's start qemu: 20:54:04 (gdb) c 20:54:04 Continuing. 20:54:05 Breakpoint 1, start_kernel () at init/main.c:467 20:54:05 warning: Source file is more recent than executable. 20:54:05 467 { 20:54:37 no output occurred because this is before the kernel outputs anything, but qemu did run the early kernel code 20:54:51 now, we can use regular gdb commands like "n" (next) to step over the kernel setup: 20:55:02 (gdb) n 20:55:02 476 smp_setup_processor_id(); 20:55:02 (gdb) 20:55:02 482 boot_init_stack_canary(); 20:55:03 (gdb) 20:55:03 484 cgroup_init_early(); 20:55:05 (gdb) 20:55:07 486 local_irq_disable(); 20:55:09 (gdb) 20:55:11 487 early_boot_irqs_disabled = true; 20:55:13 (gdb) 20:55:15 493 tick_init(); 20:55:17 (gdb) 20:55:19 494 boot_cpu_init(); 20:55:21 if I keep going, soon I will see output in my qemu windo 20:55:23 * window 20:56:09 once you get beyond rest_init gdb will seem like it is not stopping and the kernel keeps going 20:56:36 this is because at that point the kernel has begun running the scheduler and is in an entirely different code path (initramfs init) 20:56:49 but we can still ctrl-c in gdb and stop the kernel 20:56:57 ^C 20:56:58 Program received signal SIGINT, Interrupt. 20:56:58 0xc00ea2c8 in __mod_zone_page_state (zone=0xc06a2940 , 20:56:58 item=NR_ACTIVE_ANON, delta=delta@entry=-1) at mm/vmstat.c:223 20:56:58 223 if (unlikely(x > t || x < -t)) { 20:56:58 (gdb) 20:57:08 in my case, it was well into boot when I did that 20:57:24 I can type "c" to continue booting as normal 20:57:36 Fedora release 18 (Spherical Cow) 20:57:36 Kernel 3.6.10-8.fc18.armv7hl on an armv7l (ttyAMA0) 20:57:37 localhost login: 20:57:43 now, how is this useful? 20:57:49 well, it is useful in several ways 20:58:13 1. You can now use gdb to stop the entire qemu model at any time to see how the kernel is working 20:58:24 2. You can step through kernel bootup to understand how Linux works 20:58:38 3. You can confirm what the kernel is doing at a given moment 20:59:02 When I told Paul recently that I knew the updated kernels for 3.7 were booting, this is because I ran them in this fashion 20:59:25 I can see that they boot, they just are unable (with a dtb) to see the serial interface to tell us 20:59:46 I will be using gdb and some kernel hackery experience to debug what is wrong with that, using this approach 21:00:05 You can also use gdb with openocd in place of qemu's gdb stub 21:00:21 if you have a hardware debugger like a Flyswatter from tincantools attached to e.g. a PandaBoard 21:00:36 then you can run openocd after attaching it, and it will listen on e.g. localhost:1234 21:00:54 and then you can connect from gdb in the same way to debug physical hardware just like you debug a qemu model 21:01:03 I can show that another time 21:01:15 but for now, I think I have covered the introduction I wanted to cover here 21:01:24 are there questions, comments, suggestions? Was this helpful? 21:01:53 excellent Jon, thanks for doing this, it was very helpful 21:02:25 * j_dulaney thought so 21:02:33 well, good. If there are questions I will answer them here or in #fedora-arm 21:02:36 * j_dulaney will poke at it some more 21:02:42 Roger 21:02:44 but if there are no specific questions at this moment, I will end this 21:02:50 ack 21:02:52 going once... 21:03:01 going twice... 21:03:07 #endmeeting