How to use the ObjDump tool with x86
Having access to an analysis tool when dealing with compiled executables is always useful. In Linux, ObjDump is one such tool, which can be used to extract information from object files.
This article provides an overview of various ObjDump command-line options and their use. We will take a simple Hello World program written in x86 assembly as our target program and run ObjDump against it.
See the next previous in the series, Debugging your first x86 program.
What is ObjDump?
As mentioned at the beginning of the article, ObjDump is a useful utility to extract information from object files. This tool comes pre-installed with the majority of the Linux distributions. Following are the help options available when running ObjDump.
$ objdump
Usage: objdump <option(s)> <file(s)> Display information from object <file(s)>. At least one of the following switches must be given: -a, –archive-headers Display archive header information -f, –file-headers Display the contents of the overall file header -p, –private-headers Display object format specific file header contents -P, –private=OPT,OPT… Display object format specific contents -h, –[section-]headers Display the contents of the section headers -x, –all-headers Display the contents of all headers -d, –disassemble Display assembler contents of executable sections -D, –disassemble-all Display assembler contents of all sections –disassemble=<sym> Display assembler contents from <sym> -S, –source Intermix source code with disassembly –source-comment[=<txt>] Prefix lines of source code with <txt> -s, –full-contents Display the full contents of all sections requested -g, –debugging Display debug information in object file -e, –debugging-tags Display debug information using ctags style -G, –stabs Display (in raw form) any STABS info in the file -W[lLiaprmfFsoRtUuTgAckK] or –dwarf[=rawline,=decodedline,=info,=abbrev,=pubnames,=aranges,=macro,=frames, =frames-interp,=str,=loc,=Ranges,=pubtypes, =gdb_index,=trace_info,=trace_abbrev,=trace_aranges, =addr,=cu_index,=links,=follow-links] Display DWARF info in the file –ctf=SECTION Display CTF info from SECTION -t, –syms Display the contents of the symbol table(s) -T, –dynamic-syms Display the contents of the dynamic symbol table -r, –reloc Display the relocation entries in the file -R, –dynamic-reloc Display the dynamic relocation entries in the file @<file> Read options from <file> -v, –version Display this program’s version number -i, –info List object formats and architectures supported -H, –help Display this information |
How to extract assembly code
ObjDump tool can be used to extract assembly code from an already-built binary. Let us begin by going through the following assembly program to better understand the approach that can be used.
helloworld.nasm
section .text
global _start _start: mov edx,len mov ecx,msg mov ebx,1 mov eax,4 int 0x80 mov eax, 1 mov ebx, 0 int 0x80 section .rodata msg db ‘Hello, world!’,0xa len equ $ – msg |
As we can notice in the preceding program, the assembly code is written in the .text section. So, we can use ObjDump to extract the .text section from the object file produced by the assembler. It is also possible to extract other sections such as .rodata. In the next few sections, we will discuss how this can be done.
Displaying header contents using ObjDump
To display the header contents from a binary, we can use the -x flag as shown below.
$ objdump -x helloworld
helloworld: file format elf32-i386 helloworld architecture: i386, flags 0x00000112: EXEC_P, HAS_SYMS, D_PAGED start address 0x08049000 Program Header: LOAD off 0x00000000 vaddr 0x08048000 paddr 0x08048000 align 2**12 filesz 0x00000094 memsz 0x00000094 flags r– LOAD off 0x00001000 vaddr 0x08049000 paddr 0x08049000 align 2**12 filesz 0x00000022 memsz 0x00000022 flags r-x LOAD off 0x00002000 vaddr 0x0804a000 paddr 0x0804a000 align 2**12 filesz 0x0000000e memsz 0x0000000e flags r– Sections: Idx Name Size VMA LMA File off Algn 0 .text 00000022 08049000 08049000 00001000 2**4 CONTENTS, ALLOC, LOAD, READONLY, CODE 1 .rodata 0000000e 0804a000 0804a000 00002000 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA SYMBOL TABLE: 08049000 l d .text 00000000 .text 0804a000 l d .rodata 00000000 .rodata 00000000 l df *ABS* 00000000 helloworld.nasm 0804a000 l .rodata 00000000 msg 0000000e l *ABS* 00000000 len 08049000 g .text 00000000 _start 0804b00e g .rodata 00000000 __bss_start 0804b00e g .rodata 00000000 _edata 0804b010 g .rodata 00000000 _end |
As mentioned earlier, we used the Hello World program as our target. The preceding output shows the header information extracted. This includes the metadata of the elf binary (with the details such as file format, architecture), program header, sections available in the binary(.text, .rodata) and the symbol table.
Displaying assembler contents of executable sections using ObjDump
As discussed earlier, we can use the .text section to dump the assembly code from a pre-built binary. This can be done using the -d flag as shown in the following excerpt.
Assembler contents from text section:
$ objdump -d helloworld
helloworld: file format elf32-i386 Disassembly of section .text: 08049000 <_start>: 8049000: ba 0e 00 00 00 mov $0xe,%edx 8049005: b9 00 a0 04 08 mov $0x804a000,%ecx 804900a: bb 01 00 00 00 mov $0x1,%ebx 804900f: b8 04 00 00 00 mov $0x4,%eax 8049014: cd 80 int $0x80 8049016: b8 01 00 00 00 mov $0x1,%eax 804901b: bb 00 00 00 00 mov $0x0,%ebx 8049020: cd 80 int $0x80 |
As we can see in the preceding excerpt, the assembly code is shown but it is in AT&T syntax. It is also possible to control the output syntax. This can be done using the flag -M as shown below.
Assembler contents from text section in intel assembly syntax:
$ objdump -d helloworld -M intel
helloworld: file format elf32-i386 Disassembly of section .text: 08049000 <_start>: 8049000: ba 0e 00 00 00 mov edx,0xe 8049005: b9 00 a0 04 08 mov ecx,0x804a000 804900a: bb 01 00 00 00 mov ebx,0x1 804900f: b8 04 00 00 00 mov eax,0x4 8049014: cd 80 int 0x80 8049016: b8 01 00 00 00 mov eax,0x1 804901b: bb 00 00 00 00 mov ebx,0x0 8049020: cd 80 int 0x80 |
As we can observe, the output is now in intel syntax. Similarly, if we want to explicitly display the assembly code in AT&T syntax, it can be done as follows.
Assembler contents from text section in AT&T assembly syntax:
$ objdump -d helloworld -M att
helloworld: file format elf32-i386 Disassembly of section .text: 08049000 <_start>: 8049000: ba 0e 00 00 00 mov $0xe,%edx 8049005: b9 00 a0 04 08 mov $0x804a000,%ecx 804900a: bb 01 00 00 00 mov $0x1,%ebx 804900f: b8 04 00 00 00 mov $0x4,%eax 8049014: cd 80 int $0x80 8049016: b8 01 00 00 00 mov $0x1,%eax 804901b: bb 00 00 00 00 mov $0x0,%ebx 8049020: cd 80 int $0x80 |
If we want to display assembler code from all sections, we can use the -D flag.
Assembler contents from all sections in intel assembly syntax:
$ objdump -D helloworld -M intel
helloworld: file format elf32-i386 Disassembly of section .text: 08049000 <_start>: 8049000: ba 0e 00 00 00 mov edx,0xe 8049005: b9 00 a0 04 08 mov ecx,0x804a000 804900a: bb 01 00 00 00 mov ebx,0x1 804900f: b8 04 00 00 00 mov eax,0x4 8049014: cd 80 int 0x80 8049016: b8 01 00 00 00 mov eax,0x1 804901b: bb 00 00 00 00 mov ebx,0x0 8049020: cd 80 int 0x80 Disassembly of section .rodata: 0804a000 <msg>: 804a000: 48 dec eax 804a001: 65 6c gs ins BYTE PTR es:[edi],dx 804a003: 6c ins BYTE PTR es:[edi],dx 804a004: 6f outs dx,DWORD PTR ds:[esi] 804a005: 2c 20 sub al,0x20 804a007: 77 6f ja 804a078 <msg+0x78> 804a009: 72 6c jb 804a077 <msg+0x77> 804a00b: 64 21 0a and DWORD PTR fs:[edx],ecx |
Similarly, we can extract assembler contents using ObjDump in AT&T syntax.
Displaying debug information using ObjDump
We can use the -g flag of ObjDump to display the debug information from a binary. The following excerpt shows the output from a compiled C Program.
$ objdump -g jump
jump: file format elf64-x86-64 Contents of the .eh_frame section (loaded from jump): 00000000 0000000000000014 00000000 CIE Version: 1 Augmentation: “zR” Code alignment factor: 1 Data alignment factor: -8 Return address column: 16 Augmentation data: 1b DW_CFA_def_cfa: r7 (rsp) ofs 8 DW_CFA_offset: r16 (rip) at cfa-8 DW_CFA_nop DW_CFA_nop [REDACTED FOR BREVITY] |
Displaying contents of the symbol table using ObjDump
According to Oracle docs, “The symbol table contains information to locate and relocate symbolic definitions and references. The assembler creates the symbol table section for the object file. It makes an entry in the symbol table for each symbol that is defined or referenced in the input file and is needed during linking. The symbol table is then used by the link editor during relocation”.
ObjDump’s -t flag can be used to display the Symbol table from an executable.
$ objdump -t helloworld -M intel
helloworld: file format elf32-i386 SYMBOL TABLE: 08049000 l d .text 00000000 .text 0804a000 l d .rodata 00000000 .rodata 00000000 l df *ABS* 00000000 helloworld.nasm 0804a000 l .rodata 00000000 msg 0000000e l *ABS* 00000000 len 08049000 g .text 00000000 _start 0804b00e g .rodata 00000000 __bss_start 0804b00e g .rodata 00000000 _edata 0804b010 g .rodata 00000000 _end |
As we can observe in the preceding excerpt, each symbol used in the program is referenced in the symbol table.
See the next article in the series, How to diagnose and locate segmentation faults in x86 assembly.
Sources
- Symbol tables, Oracle
- Assembly Language for x86 Processors, Kip Irvine
- Modern X86 Assembly Language Programming, Daniel Kusswurm
- Linux Assembly Language Programming, Bob Neveln