Complete Tour of PE and ELF: Structure
Since we have completed the PE structure, now it is time to look at the ELF structure which is somewhat easier to understand as compared to PE. For ELF structure, we will be looking at both the linking view and execution view of a binary.
Sections are similar to what we saw in PE structure like .text, .rodata etc. These sections get merged into unnamed segments which OS loader picks up and maps them into memory.
So lets’s go through the complete ELF structure
Before we begin, it is important to note that in ELF *_addr will refer to virtual address where as *_ will refer to file offset.
Important fields for us here are:
- e_ident: This shows that the file is an ELF file with field value begins with 0x7f and then each character for E, L, F.
- e_type: This is similar to characteristics field in PE. This field whether this binary is a relocatable file(ET_REL), executable(ET_EXEC), shared object(ET_DYN) or a core dump(ET_core)
- e_entry: This is the entry point of execution for the binary. Unlike PE, in ELF, we do not have to worry about TLS callbacks(as there are none). This address is the virtual address and not RVA.
- e_phoff: This field tells us about the program header offset
- e_shoff: This field tells us about the section header offset
- e_phnum: Number of program headers
- e_shnnum: Number of section headers
- e_shshtrndx: This is a string table into the e_shoff which contains strings for all the section headers.
Below is an output from hex editor for a sample file
- hexedit hello
Using readelf we can the complete ELF header
- readelf –h hello
Program Header will define the segments (chunk of file)
Important Fields that we care about are:
p_type: This field has further types:
- PT_Load: Data from file which will be mapped to memory
- PT_DYNAMIC: It is similar to IAT in PE. Holds information about bindings for dynamic linkage.
PT_INTERP: Loads the interpreter (dynamic linker) into the memory.
- p_flags: This represents the whether the file is Read/Write/Execute..represented by number PF_R, PF_W, PF_X
- p_offset: It is the file offset of data which will be mapped to memory(Similar to pointertoRawData in PE)
- p_vaddr: It is the virtual address where the segment is mapped into the memory(VirtualAddress in PE)
- p_filesz: total size of data from file that will be mapped to memory(sizeOfRawData in PE)
p_memsz: Size of the segment mapped in memory(VirtualSize in PE)
An important point to note here is that as in PE filesize can be greater than memory size but in ELF this is not the case as there is no padding done here. So in ELF memory size >= filesize. Recall from PE why it memory size >filesize (HINT bss)
- p_align: How the segments will be aligned in memory(similar to SectionAlignment in PE )
Below output is showing program header
- readelf –l hello
Note the difference between two LOAD address (end address at 6fc and other beginning at e10). We will discuss later in this article about this gap.
Below output shows what segment is mapped to what section. The number of segments will be equal to Number OF Program Header in ELF header. For example below output says that segment 0 contains no section, segment 01 maps to interp section, etc.
As stated above whatever is in PT_LOAD will get mapped into memory. Remember, right? It turns out that OS loader pick and map 0x1000 bytes into memory. Which means the some or all portion of the gap between 2 load segments will be mapped if it comes in 0x1000, 0x2000, 0x3000, etc. range.
IF we can look above picture which states load segments and program headers, first load segment ends at 6fc and second load segment begins at e10. This gap between both load sections will be loaded in the memory in chunks of 0x1000 bytes.
To illustrate the things I will pick up a section say 0x700 which will be between these two sections. Then I will change some characters.
- hexedit hello
- Navigate to 0x700
- Enter ’42’,’43’,’44’,’45’,’46’,’47’,’48’,’49’to replace the chars with B, C,D,E,F,G,H,I
Below screenshot shows the same. Save file so as to overwrite the existing sections.
After that open up a gdb to see if this section mapped into memory
- gdb hello
- b main (setting up a breakpoint at main)
- r (run hello)
- x/8c 0x400700(because 0x400000 was the virtual address for this PT _LOAD segment)
Below we can see B,C,D,E,F,G,H,I mapped into memory at 0x400700
This shows that despite load segment ends at 6fc, OS loader loads in 0x1000 chunks from beginning.
So far we have covered the following things from ELF structure.
In the next article, we will wrap up ELF structure and will also see some interesting concepts with packers.