Checking if PAE is Enabled

This was discussed in the first portion of this tutorial: please review before proceeding.

Getting the Virtual Address

The next thing we need to do is to compile and run the program, which we’ll debug, on Windows. When running the program on Windows, the following will be displayed, because we’ve coded the “int 3″ software interrupt into the C++ code. When the “int 3″ instruction is reached, the interrupt will be invoked, which will cause WinDbg to pause the execution of whole Windows operating system. WinDbg will then present us with a message about “break instruction exception,” as can be seen on the picture below:

Let’s now list the whole assembly function that corresponds to the above C++ code. The assembly code can be seen in the output below. The first instruction is the “int 3″ interrupt instruction that was used to stop the Windows operating system and invoke a debugger. Now we can use the Windbg t command to step into, or the p command to step over, the instructions manually.

kd> u 0x004113ae L40
virtualphysical+0x113ae:
004113ae cc              int     3
004113af 6a04            push    4
004113b1 e8c1fdffff      call    virtualphysical+0x11177 (00411177)
004113b6 83c404          add     esp,4
004113b9 898520ffffff    mov     dword ptr [ebp-0E0h],eax
004113bf 83bd20ffffff00  cmp     dword ptr [ebp-0E0h],0
004113c6 741a            je      virtualphysical+0x113e2 (004113e2)
004113c8 8b8520ffffff    mov     eax,dword ptr [ebp-0E0h]
004113ce c70000000000    mov     dword ptr [eax],0
004113d4 8b8d20ffffff    mov     ecx,dword ptr [ebp-0E0h]
004113da 898d18ffffff    mov     dword ptr [ebp-0E8h],ecx
004113e0 eb0a            jmp     virtualphysical+0x113ec (004113ec)
004113e2 c78518ffffff00000000 mov dword ptr [ebp-0E8h],0
004113ec 8b9518ffffff    mov     edx,dword ptr [ebp-0E8h]
004113f2 8955ec          mov     dword ptr [ebp-14h],edx
004113f5 c745f80a000000  mov     dword ptr [ebp-8],0Ah
004113fc 8b45ec          mov     eax,dword ptr [ebp-14h]
004113ff c70014000000    mov     dword ptr [eax],14h
00411405 8bf4            mov     esi,esp
00411407 8b45f8          mov     eax,dword ptr [ebp-8]
0041140a 50              push    eax
0041140b 68b0574100      push    offset virtualphysical+0x157b0 (004157b0)
00411410 ff15b8824100    call    dword ptr [virtualphysical+0x182b8 (004182b8)]
00411416 83c408          add     esp,8
00411419 3bf4            cmp     esi,esp
0041141b e816fdffff      call    virtualphysical+0x11136 (00411136)
00411420 8bf4            mov     esi,esp
00411422 8b45ec          mov     eax,dword ptr [ebp-14h]
00411425 8b08            mov     ecx,dword ptr [eax]
00411427 51              push    ecx
00411428 683c574100      push    offset virtualphysical+0x1573c (0041573c)
0041142d ff15b8824100    call    dword ptr [virtualphysical+0x182b8 (004182b8)]
00411433 83c408          add     esp,8
00411436 3bf4            cmp     esi,esp
00411438 e8f9fcffff      call    virtualphysical+0x11136 (00411136)
0041143d 8bf4            mov     esi,esp
0041143f ff15bc824100    call    dword ptr [virtualphysical+0x182bc (004182bc)]
00411445 3bf4            cmp     esi,esp
00411447 e8eafcffff      call    virtualphysical+0x11136 (00411136)
0041144c 33c0            xor     eax,eax
0041144e 5f              pop     edi
0041144f 5e              pop     esi
00411450 5b              pop     ebx
00411451 81c4e8000000    add     esp,0E8h
00411457 3bec            cmp     ebp,esp
00411459 e8d8fcffff      call    virtualphysical+0x11136 (00411136)
0041145e 8be5            mov     esp,ebp
00411460 5d              pop     ebp
00411461 c3              ret

We’re particularly interested in the virtual address that’s been used for the variables x and y in the C++ code. The values of those variables are stored at address 0x004113f5, where we can see a constant 0xAh being saved to it, and at address 0x004113ff, where a constant 0x14h is being saved to it. We could step through the program instruction by instruction with p or t command, but we’d rather set two breakpoints on the interesting addresses with the bp command like this:

kd> bp 004113f5
kd> bp 004113ff
kd> bl
 0 e 004113f5     0001 (0001) virtualphysical+0x113f5
 1 e 004113ff     0001 (0001) virtualphysical+0x113ff

The first two commands set breakpoints on the addresses 0x004113f5 and 0x004113ff, while the third cl command displays all the set breakpoints, which are the breakpoints we’ve just set. Then we can use the g command to run the program, upon which the breakpoint 0 will be hit as can be seen on the picture below:

Along with the actual breakpoint ID and its address, the current line that’s about to be executed is also displayed. On that line we can see that we’re interested in the address [ebp-8], so we must calculate it by printing the value of register EBP, subtracting 8 from it, and dumping the memory at that address. Let’s first display the value of EBP and dump the memory:

Want to learn more?? The InfoSec Institute Reverse Engineering course teaches you everything from reverse engineering malware to discovering vulnerabilities in binaries. These skills are required in order to properly secure an organization from today's ever evolving threats. In this 5 day hands-on course, you will gain the necessary binary analysis skills to discover the true nature of any Windows binary. You will learn how to recognize the high level language constructs (such as branching statements, looping functions and network socket code) critical to performing a thorough and professional reverse engineering analysis of a binary. Some features of this course include:

  • CREA Certification
  • 5 days of Intensive Hands-On Labs
  • Hostile Code & Malware analysis, including: Worms, Viruses, Trojans, Rootkits and Bots
  • Binary obfuscation schemes, used by: Hackers, Trojan writers and copy protection algorithms
  • Learn the methodologies, tools, and manual reversing techniques used real world situations in our reversing lab.

The variable x is located at the address 0x0012ff60 and its content is 0xcccccccc. After running the command that stores 0xAh at that address, the content should be 0x0000000A, as can be seen on the picture below:

We’ve gotten our first virtual address, which is 0x0012ff60. The same steps can be repeated for the y variable as well. All of the commands can be seen on the picture below:

The value 0×14 was written to the address 0×00345988, which is the second address we’re interested in. Thus, we’ve successfully gotten the two virtual addresses that we wanted. The virtual addresses are summarized below:

  • variable x: 0x0012ff60
  • variable y: 0×00345988

Getting the Linear Address

The first thing that we must do is figure out in which segment the address actually is. It’s clear that the variable x, which address is 0x0012ff60, is on the stack segment, since we’re initializing the variable on the stack.

Let’s print all of the values of stack segments with the use of the r command. The results can be seen on the picture below:

The registers are actually 16-bit, so the values correspond to the following:

  • SS (Stack Segment) : 0×0023
  • CS (Code Segment) : 0x001b
  • DS (Data Segment) : 0×0023
  • ES (Extra Segment) : 0×0023
  • GS (Data Segment) : 0×0000
  • FS (Data Segment) : 0x003b

Right now you may be confused and thinking: okay, if segmentation is in use, why do the segment registers hold the same value no matter which process is currently being debugged? Shouldn’t every process have its own segment register values? After all, the code and data between the processes is not shared. This isn’t true for shared libraries, but right now we’re not talking about DLL files, just executables.

You can be scratching your head and trying to figure out what’s happening, but all in all the reason is very simple. Let’s take a look at the stack segment register for example: 0×0023 can be transformed into binary form, which in this case is: 0000 0000 0010 0011. The first two least significant bits are used for protection, but we won’t go into that right now. The third bit is used to declare whether we should be looking for the descriptors in the global or local descriptor table. In this case the third bit is set to 0 (the 0 is bold in the binary representation), which means that the appropriate descriptor is located in the global descriptor table. The rest of the bits make up an offset into the GDT to specify the right descriptor to be used: those bits are 0000000000100, which can easily be represented with 0×4 in hexadecimal form.

Let’s use the same formula to get the offset of all segment registers:

  • SS: 0×4
  • CS: 0×3
  • DS: 0×4
  • ES: 0×4
  • GS: 0×0
  • FS: 0×7

Let’s also use the “dg 0 40″ command to print the first part of the GDT table, which can be seen on the picture below:

Let’s take the virtual address that we’ve gotten in the previous step and convert it to a linear address. The virtual address can be converted to a linear address by taking the base address from the GDT descriptor table (of an appropriate index that’s specified by one of the segment registers) and adding the virtual address to it. The base address of the first five segment descriptors that span the entire linear address space is 0×00000000 and the virtual addresses are 0x0012ff60 and 0×00345988, which can be directly translated like this:

  • variable x : 0×00000000 + 0x0012ff60 = 0x0012ff60
  • variable y : 0×00000000 + 0×00345988 = 0×00345988

This proves that segmentation is being used because it must be used and cannot be turned off, but it doesn’t actually do anything. We can conclude that virtual addresses are the same as linear addresses and no translation is necessary to translate from one to the other.

Conclusion

In this tutorial, we’ve looked at how to figure out whether PAE is enabled, but we’ve also started to look at an example and resolved virtual to linear addresses. We saw that the Windows operating system doesn’t actually use segmentation, since the virtual addresses are the same as linear addresses.

References:

[1] x86 memory management and Linux kernel, accessible at http://manavar.blogspot.com/2011/05/x86-memory-management-and-linux-kernel.html.

[2] http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html.

[3] W4118: segmentation and paging, accessible at http://www.cs.columbia.edu/~junfeng/os/Flectures/l05-mem.pdf.

[4] Common WinDbg Commands, accessible at http://www.windbg.info/doc/1-common-cmds.html.

[5] Understanding !PTE , Part 1: Let’s get physical, accessible at http://blogs.msdn.com/b/ntdebugging/archive/2010/02/05/understanding-pte-part-1-let-s-get-physical.aspx.

[6] Understanding !PTE, Part2: Flags and Large Pages, accessible at http://blogs.msdn.com/b/ntdebugging/archive/2010/04/14/understanding-pte-part2-flags-and-large-pages.aspx.

[7] Part 3: Understanding !PTE – Non-PAE and X64, accessible at http://blogs.msdn.com/b/ntdebugging/archive/2010/06/22/part-3-understanding-pte-non-pae-and-x64.aspx.