Introduction

We all know that the IA-32 processors have two modes of operation: real mode and protected mode. But why would we want to talk about real mode? The first thing is that the IA-32 processors are still used while the IA-32 computer is booting, which is also the reason why we can still boot into the DOS. We can do that by downloading the zip file attached to this post http://digiex.net/downloads/download-center-2-0/applications/10097-windows-98-boot-disk-download-virtualbox-virtualpc-vmware.html and then unzipping the archive and adding a new Floppy Controller to the VirtualBox as seen on the picture below:

When we boot the Windows XP operating system after doing that, the booting process can be seen on the picture below:

We now have the A:> command prompt, which shows we’ve booted into MSDOS instead of Windows booting normally into the GUI mode. If we enter the dir command, we’ll be listing every file on the A: drive, which is our floppy. The contents of the floppy drive can be seen below:

Now, let’s execute the mem command to display the memory details in the MSDOS environment:

We can see that MSDOS has a total of 640KB of memory, where 92KB is used and 548KB is still free. Actually, MSDOS uses 20-bit memory addresses, which means it can address at most 1MB of physical memory. MSDOS uses the segmented memory model and all of its registers are 16-bits wide. This means that its programs use 16-bit addresses and the segment register is also 16-bits wide. The segment register contains the base address of the segment in memory, that is 2^16/1024 = 64 KB in size. The logical address is also 16-bits in size.

To obtain the actual physical address, we need to add the segment register base and the logical address together to obtain the physical address. But if the segment base address is 16-bits wide and the logical address is also 16-bits wide, this can only address 16-bit address space (let’s forget about what happens if an overflow occurs), so this can’t be right. Earlier, we said that the MSDOS environment has a 20-bit address space, but how can we address it if we’re only using 16-bit addresses?

This is possible because of a little trick, where the processor adds an additional zero to the end of the segment register and another zero to the beginning of the logical address. This really doesn’t do anything, because adding zeros to the beginning of hexadecimal digits is just another representation of the same number.

By adding an additional zero to the segment base, all of a sudden the segment base address isn’t 16-bits wide any more, but 20-bits wide. An additional zero adds 4 more bits to the address: if 0×00 adds one byte, which is 8-bits, the value 0×0 adds 4-bits to the base segment address. Another interesting fact is that since the logical address can only be 16-bits wide, the segment can also be only 64KB in size, because the logical address is being used as an offset to access a certain value of the segment.

Also, real mode doesn’t offer any memory protections that are present in protected mode. This makes MSDOS extremely fault prone, because if one process does something stupid with the memory, it can crash the whole MSDOS system. In protected mode, only that program will crash, but in real mode, the whole system is at risk.

The MSDOS Memory

Let’s take a look at the detailed MSDOS memory, which we can see by passing some flags to the mem.exe command. We need to execute the “mem.exe /d /p” command, where the /d flag is used for debugging and can display all modules in memory, internal drivers and other information. The /p flag is used to pause the printing of information after each screen, which is useful if we want to view all the data being outputted to the screen.

We didn’t present the rest of the output, because it’s the same as the one already presented earlier in the article (the picture about the memory summary). On the pictures above, we can see the interrupt vector table IVT accessible at segment 0×00000 and our program MEM is loaded at segment 0x00EF6 and 0x016E6. We can see a bunch of other loaded device drivers, system data, etc.

The MSDOS environment uses six segment registers: CS, DS, SS, ES, FS and GS that are all 16-bits in size. The CS register is a code segment register that holds the base address of the program instructions being executed. The DS segment register is the data segment register that holds the base address to the data of the program. The SS stack segment registers holds the base address of the stack of the program and so on. This means that the program can have at most 6 segments active at any time during execution.

Let’s now use the debug.exe 16-bit debugger to debug the mem.exe program. To do so, we must execute the “debug.exe mem.exe” command as can be seen below:

Want to learn more?? The InfoSec Institute Reverse Engineering course teaches you everything from reverse engineering malware to discovering vulnerabilities in binaries. These skills are required in order to properly secure an organization from today's ever evolving threats. In this 5 day hands-on course, you will gain the necessary binary analysis skills to discover the true nature of any Windows binary. You will learn how to recognize the high level language constructs (such as branching statements, looping functions and network socket code) critical to performing a thorough and professional reverse engineering analysis of a binary. Some features of this course include:

  • CREA Certification
  • 5 days of Intensive Hands-On Labs
  • Hostile Code & Malware analysis, including: Worms, Viruses, Trojans, Rootkits and Bots
  • Binary obfuscation schemes, used by: Hackers, Trojan writers and copy protection algorithms
  • Learn the methodologies, tools, and manual reversing techniques used real world situations in our reversing lab.

After starting the debugging of the program, we’ve inputted the -r command into the debug.exe debugger, which prints the state of all registers. The -r command also prints the next instruction that will be executed when running the program (the instruction where the IP register points to). Notice that the IP register holds the value 0×0010, the CS (code segment) register holds the value 0×2439, and the instruction printed is from the segment 2439 at offset 0010.

Introduction to Interrupts

An interrupt is an event that triggers some action, which is called Interrupt Service Routine (ISR) or Interrupt Handler. When an interrupt is triggered, it is usually triggered with the specific number that directly corresponds to the ISR routine, because when triggering a specific event, we must know in advance what ISR will get called to handle the event.

Various processor architectures use an Interrupt Vector Table (IVT) or Interrupt Description Table (IDT), which is a table of interrupt vectors that is used to call the right interrupt service routine (ISR) based on the interrupt event. When the CPU is interrupted by an event, it looks up the interrupt service routine handler in the IVT, and transfers control to it [1]. Basically the interrupt table stores interrupt descriptions that tell us where the appropriate interrupt service routines are located in memory, so that program control can be transferred there when appropriate.

The interrupts must be invisible so that the interrupted program doesn’t even notice that it’s been interrupted. When an interrupt happens, a certain program is being affected in such a way that the CPU state is saved into memory, then the interrupt service routine is called to try to fix the state of the program. After that, the state of the CPU must be restored and the program’s execution can continue as if there was no interrupt at all. The interrupted program doesn’t even notice it’s been interrupted.

Types of Interrupts

There are three types of interrupts:

  • hardware interrupts or external interrupts
    • maskable interrupts (can be ignored or masked)
    • non-maskable interrupts (must be handled immediately)
  • software interrupts or programmed exceptions
  • exceptions (errors that happened while the processor is trying to execute an instruction):
    • faults (program is restarted before the instruction that generated the fault, such as divide by zero)
    • traps (program is restarted after the instruction that generated the trap, such as int 3)
    • aborts (program cannot be restarted)

The non-maskable interrupts must be handled as soon as they happen, because they are usually critical, like a hardware failure, division by zero, access to a bad address or something else. Maskable interrupts must be handled sometime in the future; IRQs (Interrupt Requests) can be categorized under the maskable interrupts.

An interrupt request is a hardware signal sent to the processor that temporarily stops a running program and allows a special program, an interrupt handler, to run instead. Interrupts are used to handle such events as data receipt from a modem or network, or a key press or mouse movement. The interrupt request level is the priority of an interrupt request [2].

We can display all the IRQ numbers currently in use by listing the contents of the /proc/interrupts file, which can be seen below:

 # cat /proc/interrupts
           CPU0       CPU1
  0:    6944846    7492717   IO-APIC-edge      timer
  1:       2857       2252   IO-APIC-edge      i8042
  8:         13         14   IO-APIC-edge      rtc0
  9:        433        430   IO-APIC-fasteoi   acpi
 12:     169265     155602   IO-APIC-edge      i8042
 16:         50         47   IO-APIC-fasteoi   uhci_hcd:usb3
 17:         16         14   IO-APIC-fasteoi   uhci_hcd:usb4
 18:      34761      29673   IO-APIC-fasteoi   uhci_hcd:usb5, uhci_hcd:usb8
 19:         12         11   IO-APIC-fasteoi   ehci_hcd:usb1
 20:          3          1   IO-APIC-fasteoi   ehci_hcd:usb2, uhci_hcd:usb6
 22:          4          3   IO-APIC-fasteoi   yenta, uhci_hcd:usb7
 45:     224863     217010   PCI-MSI-edge      ahci
 46:    1329799     943403   PCI-MSI-edge      snd_hda_intel
 47:      11953      12972   PCI-MSI-edge      eth0
 48:    1158569    1009461   PCI-MSI-edge      iwlwifi
NMI:          0          0   Non-maskable interrupts
LOC:    9125118    9940552   Local timer interrupts
SPU:          0          0   Spurious interrupts
PMI:          0          0   Performance monitoring interrupts
IWI:          0          0   IRQ work interrupts
RTR:          0          0   APIC ICR read retries
RES:     319540     595551   Rescheduling interrupts
CAL:         56         68   Function call interrupts
TLB:      67339      65343   TLB shootdowns
TRM:          0          0   Thermal event interrupts
THR:          0          0   Threshold APIC interrupts
MCE:          0          0   Machine check exceptions
MCP:        167        167   Machine check polls
ERR:          0
MIS:          0

In the first column, we can see the IRQ number. The second column shows how many times the interrupt was called after the last boot of the system.

The programmed exception can be triggered with assembler instructions int 3.

Interrupt Vector

Each interrupt or exception is identified by a number between 0 – 255, which is called an interrupt vector. The interrupt vector numbers are classified as follows:

  • 0 – 31 : exceptions and non-maskable interrupts (in real mode, the BIOS handles these interrupts)
  • 32 – 63 : maskable interrupts
  • 64 – 255 : software interrupts

The Linux system often uses software interrupt 0×80, which is used for calling system functions.

Interrupt Vector Table (IVT)

In the previous part, we’ve seen that the Interrupt Vector Table IVT is allocated in the segment 0×00000 and is 1024 bytes in size. This means that the first 1KB of memory is occupied by the IVT, which holds the pairs of numbers and interrupt service routines (ISR). Each integer number is associated with one ISR. When an interrupt that has a certain number is triggered, it’s corresponding interrupt service routine is called.

Because the IVT uses 1024 bytes in total and each interrupt vector takes 4 bytes, the table can hold 1024/4=256 vectors.

By the end of the day, it doesn’t really matter whether we’re dealing with maskable or non-maskable interrupts, or software or hardware interrupts. We can imagine this simply with the following sentence: when a system or a process gets into an erroneous state, the CPU immediately tries to fix the problem by executing the right interrupt handler routine, regardless of software or hardware interrupt being triggered. All interrupts and exceptions have an interrupt vector that associates the interrupts with the appropriate functions to be called when an event occurs. The functions that get called are defined in the interrupt descriptor table, which is a linear table of 256 entries. The IDT associates an interrupt handler with an interrupt vector.

An Interrupt Example

Let’s take a look at the following example where we intentionally divide a number by zero. The source code of the program can be seen below:

#include <iostream>
using namespace std;

int main() {
  cout << "Result: " << 10/0 << endl;
  return 0;
}

We can see that the program is written in C++ and is really simple; it contains just one cout instruction. When the program executes, it should print “Result: ” followed by the result of the 10/0 on the screen. But what happens when we compile and run the program? Below we can see the command that compiles and runs the program and displays the warnings and errors:

# g++ main.cpp -o main &amp;&amp; ./main
main.cpp: In function 'int main()':
main.cpp:6:28: warning: division by zero
Floating point exception

# echo $?
136

So when compiling the program, we get a warning about dividing a number by zero, which is really what we should get. Since this is a warning, we can ignore it (most of the times we ignore warnings), but this time it’s not actually safe to ignore it. When we run the program, we get a floating point exception and the program is terminated with an error 136. This is clearly not a successful return code, so an error must have happened.

Conclusion

Real mode is important on the IA-32 processor systems, because it’s still being used right after starting our computer: the BIOS itself operates in real mode and it is kind of needed in every computer system. The job of the BIOS is to tell the operating system where it must boot from, or on which partition the boot loaded is located, and a bunch of other stuff that are pretty closely related to hardware components.

References:

[1] Wikipedia, Interrupt vector table, accessible at http://en.wikipedia.org/wiki/Interrupt_vector_table.

[2] Wikipedia, Interrupt request, accessible at http://en.wikipedia.org/wiki/Interrupt_request.