Reverse engineering

The Import Directory: Part 1

April 24, 2013 by Dejan Lukan

We know that when the operating system loads the executable, it will scan through its IAT table to locate the DLLs and functions the executable is using. This is done because the OS must map the required DLLs into the executable’s address space.

To be more precise, IAT is the table that contains references between the function names and their virtual addresses, which are exported from different loaded modules. Each executable or DLL library contains a PE header, which has all the information that the executable needs for the operating system to start successfully, including the IAT table. To understand where the IAT table is located, we must first talk about the PE header.

Now we’re ready to explore the actual IAT of the process. Let’s first present the program we’ll be using to do that:

[cpp]
#include "stdafx.h"
#include <Windows.h>

int _tmain(int argc, _TCHAR* argv[]) {

HANDLE hFile = CreateFile(L"C:temp.txt", GENERIC_WRITE, 0, NULL, CREATE_NEW, FILE_ATTRIBUTE_NORMAL, NULL);
if(hFile == INVALID_HANDLE_VALUE) {
printf("Unable to open file.");
}
else {
printf("File successfully opened/created.");
}
CloseHandle(hFile);
getchar();
return 0;
}
[/cpp]

When we compile the program, another createfilee.exe executable will be created. We can start the createfilee.exe program and let it run. It will stop the execution on the getchar() function call, which will wait until we press certain keystroke. After that, start the WinDbg debugger and attach it to the process like this:

Now we’ll use the !dh command to print the PE header elements that we need. Let’s first print all the options of the !dh command, which we can see below. If we pass the -a parameter to the !dh command, we’ll be printing everything to the console window. If we use the -f parameter, we’ll print only the file headers and with -s we’ll print only the section headers.

In our case, we’ll use the -f parameter because we need to dump the file headers. The output below was generated by the “!dh 00400000 -f” command:

[plain]
0:002> !dh 00400000 -f

File Type: EXECUTABLE IMAGE
FILE HEADER VALUES
14C machine (i386)
7 number of sections
515EBA3E time date stamp Fri Apr 05 13:49:18 2013

0 file pointer to symbol table
0 number of symbols
E0 size of optional header
102 characteristics
Executable
32 bit word machine

OPTIONAL HEADER VALUES
10B magic #
10.00 linker version
3800 size of code
3A00 size of initialized data
0 size of uninitialized data
11078 address of entry point
1000 base of code
—– new —–
00400000 image base
1000 section alignment
200 file alignment
3 subsystem (Windows CUI)
5.01 operating system version
0.00 image version
5.01 subsystem version
1B000 size of image
400 size of headers
0 checksum
00100000 size of stack reserve
00001000 size of stack commit
00100000 size of heap reserve
00001000 size of heap commit
8140 DLL characteristics
Dynamic base
NX compatible
Terminal server aware
0 [ 0] address [size] of Export Directory
18000 [ 3C] address [size] of Import Directory
19000 [ 459] address [size] of Resource Directory
0 [ 0] address [size] of Exception Directory
0 [ 0] address [size] of Security Directory
1A000 [ 2EC] address [size] of Base Relocation Directory
15720 [ 1C] address [size] of Debug Directory
0 [ 0] address [size] of Description Directory
0 [ 0] address [size] of Special Directory
0 [ 0] address [size] of Thread Storage Directory
0 [ 0] address [size] of Load Configuration Directory
0 [ 0] address [size] of Bound Import Directory

181BC [ 180] address [size] of Import Address Table Directory
0 [ 0] address [size] of Delay Import Directory
0 [ 0] address [size] of COR20 Header Directory
0 [ 0] address [size] of Reserved Directory
[/plain]

At the bottom of the output we can see the data directories that we’re after. The data directory that we want to read is the IAT directory, which is located at the 0x181BC RVA address and is 180 bytes in size. Now we know the exact address of the IAT table in memory: if the base address of the executable is 0x00400000 and the RVA of the IAT in the executable is 0x181BC, then the whole address of the IAT table in memory is 0x00400000+0x181BC = 0x004181BC. Also, the size of the IAT is 0x180 bytes and each entry is 4 bytes in size. This is why the whole command should be as follows:

[plain]
> dps 004181bc L180/4
[/plain]

The output of that command can be seen below (the whole table is presented even though it might be rather long), just so we can observe all the entries in that table:

[plain]
0:002> dps 004181bc L180/4
004181bc 7c809be7 kernel32!CloseHandle
004181c0 7c864042 kernel32!UnhandledExceptionFilter
004181c4 7c80de95 kernel32!GetCurrentProcess
004181c8 7c801e1a kernel32!TerminateProcess
004181cc 7c80ac7e kernel32!FreeLibrary
004181d0 7c80e4dd kernel32!GetModuleHandleW
004181d4 7c80ba71 kernel32!VirtualQuery
004181d8 7c80b475 kernel32!GetModuleFileNameW
004181dc 7c80ac61 kernel32!GetProcessHeap
004181e0 7c9100c4 ntdll!RtlAllocateHeap
004181e4 7c90ff2d ntdll!RtlFreeHeap
004181e8 7c8017e9 kernel32!GetSystemTimeAsFileTime
004181ec 7c8099c0 kernel32!GetCurrentProcessId
004181f0 7c8097d0 kernel32!GetCurrentThreadId
004181f4 7c80934a kernel32!GetTickCount
004181f8 7c80a4c7 kernel32!QueryPerformanceCounter
004181fc 7c9132ff ntdll!RtlDecodePointer
00418200 7c8449cd kernel32!SetUnhandledExceptionFilter
00418204 7c80aeeb kernel32!LoadLibraryW
00418208 7c80ae40 kernel32!GetProcAddress
0041820c 7c80be56 kernel32!lstrlenA
00418210 7c812f81 kernel32!RaiseException
00418214 7c809c98 kernel32!MultiByteToWideChar
00418218 7c81f424 kernel32!IsDebuggerPresent
0041821c 7c80a174 kernel32!WideCharToMultiByte
00418220 7c839471 kernel32!HeapSetInformation
00418224 7c809842 kernel32!InterlockedCompareExchange
00418228 7c802446 kernel32!Sleep
0041822c 7c80982e kernel32!InterlockedExchange
00418230 7c9132d9 ntdll!RtlEncodePointer
00418234 7c810cd9 kernel32!CreateFileW
00418238 00000000
0041823c 00000000
00418240 00000000
00418244 00000000
00418248 00000000
0041824c 00000000
00418250 00000000
00418254 00000000
00418258 00000000
0041825c 00000000
00418260 00000000
00418264 00000000
00418268 00000000
0041826c 00000000
00418270 00000000
00418274 00000000
00418278 00000000
0041827c 10322e30 MSVCR100D!_crt_debugger_hook
00418280 10327ce0 MSVCR100D!_wsplitpath_s
00418284 10274390 MSVCR100D!wcscpy_s
00418288 10326190 MSVCR100D!_wmakepath_s
0041828c 10323040 MSVCR100D!_except_handler4_common
00418290 10319d40 MSVCR100D!_onexit
00418294 102496d0 MSVCR100D!_lock
00418298 10319fa0 MSVCR100D!__dllonexit
0041829c 10249720 MSVCR100D!_unlock
004182a0 10316310 MSVCR100D!_invoke_watson
004182a4 103329b0 MSVCR100D!_controlfp_s
004182a8 102fb0c0 MSVCR100D!terminate
004182ac 10248680 MSVCR100D!_initterm_e
004182b0 10248650 MSVCR100D!_initterm
004182b4 103151e0 MSVCR100D!_CrtDbgReportW
004182b8 10319ac0 MSVCR100D!_CrtSetCheckCount
004182bc 10362730 MSVCR100D!__winitenv
004182c0 10248080 MSVCR100D!exit
004182c4 102480c0 MSVCR100D!_cexit
004182c8 1031d090 MSVCR100D!_XcptFilter
004182cc 102480a0 MSVCR100D!_exit
004182d0 10248ce0 MSVCR100D!__wgetmainargs
004182d4 10248100 MSVCR100D!_amsg_exit
004182d8 10245130 MSVCR100D!__set_app_type
004182dc 103635f8 MSVCR100D!_fmode
004182e0 103632fc MSVCR100D!_commode
004182e4 10247580 MSVCR100D!__setusermatherr
004182e8 1031ecd0 MSVCR100D!_configthreadlocale
004182ec 10321270 MSVCR100D!_CRT_RTC_INITW
004182f0 10267ee0 MSVCR100D!printf
004182f4 1025f660 MSVCR100D!getchar
004182f8 00000000
004182fc 00000000
00418300 00000000
00418304 00000000
00418308 00000000
0041830c 00000000
00418310 00000000
00418314 00000000
00418318 00000000
0041831c 00000000
00418320 00000000
00418324 00000000
00418328 00000000
0041832c 00000000
00418330 00000000
00418334 00000000
00418338 00000000
[/plain]

First, we can see a number of entries from the kernel32.dll library and later there are entries from the msvcr100d.dll library. All the entries that we’re directly using in our C++ code have been marked in bold font.

We’ve just figured out the library names used by the executable, and all the function names plus their virtual addresses in memory. Let’s also print all the loaded modules with the lmi command. The output of that command can be seen below:

[plain]
0:002> lmi
start end module name
00400000 0041b000 createfilee C (no symbols)
00940000 00949000 Normaliz (export symbols) C:WINDOWSsystem32Normaliz.dll
10200000 10373000 MSVCR100D (pdb symbols) C:WINDOWSsystem32MSVCR100D.dll
3d930000 3da16000 WININET (pdb symbols) C:WINDOWSsystem32WININET.dll
3dfd0000 3e1bc000 iertutil (pdb symbols) C:WINDOWSsystem32iertutil.dll
5b860000 5b8b5000 NETAPI32 (pdb symbols) C:WINDOWSsystem32NETAPI32.dll
5d090000 5d12a000 comctl32_5d090000 (pdb symbols) C:WINDOWSsystem32comctl32.dll
71aa0000 71aa8000 WS2HELP (pdb symbols) C:WINDOWSsystem32WS2HELP.dll
71ab0000 71ac7000 WS2_32 (pdb symbols) C:WINDOWSsystem32WS2_32.dll
76390000 763ad000 IMM32 (pdb symbols) C:WINDOWSsystem32IMM32.DLL
76b40000 76b6d000 WINMM (pdb symbols) C:WINDOWSsystem32WINMM.dll
77120000 771ab000 OLEAUT32 (pdb symbols) C:WINDOWSsystem32OLEAUT32.dll
773d0000 774d3000 comctl32 (pdb symbols) C:WINDOWSWinSxSx86_Microsoft.Windows.Common-Controls_6595b64144ccf1df_6.0.2600.6028_x-ww_61e65202comctl32.dll
774e0000 7761e000 ole32 (pdb symbols) C:WINDOWSsystem32ole32.dll
77a80000 77b15000 CRYPT32 (pdb symbols) C:WINDOWSsystem32CRYPT32.dll
77b20000 77b32000 MSASN1 (pdb symbols) C:WINDOWSsystem32MSASN1.dll
77c00000 77c08000 VERSION (pdb symbols) C:WINDOWSsystem32VERSION.dll
77c10000 77c68000 msvcrt (pdb symbols) C:WINDOWSsystem32msvcrt.dll
77dd0000 77e6b000 ADVAPI32 (pdb symbols) C:WINDOWSsystem32ADVAPI32.dll
77e70000 77f03000 RPCRT4 (pdb symbols) C:WINDOWSsystem32RPCRT4.dll
77f10000 77f59000 GDI32 (pdb symbols) C:WINDOWSsystem32GDI32.dll
77f60000 77fd6000 SHLWAPI (pdb symbols) C:WINDOWSsystem32SHLWAPI.dll
77fe0000 77ff1000 Secur32 (pdb symbols) C:WINDOWSsystem32Secur32.dll
78130000 78263000 urlmon (private pdb symbols) C:WINDOWSsystem32urlmon.dll
7c800000 7c8f6000 kernel32 (pdb symbols) C:WINDOWSsystem32kernel32.dll
7c900000 7c9b2000 ntdll (pdb symbols) C:WINDOWSsystem32ntdll.dll
7c9c0000 7d1d7000 SHELL32 (pdb symbols) C:WINDOWSsystem32SHELL32.dll
7e410000 7e4a1000 USER32 (pdb symbols) C:WINDOWSsystem32USER32.dll
[/plain]

The two libraries kernel32.dll and msvcr100d.dll have been bolded to be easily found. Notice that their base addresses are 0x10200000 and 0x7c800000, which correlates with all the functions in the IAT table. All those function pointers are correct, because the OS filled the IAT table with correct function pointers when the executable was loaded.

The Import Directory

The import function is the function that’s not located in the current module, but is imported from some other module, usually from several. The information about the function must be kept in the import directory of the current module because when the operating system loads the executable and memory and starts it, it must also load all the dependent libraries in the process’s memory space, so that the program can call those functions.

The import table contains IMAGE_IMPORT_DESCRIPTOR structures, which has the following members:

Each IMAGE_IMPORT_DESCRIPTOR element structure in the import directory contains information about a DLL the current module needs in order to reference its symbols and call its functions. The array will always contain another terminating structure, which has its members initialized to zero.

At the beginning of the IMAGE_IMPORT_DESCRIPTOR, we can see a union data structure being used. Union variables occupy the same memory and are normally used to specify that certain variable can have different variable types. In our case, both variables, the Characteristics and OriginalFirstThunk, have the same variable type DWORD, so the union declaration is only used to make an alias for both of those members.

Remember that the union declaration occupies only 4 bytes in our case (which is the size of the DWORD type) and not 8 bytes; this is how the union declarations work. Because of this, the size of IMAGE_IMPORT_DESCRIPTOR data structure is 20 bytes: 4 bytes for the union declaration and 16 bytes for the TimeDateStamp, ForwarderChain, Name and FirstThunk elements.

We haven’t yet mentioned what the elements of the structure actually mean. This is why we’re describing them below:

  • OriginalFirstThunk: this element contains the RVA to the IMAGE_THUNK_DATA structure, which we can see below:

We can see that the IMAGE_THUNK_DATA structure is a union structure, which is 4-bytes in size. When we come to this structure, we must remember that a function can be imported by name or by ordinal. In the case of a latter, the Ordinal field of the union in IMAGE_THUNK_DATA structure will have the most significant, but set to 1 and the ordinal number can be extracted from the least significant bits.

The structure actually contains a pointer to the array of RVAs that point to the IMAGE_IMPORT_BY_NAME structures, terminated by 0. Let’s look at how the IMAGE_IMPORT_BY_NAME structure look, which can be seen below:

There are two elements inside the structure:

  • Hint: this field is not of particular importance.
  • Name: contains the name of the import function; the field is actually a variable-sized pointer to the string.

Keep in mind that the OriginalFirstThunk will contain as many elements as is the number of imported functions for a particular library. Each imported function name represents one element in the array.

  • TimeDateStamp
  • ForwarderChain
  • Name: contains the RVA address where the name of the library is saved.
  • FirstThunk: contains the RVA to the array of IMAGE_THUNK_DATA structures, like the OriginalFirstThunk above. Both arrays contain the same number of elements. The OriginalFirstThunk is an array of names of imported functions, also called the ILT. The FirstThunk is an array of addresses of imported functions, also called the IAT.The OriginalFirstThunk uses the AddressOfData element of the IMAGE_THUNK_DATA structure, which points to another structure that contains the Name element of the library. The FirstThunk uses the Function element of the IMAGE_THUNK_DATA structure, which points to the address of the imported function. When the executable is loaded, the loader must traverse the OriginalFirstThunk array to find all the imported function names the executable is using. Then it must calculate the addresses of the functions and populate the FirstThunk array, so that the functions can be called whenever needed.

Conclusion

To conclude, the Import Table contains one entry for each DLL we’re importing from. Each entry contains Import Lookup Table (ILT) and Import Address Table (IAT) pointers [7]. If we would like to go over the whole PE file structure, there’s a great picture available at https://www.openrce.org/reference_library/files/reference/PE%20Format.pdf, which was provided by the OpenRCE team.

References:

[1] Import Address Table, accessible at http://en.wikipedia.org/wiki/Import_Address_Table#Import_Table.

[2] Dynamic-link library, accessible at http://en.wikipedia.org/wiki/Dynamic-link_library#Symbol_resolution_and_binding.

[3] CreateFile function, accessible at http://msdn.microsoft.com/en-us/library/windows/desktop/aa363858(v=vs.85).aspx.

[4] Linker Options, accessible at http://msdn.microsoft.com/en-us/library/y0zzbyt4.aspx.

[5] PE File Structure, accessible at http://www.thehackademy.net/madchat/vxdevl/papers/winsys/pefile/pefile.htm.

[6] Tutorial 6: Import Table, accessible at http://win32assembly.programminghorizon.com/pe-tut6.html.

[7] What’s the difference between “Import Table address” and “Import Address Table address” in Date Directories of PE?, accessible at http://stackoverflow.com/questions/3801571/whats-the-difference-between-import-table-address-and-import-address-table-a.

Posted: April 24, 2013
Author
Dejan Lukan
View Profile

Dejan Lukan is a security researcher for InfoSec Institute and penetration tester from Slovenia. He is very interested in finding new bugs in real world software products with source code analysis, fuzzing and reverse engineering. He also has a great passion for developing his own simple scripts for security related problems and learning about new hacking techniques. He knows a great deal about programming languages, as he can write in couple of dozen of them. His passion is also Antivirus bypassing techniques, malware research and operating systems, mainly Linux, Windows and BSD. He also has his own blog available here: http://www.proteansec.com/.

One response to “The Import Directory: Part 1”

  1. Todd says:

    Your bold markings are not showing up within the article. Other than that nice article.

Leave a Reply

Your email address will not be published.