Hacking

The Import Directory: Part 2

Dejan Lukan
April 29, 2013 by
Dejan Lukan

You can take a look at the previous article before reading this one. If you already understand the basics of IAT table, then you can skip the first article, but otherwise you should read that before continuing below.

Presenting the Example Import Directory

FREE role-guided training plans

FREE role-guided training plans

Get 12 cybersecurity training plans — one for each of the most common roles requested by employers.

Let's use the !dh command to dump the PE header. Below we can see that we've dumped the PE header that's located at the 0x00400000 virtual address. Note that we presented only the Import Directory entry, because we're interested only in that right now.

[plain]

0:002> !dh 00400000 -f

….

18000 [ 3C] address [size] of Import Directory


[/plain]

We can see that the RVA to the import directory is 0x18000 and is 0x3C bytes in size. The Import Directory points to the IMAGE_IMPORT_DESCRIPTOR structures, which is 20 bytes in size. Since the size of the Import Directory is 0x3C (60 bytes) and the size of the IMAGE_IMPORT_DESCRIPTOR structure is 0x14 (20 bytes), there are 3 structures available in the Import Directory.

Let's dump the three structures contained in the Import Directory table. On the picture below we've calculated the address of the Import Table, which is 0x00400000+18000, and then we also added the relative offsets to the IMAGE_IMPORT_DESCRIPTOR structures. Since the size of each structure in an array is 0x14 bytes, we must add 0x14 bytes to the address to access the next element:

We can see that the array actually contains two elements and that the last one is set to zero to denote the end of the Import Directory. The names of both elements are denoted by the Name element in the IMAGE_IMPORT_DESCRIPTOR structure. The Name element holds the RVAs to the actual name of the library. In the next picture, we've dumped the memory at those RVAs as hex and ASCII representations:

Notice that we've actually printed the names of the loaded libraries, the msvcr100dl.dll and kernel32.dll? This proves that the executable is using the imported functions from those libraries.

Let's now dump the OriginalFirstThunk array. We've seen that the RVA address to the OriginalFirstThunk array of the msvcr100d.dll library is 0x180FC. This is why we can use the command on the picture below to dump the four bytes of the IMAGE_THUNK_DATA structure.

The IMAGE_THUNK_DATA structure contains the element AddressOfData, which points to the IMAGE_IMPORT_BY_NAME structure. Let's present that structure again for clarity:

If we now dump the IMAGE_IMPORT_BY_NAME structure from the address 0x00400000+000184fe, we'll see the following:

Notice that the name presents only one character '_'. We've already established that the Name element actually contains a null terminated ASCII name, so it's best if we dump the contents of memory on that address to see the actual name. We've dumped the contents of memory with the db command, which can be seen below:

The first type bytes are Hint, while the rest of the bytes, until the first NULL byte, are part of the Name element. Because of this, the actual name of this function is _crt_debugger_hook. We can also use the da command to dump only the ASCII characters, but we have to add an additional 0x2 bytes to the address to jump over the Hint element. We can see the same string dumped on the picture below:

We've seen that the OriginalFirstThunk array contains RVA addresses to the IMAGE_THUNK_DATA structures that in turn point to the IMAGE_IMPORT_BY_NAME structure that contains the name of the function of certain library. All of the RVA addresses that point to the IMAGE_THUNK_DATA can be seen on the picture below:

Notice that the last element is NULL element 0x00000000, which terminates the array. The rest of the dwords are RVA addresses to the IMAGE_THUNK_DATA structure. We can see that it would take a lot of work to traverse the entries manually, so we'll just write a simple script that will do it for us. Let's first check if the current expression evaluator is set to MASM. We can do this with the .expr command:

In order to write a script, we must first take a look at basic WinDbg scripting instructions. If we want to declare a variable, we must use the "r" prefix and the name must be $t0-$t19. If we want to obtain the value of the variable, we must use the prefix "@" like this: @$t0. We can use the script parameters as $arg1 - $argN in the script itself.

Whenever we want to execute the script, we need to use the following command:

[plain]

kd> $$><"script_path"

[/plain]

I've coded a script that traverses the OriginalFirstThunk array and prints all the names from that array. The script can be seen below:

[plain]

$$

$$ This script reads the whole OriginalFirstThunk array and prints the

$$ function names stored in that array. The address of the OriginalFirstThunk

$$ must be passed to the script as the first parameter.

$$

.block

{

$$ the address of OriginalFirstThunk array

r $t0 = ${$arg1}+0x00400000

.for (r @$t1 = 0; @$t1 < 1000; r @$t1 = @$t1 + 4) {

$$ Calculate the address of element in OriginalFirstThunk.

r $t2 = @$t0 + @$t1

$$ Get the value of the Name element.
r $t4 = poi(@$t2)

.if(@$t4 = 0) {

.break

}

.else {

$$ Calculate the address of Name element in IMAGE_IMPORT_BY_NAME.

r $t3 = poi(@$t2)+0x00400000+0x2

da @$t3

}

}

}

[/plain]

Let's explain the script a little further. At first we calculate the actual address of the OriginalFirstThunk array: we add 0x00400000 (base address) to the first input argument. In our case, the input argument should be 0x180fc, so the whole address will be 0x004180fc, which is the address of the OriginalFirstThunk of the msvcr100d.dll library.

There's no need to say that the script only works if the base address of the PE header is 0x00400000; if we would like to have a more versatile script, we only need to make small changes to find the PE header base address dynamically. We didn't do this in our case, since it's not important for this exercise.

After that, we have a for loop which counts from 0 to 1000, increasing the counter by t 4 and executing the for loop body each time. In the loop body, we calculate the address of each element in the OriginalFirstThunk array and read the value from that address. If the address contains the value 0, then we've come to the end of the array and we terminate the loop. Otherwise, we take that value and add 0x400002, which constructs the whole address to the actual null-terminated ASCII name. Then we print that value to the output and repeat the loop.

Let's see what happens when we run the script. We saved the script into C:scripts directory as importnames.wds script, but the extension can be anything we like, even.txt. We're passing one argument 0x180fc to the script, which is the RVA to the OriginalFirstThunk.

[plain]

0:002> $$>a<C:scriptsimportnames.wds 0x180fc

00418500 "_crt_debugger_hook"

004184f0 "_wsplitpath_s"

004184e4 "wcscpy_s"

004184d4 "_wmakepath_s"

004184ba "_except_handler4_common"

004184b0 "_onexit"

004184a8 "_lock"

0041849a "__dllonexit"

00418490 "_unlock"

0041847e "_invoke_watson"

0041846e "_controlfp_s"

0041845a "?terminate@@YAXXZ"

0041844c "_initterm_e"

00418440 "_initterm"

0041842e "_CrtDbgReportW"

0041841a "_CrtSetCheckCount"

0041840c "__winitenv"

00418404 "exit"

004183fa "_cexit"

004183ec "_XcptFilter"

004183e4 "_exit"

004183d2 "__wgetmainargs"

004183c4 "_amsg_exit"

004183b2 "__set_app_type"

004183a8 "_fmode"

0041839c "_commode"

00418388 "__setusermatherr"

00418372 "_configthreadlocale"

00418360 "_CRT_RTC_INITW"

00418348 "printf"

0041833e "getchar"

[/plain]

Let's also dump all the names from the FirstThunk array in the msvcr100d.dll library, which has a RVA of 0x1827c. In order to do that, we have to change the script a little bit, because the OriginalFirstThunk and FirstThunk don't actually use the same data structures. The new script is very similar to the previous one and can be seen below:

[plain]

$$

$$ This script reads the whole OriginalFirstThunk array and prints the

$$ function names stored in that array. The address of the OriginalFirstThunk

$$ must be passed to the script as the first parameter.

$$

.block

{

$$ the address of OriginalFirstThunk array

r $t0 = ${$arg1}+0x00400000

.for (r @$t1 = 0; @$t1 < 1000; r @$t1 = @$t1 + 4) {

$$ Calculate the address of element in OriginalFirstThunk.

r $t2 = @$t0 + @$t1

$$ Get the value of the Name element.
r $t4 = poi(@$t2)

.if(@$t4 = 0) {

.break

}

.else {

$$ Calculate the address of Name element in IMAGE_IMPORT_BY_NAME.

r $t3 = poi(@$t2)+0x00400000+0x2

.printf "Addr: %xn", @$t4

}

}

}

[/plain]

We won't explain the script in detail, since it's very similar to the previous one. The only difference is the else conditional body, where we print the read value to the stdout, where in the previous case we printed the value pointed to by the current value and there was one more pointer in the hierarchy.

When we run the script, the following will be printed to the screen:

[plain]

0:002> $$>a<C:scriptsimportnames2.wds 0x1827c

Addr: 10322e30

Addr: 10327ce0

Addr: 10274390

Addr: 10326190

Addr: 10323040

Addr: 10319d40

Addr: 102496d0

Addr: 10319fa0

Addr: 10249720

Addr: 10316310

Addr: 103329b0

Addr: 102fb0c0

Addr: 10248680

Addr: 10248650

Addr: 103151e0

Addr: 10319ac0

Addr: 10362730

Addr: 10248080

Addr: 102480c0

Addr: 1031d090

Addr: 102480a0

Addr: 10248ce0

Addr: 10248100

Addr: 10245130

Addr: 103635f8

Addr: 103632fc

Addr: 10247580

Addr: 1031ecd0

Addr: 10321270

Addr: 10267ee0

Addr: 1025f660

[/plain]

Notice that we passed the RVA of the OriginalFirstThunk 0x1827c to the new script. The script printed the same number of elements as before, but now the function addresses were printed, instead of the function names in the previous script.

Let's verify that the printed address actually belongs to the function we've identified. The last element printed in both cases is "getchar" and "1025f660", which means that the getchar() function must be located at address 0x1025f660. We can check whether this is true by simply using the u command. The picture below shows us that our script works and that we've correctly identified the address of the getchar() function:

In the beginning of the article we've identified that the executable uses two libraries, the msvcr100d.dll and the kernel32.dll library. Previously, we've dumped the names and addresses of the functions in the msvcr100d.dll library. Now let's dump all the function names of the kernel32.dll library. We can see all the names below:

[plain]

0:002> $$>a<C:scriptsimportnames.wds 0x1803c

00418516 "CloseHandle"

0041874c "UnhandledExceptionFilter"

00418738 "GetCurrentProcess"

00418724 "TerminateProcess"

00418716 "FreeLibrary"

00418702 "GetModuleHandleW"

004186f2 "VirtualQuery"

004186dc "GetModuleFileNameW"

004186ca "GetProcessHeap"

004186be "HeapAlloc"

004186b2 "HeapFree"

00418698 "GetSystemTimeAsFileTime"

00418682 "GetCurrentProcessId"

0041866c "GetCurrentThreadId"

0041865c "GetTickCount"

00418642 "QueryPerformanceCounter"

00418632 "DecodePointer"

00418614 "SetUnhandledExceptionFilter"

00418604 "LoadLibraryW"

004185f2 "GetProcAddress"

004185e6 "lstrlenA"

004185d4 "RaiseException"

004185be "MultiByteToWideChar"

004185aa "IsDebuggerPresent"

00418594 "WideCharToMultiByte"

0041857e "HeapSetInformation"

00418560 "InterlockedCompareExchange"

00418558 "Sleep"

00418542 "InterlockedExchange"

00418532 "EncodePointer"

00418524 "CreateFileW"

[/plain]

Notice that this time we had to use the RVA of the kernel32.dll's OriginalFirstThunk, which is 0x1803c. To print the appropriate addresses, we must use the RVA of kernel32.dll's FirstThunk, which is 0x181bc. We can see all of the functions' addresses printed below:

[plain]

0:002> $$>a<C:scriptsimportnames2.wds 0x181bc

Addr: 7c809be7

Addr: 7c864042

Addr: 7c80de95

Addr: 7c801e1a

Addr: 7c80ac7e

Addr: 7c80e4dd

Addr: 7c80ba71

Addr: 7c80b475

Addr: 7c80ac61

Addr: 7c9100c4

Addr: 7c90ff2d

Addr: 7c8017e9

Addr: 7c8099c0

Addr: 7c8097d0

Addr: 7c80934a

Addr: 7c80a4c7

Addr: 7c9132ff

Addr: 7c8449cd

Addr: 7c80aeeb

Addr: 7c80ae40

Addr: 7c80be56

Addr: 7c812f81

Addr: 7c809c98

Addr: 7c81f424

Addr: 7c80a174

Addr: 7c839471

Addr: 7c809842

Addr: 7c802446

Addr: 7c80982e

Addr: 7c9132d9

Addr: 7c810cd9

[/plain]

Let's also verify that the function addresses are correct by checking whether the last element matches.

The address 0x7c810cd9 matches the CreateFileW function, which means that our scripts work as intended.

If we now dump the PE header with the !dh command, we'll see that the RVA to the Import Address Table Directory is 0x181BC, which is exactly the RVA of the kernel32.dll's FirstThunk.

[plain]
0:002> !dh 00400000 -f

181BC [ 180] address [size] of Import Address Table Directory


[/plain]

This proves that the IAT table must be traversed through the Import Directory structures as we saw in this tutorial. If we dumped the contents of the memory at IAT (RVA0x181BC), we would see that we're actually accessing the same functions that we already identified.

Conclusion

We've seen the distinction between load-time and run-time dynamic linking. With load-time dynamic linking, we must specify the required libraries that we'll be using during the program compilation, and of course the used functions are written to the program's IAT table. With run-time dynamic linking, the IAT is not used, because we'll know the function that we're referencing at run-time and not at compile-time. We can bring a library to the current process's address space by running the LoadLibrary() function and then scanning through its exported functions.

The IAT table is used to support dynamic linking, which needs to be done when the application is run. Since the application uses functions from standard libraries, we must write them into the IAT table, so that the operating system knows which libraries to load into the process's address space when the process is being executed. Alternatively, we could use run-time linking, in which case the IAT table won't be necessary, because we have to load the library and execute its functions at run-time.

References:

[1] Import Address Table, accessible at http://en.wikipedia.org/wiki/Import_Address_Table#Import_Table.

[2] Dynamic-link library, accessible at http://en.wikipedia.org/wiki/Dynamic-link_library#Symbol_resolution_and_binding.

[3] CreateFile function, accessible at http://msdn.microsoft.com/en-us/library/windows/desktop/aa363858(v=vs.85).aspx.

[4] Linker Options, accessible at http://msdn.microsoft.com/en-us/library/y0zzbyt4.aspx.

[5] PE File Structure, accessible at http://www.thehackademy.net/madchat/vxdevl/papers/winsys/pefile/pefile.htm.

[6] Tutorial 6: Import Table, accessible at http://win32assembly.programminghorizon.com/pe-tut6.html.

Become a Certified Ethical Hacker, guaranteed!

Become a Certified Ethical Hacker, guaranteed!

Get training from anywhere to earn your Certified Ethical Hacker (CEH) Certification — backed with an Exam Pass Guarantee.

[7] What's the difference between "Import Table address" and "Import Address Table address" in Date Directories of PE?, accessible at http://stackoverflow.com/questions/3801571/whats-the-difference-between-import-table-address-and-import-address-table-a.

Dejan Lukan
Dejan Lukan

Dejan Lukan is a security researcher for InfoSec Institute and penetration tester from Slovenia. He is very interested in finding new bugs in real world software products with source code analysis, fuzzing and reverse engineering. He also has a great passion for developing his own simple scripts for security related problems and learning about new hacking techniques. He knows a great deal about programming languages, as he can write in couple of dozen of them. His passion is also Antivirus bypassing techniques, malware research and operating systems, mainly Linux, Windows and BSD. He also has his own blog available here: http://www.proteansec.com/.