Steganography is the art of hiding information to prevent detection of a hidden message. It has been used throughout history by many methods and variation, ancient Greeks shaved heads of messengers and tattooed the secret message, once the heir grew back the message remained undetectable until the head is shaved again. Many ingenious techniques and methods were used by ancient civilizations. Earlier and near World War II invisible inks offered a common form of undetectable writing. An innocent letter could contain a very different message written between their lines.
It’s a security trough obscurity approach, theoretically, apart from the sender and the recipient no one is supposed to suspect the existence of the hidden message. Digital technologies and informatics gave us new ways to apply Steganography and this by the most intriguing techniques like hiding information in digital images.
Steganography and cryptography belong to the same “big family”, if cryptography scrambles a message so it cannot be read. Steganography just hides it to not attract attention and this is the advantage that Steganography takes over cryptography.
Trough this article I’ll demonstrate how a hidden “key” is stored in a “innocent looking” picture, the file we will study is a problem taken from a CTF (Capture the Flag, a computer security competition)
Analyzing the target
The file we know has a hidden message within it is a JPG file that simply looks like this (check references for download link):
The original file name was “spamcarver” which is itself a hint, according to Wikipedia; file carving is the process of reassembling computer files from fragments in the absence of filesystem metadata, the craving process, makes use of knowledge of common file structures (information contained in files, and heuristics regarding how filesystems fragment data). Fusing these three sources of information, a file carving system infers which fragments belong together. This is enough to push us to explore the JPEG file structure.
An In-depth sight into JPEG file format
Every image file that uses JPEG compression is commonly called a JPEG file, and is considered as variant of JIF image format, and most images captured by recent devices such as digital cameras creates files in EXIF format (Exchangeable image file format), a format standardized for metadata interchange, since the Exif standard does not allow color profiles. Most image editing software stores JPEG in JFIF format, and also includes the APP1 segment from the Exif file to include the metadata in an almost-compliant way; the JFIF standard is interpreted somewhat flexibly.
Technically, every JPEG file just like any other object has a beginning or header, called “Start of Image” and a trailer called “End of Image”, every JPEG file starts from the binary value ‘0xFFD8‘ and ends by the binary value ’0xFFD9‘.
A JPEG file contains binary data starting by FF called Markers and has a basic format like this: 0xFF+Marker Number (1 byte) +Data size (2 bytes) +Data (n bytes). Some Markers are used to describe data, after the Start of Stream Marker which starts the JPEG image stream that ends by the End of Image Marker.
Here is a basic JPEG file format structure:
This seems to be enough to start studying our given image.
Before trigging our hexadecimal editor, let’s just do some “routine” tasks like checking the picture using some standard tools to get some additional information about this file.
In a Windows 7 machine, I installed GnuWin32 to get some *unix commands like “file” which determines file type and applies it on our target.
Start by downloading automated gnuwin32 download tool (GetGnuWin32-0.6.3.exe link on references section), then extract it to the desired folder, the installation process is quite simple even if it’s command line based, so using windows command prompt (CMD) to navigate the extracted location and run “download.bat”. This will download automatically all the available GnuWin32 packages to the same directory, if you are prompted to do something just accept the defaults.
After the download is finished and remaining on the command prompt you need to install all downloaded packages by typing: C:\PathWhereYouDownloaded\GetGnuWin32\> install c:\gnuwin32 this will install all downloaded packages to c:/gnuwin32 directory.
The last step is to add this new directory to the Environment Variables do this by right clicking on “My Computer” then “Properties” and click on “Advanced system settings” (or something similar)
In the System Properties window click on “Environment Variables” button, in the Environment Variables window (as shown below), select “Path ” variable in the Systems Variable section and click the Edit button. Modify the path line by adding “;c:\gnuwin32“
(without quotes as shown below):
Let’s now use “file” command:
As expected this is a valid JPEG file stored in JFIF format, we can get even more information using another tool called “ExifTool” (You can download the link from the references section) which can help us with handling and manipulating images metadata:
Until now, everything seems legit and the file seems to be a valid JPEG file which leaves us the ultimate and efficient method: doing it by hands, the old school way!
Let’s open our image file using a hexadecimal editor and focus on its structure; now we know that every JPEG file starts by 0xFFD8 and ends with 0xFFD9:
FFD8 is the Start of Image Marker, FFE0 is an Application Marker which is used to insert digital camera configuration and thumbnail image and it doesn’t interest us.
Let’s try to find the trailer of our file (the End of Image Marker) which is equal to 0xFFD9, so using your hexadecimal editor try to find the value “FFD9″. To do this using WinHex, click on “Find Hex Values” on the window that appears taped in the hexadecimal value you want to find then click “OK”
And guess what two hits were found which is not “very” normal, click on the first hit to get to its offset
Things get more intriguing, well basically this means that something’s appended to the JPEG file.
The JPEG file should end on FFD9 but exactly after the supposed end of image an interesting 504B0304……. with lot of other binary data appear.
If you are habituated to reverse engineering you can easily see that this is in fact the header of a normal PKZip file, even if you are not, a quick Google search will reveal it. Let’s now study the binary data that appends after the end of the image marker.
A sight into PKZip file format
Each PKZip file (or simply ZIP file) mainly has this structure:
And it may contain many local files headers, many local files data and many data descriptors of course there are lots of other technical details that I won’t explain in this paper.
Each Local File header is structured in the following manner:
|Signature||The signature of the local file header is always 0x504b0304|
|Version||The PKZip version needed for archive extraction|
|Flags||Bit 00: encrypted fileBit 01: compression optionBit 02: compression option
Bit 03: data descriptor
Bit 04: enhanced deflation
Bit 05: compressed patched data
Bit 06: strong encryption
Bit 07-10: unused
Bit 11: language encoding
Bit 12: reserved
Bit 13: mask header values
Bit 14-15: reserved
|Compression method||00: no compression01: shrunk02: reduced with compression factor 1
03: reduced with compression factor 2
04: reduced with compression factor 3
05: reduced with compression factor 4
09: enhanced deflated
10: PKWare DCL imploded
12: compressed using BZIP2
18: compressed using IBM TERSE
19: IBM LZ77 z
98: PPMd version I, Rev 1
|File modification time||Bits 00-04: seconds divided by 2Bits 05-10: minuteBits 11-15: hour|
|File modification date||Bits 00-04: day
Bits 05-08: month
Bits 09-15: years from 1980
|Crc-32 checksum||CRC-32 algorithm with ‘magic number’ 0xdebb20e3 (little endian)|
|Compressed size||If archive is in ZIP64 format, this filed is 0xffffffff and the length is stored in the extra field|
|Uncompressed size||If archive is in ZIP64 format, this filed is 0xffffffff and the length is stored in the extra field|
|File name length||The length of the file name field below|
|Extra field length||The length of the extra field below|
|File name||The name of the file including an optional relative path. All slashes in the path should be forward slashes ‘/’.|
|Extra field||Used to store additional information. The field consists of a sequence of header and data pairs, where the header has a 2 byte identifier and a 2 byte data size field.|
In addition to this, every PKZip has a signature used to show the end of the Central Directory which is “0x504B0506″, in other words, every ZIP file is started by “0x504B0304″ and is ended by “0x506B0506″.
Let’s get back to our JPEG file where we left it:
Here I marked with different colors bytes that need explanation based on the table above:
|Version||0×14 = 20d means version 2.0|
|Flags||Bit 02: compression option|
|Compression method||08: deflated|
|File modification time||0x02F4 (little endian)|
|File modification date||0x419F (little endian)|
|Crc-32 checksum||0x9CD950D4 (little endian)|
|Compressed size||0x2DE3 = 11747 bytes|
|Uncompressed size||0xE299 = 58009 bytes|
|File name length||0×8 bytes|
|Extra field length||0x1C|
|File name||0×2020202020202020 = 8 times space bare|
|Extra field||0×5455 extended timestamp, size: 5 bytes|
We know enough to think about extracting this zip file from the given JPEG file, we know the header of the file, how the file is structured and that this last contains a file named ” “with no extension!
The easiest way to proceed in order to “dump” the zip embedded within the JPEG file is copying all bytes starting from the header of the ZIP until its trailer, which means, from the first “504B0304″until the end of the Central Directory meaning “506B0506″ located in general at the end of file streaming:
Using your hexadecimal editor go to the offset 0XCB8E to find the beginning of the zip file, then select all bytes until the offset 0xFA04, copy data into new file and save it as a ZIP file:
If you are using WinHex right click on the exact offset then select “Edit -> Copy Block -> Into New File”
A “Save File as” window appears; give your file a name.zip
Checking the dumped Zip file
Using “file” command tells us that indeed this is a valid zip file:
C:\Users\Soufiane\Desktop\stegano\dumpedPK.zip; Zip archive data, at least v2.0 to extract
Now let’s try to extract our compressed archive (you can use whatever software you want) I’ll keep on using commands given by GnuWin32, so:
error: cannot create
Remember the name of the file inside the zip file? An eight space name and this kind of file names can in fact cause some unzipping problems, so let’s get back to the dumped zip file and using the hexadecimal editor, we will change the name by something more usual. What you have to do is making a hexadecimal search (like the one we did before) and try to find “2020202020202020″ then changing it by whatever you like.
According to the PKZip file structure you are supposed to find two hints one in the beginning of the zip file and one in its end:
Change these values using the same thing:
Save and try to extract again:
Yes! A file called “NoSpaces” is now created; let’s see what kind of files is this:
The zip file contained in reality another JPEG file, rename the extracted file to “NoSpaces.jpeg” and let’s see how it looks:
It’s a working JPEG file containing the key we were asked!
In this paper we learned that Steganography is not only this enigmatic art strictly based on mathematics and complex algorithms, we saw how a simple image file can hide any other file just by handling and understanding file structures, I tried to introduce you some common file structures that are JPEG and PKZip files and we saw how a hexadecimal editor can help us investigating files.
Hope you learned something new!
- Target used : http://www.mediafire.com/download/g0xl9c6tmarb5gd/Steganography-UnZipMe.zip