Cryptography

Steganography: What your eyes don’t see

Soufiane Tahiri
August 14, 2013 by
Soufiane Tahiri

Steganography is the art of hiding information to prevent detection of a hidden message. It has been used throughout history by many methods and variation, ancient Greeks shaved heads of messengers and tattooed the secret message, once the heir grew back the message remained undetectable until the head is shaved again. Many ingenious techniques and methods were used by ancient civilizations. Earlier and near World War II invisible inks offered a common form of undetectable writing. An innocent letter could contain a very different message written between their lines.

It's a security trough obscurity approach, theoretically, apart from the sender and the recipient no one is supposed to suspect the existence of the hidden message. Digital technologies and informatics gave us new ways to apply Steganography and this by the most intriguing techniques like hiding information in digital images.

Learn Applied Cryptography

Learn Applied Cryptography

Build your applied cryptography and cryptanalysis skills with 13 courses covering hashing, PKI, SSL/TLS, full disk encryption and more.

Steganography and cryptography belong to the same "big family", if cryptography scrambles a message so it cannot be read. Steganography just hides it to not attract attention and this is the advantage that Steganography takes over cryptography.

Trough this article I'll demonstrate how a hidden "key" is stored in a "innocent looking" picture, the file we will study is a problem taken from a CTF (Capture the Flag, a computer security competition)

Analyzing the target

The file we know has a hidden message within it is a JPG file that simply looks like this (check references for download link):

The original file name was "spamcarver" which is itself a hint, according to Wikipedia; file carving is the process of reassembling computer files from fragments in the absence of filesystem metadata, the craving process, makes use of knowledge of common file structures (information contained in files, and heuristics regarding how filesystems fragment data). Fusing these three sources of information, a file carving system infers which fragments belong together. This is enough to push us to explore the JPEG file structure.

An In-depth sight into JPEG file format

Every image file that uses JPEG compression is commonly called a JPEG file, and is considered as variant of JIF image format, and most images captured by recent devices such as digital cameras creates files in EXIF format (Exchangeable image file format), a format standardized for metadata interchange, since the Exif standard does not allow color profiles. Most image editing software stores JPEG in JFIF format, and also includes the APP1 segment from the Exif file to include the metadata in an almost-compliant way; the JFIF standard is interpreted somewhat flexibly.

Technically, every JPEG file just like any other object has a beginning or header, called "Start of Image" and a trailer called "End of Image", every JPEG file starts from the binary value '0xFFD8' and ends by the binary value '0xFFD9'.

A JPEG file contains binary data starting by FF called Markers and has a basic format like this: 0xFF+Marker Number (1 byte) +Data size (2 bytes) +Data (n bytes). Some Markers are used to describe data, after the Start of Stream Marker which starts the JPEG image stream that ends by the End of Image Marker.

Here is a basic JPEG file format structure:

JPEG-file-format-structure

This seems to be enough to start studying our given image.

Problem analysis

Before trigging our hexadecimal editor, let's just do some "routine" tasks like checking the picture using some standard tools to get some additional information about this file.

In a Windows 7 machine, I installed GnuWin32 to get some *unix commands like "file" which determines file type and applies it on our target.

Installing GnuWin32

Start by downloading automated gnuwin32 download tool (GetGnuWin32-0.6.3.exe link on references section), then extract it to the desired folder, the installation process is quite simple even if it's command line based, so using windows command prompt (CMD) to navigate the extracted location and run "download.bat". This will download automatically all the available GnuWin32 packages to the same directory, if you are prompted to do something just accept the defaults.

After the download is finished and remaining on the command prompt you need to install all downloaded packages by typing: C:PathWhereYouDownloadedGetGnuWin32> install c:gnuwin32 this will install all downloaded packages to c:/gnuwin32 directory.

The last step is to add this new directory to the Environment Variables do this by right clicking on "My Computer" then "Properties" and click on "Advanced system settings" (or something similar)

In the System Properties window click on "Environment Variables" button, in the Environment Variables window (as shown below), select "Path " variable in the Systems Variable section and click the Edit button. Modify the path line by adding ";c:gnuwin32"

(without quotes as shown below):

Let's now use "file" command:

As expected this is a valid JPEG file stored in JFIF format, we can get even more information using another tool called "ExifTool" (You can download the link from the references section) which can help us with handling and manipulating images metadata:

Until now, everything seems legit and the file seems to be a valid JPEG file which leaves us the ultimate and efficient method: doing it by hands, the old school way!

Let's open our image file using a hexadecimal editor and focus on its structure; now we know that every JPEG file starts by 0xFFD8 and ends with 0xFFD9:

FFD8 is the Start of Image Marker, FFE0 is an Application Marker which is used to insert digital camera configuration and thumbnail image and it doesn't interest us.

Let's try to find the trailer of our file (the End of Image Marker) which is equal to 0xFFD9, so using your hexadecimal editor try to find the value "FFD9". To do this using WinHex, click on "Find Hex Values" on the window that appears taped in the hexadecimal value you want to find then click "OK"

And guess what two hits were found which is not "very" normal, click on the first hit to get to its offset

Things get more intriguing, well basically this means that something's appended to the JPEG file.

The JPEG file should end on FFD9 but exactly after the supposed end of image an interesting 504B0304……. with lot of other binary data appear.

If you are habituated to reverse engineering you can easily see that this is in fact the header of a normal PKZip file, even if you are not, a quick Google search will reveal it. Let's now study the binary data that appends after the end of the image marker.

A sight into PKZip file format

Each PKZip file (or simply ZIP file) mainly has this structure:

PKzip

And it may contain many local files headers, many local files data and many data descriptors of course there are lots of other technical details that I won't explain in this paper.

Each Local File header is structured in the following manner:

Signature The signature of the local file header  is always 0x504b0304

Version The PKZip version needed for archive extraction

Flags

Bit 00: encrypted fileBit 01: compression optionBit 02: compression option

Bit 03: data descriptor

Bit 04: enhanced deflation

Bit 05: compressed patched data

Bit 06: strong encryption

Bit 07-10: unused

Bit 11: language encoding

Bit 12: reserved

Bit 13: mask header values

Bit 14-15: reserved

Compression method

00: no compression01: shrunk02: reduced with compression factor 1

03: reduced with compression factor 2

04: reduced with compression factor 3

05: reduced with compression factor 4

06: imploded

07: reserved

08: deflated

09: enhanced deflated

10: PKWare DCL imploded

11: reserved

12: compressed using BZIP2

13: reserved

14: LZMA

15-17: reserved

18: compressed using IBM TERSE

19: IBM LZ77 z

98: PPMd version I, Rev 1

File modification time Bits 00-04: seconds divided by 2Bits 05-10: minuteBits 11-15: hour

File modification date

Bits 00-04: day

Bits 05-08: month

Bits 09-15: years from 1980

Crc-32 checksum CRC-32 algorithm with 'magic number' 0xdebb20e3 (little endian)

Compressed size If archive is in ZIP64 format, this filed is 0xffffffff and the length is stored in the extra field

Uncompressed size If archive is in ZIP64 format, this filed is 0xffffffff and the length is stored in the extra field

File name length The length of the file name field below

Extra field length The length of the extra field below

File name The name of the file including an optional relative path. All slashes in the path should be forward slashes '/'.

Extra field Used to store additional information. The field consists of a sequence of header and data pairs, where the header has a 2 byte identifier and a 2 byte data size field.

In addition to this, every PKZip has a signature used to show the end of the Central Directory which is "0x504B0506", in other words, every ZIP file is started by "0x504B0304" and is ended by "0x506B0506".

Let's get back to our JPEG file where we left it:

Here I marked with different colors bytes that need explanation based on the table above:

Signature 0x504B0304

Version 0x14 = 20d means version 2.0

Flags Bit 02: compression option

Compression method 08: deflated

File modification time 0x02F4 (little endian)

File modification date 0x419F (little endian)

Crc-32 checksum 0x9CD950D4 (little endian)

Compressed size 0x2DE3 = 11747 bytes

Uncompressed size 0xE299 = 58009 bytes

File name length 0x8 bytes

Extra field length 0x1C

File name 0x2020202020202020 = 8 times space bare

Extra field 0x5455 extended timestamp, size: 5 bytes

We know enough to think about extracting this zip file from the given JPEG file, we know the header of the file, how the file is structured and that this last contains a file named " "with no extension!

The easiest way to proceed in order to "dump" the zip embedded within the JPEG file is copying all bytes starting from the header of the ZIP until its trailer, which means, from the first "504B0304"until the end of the Central Directory meaning "506B0506" located in general at the end of file streaming:

Using your hexadecimal editor go to the offset 0XCB8E to find the beginning of the zip file, then select all bytes until the offset 0xFA04, copy data into new file and save it as a ZIP file:

If you are using WinHex right click on the exact offset then select "Edit -> Copy Block -> Into New File"

A "Save File as" window appears; give your file a name.zip

Checking the dumped Zip file

Using "file" command tells us that indeed this is a valid zip file:

C:UsersSoufiane>file C:UsersSoufianeDesktopsteganodumpedPK.zip

C:UsersSoufianeDesktopsteganodumpedPK.zip; Zip archive data, at least v2.0 to extract

Now let's try to extract our compressed archive (you can use whatever software you want) I'll keep on using commands given by GnuWin32, so:

C:UsersSoufiane>unzip C:UsersSoufianeDesktopsteganodumpedPK.zip

Archive: C:/Users/Soufiane/Desktop/stegano/dumpedPK.zip

error: cannot create

Remember the name of the file inside the zip file? An eight space name and this kind of file names can in fact cause some unzipping problems, so let's get back to the dumped zip file and using the hexadecimal editor, we will change the name by something more usual. What you have to do is making a hexadecimal search (like the one we did before) and try to find "2020202020202020" then changing it by whatever you like.

According to the PKZip file structure you are supposed to find two hints one in the beginning of the zip file and one in its end:

Change these values using the same thing:

Save and try to extract again:

C:UsersSoufianeDesktopstegano>unzip C:UsersSoufianeDesktopsteganodumpedPK.zip

Archive: C:/Users/Soufiane/Desktop/stegano/dumpedPK.zip

inflating: NoSpaces

Yes! A file called "NoSpaces" is now created; let's see what kind of files is this:

The zip file contained in reality another JPEG file, rename the extracted file to "NoSpaces.jpeg" and let's see how it looks:

It's a working JPEG file containing the key we were asked!

Conclusion

In this paper we learned that Steganography is not only this enigmatic art strictly based on mathematics and complex algorithms, we saw how a simple image file can hide any other file just by handling and understanding file structures, I tried to introduce you some common file structures that are JPEG and PKZip files and we saw how a hexadecimal editor can help us investigating files.

Hope you learned something new!

Learn Applied Cryptography

Learn Applied Cryptography

Build your applied cryptography and cryptanalysis skills with 13 courses covering hashing, PKI, SSL/TLS, full disk encryption and more.

Sources

Soufiane Tahiri
Soufiane Tahiri

Soufiane Tahiri is is an InfoSec Institute contributor and computer security researcher, specializing in reverse code engineering and software security. He is also founder of www.itsecurity.ma and practiced reversing for more then 8 years. Dynamic and very involved, Soufiane is ready to catch any serious opportunity to be part of a workgroup.

Contact Soufiane in whatever way works for you:

Email: soufianetahiri@gmail.com

Twitter: https://twitter.com/i7s3curi7y

LinkedIn: http://ma.linkedin.com/in/soufianetahiri

Website: http://www.itsecurity.ma