Malware analysis

Simple malware obfuscation techniques

What is malware?

Malware stands for malicious software and software, in simple language, means some program written in any programming language. So if a malicious program is intentionally written to cause damage to any computer or server or gain unauthorized access to any system, it is called malware.

Malware is a generic term used to define a variety of malicious programs and can take various forms. These terms include virus, Trojan horse, worms, adware, spyware, ransomware and so on.

Become a certified reverse engineer!

Get live, hands-on malware analysis training from anywhere, and become a Certified Reverse Engineering Analyst.

Start Learning

What is obfuscation?

Obfuscation is the most commonly used technique to conceal the original code written by the programmer, rendering the executable code difficult to read and hard to understand while maintaining the functionality of the written code. Nearly every piece of malware uses obfuscation in some or the other way.

Why obfuscation?

Usually, every computer or server has some software installed on it to detect and prevent malicious programs from being executed onto the local machine. This software can be present in various forms like antivirus, Windows Defender, ad blockers and so on, which detect this malware and stop it at the perimeter.

Malware analysts can also analyze the malware and identify important information like strings and the URL with which the malware is communicating and implement necessary measures to prevent the malware from being executed. Thus, most of the malware is obfuscated by default.

Malware obfuscation techniques

There are many obfuscation techniques being used by malware writers. Some of them are basic, while others are advanced.

Let’s have a look at some of the basic malware techniques widely being used.

Exclusive OR (XOR)

XOR is the most commonly used technique for obfuscating any malware. Also, it is very simple to implement and can easily hide the malicious payload from getting detected.

XOR is a binary operation. It is denoted as ^. The Boolean table for XOR operation looks like this:

A B A^B

0 0 0

0 1 1

1 0 1

1 1 0

The following are the steps followed to obfuscate and de-obfuscate the code using the XOR technique:

Obfuscation

1) Attacker randomly picks one byte value. This value acts as the key.

2) Possible key values range from 0-255 (decimal value).

3) Attackers encode and obfuscate the original code by iterating through every byte of data and XORing each and every byte with the key selected in step 1.

De-obfuscation

4) For de-obfuscation, we need to make use of the same key used for obfuscation.

5) Attacker repeats step 3 and iterates through every byte of data, XORing each and every byte with the key selected in step 1 to de-obfuscate the original obfuscated data.

Base64 encoding

Base64 is another simple malware obfuscation technique. There are only 64 characters in base64 encoding, hence the name. They are:

ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/=

The “=” character is used for padding.

In base64 encoding, the encoding function takes three characters and strings them with each other to obtain a 24-bit string. This string is broken into four chunks each of six bits, which is then translated into one of the base64 characters. Base64 is trivial to decode if it can be recognized.

ROT13

ROT13 is another malware obfuscation technique. It stands for Rotate 13.

It is simply a letter substitution cipher that replaces a letter by the 13th letter after it. Example “A” is replaced by ”N”, ”B” by ”O”, ”H” by ”U” and so on, continuing the sequence. Thus, only the alphabet is encoded while numbers, characters and symbols are not affected by it.

To de-obfuscate, applying ROT13 twice on obfuscated code reveals the original code.

Dead code insertion

In this, ineffective and useless codes are added to the original source code of the program. This results in a disguised program from the original one.

Dead code makes program control flows more complex and difficult to understand, while the behavior of the original program does not change. Dead codes are usually inserted into expressions and statements and not into loops to avoid performance issues.

Instruction changes

In the instruction change technique, malware writers alter instruction codes in the original code resulting in change in the appearance of the code while the behavior of the code remains the same. Thus, this makes it difficult for reverse engineers to reverse the code by going through the instructions and trying to understand the logic implemented in the code.

Packers

Packers usually involve compressing the original source code, thus reducing the original size of the code. Unlike standard zip files, packed executables automatically unpack themselves when executed.

Crypters

Crypters usually obfuscate the original source code and hide the original source code by cryptographic algorithm so that the original source code cannot be reversed by reverse-engineering. This technique also helps in bypassing and evading antiviruses and network defenses installed on the network. Most of the crypters get decoded and are not difficult to reverse.

Conclusion

As long as malware exists, so will various malware obfuscation techniques. Attackers will come up with new techniques to evade detection from malware engines, but it is the responsibility of the organization to take proactive measures for detecting and preventing various malware-based attacks by familiarizing themselves with the latest techniques being leveraged by attackers.