# Entropy Calculations

Entropy is a measure of the randomness in a system. The more random the system, the less predictable it is and the higher its entropy.

## Entropy and Cryptanalysis

Entropy is useful in a variety of different fields, including cryptography. A measure of the randomness in a system is a useful method of differentiating between strong encryption and weak or non-existent encryption.

One of the methods of determining if an encryption algorithm is effective is if the ciphertexts that it produces can be differentiated from a random binary string. A fully random binary string has maximal entropy, meaning that there is no information exposed.

This is desirable in an encryption algorithm because it means that the ciphertext leaks no information about the corresponding plaintext. Therefore, calculating the entropy of data can help to differentiate between the ciphertext created by a strong encryption algorithm or the use of potentially weak and broken encryption.

## Calculating Entropy

Entropy can be calculated in a number of different ways. In cryptography, the most commonly used type of entropy is Shannon entropy, which was created by Claude Shannon, the father of information theory.

Shannon entropy can be calculated based upon the observed probability that a particular event occurs. With cryptography, this is the number of occurrences of zeros and ones within the ciphertext. The more unusual the ciphertext, the lower the entropy and the more information that can be derived about the corresponding plaintext.

By looking for high-entropy data, it is possible to identify data encrypted by a strong encryption algorithm and if a particular ciphertext was created by a broken encryption algorithm. While it is possible to do this by hand, some tools, including radare2 and binwalk, offer built-in entropy calculators, which can help with identifying encrypted data within a particular file.

## Conclusion

Entropy calculations provide an easy shortcut for identifying encrypted data within a file. Encrypted data is high-entropy, making it easily identifiable from more ordered data, such as text or code. On the other hand, poorly-encrypted data has lower entropy, providing a hint that a particular ciphertext may be breakable.

### Sources

A Gentle Introduction to Information Entropy – https://machinelearningmastery.com/what-is-information-entropy/

Radare2 – https://rada.re/

Binwalk – https://tools.kali.org/forensics/binwalk