Digital steganography: hiding data in images

Digital steganography is the practice of hiding that a communication is happening, by concealing a secret message into digital documents, without using a secret channel of communication, such that only someone that knows that a communication is happening and also knows the decoding method, is able to retrieve the hidden message. 

# OutGuess

Starting in 2012, the mysterious Cicada 3301 group presented several steganography and cryptography puzzles, including the usage of a tool called "OutGuess", which is able to hide (and recover) arbitrary data into the pixels of a JPEG image, without changing the image visually and also without being able to detect the presence of the steganographic content using statistical tests. The tool is quite complicated and uses lots of math to do its magic (source code).

# Steganography in lossless images

In this post, we're presenting a much simpler idea for hiding arbitrary data into the pixels of a PNG image (or any other lossless image format). The resulting image is visually identical to the original image.

The idea in a nutshell is to hide 3 bits into each RGB pixel of an image (without changing the alpha channel).

Example: let's say we want to hide the text "abc" into a completely white image (255,255,255). We start by converting the text that we want to hide into bits:

    "abc" =  [0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1]

Then we iterate over the RGB pixels, and we change the least significant bit of v={R,G,B} to the next 3 bits from our data:

    v = ((v>>1)<<1) | bit[i++]

We repeat this process, until we encoded all the bits. The encoded pixels would look something like this:

[254, 255, 255], [254, 254, 254],
[254, 255, 254], [255, 255, 254],
[254, 254, 255], [254, 254, 255],
[255, 254, 254], [254, 255, 255],

The decoding process involves reading the least significant bit of each v={R,G,B} pixel and reconstructing the original message:

    bit[i++] = v & 0x01

In practice, we also need to know the length of the data, in order to know how many pixels to read and decode:

    L = length("abc")

We can encode this length as a 32-bit value, before we encode the actual data.

# Optimizations and ideas

In an image of size (width, height), we are able to store this many bits:

    max_bits_length = 3 * (width * height - 32)

However, we can optimize this, by first compressing the data that we want to encode.

Additionally, if security is an issue, we may also encrypt the hidden data with a password, or by using public-key cryptography.

Another idea would be to store the data at a secret offset into the image, in order to make the detection of the hidden message harder.

# OutGuess-PNG

outguess-png is a tool written in Perl, inspired by outguess, that hides arbitrary data into a PNG image and also does transparent DEFLATE compression and decompression of the hidden data: