# Audio Steganography

> Found a suspicious audio file. Seems like... something is hidden inside... Identify the character strings.

To begin the analysis, we opened the provided `.wav` file using [Sonic Visualiser](https://www.sonicvisualiser.org/), a popular audio visualization and analysis tool.&#x20;

We focused on the Spectrogram feature to inspect the audio's frequency and time domain representation. However, we did not find any immediate visual clues or hidden character strings in this step.

<figure><img src="https://1326575018-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWOW1TQuFBNIp3aDtWk3v%2Fuploads%2FyLB1iyGJrGnZjDBF2H4E%2Fimage.png?alt=media&#x26;token=b7f6f5b2-222b-46b1-955f-50fae970e1a8" alt=""><figcaption><p>Nothing interesting here!</p></figcaption></figure>

Realizing the potential application of Audio Steganography in hiding information within audio files, we conducted research on the topic.&#x20;

During our investigation, we came across a [blog post](https://sumit-arora.medium.com/audio-steganography-the-art-of-hiding-secrets-within-earshot-part-2-of-2-c76b1be719b3) discussing the LSB (Least Significant Bit) algorithm. This algorithm involves replacing the least significant bits of the audio samples with hidden data, making it a common technique for concealing information within digital audio.

<figure><img src="https://1326575018-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWOW1TQuFBNIp3aDtWk3v%2Fuploads%2FanlzNGggs32eKC5B1J7z%2Fimage.png?alt=media&#x26;token=957957d9-4196-4b7f-b56c-650e65e600cb" alt=""><figcaption></figcaption></figure>

```python
# Use wave package (native to Python) for reading the received audio file
import wave
song = wave.open("song_embedded.wav", mode='rb')
# Convert audio to byte array
frame_bytes = bytearray(list(song.readframes(song.getnframes())))

# Extract the LSB of each byte
extracted = [frame_bytes[i] & 1 for i in range(len(frame_bytes))]
# Convert byte array back to string
string = "".join(chr(int("".join(map(str,extracted[i:i+8])),2)) for i in range(0,len(extracted),8))
# Cut off at the filler characters
decoded = string.split("###")[0]

# Print the extracted text
print("Sucessfully decoded: "+decoded)
song.close()
```

{% hint style="info" %}
In-depth explanation on how the code works.
{% endhint %}

1. Importing Required Libraries: The code starts by importing the necessary libraries. In this case, the `wave` module is imported to work with audio files in the WAV format.
2. Opening the Audio File: The `wave.open()` function is used to open the audio file named "song\_embedded.wav" in read mode ('rb'). The resulting object, `song`, represents the opened audio file.
3. Converting Audio to a Byte Array: The audio file is read and converted into a byte array using the `readframes()` method of the `song` object. The `getnframes()` method is used to determine the total number of frames in the audio file. The resulting byte array is stored in `frame_bytes`.
4. Extracting the LSB of Each Byte: The LSB (Least Significant Bit) of each byte in the `frame_bytes` array is extracted. This is done by performing a bitwise AND operation with 1 (`frame_bytes[i] & 1`) for each byte in the array. The extracted LSBs are stored as a list in `extracted`.
5. Converting Byte Array to String: The extracted LSBs are converted back into a string representation. The code iterates over the `extracted` list in groups of 8 bits (`extracted[i:i+8]`), converts each group of 8 bits into a string representation (`"".join(map(str,extracted[i:i+8]))`), and then converts the resulting binary string into a Unicode character (`chr(int(binary_string, 2))`). The characters are joined together to form the complete string, which is stored in `string`.
6. Removing Filler Characters: The extracted string may contain filler characters that were added during the encoding process. These filler characters are represented by the "###" sequence. The code splits the `string` at the first occurrence of "###" and keeps only the portion before it. The resulting decoded message is stored in `decoded`.
7. Printing the Extracted Text: Finally, the decoded message is printed as a success message using the `print()` function.
8. Closing the Audio File: The `song.close()` statement is used to close the audio file after decoding is complete.

Following the instructions provided with the code, we executed it on the suspicious `.wav` file. The code analyzed the audio samples and extracted the hidden character strings embedded using the LSB algorithm.&#x20;

As a result, the secret message was revealed, and we obtained the flag required for the challenge.

<figure><img src="https://1326575018-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWOW1TQuFBNIp3aDtWk3v%2Fuploads%2F54czzzJTYht2X0W7tDak%2Fimage.png?alt=media&#x26;token=17af3303-6086-49c6-a0c1-6234773341fa" alt=""><figcaption><p>Flag!!</p></figcaption></figure>
