Audio Steganography
Found a suspicious audio file. Seems like... something is hidden inside... Identify the character strings.
To begin the analysis, we opened the provided .wav
file using Sonic Visualiser, a popular audio visualization and analysis tool.
We focused on the Spectrogram feature to inspect the audio's frequency and time domain representation. However, we did not find any immediate visual clues or hidden character strings in this step.

Realizing the potential application of Audio Steganography in hiding information within audio files, we conducted research on the topic.
During our investigation, we came across a blog post discussing the LSB (Least Significant Bit) algorithm. This algorithm involves replacing the least significant bits of the audio samples with hidden data, making it a common technique for concealing information within digital audio.

# Use wave package (native to Python) for reading the received audio file
import wave
song = wave.open("song_embedded.wav", mode='rb')
# Convert audio to byte array
frame_bytes = bytearray(list(song.readframes(song.getnframes())))
# Extract the LSB of each byte
extracted = [frame_bytes[i] & 1 for i in range(len(frame_bytes))]
# Convert byte array back to string
string = "".join(chr(int("".join(map(str,extracted[i:i+8])),2)) for i in range(0,len(extracted),8))
# Cut off at the filler characters
decoded = string.split("###")[0]
# Print the extracted text
print("Sucessfully decoded: "+decoded)
song.close()
Importing Required Libraries: The code starts by importing the necessary libraries. In this case, the
wave
module is imported to work with audio files in the WAV format.Opening the Audio File: The
wave.open()
function is used to open the audio file named "song_embedded.wav" in read mode ('rb'). The resulting object,song
, represents the opened audio file.Converting Audio to a Byte Array: The audio file is read and converted into a byte array using the
readframes()
method of thesong
object. Thegetnframes()
method is used to determine the total number of frames in the audio file. The resulting byte array is stored inframe_bytes
.Extracting the LSB of Each Byte: The LSB (Least Significant Bit) of each byte in the
frame_bytes
array is extracted. This is done by performing a bitwise AND operation with 1 (frame_bytes[i] & 1
) for each byte in the array. The extracted LSBs are stored as a list inextracted
.Converting Byte Array to String: The extracted LSBs are converted back into a string representation. The code iterates over the
extracted
list in groups of 8 bits (extracted[i:i+8]
), converts each group of 8 bits into a string representation ("".join(map(str,extracted[i:i+8]))
), and then converts the resulting binary string into a Unicode character (chr(int(binary_string, 2))
). The characters are joined together to form the complete string, which is stored instring
.Removing Filler Characters: The extracted string may contain filler characters that were added during the encoding process. These filler characters are represented by the "###" sequence. The code splits the
string
at the first occurrence of "###" and keeps only the portion before it. The resulting decoded message is stored indecoded
.Printing the Extracted Text: Finally, the decoded message is printed as a success message using the
print()
function.Closing the Audio File: The
song.close()
statement is used to close the audio file after decoding is complete.
Following the instructions provided with the code, we executed it on the suspicious .wav
file. The code analyzed the audio samples and extracted the hidden character strings embedded using the LSB algorithm.
As a result, the secret message was revealed, and we obtained the flag required for the challenge.

Last updated