I frequently see posts and headlines like “You won’t believe what Jupiter sounds like!” and “The sounds of a black hole are TERRIFYING!” While many of these are some interpretation of radio data and are not actual sound, you can sonify just about anything in many different creative ways. I decided to give it a go by writing a Python script to sonify my astronomy images.
The script is pretty simple. It scans the image from left to right one pixel column at a time. For each pixel in a column, it maps its coordinate to a sinusoid of a certain frequency, in this case using a normal distribution to map pixels near the center of the image to about 1000Hz and pixels near the top and bottom to around 100Hz. The loudness of these frequencies are weighted by the luminance pixel value, namely by an exponential function to mimic the non-linearity of the human ear. All of the frequencies of a column are then summed together. Each column is dwelled upon for 1/10th of a second. I used a little trickery to try to mitigate discontinuities upon pixel changes. The images were downsampled from the originals to have a height of 1000 pixels, and the sound waves were sampled at 44,100Hz, but are presented in MP3 format.
Great Nebula in Orion
Starless Western Veil
The script for Python 3.8 is given below. You will need
PyAudio installed. Run the script like so,
% python3.8 sonify.py image.jpg
and it will churn through the file. It will probably take a good bit of time based on the size of the image. When it completes, it will play the file and then save it to
image.jpg.wav. There are a few variables at the top of the file that you can play around with to effect the output, such as the frequency range and time step. The
c variable is a formula to map the pixel value to a frequency weight, this could be toyed around with. And the
freq_map function is open to lots of creativity.
import sys import math import time import numpy as np import pyaudio from skimage import io from scipy.io.wavfile import write from skimage.color import rgb2gray print("Opening image file", sys.argv) filename = sys.argv image = rgb2gray(io.imread(filename)) p = pyaudio.PyAudio() # final scalar volume = 0.5 # sampling rate fs = 44100 # amount of time to linger on a pixel dt = 0.1 # min and max frequencies f_min = 100 f_max = 1000 # image height h = len(image) # image width w = len(image) print("image height", h) print("image width", w) stream = p.open(format=pyaudio.paFloat32, channels=1, rate=fs, output=True) # output samples samples = np.arange(0) ys = range(0, h - 1, 1) # stores previous weight values to use for discontinuity matching last_samples = np.zeros(h) # function to map y coordinate to frequency def freq_map(x): return math.exp(0 - ( (x - (h / 2)) / (h / 4) ) ** 2) for x in range(0, w - 1, 1): print("%.2f" % (100 * x / w), "% complete, ", x, " of ", w) pix_samples = np.zeros(math.ceil(fs * dt)) for y in ys: f = f_min + freq_map(y) * (f_max - f_min) # frequency weight c = 1000 ** image[y][x] f_samples = c * np.sin(2 * np.pi * (np.add(x * fs * dt, np.arange(fs * dt))) * f / fs) # try to scale samples using a line to avoid discontinuities if x != 0: first = last_samples[y] / c # linear scaling m = (1 - first) / (fs * dt) b = first f_samples = np.array([y * (m * x + b) for (x, y) in zip(np.arange(fs * dt), f_samples)]) last_samples[y] = c # sum all samples pix_samples = np.add(pix_samples, f_samples) if x == 0: samples = pix_samples else: samples = np.concatenate( (samples, pix_samples) ) samples = samples.astype(np.float32) # normalize output m = max(samples) samples = samples / m output_bytes = (volume * samples).tobytes() start_time = time.time() stream.write(output_bytes) stream.stop_stream() stream.close() p.terminate() write(filename + '.wav', fs, samples)