I frequently see posts and headlines like “You won’t believe what Jupiter sounds like!” and “The sounds of a black hole are TERRIFYING!” While many of these are some interpretation of radio data and are not actual sound, you can sonify just about anything in many different creative ways. I decided to give it a go by writing a Python script to sonify my astronomy images.
The script is pretty simple. It scans the image from left to right one pixel column at a time. For each pixel in a column, it maps its coordinate to a sinusoid of a certain frequency, in this case using a normal distribution to map pixels near the center of the image to about 1000Hz and pixels near the top and bottom to around 100Hz. The loudness of these frequencies are weighted by the luminance pixel value, namely by an exponential function to mimic the non-linearity of the human ear. All of the frequencies of a column are then summed together. Each column is dwelled upon for 1/10th of a second. I used a little trickery to try to mitigate discontinuities upon pixel changes. The images were downsampled from the originals to have a height of 1000 pixels, and the sound waves were sampled at 44,100Hz, but are presented in MP3 format.
Double Cluster
Cygnus Wall
Great Nebula in Orion
Triangulum Galaxy
Starless Western Veil
The script for Python 3.8 is given below. You will need numpy
, scikit-image
, scipy
, and PyAudio
installed. Run the script like so,
% python3.8 sonify.py image.jpg
and it will churn through the file. It will probably take a good bit of time based on the size of the image. When it completes, it will play the file and then save it to image.jpg.wav
. There are a few variables at the top of the file that you can play around with to effect the output, such as the frequency range and time step. The c
variable is a formula to map the pixel value to a frequency weight, this could be toyed around with. And the freq_map
function is open to lots of creativity.
import sys
import math
import time
import numpy as np
import pyaudio
from skimage import io
from scipy.io.wavfile import write
from skimage.color import rgb2gray
print("Opening image file", sys.argv[1])
filename = sys.argv[1]
image = rgb2gray(io.imread(filename))
p = pyaudio.PyAudio()
# final scalar
volume = 0.5
# sampling rate
fs = 44100
# amount of time to linger on a pixel
dt = 0.1
# min and max frequencies
f_min = 100
f_max = 1000
# image height
h = len(image)
# image width
w = len(image[0])
print("image height", h)
print("image width", w)
stream = p.open(format=pyaudio.paFloat32,
channels=1,
rate=fs,
output=True)
# output samples
samples = np.arange(0)
ys = range(0, h - 1, 1)
# stores previous weight values to use for discontinuity matching
last_samples = np.zeros(h)
# function to map y coordinate to frequency
def freq_map(x):
return math.exp(0 - ( (x - (h / 2)) / (h / 4) ) ** 2)
for x in range(0, w - 1, 1):
print("%.2f" % (100 * x / w), "% complete, ", x, " of ", w)
pix_samples = np.zeros(math.ceil(fs * dt))
for y in ys:
f = f_min + freq_map(y) * (f_max - f_min)
# frequency weight
c = 1000 ** image[y][x]
f_samples = c * np.sin(2 * np.pi * (np.add(x * fs * dt, np.arange(fs * dt))) * f / fs)
# try to scale samples using a line to avoid discontinuities
if x != 0:
first = last_samples[y] / c
# linear scaling
m = (1 - first) / (fs * dt)
b = first
f_samples = np.array([y * (m * x + b) for (x, y) in zip(np.arange(fs * dt), f_samples)])
last_samples[y] = c
# sum all samples
pix_samples = np.add(pix_samples, f_samples)
if x == 0:
samples = pix_samples
else:
samples = np.concatenate( (samples, pix_samples) )
samples = samples.astype(np.float32)
# normalize output
m = max(samples)
samples = samples / m
output_bytes = (volume * samples).tobytes()
start_time = time.time()
stream.write(output_bytes)
stream.stop_stream()
stream.close()
p.terminate()
write(filename + '.wav', fs, samples)