Digital audio experiments

by Don Cross - dcross@intersrv.com

I am fascinated by digital audio. I like to write computer programs (mostly C++) which synthesize new sounds or process existing sounds in interesting ways. I have done some DSP stuff using Fourier transforms, but no wavelets yet! I want to start taking piano lessons soon and see if I can merge musical composition with computer programming to create satisfying music. Speech recognition is another area which intrigues me.


Fourier Transforms

Take a look at this if you are interested in analyzing the frequency spectrum of a digital audio recording. Includes source code and online tutorial.

Time-domain filtering techniques

This will help you if you are interested in doing clean-sounding filtering of digital audio. Includes theoretical discussion, a C++ class for implementing linear filters, and some JavaScript applets for helping you design filter coefficients.

Touch Tone WAV file generator

Here's an amusing little program for generating WAV files that contain American telephone touch tone noises.

Reading and writing WAV files

Contains a description of the format of a WAV file and source code (Turbo Pascal and C++) for reading/writing WAV files.

Optimized calculation of successive sines and cosines

If you need to calculate billions and billions of sines and cosines in a computer program, check this out.

Digital Audio Journal

This is an online journal of my digital audio experiments which I will update every time I do something with digital audio/DSP that I think is noteworthy or interesting. This page also contains some more sounds for you to listen to! Here's a quick index if that's all you're interested in:

Sounds I have created on my computer

comp3.wav (4 MB) - This is a little CD-quality piece of synthetic music I made, using instrument voices I created myself. The voices were made using Mesh program described below. It is almost 24 seconds long, which is why the file is so large.

bwang.wav (84K) - This is a 22 kHz stereo sound which was synthesized by a computer program I wrote called "Mesh". This program simulates a grid of point masses connected by a mesh of springs surrounded by a rigid frame. The audio samples come from displacements of a particular point mass from its equilibrium position in the mesh; the left channel is the horizontal displacement and the right channel is the vertical displacement. In this particular recording, the grid was 32 by 32, for a total of 1024 point masses. This recording is only 0.975 seconds long, but it took about 40 minutes of CPU time on a 100 MHz Pentium machine to calculate. After Mesh generated this sound, I cleaned it up a little bit by doing a 5 millisecond fade in and a 30 millisecond fade out.

gong.wav (179K) - Another sound created by the Mesh program. It sounds quite a bit different from bwang.wav because I used fewer point masses, a higher spring constant, and applied the initial impulse to a point mass closer to the center of the mesh.


qwibnar5.wav (29K) - This is a recording of my voice fed through a program I wrote called modulate. This program is one of the mathematically simplest audio processing programs I have written, and surprisingly satisfying in its results. It just multiplies successive samples in the input file by a cosine wave of a frequency specified by the user in Hz. See my Fast Trig Page to learn how to efficiently calculate sinusoidal waves.

I got the idea for the modulate program from my electrical engineering studies back in college, where I learned that multiplying a function of time by a sinusoidal function has the effect of shifting its frequency spectrum. More accurately, the frequency spectrum of the output is the average of the input's frequency spectrum shifted down and the same spectrum shifted up. Both shifts are equal to the frequency of the sine wave you multiply the input by. So my voice is made to sound like two people speaking in unison, one with a lower voice and one with a higher voice. The effect of the modulate program is much like certain undesired effects in mistuned shortwave radio signals.


disturb.wav (126K) - Mix of several recordings I made at home, with a little processing (sections were slowed down). Includes peculiar vocals, slinky, cat purring, and wooden ruler.

disturb1.wav (506K) - This is the result of feeding the above sound (disturb.wav) through my program called reverb. Note that this program creates an interesting stereo effect out of the monaural input. The reverb program works by feeding the input samples through the complex-valued feedback function y(t) = x(t) + z*y(t-dt), where y(t) is the output at time t, x(t) is the input at time t, z is a complex number, and dt is a arbitrary positive real number. (Actually, it's not completely arbitrary; it must correspond to an integer number of PCM samples.)

disturb2.wav (506K) - This is the sound disturb.wav fed through both reverb and another program I wrote called resonate. The resonate program is interesting because it is a mathematical simulation of many strings vibrating in response to the input sound. In this sound file, I told resonate to simulate 128 strings ranging geometrically in frequency from 110 Hz to 5500 Hz.

qwiburp.wav (5.0 MB) - This is CD-equivalent audio (44.1 kHz, 16-bit stereo). It is a work of silliness using various samples I obtained from the Internet, mixed and processed together into what my wife Maria calls an "interesting collage".


Related resources


[Don Cross home page]

Visits to this page:

If you want to add a counter to your web page, try Web-Counter.