import wave audioop
wav_in=wave.open('out.wav','r')
contents=w.readframes(w.getnframes())
# contents now holds all the samples in a long text string.
samples_list=[]; ctr=0
while 1:
try:
samples_list.append(audioop.getsample(contents,2,ctr))
# "2" refers to no. of bytes/sample, ctr is sample no. to take.
ctr+=1
except:
# We jump to here when we run out of samples
break
o=open('outfile.dat','w')
for x in range(len(out)):
o.write('%d %e\n'%(x,out[x]))
Another, only slightly different approach was to use sox first
(remember, I didn't know what sox could do when i tried this!), to turn
our wav file into a raw file, with signed samples, and two bytes (1
word) per sample:
offt=open('outfile_fft.dat','w')
for x in range(len(out_fft)/2):
offt.write('%f %f
'%(1.0*x/wtime,abs(out_fft[x].real)))
Now, depending on your knowledge of FFT's (Fast Fourier Transforms) the
above might need some explanation. First of all, the FFT data produced
will be complex. However, we are looking at a purely real signal, so we
will discard the Imaginary part. Also, only half the data is useful, as
the other half is just a mirror image (if you looked at the imaginary
part, it would be a mirror image multiplied by -1). Also, you can look
on the index of the FFT as meaning the number of cycles occurring in the
time-duration of the total sound extract. Thus, the 0, first component
is the DC component. The index 1 sample reverses once in the time
duration, the next one reverses twice, and so on (lowest frequency is on
the left!).
[[viewing-the-results]]
Viewing the Results
^^^^^^^^^^^^^^^^^^^
To view the output, you need to use some graphing software.
http://www.gnuplot.info[Gnuplot] is one good option. Included below are
thumbnails (links to full size) of an link:out.mp3[original time-domain signal],
followed by the entire FFT data (as output to file by the
python code above). And a smaller detail of the region where most of the
interest is, between 1000 and 2000 Hz.
.Time domain plot of input signal
image:/timeplot.png["Time Plot", width=128, link="/timeplot.png"]
.Plot of overall FFT data
image:/fft_plot_a.png["Overall FFT data", width=128, link="/fft_plot_a.png"]
.Detailed plot of FFT data, range 1-2kHz
image:/fft_plot_b.png["Detailed FFT", width=128, link="/fft_plot_b.png"]
Gnuplot code to produced these figures is included below:
set style data lines
# Time domain signal first:
set xlabel 'Time (s)'; set ylabel 'Amplitude'
set title 'Signal Profile'
set terminal png
set output 'timeplot.png'
plot 'outfile.dat'
# Now look at FFT Data
set output 'fft_plot_a.png'
set xlabel 'Frequency (Hz)'
set ylabel 'Magnitude of Real FFT'
set title 'Absolute value of FFT'
plot 'outfile_fft.dat'
# Now get a closer look at where the action is...
set xrange [1000:2000]
set title 'Absolute value of FFT (Detail)'
set output 'fft_plot_b.png'
replot
For what it's worth, the signal analysed here is a 5 second (220500
samples) recording of two notes from a harmonica.
[[conclusion]]
Conclusion
^^^^^^^^^^
These have just been some quick and sketchy notes on how to analyse a
sound input and find out what frequency content it has. As you can see,
there is nothing very clever or difficult to the implementation of this,
but it is still (in my opinion!) interesting to do. If you have more
interesting sound inputs to work on (e.g. vibrations from a machine) you
could proceed even further with your analysis and use it as a tool for
locating the source of the vibration and noise.
There may be typos in the code snippets in this article. However, you
can download the link:/sigproc.tar.gz[archive of code] [Not up yet],
which I have run and used to produce the figures included above. Even
though that does not mean there are no mistakes, it does at least mean
you can reproduce my results (Note that the sound sample is included in
mp3 format, to save my webspace and bandwidth, so you will have to first
convert it to a wav file. http://www.xmms.org[xmms] can do this for you
using its cdwriter plugin (plays an mp3 into a wav file)).
[[future-work-links]]
Future Work, Links
~~~~~~~~~~~~~~~~~~
An obvious further development would be to do a wavelet transform on the
sound signal. Hopefully when I have a bit more spare time I'll put
something on this topic up here.
A useful link I
http://www.linux.ie/pipermail/ilug/2002-October/051187.html[saw on ILUG]
is http://www.cf.ac.uk/psych/CullingJ/pipewave.html[Pipewave] which is a
suite of command line tools for stuff like this. AFAICS, the emphasis is
on speech applications, but should be of general interest too.