Audio Basics

The Wolfram Language provides built-in support for both programmatic and interactive audio processing, fully integrated with the Wolfram Language's powerful mathematical and algorithmic capabilities. You can create and import sound files, manipulate them with built-in functions, apply linear and nonlinear filters, and visualize them in any number of ways.
Audio Creation and Representation
An audio object can be created from numerical arrays, files, and URLs.
Audio[data]
in-core audio with samples given by data
Audio[file]
out-of-core audio from a file
Audio[url]
out-of-core audio from a url
Import[file]
import audio from a file
AudioGenerator[model]
generate various oscillators and noises
Audio creation functions.
The simplest way to create an audio object is to wrap the Audio constructor around a vector of real values ranging from to 1.
Here is a one-channel in-core audio object created from a vector of numbers:
Another way is to obtain an out-of-core audio object from a file on the local file system or any accessible remote location. Out-of-core audio objects do not store samples in memory.
This creates an out-of-core audio object from the Wolfram Language documentation directory ExampleData and displays the number of bytes used internally by the object:
You can use Import to create an in-core audio object from a file or URL.
This creates an in-core audio object and displays the number of bytes used internally by the object:
The "Audio" collection from ExampleData contains sample audio clips.
This lists available audio clips:
This lists the properties of a given audio object:
All types of Sound objects can be converted to Audio.
Here is an example that converts MIDI notes to audio:
See the waveform of the generated audio:
This converts sound represented by SampledSoundFunction to audio:
Various oscillators and noises can be created using AudioGenerator.
This creates sinusoidal audio:
This creates one second of the pink noise:
Audio Properties
Useful properties of an audio object can be obtained by calling the following functions.
AudioLength[audio]
give the number of samples of an audio object
Duration[audio]
give the duration of audio in seconds
AudioChannels[audio]
give the number of channels present in the data for audio
AudioSampleRate[audio]
give the sample rate associated with audio
AudioType[audio]
give the type of values used for each sample element in audio
Audio properties.
This returns audio length, duration, number of channels, sample rate, and data type:
The array of sample values can be extracted using the function AudioData. By default, the function returns real values, but you can ask for a specific type using the optional "type" argument.
This returns a fragment of the in-core audio as a vector of real values scaled to the range to 1:
Here is the same fragment extracted from the out-of-core audio as a vector of integers in the range 127 to 128:
In the case of multichannel audio, the raw sample data is represented by a list of channel values (2D array).
This creates an out-of-core multichannel audio object and extracts a fragment of that audio object as a list of channel values:
A multichannel audio object can be split into a list of single-channel audio objects and conversely, a multichannel audio object can be created from any number of single-channel audio objects.
This splits the example stereo audio object into two mono audio objects:
This creates the stereo audio object from two mono audio objects:
Audio Visualization
Audio can be visualized in a variety of ways.
AudioPlot[audio]
plot the waveform of audio
Periodogram[audio]
plot the squared magnitude of the discrete Fourier transform (power spectrum) of audio
Spectrogram[audio]
plot the spectrogram of audio
Audio visualization functions.
This plots the waveform of an audio signal:
This highlights silent parts of an audio signal:
This plots the power spectrum of an audio signal:
This plots the spectrogram of an audio signal using partitions of length 512 and the offset 64:
Basic Operations
Many useful audio processing tasks require nothing more than simple arithmetic operations between two audio objects or an audio object and a constant. For example, you can change volume by multiplying an audio object by a constant factor or by adding (subtracting) a constant to (from) an audio object. For this purpose, all Wolfram Language operators and functions with attributes NumericFunction or Listable are overloaded to work with audio objects.
audio1+audio2
add two audio objects
n*audio
multiply a scalar by an audio object
Mean[{a1,a2,}]
compute the mean of a list of audio objects
Some arithmetical and statistical operations on audio objects.
This creates a linear combination of two audio objects:
This computes the mean of a list of audio objects:
This computes statistics of an audio object:
A system option called "IndeterminateValue" is used to replace values that can result from arithmetical operation but cannot be stored in an audio object. These include ComplexInfinity and Indeterminate.
This displays the current value of the "IndeterminateValue" option and changes it to 2:
This shows how "IndeterminateValue" is used to replace unacceptable values in the result of an arithmetical operation:
Complex numbers in the result of an arithmetical operation are replaced by the real parts of these values:
Change the "IntermediateValue" back to the default zero:
Consider the audio manipulation operations that change the audio duration by trimming, deleting, or padding. These operations serve a variety of useful purposes. Trimming and deleting allow you to create a new audio object from a selected portion of a larger one, while padding is typically used to extend an audio object at the ends to ensure uniform treatment of the end samples in many audio processing tasks.
AudioTrim[audio,{t1,t2}]
give an audio object consisting of only samples between t1 and t2
AudioPad[audio,m]
pad audio on both sides with m seconds of zeros
AudioDelete[audio,{t1,t2}]
delete from time t1 to time t2
AudioSplit[audio,{t1,t2,}]
split audio at times ti
AudioPartition[audio,dur,offset]
partition audio into overlapping segments
AudioIntervals[audio,crit]
return intervals of audio for which criteria crit is satisfied
Some basic audio operations.
This selects the first second of an audio object:
This shows three different padding methods applied to the end of an audio object:
This trims the silence from the beginning and the end of an audio object:
This deletes samples between the first and the third seconds of an audio object:
This splits an audio object in half:
This partitions an audio object into non-overlapping fragments:
This partitions an audio object into overlapping fragments:
Intervals that satisfy various criteria can be computed and used to extract corresponding blocks of an audio object.
This computes the intervals where the maximal value is greater than .1 and the value of the spectral centroid is smaller than 800:
This returns audio objects corresponding to the intervals computed above:
It is frequently necessary to change the sample rate of an audio object by resampling, or to normalize audio samples in some manner. Functions that perform these basic tasks are readily available.
AudioResample[audio,sr]
give a resampled audio object that has sample rate sr
AudioNormalize[audio]
normalize audio so that the maximum absolute value of its samples is 1
AudioAmplify[audio,s]
multiply all samples of the audio by a factor s
Audio resampling, normalizing, and amplifying.
This changes the sample rate of the example audio to 8000:
This normalizes audio samples:
This increases the volume of an audio object:
The Wolfram Language also provides functions that perform operations on the list of audio objects.
AudioJoin[list]
concatenate a list of audio objects
AudioOverlay[list]
overlay a list of audio objects
ConformAudio[list]
conform properties of each audio object from the list
Joining, overlaying, and conforming lists of audio objects.
This joins two audio signals:
This overlays two audio signals:
This conforms two audio signals: