Concatenating audio files/recordings in pure JavaScript
3 Feb 2021
I was recently working on a project where the requirement was to take several audio recordings and concatenate them into one recording, so they play one after another.
There are sever-side methods for achieving this - notably via something like FFMPEG. But can we do it in pure, client-side JS? Yes we can!
Via device-recorded audio 🔗
Here's a simple function which, when called, records audio from the user's microphone. Each recording is then converted to a blob and a URL to that blob (i.e. to the derived audio file) is pushed into an array, blobUrls
.
let recorder, blobUrls = [];
const record = () => {
MediaDevices.getUserMedia({audio: true}).then(stream => {
let recData = [];
recorder = new MediaRecorder(stream);
recorder.ondataavailable = evt => recData.push(evt.data);
recorder.onstop = evt => {
let blob = new Blob(recData);
urls.push(URL.createObjectURL(blob));
};
});
}
Another function, when called, stops the recording.
This is fine. But how do we concatenate the recordings? There's a couple of changes we need to make:
- Move the declaration of
recData
outside the function, so that the audio data of all recordings goes into the same array, not separate arrays, as above - Remove the blob-creation code from the above. We don't want to make blobs per recording anymore; we want to make one blob, later, when we've saved multiple recordings.
let recorder, blobUrls = [], recData = [];
const record = () => {
MediaDevices.getUserMedia({audio: true}).then(stream => {
recorder = new MediaRecorder(stream);
recorder.ondataavailable = evt => recData.push(evt.data);
});
}
Now we need a function that can run once we've made two or more recordings, to concatenate them - and this is where we'll do our blob stuff.
const concatRecordings = () => {
let blob = new Blob(recData),
url = URL.createObjectURL(blob),
audio = new Audio(url);
audio.play(); //<-- or serve it as a download
}
One minor issue here is we've saving our audio data into a flat array. This means it's not possible for the user to delete and re-record a recording later, as its data has been subsumed into the array. In other words, in the flat array we don't know where one track's data chunks end and another's begins. We can solve this by maintaining a multi-dimensional array instead - one sub-array per recording.
In our record()
function, we'd just set up that recording's sub-array for its data.
And then when we have audio data, push it into the sub-array, not the outer array.
Finally, when we come to make our blob in concatRecordings()
, we'd need to flatten the array. But until that point, we could easily remove one of the sub-arrays (e.g. to rerecord it).
Via fetch() 🔗
So far we've looked only at mic-recorded audio. But we can achieve the same thing with pre-existing files, loaded via the promise-based Fetch API - largely thanks to the fact that the Fetch API gives us a structured means to how we want the loaded data to be parsed. We specify this via a method of the Body mixin, e.g.
OK so that's for JSON. But we want blobs - and as luck would have it there's a blob()
method on the Body
mixin, too! And so:
let uris = ['file.mp3', 'file2.mp3'],
proms = uris.map(uri => fetch(uri).then(r => r.blob()));
Promise.all(proms).then(blobs => {
let blob = new Blob([blobs[0], blobs[1]]),
blobUrl = URL.createObjectURL(blob),
audio = new Audio(blobUrl);
audio.play();
});
Once we've loaded all files (as blobs) - which we can wait for via Promise.all()
- we then feed them to our master blob, which results in concatenated audio!
Via file inputs 🔗
That just leaves input[type=file]
elements. Things are a little different here, but it's still pretty straightforward. Suppose the following HTML:
<input type='file' id='file1'>
<input type='file' id='file2'>
<button onclick='concatAudio'>Concat audio!</button>
Suppose our user has chosen audio files for both fields. Our function to concatenate them then looks like this:
const concatAudio = evt => {
let proms = [1, 2].map(inputNum => new Promise(resolve => {
let file = document.querySelector('#file'+inputNum).files[0],
fr = new FileReader();
fr.onloadend = evt => resolve(fr.result);
fr.readAsArrayBuffer(file);
}));
Promise.all(proms).then(buffers => {
let blob = new Blob(buffers),
blobUrl = URL.createObjectURL(blob),
audio = new Audio(blobUrl);
audio.play();
});
}
The interesting part here is the use of the File Reader API. Earlier, we had blobs that we could just feed straight to our master blob. Here, we first need to read the files, and derive array buffers from them. (Again we use Promise.all()
to wait for all files to be read.)
Array buffers are used to store binary data, and the readAsArrayBuffer()
method of the File Reader API gets us the binary (blob) data behind the selected files.
We can then feed those buffers straight to our master blob to do the concatenation. Pretty neat!
Summary 🔗
Concatenating audio is easy in pure JavaScript, however you derive the audio. And we've not even looked at the Web Audio API which, with its considerable complexity, can also do this sort of thing (and much more).
If you're recording audio over mic, it's just a case of pushing all recorded data, from all recordings, into the same array, then feeding that to the master blob.
For pre-existing files loaded over fetch(), we can get them as blobs by calling the blob()
method of the Body mixin, then feed the derived blobs to the master blob.
And finally for input fields where the user selects audio files, we first need to read them into array buffers then pass those buffers to the master blob.
Did I help you? Feel free to be amazing and buy me a coffee on Ko-fi!