Web Audio API – things I learned the hard way

Firefox recently dropped support for the Mozilla Audio Data API, so I started porting my two audio-enabled Web apps (Plasticity and Online Tone Generator) to the Web Audio API, which is the W3C-blessed standard way to work with audio in the browser.

In the process, I ran into a few problems, which I thought I’d share here for the benefit of other developers making their first steps with the Web Audio API.

AudioBufferSourceNodes and OscillatorNodes are single-use entities

Suppose we want to generate two 440 Hz tones, each with a duration of 1 second, separated with a 1-second pause. This is not the way to do it:

oscillator = context.createOscillator();
oscillator.frequency.value = 440;
oscillator.connect(context.destination);
currentTime = context.currentTime;
oscillator.start(currentTime);
oscillator.stop(currentTime + 1); //stop after 1 second
oscillator.start(currentTime + 2); //resume after 2 seconds
oscillator.stop(currentTime + 3); //stop again after 3 seconds

What’s wrong? We cannot call .start() on an OscillatorNode or AudioBufferSourceNode more than once. The second call in the above code will result in an error. Both OscillatorNodes (which are used to generate simple tones) and AudioBufferSourceNodes (which are used to play back short samples like sound effects) are meant to be thrown away after each use.

Instead, we should create a separate node for every time we want to play a sound. Every time, we must also connect it to the audio graph:

oscillator = context.createOscillator();
oscillator.frequency.value = 440;
oscillator.connect(context.destination);
currentTime = context.currentTime;
oscillator.start(currentTime);
oscillator.stop(currentTime + 1);

oscillator2 = context.createOscillator(); //create 2nd oscillator
oscillator2.frequency.value = 440;
oscillator2.connect(context.destination);
oscillator2.start(currentTime + 2);
oscillator2.stop(currentTime + 3);

ChannelMergerNode inputs don’t map to channels

What do you do when you have two mono sources – like OscillatorNodes, which are always mono, or AudioBufferSourceNodes connected to a mono buffer – and you want to mix them into a stereo signal, for example, play one sample in the left channel, and the other in the right? You use a ChannelMergerNode.

A ChannelMergerNode has a number of inputs, but only one output. It takes the input audio signals and mixes them into a single multichannel signal. Sounds pretty simple, but it’s easy to fall into the trap of assuming that inputs correspond to channels in the output signal. For example, take a look at the following code, which tries to play a tone on the right channel only:

oscillatorR = context.createOscillator();
oscillatorR.frequency.value = 440;
mergerNode = context.createChannelMerger(2);
//create mergerNode with 2 inputs
mergerNode.connect(context.destination);

oscillatorR.connect(mergerNode, 0, 1);
//connect output #0 of the oscillator to input #1 of the mergerNode
//we're leaving input #0 of the mergerNode empty
currentTime = context.currentTime;
oscillatorR.start(currentTime);
oscillatorR.stop(currentTime + 2);

The result of running this code is a tone playing in both channels at the same time. Why? Because inputs of a ChannelMergerNode do not map to channels in the output signal. If an input is not connected, ChannelMergerNode will ignore it. In this case, the first input (numbered 0) is not connected. The only connected input is #1, and it has a mono signal. ChannelMerger merges all the channels on all the connected inputs into a single output. Here, it receives only a single mono signal, so it will output a mono signal, which you will hear coming from both speakers, as you always do with mono sounds.

The right way to have a sound playing only in one channel is to create a “dummy” source node and connect it to the ChannelMergerNode:

context = makeAudioContext();
oscillatorR = context.createOscillator();
oscillatorR.frequency.value = 440;
mergerNode = context.createChannelMerger(2); //create mergerNode with 2 inputs
mergerNode.connect(context.destination);

silence = context.createBufferSource();
silence.connect(mergerNode, 0, 0);
//connect dummy source to input #0 of the mergerNode
oscillatorR.connect(mergerNode, 0, 1);
//connect output #0 of the oscillator to input #1 of the mergerNode
currentTime = context.currentTime;
oscillatorR.start(currentTime);
oscillatorR.stop(currentTime + 2);

You create a silence node by creating an AudioBufferSourceNode, just like you would for any sample, and then not initializing the buffer property. The W3C spec guarantees that this produces a single channel of silence. (As of April 2014, this works in Chrome, but does not work in Firefox 28. In Firefox, the input is ignored and the result is the tone playing on both channels.)

Unused nodes get removed automatically

You might think that if you want to have two sounds playing in different channels – one sound in left, another in right – you don’t need to create dummy nodes. After all, the ChannelMergerNode will have two input channels.

In the code below, we want to play a 440 Hz tone in the left channel for 2 seconds, and a 2400 Hz tone in the right channel for 4 seconds. Both tones start at the same time.

oscillatorL = context.createOscillator();
oscillatorL.frequency.value = 440;
oscillatorR = context.createOscillator();
oscillatorR.frequency.value = 2400;
mergerNode = context.createChannelMerger(2); //create mergerNode with 2 inputs
mergerNode.connect(context.destination);

oscillatorL.connect(mergerNode, 0, 0);
//connect output #0 of the oscillator to input #0 of the mergerNode
oscillatorR.connect(mergerNode, 0, 1);
//connect output #0 of the oscillator to input #1 of the mergerNode
currentTime = context.currentTime;
oscillatorL.start(currentTime);
oscillatorL.stop(currentTime + 2); //stop "left" tone after 2 s
oscillatorR.start(currentTime);
oscillatorR.stop(currentTime + 4); //stop "right" tone after 4 s

This code works as expected for the first 2 seconds – each tone is audible only on one channel. But then the left tone stops playing, and the right tone starts playing on both channels. What’s going on?

  1. When oscillatorL stops playing, it gets disconnected from mergerNode and deleted. The browser is allowed to do this because – as you recall – an OscillatorNode or AudioBufferSourceNode can only be used once, so after we call oscillatorL.stop(), oscillatorL becomes unusable.
  2. The ChannelMergerNode notices that it is left with only one channel of input, and starts outputting a mono signal.

As you can see, the most stable solution, if you want to access individual audio channels, is to always have a dummy node (or several, if you’re dealing with 5.1 or 7.1 audio) connected to your ChannelMergerNode. What’s more, it’s probably best to make sure the dummy nodes remain referenceable for as long as you need them. If you assign them to local variables in a function, and that function returns, the browser may remove those nodes from the audio graph:

function playRight()
{
    var oscillatorR = context.createOscillator();
    oscillatorR.frequency.value = 440;
    var mergerNode = context.createChannelMerger(2);
    mergerNode.connect(context.destination);
    var silence = context.createBufferSource();
    silence.connect(mergerNode, 0, 0);
    oscillatorR.connect(mergerNode, 0, 1);
    currentTime = context.currentTime;
    oscillatorR.start(currentTime);
    oscillatorR.stop(currentTime + 2);    
}

playRight();

Consider what happens at the time playRight() finishes. oscillatorR won’t get removed because it’s playing (scheduled to stop in 2 seconds). But the silence node is not doing anything and when the function exits, it won’t be referenceable, so the browser might decide to get rid of it. This would of course switch the output of the ChannelMergerNode into mono mode.

It’s worth noting that the above code currently works in Chrome, but it might not in the future. The W3C spec gives browsers a lot of leeway when it comes to removing AudioNodes:

An implementation may choose any method to avoid unnecessary resource usage and unbounded memory growth of unused/finished nodes. (source)

In Firefox, You cannot modify an AudioBuffer after you’ve assigned it to AudioBufferSourceNode.buffer

The following code attempts to generate and play a 440 Hz tone over a single channel:

SAMPLE_RATE = 44100;
buffer = context.createBuffer(1, 44100*2, SAMPLE_RATE);
//create a mono 44.1 kHz buffer, 2 seconds length
bufferSource = context.createBufferSource();
bufferSource.buffer = buffer;

soundData = buffer.getChannelData(0);
for (var i = 0; i < soundData.length; i++)
  soundData[i] = Math.sin(2*Math.PI*i*440/SAMPLE_RATE);
bufferSource.connect(context.destination);
bufferSource.start(0);

It works in Chrome, but fails in Firefox (28) without any error message. Why? The moment you assign buffer to bufferSource.buffer, Firefox makes the buffer immutable. Further attempts to write to the buffer are ignored.

This behavior is not covered by the W3C spec (at least I couldn’t find any relevant passage), but here’s a Mozilla dev explaining it on StackOverflow.

About these ads

3 responses to “Web Audio API – things I learned the hard way

  1. Comment on “AUDIOBUFFERSOURCENODES AND OSCILLATORNODES ARE SINGLE-USE ENTITIES” section. I have found more elegant solution – use GainNode: OscillatorNode -> GainNode -> DestinationNode.

    Start OscillatorNode once, then change gain value from 0.0 to 1.0 and vise versa.

  2. Only being able to call start() once was driving me nuts. The stop() method here is misleading – perhaps it should be re-named remove() or destroy(). Thanks for this post – very helpful. Here is a stop/stop solution that I came up with: http://codepen.io/deadlygeek/pen/mydevQ/

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s