It is generally assumed that all major MP3 playback software produces the same output. The reason for this thinking is that the MPEG standard defines a decoder in a strict way, allowing only small deviations due to rounding.
A few years ago, I was disabused of that idea when I did an informal test to compare several well-known music players (iTunes 7, Winamp, Foobar2000, Windows Media Player). The test revealed iTunes 7 to be the outlier producing different output from the rest of the pack.
Today, I will present the results of a more rigorous test using the latest version of iTunes (10.2.1).
Test setup
- Windows 7 Professional SP1 (32-bit) with all the latest updates
- Auzentech X-Meridian 7.1 sound card
- Cool Edit Pro 2.0 audio editing software
Tested players
- Windows Media Player 12.0.7601.17514
- Winamp 5.56
- Foobar2000 1.1.5
- iTunes 10.2.1
MethodolOgy
I played two 10-second MP3 clips in each player, recording the output digitally with Cool Edit Pro 2.0 using the S/PDIF loopback mechanism provided by the sound card driver.
All postprocessing options (crossfade, sound check, etc.) were turned off. Both application and system volume were at 100%.
I used the following MP3 files:
- 
a 10-second clip from Wszystko Ch. by Elektryczne Gitary encoded with LAME 3.97 at 256 kbps ABR with high encoding quality (download file)
- a 10-second clip from Time by Pink Floyd encoded with LAME 3.98 at 256 kbps ABR with high encoding quality (download file)
After recording in Cool Edit Pro (as a 16-bit, 44.1 kHz file which matched the source material), I saved each stream as a text file, which looked like this:
-354 -172 1 203 -447 -443 -2490 -3088 -3504 -3676 -3233 -2944 -3206 -3867 -2829 -4348 -2391 -4461 -2196 -4165...
I also opened each of the two MP3 files directly in Cool Edit Pro 2.0 and then saved it as a text file. This file was used as a reference: the output of each player was compared against it. Cool Edit Pro 2.0 uses a Fraunhofer MP3 decoder (Fraunhofer IIS is the institute where MP3 was developed).
I opened the text files in Notepad++ and synchronized them by discarding the initial silence in each file. The goal was to make sure that the first sample in each file corresponded to the start of the clip to enable direct sample-by-sample comparison.
After synchronization, each text file was opened in Cool Edit Pro 2.0 again. Each waveform was subtracted from the reference waveform to reveal the differences.
Results
Each waveform below shows the difference between the reference output stream (Cool Edit Pro 2.0 with Fraunhofer decoder) and the output stream produced by an MP3 player.
Wszystko Ch. – Windows Media Player 12
Wszystko Ch. – Winamp 5.56
Wszystko Ch. – Foobar2000 1.1.5
Wszystko Ch. – iTunes 10.2.1
As you can see, Windows Media Player, Winamp and Foobar2000 all produced output that matched the reference stream very closely. A review of the text files showed that all three players produced virtually identical bitstreams: the differences between individual samples and the reference stream did not exceed 1, or in rare cases, 2. These differences were not large enough to register on the waveform view, even with magnification.
iTunes 10.2.1, however, added significant distortion that can be seen in the waveform above. In some cases, the samples deviated from the reference values by as much as 5 percent (e.g. 1811 instead of 1719). You can also download the above waveform as a WAV file to hear the “enhancement” added by iTunes. It basically sounds like a very high-pitched sound (> 15000 Hz) of an uneven volume. The ability to hear it will depend on your age: younger listeners will find it more prominent. (Of course, during normal music listening this sound would be very hard to hear.)
The output generated by iTunes 10.2.1 did not depend on the output setting in QuickTime (which iTunes uses to play audio). DirectSound, WaveOut and Windows Audio Sessions all produced the same output.
Time – Windows Media Player 12
Time – Winamp 5.56
Time – Foobar2000 1.1.5
Time – iTunes 10.2.1
Again, Windows Media Player, Winamp and Foobar2000 match the reference stream, while iTunes engages in creative decoding. In this sample, the distortion is smaller: personally, I cannot hear anything when I play the above waveform.
Conclusions
Cool Edit Pro 2.0, Windows Media Player 12, Winamp 5.56 and Foobar2000 1.1.5 all decoded the MP3 clips in virtually the same way. iTunes 10.2.1, on the other hand, produced a distorted output stream. While the distortion is probably inaudible in normal listening situations, it seems to mean that the latest version of iTunes fails to conform to the MP3 standard and is probably best avoided by users who care about audio fidelity.
Notes
In further tests using the same samples, I found that iTunes 9.2.1 matched the reference stream as well as WMP, Winamp and Foobar2000 – it would therefore seem that it decodes MP3 files properly. I also evaluated MediaMonkey 3 and detected very significant distortion (much larger than iTunes 10.2.1), even after disabling as many postprocessing options as I could find (volume leveling, clipping protection, crossfade, smooth stop/seek/pause, remove silence – did I miss anything?).
Check out the thread at Hydrogenaudio for an interesting discussion and independent measurements which confirm my findings.
Added November 2012: I made some quick measurements with the latest version of iTunes (10.7.0.21 for Windows) and it seems that it decodes MP3 files properly.
Added November 2014: I tested iTunes 11.4 for Windows (with DirectSound playback enabled) on one of the files and it is close enough to the reference waveform.
 
	







I opened the wav file in Audacity and Foobar, and I couldn’t hear anything, with the exeption of small “click” in the beginning.
Should I blame my music card/earphones (which are sennheiser hd590) or my sense of hearing? I’m afraid, the latter. 🙁
I have to turn up the volume significantly in order to hear it.
Could this be a specific error to iTunes 10.2.1 Windows? Curious about 10.2.1 Mac and 10.2.0 Windows.
Dude, awesome testing. Great idea.
Thanks for the trobble you had to give me some good reading and information.
Perhaps it could be some sort of deliberate watermarking for copyright or other reasons