How Does MP3 Shrink Files 10×? Psychoacoustic Masking Explained

The Core Idea: Throw Away What You Can’t Hear

Uncompressed CD audio runs about 10 MB per minute; MP3 cuts that to a tenth while sounding nearly the same. The trick is not cleverer lossless math — it is psychoacoustics: the human ear does not treat all sounds equally, and much of the signal never reaches your perception at all. The encoder simply discards what you were never going to hear. This approach is defined in the international standard ISO/IEC 11172-3 (MPEG-1 Audio; Layer III is MP3).

Masking: Loud Sounds Bury Quiet Ones

Psychoacoustic research — systematized in classic references such as Zwicker and Fastl’s Psychoacoustics: Facts and Models — identifies two key phenomena:

The MP3 encoder runs a built-in psychoacoustic model that continuously computes which components are masked right now, and allocates bits to what you can actually hear.

The Bitrate Trade-off

Bitrate sets the data budget per second. At 128 kbps the model must discard more marginal information, and complex material (cymbals, applause) may show audible artifacts; at 192 kbps and above, most listeners cannot tell the difference on most material. See How Audio Compression Works for a deeper look at choosing bitrates.

Editing tip: MP3 is lossy — every re-encode throws information away again. Cut once with the MP3 Cutter and avoid repeated edit→export→edit cycles.
Try the MP3 Cutter Now

References

  1. ISO/IEC 11172-3:1993, “Coding of moving pictures and associated audio for digital storage media — Part 3: Audio,” International Organization for Standardization.
    https://www.iso.org/standard/22412.html
  2. E. Zwicker & H. Fastl, “Psychoacoustics: Facts and Models,” Springer Series in Information Sciences (the systematic reference on masking).