Never heard these myself, but I remember that Nautilus speakers were used in an MP3 listening test by the German c't magazine in 2000[1]. Conclusion:
> In plain language, this means that our musically trained test listeners could reliably distinguish the poorer quality MP3s at 128kbps quite accurately from either of the other higher-quality samples. But when deciding between 256 kbps encoded MP3s and the original CD, no difference could be determined, on average, for all the pieces. The testers took the 256 kbps samples for the CD just as often as they
took the original CD samples themselves.
This article made me (1) never worry about "lossy audio encoding" again and (2) ignore everyone starting about "better equipment" wrt compressed audio.
Granted, they used the cheaper Nautilus 803 rather than the 801 in the test. But they also had Sennheiser Orpheus available in the listening test.
My honest and unscientific opinion is that the difference _is_ discernible but the listener needs to know what to hear for. Also, the reproduction quality is impacted by several factors like room, equipment, and recording quality (not just speaker quality).
[Anecdotal] One example of the difference between MP3 and lossless: the "image" [1] on 256kbps MP3s is worse compared to the the original uncompressed, lossless, versions (but the listening room must be appropriately prepared to reproduce a good image).
This is a highly subjective topic. IMO we'll never reach full agreement. Personally, I listen MP3 while on-the-go and lossless music at home.
Important to keep in mind the "size" of the experiment. Two interesting quotes from the article in c't magazine:
> twelve participants would be asked to come to Hanover.
> It's true that the data we collected does not support watertight
conclusions, but they do provide interesting insights.
> Important to keep in mind the "size" of the experiment. Two interesting quotes from the article in c't magazine:
>> twelve participants would be asked to come to Hanover.
It's a mistake to apply vanilla statistical thinking here. The 12 participants were not randomly drawn from the German population, they were extremely skewed towards enthusiasts/professionals: audio engineers, an owner of an actual Nautilus 801, someone who worked on MP3/AAC at Fraunhofer IIS, someone who works preparing masters for Deutsche Gramophon. If these are the people who have enormous difficulty distinguishing 256kbps MP3 from the CD original, I'm certainly not going to worry that I am going to miss out on anything with 256kbps MP3.
If 12 Grand Slam participants tell me they can't tell the difference between a standard $100 and a $1000 high end tennis racket, I'm not going to delude myself into thinking that it's going to make any difference for me.
> It's a mistake to apply vanilla statistical thinking here. The 12 participants [...] were extremely skewed towards enthusiasts/professionals
It is still undetermined if having 12 highly-skilled professionals in the experiment is enough to have a conclusive experiment.
Also, this subject is so difficult to get right that the authors of the article themselves hedged by saying that experiment "does not support watertight conclusions".
The only one who was significantly able to tell if something was mp3 encoded or not, was a guy with a hearing damage who loved punk music. In fact, mp3 was developed for persons with normal hearing. So it is well possible that he was able to tell differences where other people were unable to.
Maybe the punk music had more to do with it. Sounds like the guy was keying off of subtleties of sonority and emotive quality which are a lot more fragile to digital processing.
It's quite easy to overprocess a digital audio file and wind up with something that is pristine as far as frequency response, but flat and 'pod people' like as far as emotive cues and intensity. Aliasing and cumulative losses to word length issues have a lot to do with it.
It's VERY easy to make digital stuff accurately represent frequencies like 2 Hz or 35kHz that our ears don't hear. It's a lot harder to make the digital stuff perform in the midrange when our perception can go, inconsistently and irregularly, waaaay beyond what we're used to thinking of as the limits.
I did some personal experiments back in the day when hard disks were expensive and found that the compression artefacts show up first in distorted guitars and cymbals, then brass instruments and everything else survives much lower bit rates. So that could explain why the punk rock fan hears the compression problems first.
By the way, the lossy compression algorithms don't try to produce exact frequency response but to leave out stuff that humans wouldn't hear anyway and compress the rest.
A good way to determine the point of transparency of lossy encoding for yourself, is to ABX test on your own equipment, with files you've converted yourself. A good way to do this is with Foobar2000's ABX plugin, which lets you compare back and forth and on whole tracks or short snippets if you want.
In my experience, headphones always yield the best results, and surprisingly it doesn't matter if I use the stock earbuds from my phone or a nice set of AKG over-ear headphones. It's not a matter of absolute sound quality, just the fact that you cut out room interactions and get the sound straight to your ears makes a big difference.
MP3 has some built-in flaws that no encoder can completely cover up, short sharp sounds like castanets really expose the pre-echo, harpsichord shows similar issues. It also has a tendency to make cymbals sound "washy" or "underwater", which all lossy codecs do to some degree, but MP3 is especially bad.
Still, at 192kbps I have to really focus to hear it in normal listening, but it's more or less always there even at 320kbps in problem tracks, if I really focus in on short sections. It just sounds subtly "off". But I hope no one actually listens to music like that, in short repeated sub-1 second sections to narrow in on a specific castanet snap ;-)
As for more modern codecs like Opus and AAC, it's generally completely transparent for me at 128kbps, and that's with a bit of playing it safe, I'm pretty sure I could drop Opus down to 96kbps. Modern codecs are really impressive.
I keep my music library in FLAC, both because I know it's CD quality and because it's an archive. I want to be able to convert the tracks to any new codec that may come along, if I need to.
My library is 280GB currently, and storage is cheap :-)
> I keep my music library in FLAC, both because I know it's CD quality and because it's an archive. I want to be able to convert the tracks to any new codec that may come along, if I need to.
I understand the sentiment. But the reality is, if the re-encoding is not likely going to happen within the next 10 years, your hearing will probably have deteriorated so much that you probably won't hear the difference anymore anyway (assuming you can hear a difference today, which is a big assumption).
I got the start of my music collection from my parents (as .wav's, or rather, I helped rip the cds). I intend to do the same. So it's not just one but several decades we're talking about.
> It also has a tendency to make cymbals sound "washy" or "underwater", which all lossy codecs do to some degree, but MP3 is especially bad.
Thank you for confirming this! I record my analog synths that I play through headphones off an old mixing board, however when it comes through my ADC->iPad, stuff seems to get lost and I spend time adjusting the mix and ADSR for recording. Have been seriously mulling a reel to reel, but many others have had the same idea and the market prices are astronomical.
It's not inherent to straight uncompressed PCM audio, it's strictly an artifact of lossy compression. A reel-to-reel tape deck will be noisier and extremely cumbersome compared to proper digital recording.
Recording should be done at 96kHz 24-bit or higher, to not have to meticulously optimize recording levels and to allow room for mixing and effects, without raising the noise floor to noticable levels.
Convert to normal CD quality as the last step before distribution.
I don't collect music just to collect it, I only keep artists and albums around that I really like, or if it's something special and hard to find. Everything else is on YouTube or whatever for the rare occasion I need to listen to Metallica or AC/DC or something.
Agree with everything you say, but I would also add that the interactions with compression and other lossy signal processing that is frequently performed is not well studied. For example, when using Bluetooth headphones, it is likely that the music will be equalised/normalised, resampled to 48Khz (for mixing) and then re-encoded to a bluetooth codec e.g. LDAC. It is much safer to start with FLAC, if you cannot avoid such a signal chain.
I believe Fraunhofver did a pretty rigorous scientific test that established the CD transparency quality to be around 256kbps mp3. I don't dispute or doubt that.
However, obvious encoding artifacts abound on Spotify. Do I have a superhuman hearing?
Probably not. My hypothesis is that not everyone authors lossy files as meticulously as Fraunhofver. Also, the performance of mp3 depends on a highly linear and faithful reproduction after decoding. Mp3 is painfully obvious on crappy, processed-to-hell speaker systems like the iMac.
I think the real question is, why bother with lossy codecs? FLAC streams are lightweight by today's standards, and it's just so much simpler.
Spotify doesn't use MP3, though. So if you're hearing MP3-specific artifacts (pre-echo, washy "underwater" cymbals), those are probably the result of bad mastering or perhaps using MP3-encoded samples in some tracks. I hear this on some lossless tracks I have, unfortunately if the source material is flawed, there's nothing you can do.
Spotify uses Ogg Vorbis, except the lowest bitrate on mobile, which is HE-ACCv2, and on Chromecast/similar devices, which get AAC (because they can't natively decode Ogg Vorbis).
It is a significantly better codec than MP3, and doesn't suffer from the pre-echo, washy cymbals and badly-encoded high frequencies. At least not until you severely decrease the bitrate.
That is true of course, although the more modern codecs most significantly outperform mp3 at lower bitrates. At 256kbps mp3 should be enough, and iirc Spotify offers 320kbps ogg in hifi mode (not sure, I mostly don't ever use it unless someone links something) which should also be enough.
The most common problem is when the master is hot and encodes clipped waveforms. It sounds even stranger on lossy. In fact I'm not sure if the codecs' filter banks still have perfect reconstruction in this case or if it's just a borked master. In any case, it always tends to be quite clear when I switch between lossy and lossless on Tidal.
Anyway, it's complicated and brittle, and lossless 44/16 is only something like 2x bitrate and bit-perfect (so you can write proper regression tests for codecs very easily) shrug, I think I'll go that route.
disclaimer: occasionally get paid for mastering music, so I'm wired to prefer simplicity, transparency and don't mind some redundancy in ensuring signal integrity.
I did some test with ogg vorbis a while back. I think it was q4 vs q5 that was my limit.
What struck me though was that I couldn’t tell which was “better”.
In the track there was a section where the background drumming on one setting sounded like aluminum drum sticks while the other like wooden drumstick. Very subtle effect, and took some intense concentration to pick out.
> Mp3 is painfully obvious on crappy, processed-to-hell speaker systems like the iMac.
I didn't know that was the case. Do you know why that somehow accentuates MP3 artifacts? It's not obvious to me why it would, since all the processing iMac speakers might perform is high-quality (no recompression involved). I mean, obviously the iMac isn't going to improve the MP3, it just tries to improve the speakers. But why would that make artifacts more noticeable?
If anything, don't "crappy" speakers hide MP3 artifacts, because they're not good enough to expose them?
Mp3 relies on the fact that loud sounds mask spectrally and temporally adjacent softer sounds (psychoacoustic masking) and allows the noise floor to rise for some bands in such conditions to reduce the required number of bits to encode the band signal.
Now if the masking sound disappears or changes, the crap is no longer masked.
In other words the psychoacoustic principles work only when we don't alter the signal too much after the codec.
Many diminutive speakers make use of techniques like multiband compression (dynamics, not entropy) to produce "larger" sound. That wreaks havoc on the psychoacoustic model of lossy codecs.
Note that the quality of MP3 encoders has changed significantly since 2000, and differed significantly between encoders at the time. (Does anyone not use LAME these days?)
After a certain point, lossy compression doesn't create any perceptible loss in audio quality as long as the rip is done well (with a good encoder, etc.), and you don't know every part of the piece in question.
e.g. in classical music, you can tell subtle differences if you've listened the piece live or performed it inside an orchestra. However, that's a pretty edge case. There are always differences if you know where to look for, otherwise it's pretty insignificant.
e.g. in classical music, you can tell subtle differences if you've listened the piece live or performed it inside an orchestra
Can you explain this a bit more? How is having heard a piece live, which by definition means a unique performance, going to affect whether or not you can pick out whether the recording of (likely) a different performance has been put through lossy compression? Or do you mean a recording of that same live performance?
I mean I'm used to subtle differences coming and going depending on room/speakers/crappy compression, but only because I use the same source as reference (say, the same CD). Using a live performance as reference sounds strange, because there I would be able to here one of the musicians doing something different, but difference isn't there on another recording so not usable as cue for hearing differences in sound reproduction.
In classical music, the performance is of course unique, but the piece is not. What I mean is, even if the arrangement has changed for a particular piece, the underlying score, the foundation is same.
As you know, classical music is layered. It can be scaled for different sized orchestras, which can be akin to tessellation in graphics. You can add more nuanced scores or details if your orchestra has enough members. Of course this has a limit, which is the full score written by the original composer. Similarly, you can remove some layers or simplify the piece if you're smaller orchestra without compromising the piece.
What I tried to say is, if you've listened the piece from or performed with a relatively big orchestra, you'll know that which instrument shall be there, where the small optional triplets are, how the piece should sound or where's that little oboe shall come in, where the little cymbal adds that little crash, or how the harmonics affect each other and create that atmosphere.
So, you'll notice something is missing or off or not as it should be especially in the high end. Classical music has a lot of perceptual tricks under its sleeves to create a specific ambiance and sense of space and most of this lays in the higher end of the spectrum, and they get shaved off first with lossy compression.
Hope this helps, because it's something more felt than can be said with words, how you can't really hear the double bass but feel how it's there. It's that kind of perception.
Edit: Just wanted to add that one musician's or orchestra's specific style of course will be different, but a good orchestra is very faithful to the original score of the piece. Even if an orchestra is playing a little fast or more aggressive, or a simplified version, base instrumentation and atmosphere is the same (as long as the orchestra is not doing Metallica S&M style play the right thing with wrong instruments kind of deliberate arrangement).
Another extreme example would be the band Pink Martini. They have an on-stage audio magic which allows them live with the exact sound of their studio recordings, albeit live. It's surreal to experience.
I sort of get what you're hinting at, but I still think it might be inaccurate; to me your reasoning come over like 'played live there's detail X and Y, when listening those details might be vague or don't come out properly, so that might be lossy compression at work' (please correct me if I'm wrong). Thing is: just poor microphone placement or poor recording equipent or poor mastering can have those effects as well, no?
In my comments, I assume that the recording and mastering is done indeed properly. If you can't carry the orchestra's sound to the playback medium, everything is already moot to begin with.
The thing I'm looking is musical dynamics rather than details itself, but it's equally lost with poor recording and mastering as you say, since they're also captured by the microphones. The thing I'm trying to explain was they are not "finer details" like "oh! I hear the bow of that player", but a bigger feeling that the orchestra creates by playing together, and that effect is independent from individual instruments, most of the time.
It's a somewhat difficult concept to put into words and explain. It's more about feeling the music and decoding the brain, and I think it needs some experience. Being unable to translate this into words makes me sad, because it carries music to another dimension IMHO.
It's a somewhat difficult concept to put into words and explain
Don't worry I understand what you mean wrt dynamics etc, it's just that I'd never thought of linking it to lossy compression, because there are so many other things which make it hard to reproduce that live sound.
But that still had nothing to do with comparing lossy vs lossless, or am I misunderstanding you?
How does a live performance that you hear with your ears at a specific place in a room help you pick out missing parts in a different recording, played by different people in a different place, recorded with multiple mics and then mixed and mastered?
It seems plausible to me. I assume that when you are doing a comparison, you are comparing a single source to a memory (does anyone do comparisons by playing two synchronized sources together, possibly into different ears?) In that case, I can well imagine that listening to multiple live performances primes one's mind to remember clearly how a given presentation sounded, and to pick out small differences, precisely because live performances are all slightly different. I would further imagine that performing a piece, and particularly practicing with the rest of the orchestra or conducting a practice, further enhances one's ability to notice and characterize small differences.
Of course, this might be utter nonsense, and I will bow to bayindirh's judgement on that!
Of course you can train your ears, to be able to hear more detail and learn to differentiate and identify frequencies.
That'll definitely help with hearing differences between two tracks. But i don't think you can compare live music to recorded tracks. Especially acoustic instruments, the room is such a big factor with those.
I am not actually suggesting that one should compare recorded music to live performances for the purpose of comparing audio encoding technologies. Oddly, your reply to bayindirh is in complete agreement with what I wrote here.
I think he explained it a little better. I totally understand that you'll learn to identify which frequencies and sounds belong to which instrument and in turn learn to identify when those are missing.
I guess that also teaches your ears to identify differences in other situations more clearly.
That's what mixing and mastering engineers practice their whole career and get really good at.
It has, but in a different w.r.t comparing different sound systems with the same recording. Let me try to explain. You might know some of the following, sorry if it's a re-explanation.
In a proper concert hall, sound is expected to be homogenous, so you should be able listen to the orchestra equally well, with the same sound balance (or mix) regardless of the place you sit. Similarly, recordings are done from suspended or positioned (and ideally tuned) mics, so you can capture the orchestra as someone sitting in the audience. At least this is how our performances were recorded.
The mastering is then done to match the recorded sound to the hall's sound, and balance any imperfections or clean the orchestra's inner talk between pieces (yes, we communicate a lot :D ).
When you listen an orchestra live, you will have a lossless blueprint of the piece in your mind (track by track if you can separate the instruments). If you can get a recording of the same performance, you can compare it with the live performance. That's absolutely correct.
But if you listen to a recording of a different orchestra playing the same piece, the arrangement and instrumentation will be same (you may have 8 violins instead of 12 but, violins won't be changed by violas most of the time). So, the atmosphere of the piece will be the same. Assuming the recording is done by competent folks, the spectrum would be the same (~20Hz -> ~20Khz roughly).
After some point, even if you're listening to a different orchestra, you can start to point to the things that should be there. It's very hard to describe, but every instrument has a base sound and details on top of it (you can tell they're all trumpets, but different brands or models. Similarly you can tell they're double basses but they're different in some ways). That base sound starts to erode too when you have a lossy compression, and in turn it affects the sound of the piece, regardless of the finer details (which are mostly affected by resins, bows, styles, etc.).
It's a "these two instruments shouldn't interact like this in this piece. Something is missing!" kind of feeling. This missing part is either something at the high or low end, almost an harmonic. It's not noticeable unless you're looking for it, but it's there.
That difference can be clearly heard by re-encoding a FLAC as a high bitrate MP3 and taking their differences. It's a hiss-like sound by contains a lot of the said harmonics and you can almost listen to the piece just by listening to it. Someone did that and published the differences, but it was some years ago. I'm not sure I can replicate or find the article. That article took differences of the exact same recording but, it can be applied by your brain to different recordings after some time.
Hope I've succeeded to clarify it somewhat. It's something very hard to describe by words. Please ask more questions if you want to. :) I'd be happy to try more.
Comparing two orchestras can be similar to comparing a recording in MP3 to FLAC.
I think i get the point in that learning to listen to those details and recognize frequencies can enhance your ability to spot differences in encoded audio.
I don't disagree with the fact that you perceive more in a live performance -- after all there's a wealth of spatial information that you don't get in stereo.
But that has absolutely nothing to do with compression. All that would matter is whether you're missing the "nuance" or "layers" that are there on an uncompressed CD, but that you would perceive to be gone in MP3.
I've performed and listened to a ton of classical music in my life, and I've never heard a difference in what you're talking about between CD's and MP3's. It doesn't really make any sense in terms of how MP3 compression works, either -- the compression artifacts it introduces are pretty orthogonal to nuance in classical music. at 128+ kbps
It sounds to me like you're describing the difference between a live concert and an uncompressed stereo recording, no matter how well it was mastered.
I got to hear the sennheiser orpheus a few years ago. Honestly it was kind of underwhelming.
It is a very physically beautiful headphone but in terms of sound, it's kind of warm with a slight haze and indistinctness in the treble. That might be pleasant for some people, but I think any modern electrostat like the L700 or SR009 would outperform it significantly if you put them side by side. I assume its value is due mostly to its rarity.
I'm quite certain I can't tell the difference between flac and mp3 v0, but I keep all the music I care about in flac. I don't know what lossy formats will have mainstream support in 10-20 years, but I know I will be able to transcode flac to them.
mp3 is often better, it removes close frequencies that your ears can't hear baring tricks like very slow phase shifts. Speakers and amps are asked to do less work, so they often sound better, particularly at high volume.
The same applies to the air the sound travels through and your ear drum and bio-pickups. You can often tell the difference, but its arguable if before or after mp3 processing is and improvement or destructive in terms of psychoacoustics.
I often see djs who swear by wavs without knowing how the pitch and tempo adjust algo work in the equipment they are using.
I.e. Top of the range pioneer Cdj decks run Busybox Linux and ffmpeg
> In plain language, this means that our musically trained test listeners could reliably distinguish the poorer quality MP3s at 128kbps quite accurately from either of the other higher-quality samples. But when deciding between 256 kbps encoded MP3s and the original CD, no difference could be determined, on average, for all the pieces. The testers took the 256 kbps samples for the CD just as often as they took the original CD samples themselves.
This article made me (1) never worry about "lossy audio encoding" again and (2) ignore everyone starting about "better equipment" wrt compressed audio.
Granted, they used the cheaper Nautilus 803 rather than the 801 in the test. But they also had Sennheiser Orpheus available in the listening test.
[1] https://hydrogenaud.io/index.php?topic=27324.0