Apart from the selection of codecs, HTML5 audio seems to have pretty much all the bases covered.
As for codecs, depending on your application, you can either pre-transcode all your audio to an appropriate codec, transcode on the fly, or even use a Javascript transcoder and the Audio Data API.
I just came from a project that required heavy use of HTML5 audio, and let me tell you that the 15+ ffmpeg calls during the audio build step were the least painful part of the process.
HTML5 audio in iOS is a bit of a black box, with variable, unpredictable and undocumented behaviors between iOS5 through 7. These guards are intended to shield users from obnoxious behavior, but in the process, make the feature useless for everyone else.
You get one HTML audio element whitelisted in response to a user-initiated event, so you need a "click anywhere to do something useful!" screen. Also, you've got that one element to work with, so something like sound effects with a background track is out of the question.
Once you've got a whitelisted element, you can then re-use that element to play individual sounds by swapping out the src attribute of that single whitelisted audio tag, which queues a new web request and plays back the sound whenever the web request finishes... which is usually several seconds after you wanted it to play.
This happens regardless of whether or not the sound has been played before; the result isn't cached locally.
So, if you want to produce something remotely workable, you do audio spriting and never switch out to a different audio file. You create a build step that pulls in all of your audio, all of your sound files, puts them together into a single wave file, generates an audio atlas, and re-encodes audio to mp3, wav and ogg to hit all the major players (rinse and repeat for other channels of audio). Which works on most devices, but iPads decide to ignore currentTime assignments on that audio track, so the whole thing is kind of botched.
That's before we try to get stuff predictably working in IE, which can go anywhere from screwing the whole thing up to unpredictably mutating volume levels. That's before we support Firefox, which doesn't loop audio like everything else does and randomly stops playing audio. That's before we support Android, which exhibits different behaviors across the stock browser and Chrome, and so on, and so on.
If you want to play a single audio track in one or two desktop browsers, then sure, the technology is somewhat ready for you. Codecs are the least painful part of the problem. Anything beyond that is a travesty.
So true. We're trying to build a web-based online anatomy trainer[1], which plays success/failure sounds when you click right/wrong, but also add a voiceover with the anatomy term being spoken out.
We experience very sketchy behaviour across platforms and browsers (especially mobile). We're relying on SoundManager[2] which seems solid, but doesn't solve all problems. I wonder if anybody else has any suggestions / experience to make things easier somehow?
Did IT in an medical library in a university. The douchey doctors/professors spent a lot of money on a homegrown AV system connected to computers to accomplish part of that task, so I hope you can turn a buck off that. Seems cool.
As for codecs, depending on your application, you can either pre-transcode all your audio to an appropriate codec, transcode on the fly, or even use a Javascript transcoder and the Audio Data API.