Ever wanted to get a room full of web developers to agree on something? Tell them their client wants sound effects for their new corporate website. Chances are good that you'll hear a unanimous groan:
That's because sounds from the browser have always been a touchy subject, and for good reason. They're problematic, controversial, and completely foreign territory for most devs.
We've all seen sound effects deployed on pet projects, fun & artsy sites, or 'crazy-stuff-the-browser-can-do-that-you-didn't-know-about' at a conference. But what about bread and butter professional sites? Why is such an important layer of user experience absent from nearly all enterprise, commercial, retail and startup pages nowadays?
Designing aural web experiences has never been commonplace or elegant. There have only been four defining web audio technologies for the browser. As Krilnon points out, we had the object/embed tags that could play embedded MIDI files. The second was essentially a Flash plugin...which everyone is quite eager to forget. The third breakthrough was when HTML5 introduced the audio element, complete with DOM methods, properties, and events for controlling audio and video elements in the browser. But this format was essentially a hacky way to strip a video player down to its audio layer.
Currently, we can use the powerful Web Audio API to meet the demands of dynamic sound effects, navigation sounds, triggers for real-time events, or even synthesizing our own sounds from the browser itself! But even then, this API doesn’t come with its own share of hefty criticism, and doesn’t address the core problem: What's going on around your user when they hear a web sound?
Designers have little to no control over the context in which a user experiences sound on their web page or app. Are they in public? In a shared working space? At home on a quiet night in? There are also a slew of variables which make it impossible to know how a user will experience sound. Different browsers deliver audio in different ways, while varying internet speeds can affect page load times and performance. Some users may be hearing impaired, while others have their volumes at max, or even more common today, permanently muted.
It's very straightforward to design and add sounds on a web page or app. It's very rare to make a bulletproof case for using them. Adding sound on the web page, web app, or native app always means adding risk. When a client asks for sound or sound effects, we're eager to listen to their reasons, but more often than not, the end product winds up being cleaner by not having any sound elements at all—aside from embedded video (arguably more of a visual medium). It's also near impossible to use automated A/B testing scenarios for sound effects, due to the high level of variance in real situations.
So... should we really use sound more? There are some who argue sounds are underused on the web. And sound UX/UI designers still have jobs in the web industry. Just where are all these sounds going?
We've come to expect that a typical browser experience should immerse us in a sort of audio-deprivation tank, shielding us from the beeps and boops of our myriad devices. From time to time, we'll click on a play button to deliberately break silence with a video or podcast. Other than that, our browsing experience is an exercise in quietude. Much like reading a book, we want browsing to lack auditory feedback. But we don't feel the same about other digital experiences. That's because good audio UX relies on pinpointing user context. Sound UX on the web thrives when you can predict what's happening around the user. Think of the following scenarios where sound design and sound UX play a large role:
Video games like The Witcher 3 or Osu! can get away with brilliant sound effects when navigating through item menus, equipping gear, and changing gameplay settings. It's nearly the same experience as navigating in the browser. The difference is that gamers play in a predictable, specific context in which audio is inherently permitted to dominate their experience:
Sound architecture in games is so advanced that guides are often over 100 pages of C++ tutorials that only scrape the surface of game sound design. But the end result is one of awesome singularity. Whether a user is playing on OSX, Windows, offline, online or on their console of choice, the aural experience remains consistent throughout, as does their context.
In the world of IoT, interfaces can be minimalist, and sometimes there's no visual interface at all. We can predict that sounds are key drivers of the experience with web-connected devices. They are expected and demanded by the user, especially if there is no other way to interact.
New car-browsing experiences are on the horizon, which will require web developers to think about how their content is consumed in the car. Car apps already employ tactile responses, sound alerts, and voice controls for simple car-mode apps. The future browser experience will also need to adapt to a very specific context—users in automated driving cars.
Popular chat and conferencing apps also employ web sounds: Slack, Skype, Discord, all use notification sounds in the browser.
What's interesting is that their browser-versions mimic the same notification sounds that their native apps do. These alerts aren't considered intrusive to the same degree, because users opting for conferencing apps expect the same experience that they receive outside of the browser. Designers can predict the context of their use.
Then again, do a Google search of "sound" and your favorite messenger or site that makes sound:
The majority of results will be "How do I turn this darn thing off?!"—and we're back to square one! No matter how much a designer or developer thinks their brand sounds work, there's always going to be more people who want to turn them off.
Novelty sites, personal art projects, and game sites aside, where should a professional site use professional sounds? Where does a web sound fit in for the 'non-fun' pages which are the workhorse for an enterprise client, digital agency, or ecommerce owner?
If users log into a site to get work done (not just browsing), there's a good opportunity for sound. Take this example for a furniture ecommerce store, where embedding sounds in their internal web app helped workers complete their inventory tracking without looking at the screen. By adding sounds to navigation, alert triggers when fields were missing, and delivering auditory feedback, sound can play an important role when you know the context of work involves the user not always looking at a screen.
For sites where the user is predicted to do a lot of reading, like a blog, news site or long description page, Text-to-Speech APIs can offer users an alternative to 'look away' or assist those with sight limitations. This is getting easier to implement, with services coming from Amazon, Google, AWS and a slew of other third-party companies that use deep learning tools that read out text in a more human way.
This tactic also works for e-commerce or retail sites, too. Check out this brand (admittedly a sort of mock retail site), which has personal narrations for text on the page. You can imagine how this might be useful for an ecommerce owner looking to add a layer of narration, storytelling, and personality through sound experience—all while the user browses their product page.
Duolingo, one of the most popular language-learning sites, is a great example where auditory feedback is "allowed" by web users:
With only 4 sound files, users get familiar with the repetitive feedback loops:
Developing key sounds around repetitive actions may be useful if your users are interacting with your site regularly, rather than passively browsing. Keep in mind one of the main crticisms of Duolingo is that these sounds aren't easily muted—they could help users by having more obvious controls.
Much like Text-to-Speech AI for blogs, if your web app or site has a car-mode function, the context of 'looking away' is all too relevant for drivers. Take the BBC's Sounds which delivers, simplified touch interface, voice commands and sound alerts so that drivers never have to look away to navigate. Tesla's tablet dashboard has been notorious for not loading web browser pages correctly, but that is all changing soon. The future's car-browser experience will allow drivers to access the browser, search, and visit sites just as they might at home or the office.
Limited commercial appeal shouldn't limit our ability to think with every tool in our kit. There are viable cases for web sounds, and the frontiers are largely unexplored because of our biases. Examining biases towards web sounds may also be prudent for developing sites that work for making the web more accessible to all users, as well as near-future technologies.
But a few things are certain. Sound should never automatically play. Every noise on a webpage must be fit to a highly-specific, predictable context. And finally, users should have on-demand, ultra-obvious mute control, not hidden in settings.
This article was written by Brenden Arakaki who writes to simplify labyrinthine web and tech concepts for forward-thinking digital brands.
Just a final word before we wrap up. If you have a question and/or want to be part of a friendly, collaborative community of over 220k other developers like yourself, post on the forums for a quick response!