HTML has a built-in native audio player interface that we get simply using the 3 element. Point it to a sound file and that’s all there is to it. We even get to specify multiple files for better browser support, as well as a little CSS flexibility to style things up, like giving the audio player a border, some rounded corners, and maybe a little padding and margin. Show But even with all that… the rendered audio player itself can look a little, you know, plain. Did you know it’s possible to create a custom audio player? Of course we can! While the default 3 player is great in many cases, having a custom player might suit you better, like if you run a podcast and an audio player is the key element on a website for the podcast. Check out the sweet custom player Chris and Dave set up over at the ShopTalk Show website.The audio player fits in seamlessly with other elements on the page, sporting controls that complement the overall design.We’re going to take stab at making our own player in this post. So, put on your headphones, crank up some music, and let’s get to work! The elements of an audio playerFirst, let’s examine the default HTML audio players that some of the popular browsers provide. BlinkFirefoxInternet ExplorerIf our goal is to match the functionality of these examples, then we need to make sure our player has:
Let’s say this is the design we’re aiming for: We’re not going for anything too fancy here: just a proof of concept sorta thing that we can use to demonstrate how to make something different than what default HTML provides. Basic markup, styling and scripts for each elementWe should first go through the semantic HTML elements of the player before we start building features and styling things. We have plenty of elements to work with here based on the elements we just listed above. Play/pause buttonI think the HTML element appropriate for this button is the 5 element. It will contain the play icon, but the pause icon should also be in this button. That way, we’re toggling between the two rather than taking up space by displaying both at the same time.Something like this in the markup:
So, the question becomes: how do we swap between the two buttons, both visually and functionally? the pause icon will replace the play icon when the play action is triggered. The play button should display when the audio is paused and the pause button should display when the audio is playing. Of course, a little animation could take place as the icon transitions from the play to pause. What would help us accomplish that is Lottie, a library that renders Adobe After Effects animations natively. We don’t have to create the animation on After Effects though. The animated icon we’re going to use is provided for free by Icons8. New to Lottie? I wrote up a thorough overview that covers how it works. In the meantime, permit me to describe the following Pen: CodePen Embed Fallback The HTML section contains the following:
The CSS section includes some light styling. The JavaScript is what we need to break down a bit because it’s doing several things:
Here’s what the script is doing, minus the code:
Current time and durationThe current time is like a progress indicate that shows you how much time has elapsed from the start of the audio file. The duration? That’s just how long the sound file is. A 8 element is okay to display these. The 8 element for the current time, which is to be updated every second, has a default text content of 0. On the other side, the one for duration is the duration of the audio in 1 format.
Seek slider and volume control sliderWe need a way to move to any point in time in the sound file. So, if I want to skip ahead to the halfway point of the file, I can simply click and drag a slider to that spot in the timeline. We also need a way to control the sound volume. That, too, can be some sort of click-and-drag slider thingy. I would say 2 is the right HTML element for both of these features.
Styling range inputs with CSS is totally possible, but I’ll tell you what: it is difficult for me to wrap my head around. This article will help. Mad respect to you, Ana. Handling browser support with all of those vendor prefixes is a CSS trick in and of itself. Look at all the code needed on 3 to get a consistent experience:
Whoa! What does even mean, right? Styling the progress section of range inputs is a tricky endeavor. Firefox provides the 4 pseudo-element while Internet Explorer provides 5. As WebKit browsers do not provide any similar pseudo-element, we have to use the 6 pseudo-element to improvise the progress. That explains why, if you noticed, I added event listeners in the JavaScript section to set custom CSS properties (e.g. 7) that update when the input event is fired on each of the sliders.One of the native HTML 3 examples we looked at earlier shows the buffered amount of the audio. The 9 property specifies the amount of the audio, in percentage, that the user can play through to without having to wait for the browser to download. I imitated this feature with the 0 function on the track of the seek slider. I used the 1 function in the 0 function for the color stops to show transparency. The buffered width has a much deeper color compared to the rest of the track. However, we would treat the actual implementation of this feature much later.Volume percentageThis is to display the percentage volume. The text content of this element is updated as the user changes the volume through the slider. Since it is based on user input, I think this element should be the 3 element.
Mute buttonLike for the play and pause actions, this should be in a 5 element. Luckily for us, Icons8 also has an animated mute icon. So we would use the Lottie library here just as we did for the play/pause button.
That’s all of the basic markup, styling and scripting we need at the moment! CodePen Embed Fallback Working on the functionalityThe HTML 3 element has a 6 attribute. This attribute gives the browser instructions for how to load the audio file. It accepts one of three values:
An empty string is equivalent to the 9 value. Note, however, that these values are merely hints to the browser. The browser does not have to agree to these values. For example, if a user is on a cellular network on iOS, Safari does not load any part of an audio, regardless of the 6 attribute, except the user triggers the play action. For this player, we would use the 8 value since it doesn’t require much overhead and we want to display the length of the audio.What would help us accomplish the features our audio player should have is the JavaScript 3 interface, which the 4 interface inherits. For our audio player code to be as self-explanatory as possible, I’d divide the JavaScript into two sections: presentation and functionality.First off, we should create an 3 element in the audio player that has the basic features we want:
Display the audio durationThe first thing we want to display on the browser is the duration of the audio, when it is available. The 4 interface has a 7 property, which returns the duration of the audio, returned in seconds units. If it is unavailable, it returns 8.We’ve set 6 to 8 in this example, so the browser should provide us that information up front on load… assuming it respects 6. Since we’d be certain that the duration will be available when the browser has downloaded the metadata of the audio, we display it in our handler for the 2 event, which the interface also provides:
That’s great but, again, we get the duration in second. We probably should convert that to a 1 format:
We’re using 4 because the seconds returned by the 7 property typically come in decimals.The third variable, 6, is necessary for situations where the duration is something like 4 minutes and 8 seconds. We would want to return 4:08, not 4:8.More often than not, the browser loads the audio faster than usual. When this happens, the 2 event is fired before its listener can be added to the 3 element. Therefore, the audio duration is not displayed on the browser. Nevertheless, there’s a hack. The 3 has a property called 0. It returns a number that, according to MDN Web Docs, indicates the readiness state of the media. The following describes the values:
We want to focus on the metadata. So our approach is to display the duration if the metadata of the audio is available. If it is not available, we add the event listener. That way, the duration is always displayed. 0Seek sliderThe default value of the range slider’s 6 property is 100. The general idea is that when the audio is playing, the thumb is supposed to be “sliding.” Also, it is supposed to move every second, such that it gets to the end of the slider when the audio ends.Notwithstanding, if the audio duration is 150 seconds and the value of the slider’s 6 property is 100, the thumb will get to the end of the slider before the audio ends. This is why it is necessary to set the value of the slider’s 6 property to the audio duration in seconds. This way, the thumb gets to the end of the slider when the audio ends. Recall that this should be when the audio duration is available, when the browser has downloaded the audio metadata, as in the following: 1Buffered amountAs the browser downloads the audio, it would be nice for the user to know how much of it they can seek to without delay. The 3 interface provides the 0 and 1 properties. The 0 property returns a 3 object, which indicates the chunks of media that the browser has downloaded. According to MDN Web Docs, a 3 object is a series of non-overlapping ranges of time, with start and stop times. The chunks are usually contiguous, unless the user seeks to another part in the media. The 1 property returns a 3 object, which indicates “seekable” parts of the media, irrespective of whether they’ve been downloaded or not.Recall that the 7 attribute is present in our 3 element. If, for example the audio duration is 100 seconds, the 0 property returns a 3 object similar to the following:When the audio has started playing, the 1 property would return a 3 object similar to the following:It returns multiple chunks of media because, more often than not, byte-range requests are enabled on the server. What this means is that multiple parts of the media can be downloaded simultaneously. However, we want to display the buffered amount closest to the current playback position. That would be the first chunk (time range 0 to 20). That would be the first and last chunk from the first image. As the audio starts playing, the browser begins to download more chunks. We would want to display the one closest to the current playback position, which would be the current last chunk returned by the buffered property. The following snippet would store in the variable, 3, i.e. the time for the end of the last range in the 3 object returned by the 0 property. 2This would be 20 from the 0 to 20 range in the first image. The following snippet stores in the variable, 6, the time for the end of the last range in the 3 object returned by the 1 property. 3Nevertheless, this would be 100 from the 90 to 100 range in the second image, which is the entire audio duration. Note that there are some holes in the 3 object as the browser only downloads some parts of the audio. What this means is that the entire duration would be displayed to the user as the buffered amount. Meanwhile, some parts in the audio are not available yet. Because this won’t provide the best user experience, the first snippet is what we should use.As the browser downloads the audio, the user should expect that the buffered amount on the slider increases in width. The 3 provides an event, the progress event, which fires as the browser loads the media. Of course, I’m thinking what you’re thinking! The buffered amount should be incremented in the handler for the audio’s progress event.Finally, we should actually display the buffered amount on the seek slider. We do that by setting the property we talked about earlier, 9, as a percentage of the value of the slider’s 6 property. Yes, in the handler for the progress event too. Also, because of the browser loading the audio faster than usual, we should update the property in the 2 event and its preceding conditional block that checks for the readiness state of the audio. The following Pen combines all that we’ve covered so far:CodePen Embed Fallback Current timeAs the user slides the thumb along the range input, the range value should be reflected in the 8 element containing the current time of the audio. This tells the user the current playback position of the audio. We do this in the handler of the slider’s input event listener.If you think the correct event to listen to should be the change event, I beg to differ. Say the user moved the thumb from value 0 to 20. The input event fires at values 1 through to 20. However, the change event will fire only at value 20. If we use the change event, it will not reflect the playback position from values 1 to 19. So, I think the input event is appropriate. Then, in the handler for the event, we pass the slider’s value to the 05 function we defined earlier.We created the function to take time in seconds and return it in a 1 format. If you’re thinking, Oh, but the slider’s value is not time in seconds, let me explain. Actually, it is. Recall that we set the value of the slider’s 6 property to the audio duration, when it is available. Let’s say the audio duration is 100 seconds. If the user slides the thumb to the middle of the slider, the slider’s value will be 50. We wouldn’t want 50 to appear in the current time box because it is not in accordance with the 1 format. When we pass 50 to the function, the function returns 0:50 and that would be a better representation of the playback position.I added the snippet below to our JavaScript. 4To see it in action, you can move the seek slider’s thumb back and forth in the following Pen: CodePen Embed Fallback Play/pauseNow we’re going to set the audio to play or pause according to the respective action triggered by the user. If you recall, we created a variable, 09, to store the state of the button. That variable is what will help us know when to play or pause the audio. If its value is 10 and the button is clicked, our script is expected to perform the following actions:
We already implemented the second and third actions in the handler for the button’s click event. What we need to do is to add the statements to play and pause the audio in the event handler: 5It is possible that the user will want to seek to a specific part in the audio. In that case, we set the value of the audio’s 13 property to the seek slider’s value. The slider’s change event will come in handy here. If we use the input event, various parts of the audio will play in a very short amount of time.Recall our scenario of 1 to 20 values. Now imagine the user slides the thumb from 1 to 20 in, say, two seconds. That’s 20 seconds audio playing in two seconds. It’s like listening to Busta Rhymes on 3× speed. I’d suggest we use the 14 event. The audio will only play after the user is done seeking. This is what I’m talking about: 6With that out of the way, something needs to be done while the audio is playing. That is to set the slider’s value to the current time of the audio. Or move the slider’s thumb by one tick every second. Since the audio duration and the slider’s 6 value are the same, the thumb gets to the end of the slider when the audio ends. Now the 16 event of the 3 interface should be the appropriate event for this. This event is fired as the value of the media’s 13 property is updated, which is approximately four times in one second. So in the handler for this event, we could set the slider’s value to the audio’s current time. This should work just fine: 7However, there are some things to take note of here:
Let’s consider the first issue. To be able to interact with the slider while the audio is playing, we would have to pause the process of updating it’s value when it receives input. Then, when the slider loses focus, we resume the process. But, we don’t have access to this process. My hack would be to use the 19 global method for the process. But this time, we won’t be using the 16 event for this because it still won’t work. The animation would play forever until the audio is paused, and that’s not what we want. Therefore, we use the play/pause button’s click event.To use the 19 method for this feature, we have to accomplish these steps:
This is illustrated in the following snippet: 8But this doesn’t exactly solve our problem. The process is only paused when the audio is paused. We also need to pause the process, if it is in execution (i.e. if the audio is playing), when the user wants to interact with the slider. Then, after the slider loses focus, if the process was ongoing before (i.e. if the audio was playing), we start the process again. For this, we would use the slider’s input event handler to pause the process. To start the process again, we would use the 14 event because it is fired after the user is done sliding the thumb. Here is the implementation: 9I was able to come up with something for the second issue. I added the statements in the seek slider’s input event handlers to the 23 function. Recall that there are two event listeners for the slider’s input event: one for the presentation, and the other for the functionality. After adding the two statements from the handlers, this is how our 23 function looks: 0Note that the statement on the fourth line is the seek slider’s appropriate statement from the 25 function we created earlier in the presentation section.Now we’re left with the volume-control functionality. Whew! But before we begin working on that, here’s a Pen covering all we’ve done so far: CodePen Embed Fallback Volume-controlFor volume-control, we’re utilizing the second slider, 26. When the user interacts with the slider, the slider’s value is reflected in the volume of the audio and the 3 element we created earlier.The slider’s 6 property has a default value of 100. This makes it easy to display its value in the 3 element when it is updated. We could implement this in the input event handler of the slider. However, to implement this in the volume of the audio, we’re going to have to do some math.The 3 interface provides a 31 property, which returns a value between 0 and 1, where 1 being is the loudest value. What this means is if the user sets the slider’s value to 50, we would have to set the volume property to 0.5. Since 0.5 is a hundredth of 50, we could set the volume to a hundredth of the slider’s value. 1Not bad, right? Muting audioNext up is the speaker icon, which is clicked to mute and unmute the audio. To mute the audio, we would use its 32 property, which is also available via 3 as a boolean type. Its default value is 34, which is unmuted. To mute the audio, we set the property to 35. If you recall, we added a click event listener to the speaker icon for the presentation (the Lottie animation). To mute and unmute the audio, we should add the statements to the respective conditional blocks in that handler, as in the following: 2Full demoHere’s the full demo of our custom audio player in all its glory! CodePen Embed Fallback But before we call it quits, I’d like to introduce something — something that will give our user access to the media playback outside of the browser tab where our custom audio player lives. Permit me to introduce to you, drumroll, please… The Media Session APIBasically, this API lets the user pause, play, and/or perform other media playback actions, but not with our audio player. Depending on the device or the browser, the user initiates these actions through the notification area, media hubs, or any other interface provided by their browser or OS. I have another article just on that for you to get more context on that. The following Pen contains the implementation of the Media Session API: CodePen Embed Fallback If you view this Pen on your mobile, take a sneak peek at the notification area. If you’re on Chrome on your computer, check the media hub. If your smartwatch is paired, I’d suggest you look at it. You could also tell your voice assistant to perform some of the actions on the audio. Ten bucks says it’ll make you smile. 🤓 One more thing…If you need an audio player on a webpage, there’s a high chance that the page contains other stuff. That’s why I think it’s smart to group the audio player and all the code needed for it into a web component. This way, the webpage possesses a form of separation of concerns. I transferred everything we’ve done into a web component and came up with the following: CodePen Embed Fallback Wrapping up, I’d say the possibilities of creating a media player are endless with the 3 interface. There’s so many various properties and methods for various functions. Then there’s the Media Session API for an enhanced experience.What’s the saying? With great power comes great responsibility, right? Think of all the various controls, elements, and edge cases we had to consider for what ultimately amounts to a modest custom audio player. Just goes to show that audio players are more than hitting play and pause. Properly speccing out functional requirements will definitely help plan your code in advance and save you lots of time. |