Portfolio Project: JavaScript Text-to-Speech App: Using the Web Speech API
The Web Speech API is a powerful tool that enables web developers to incorporate speech synthesis (text-to-speech), and speech recognition features into their applications.
In this tutorial, we will focus on the speech synthesis aspect, creating a simple JavaScript Text-to-Speech application using the Web Speech API.
Prerequisites
To follow this tutorial, you will need:
- A basic understanding of HTML, CSS, and JavaScript
- A modern web browser supporting the Web Speech API (e.g., - Google Chrome, Microsoft Edge, or Mozilla Firefox)
Setting up the HTML structure
First, let's create a simple HTML structure for our text-to-speech application. Create a new HTML file called index.html
and add the following code:
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8" /> <meta name="viewport" content="width=device-width, initial-scale=1.0" /> <title>JavaScript Text-to-Speech</title> <link rel="stylesheet" href="styles.css" /> </head> <body> <div class="container"> <h1>JavaScript Text-to-Speech</h1> <textarea id="text-input" placeholder="Type your text here"></textarea> <button id="speak-btn">Speak!</button> </div> <script src="app.js"></script> </body> </html>
Add some basic CSS styles
Create a new CSS file named styles.css
and add the following styles to make the UI more appealing:
* { box-sizing: border-box; } body { font-family: Arial, sans-serif; display: flex; justify-content: center; align-items: center; height: 100vh; margin: 0; background-color: #f3f3f3; } .container { width: 80%; max-width: 600px; background-color: #fff; padding: 30px; border-radius: 10px; border: 1px solid #eee; box-shadow: 0 4px 6px rgba(0, 0, 0, 0.1); } textarea { width: 100%; height: 150px; padding: 10px; font-size: 16px; border-radius: 5px; border: 1px solid #ccc; resize: none; } button { display: block; width: 100%; padding: 10px; font-size: 18px; background-color: #007bff; color: #fff; border: none; border-radius: 5px; cursor: pointer; margin-top: 20px; } button:hover { background-color: #0056b3; }
Now you should have a nice-looking application with no functionality.
Next, let's dive into the heart of the application and add the magic with JavaScript. 👇
Implementing the Text-to-Speech functionality
Now, let's create the JavaScript functionality. Create a new JavaScript file named app.js
and add the following code:
// Initialize when the page loads window.addEventListener("DOMContentLoaded", () => { if ("speechSynthesis" in window) { document.getElementById("speak-btn").addEventListener("click", () => { // Get text from input const textInput = document.getElementById("text-input"); const text = textInput.value.trim(); // Show alert if no text is written if (!text) { alert("Please enter some text"); return; } // This creates our speech request and the content we want to be spoken const utterance = new SpeechSynthesisUtterance(text); // Optional: Customize voice properties utterance.voice = speechSynthesis.getVoices()[0]; // Choose a voice utterance.rate = 1; // Set the speech rate (0.1 to 10) utterance.pitch = utterance.volume = 1; // Set the speech volume (0 to 1) // Speak the text speechSynthesis.speak(utterance); }); } else { alert("Your browser does not support the Web Speech API"); } });
Let's explain a couple of pieces here in case the comments in the code don't reveal all.
The line with const utterance = new SpeechSynthesisUtterance(text);
creates our speech request and the content we want to be spoken. This can be modified as can be seen in the code where we can update settings like pitch and rate.
We can then "speak" by calling speechSynthesis.speak();
and passing it our utterance
.
Everything else is either an optional configuration and making our app interactive.
Play around with the index value in utterance.voice = speechSynthesis.getVoices()[0];
and listen to the variety of voices.
console.log(speechSynthesis.getVoices())
to dive in and see what option is perfect for you. The first value localized based on the users' location settings.
We will add the ability to choose a voice in the next optional section.
The best way to learn about the API is by playing with the configuration, and you can find all the settings here in the Mozilla Docs.
Demo
Here's the working code up to now:
Optional: Add a Voice Selection
We will add a select input element to the HTML structure to allow users to choose a voice for the speech synthesis. Update the HTML file to include the new select element:
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <title>JavaScript Text-to-Speech</title> <link rel="stylesheet" href="styles.css"> </head> <body> <div class="container"> <h1>JavaScript Text-to-Speech</h1> <!-- New Select options --> <select id="voice-select"> <!-- Voice options will be populated by JavaScript --> </select> <textarea id="text-input" placeholder="Type your text here"></textarea> <button id="speak-btn">Speak!</button> </div> <script src="app.js"></script> </body> </html>
And to keep it looking tasty, add the stylings for this input into your styles.css
:
select { width: 100%; padding: 5px; font-size: 16px; border-radius: 5px; border: 1px solid #ccc; resize: none; margin-bottom: 10px; }
Now let's update our JavaScript to include our voice selection. First, let's update the code so we can see the different voice options.
Add this to the top of our app (inside the event listener).
// Get the voice-select element const voiceSelect = document.getElementById("voice-select"); // Function to populate the voice options function populateVoiceList() { const voices = speechSynthesis.getVoices(); voices.forEach((voice) => { const option = document.createElement("option"); option.textContent = `${voice.name} (${voice.lang})`; option.setAttribute("data-lang", voice.lang); option.setAttribute("data-name", voice.name); voiceSelect.appendChild(option); }); } // Call the function to populate the voice options when the voices are available if (speechSynthesis.onvoiceschanged !== undefined) { speechSynthesis.onvoiceschanged = populateVoiceList; }
Now we need to get the selected option by updating our utterance.voice
variable:
// Get the selected voice const selectedVoiceName = voiceSelect.selectedOptions[0].getAttribute( "data-name" ); const selectedVoice = speechSynthesis .getVoices() .find((voice) => voice.name === selectedVoiceName); utterance.voice = selectedVoice;
Here's a demo of the full code:
What next?
Tutorials are fun, but don't stop after following along.
Change some styling and make it your own!
Try to add some <input type="range">
(docs) to configure the pitch and rate dynamically too to make sure you understand how to work with the API.
If you customize or make your version of this app, leave your CodePen links below so I can see them. 🙌
Follow me on Twitter or connect on LinkedIn.
🚨 Want to make friends and learn from peers? You can join our free web developer community here. 🎉