Is Speech to Text Software Effective for Writing Emails?
Speech to text, also known as dictation, is the use of any intermediary besides your own hands and keyboard to get your message across. You can use dictation apps for writing emails, drafting documents, writing blog posts, and even just normal browsing of a computer. It can be tricky to configure, and it’s not perfect, but it can be acceptable.
Does Speech to Text Work?
A lot of you might be coming here just to wonder whether or not speech to text software actually works. After all, a computer listening to your voice and interpreting your commands is something out of science fiction. It’s only in the last few years that home-based devices like Alexa or Google Home have become anything more than a novelty. Phones have had it a bit longer, with Siri and the other personal assistants. Still, though, if you’ve ever played around with them, you know they can be a little tricky.
There’s a difference between a virtual personal assistant app and a dictation app, though. A dictation app just needs to recognize your words and put them on the screen. A personal assistant app needs to recognize the words, parse their order, figure out what you’re trying to get it to do, and then take action. It doesn’t sound like too much of a difference, but it’s actually huge. Just think about how much difficulty you might have talking to someone with an accent you don’t recognize, and then extend that to a computer program much less powerful than the human brain.
One thing a lot of people tend to forget, though, is that dictation, standard speech to text, has existed for decades. It’s an accessibility feature. People who have trouble using their hands, or have no hands at all, are still able to use a computer though a careful setup of accessibility features. Much like text to speech for the visually impaired, speech to text bypasses the need to use a keyboard to use a computer.
As such, dictation is something of a solved problem. It’s not entirely solved, but it works well enough to save you a lot of time.
Dictation for Various Purposes
Dictation software can be used for a wide variety of purposes. Anything you would type, you can generally command with your voice, if you’re using the right programs. You can’t just hook up a mic to your PC and start yelling at it to open Reddit, though; you need to set up special programs for that.
The number one app available for a hands-free workflow is Dragon. Dragon is actually a suite of different voice recognition apps for various purposes. Dragon Anywhere is a document workflow and professional dictation app. Dragon Legal is essentially a transcription service for law workers, with improved accuracy and recognition of legal terminology most users won’t have cause to say in normal conversation. Dragon Law Enforcement is nowhere near as cool as it sounds, and is basically a way to voice-automate filling out case reports.
If all you want to do is write emails, however, Dragon is going to be overkill. It’s a professional-grade tool and it has a price to match. Their cheapest option is $150 a year, or $300 for a stand-alone version of the app you then need to pay to upgrade major versions.
There are a bunch of other options for speech to text software, and I’ll cover them below. For now, just remember that when all you need is writing, you don’t need something as complex as Dragon.
The reason is pretty simple; Dragon includes features that allow it to navigate forms and programs, performing various workflow operations, including formatting your text as well. You don’t need any of that for writing emails, taking notes, or drafting a blog post.
What Matters in Dictation Software
When you’re looking to evaluate a piece of software for its speech to text capabilities, you want to make sure you’re actually considering factors that matter. Obviously, if you know you want an accessibility program with features that help you navigate a computer or program, you’ll need an advanced piece of software. If you’re working on something simpler, like email, you can get away with pretty much any app. Here’s what I look for when I’m evaluating one.
Price. Obviously, the price is always going to be a factor. I don’t really mind paying for a program even when free alternatives exist, so long as I’m able to get something valuable out of it. If there are perfectly serviceable free options that do everything a paid option can do, there’s little reason to go with the paid app.
Training. One of the quirks of using a speech to text app is the need to train it to recognize your voice. Everyone has an accent, even if you don’t tend to think of yourself as having one. Even the Midwestern US, considered as close to neutral as possible, has its own quirks.
A good speech to text app will require training, which generally involves reading various engineered sentences. This allows the app to learn how you pronounce certain sounds and words, which makes it more accurate. An app with no training will have more errors, while an app with too much training just puts up a barrier to entry too high for anyone to care.
Accuracy. Related to training, the accuracy of the program is of crucial importance. The less accurate the dictation is, the more work you need to do to fix the writing after the fact, and thus the less valuable the app becomes. You’re using it to save time or to work on the go, so if you need to take to the keyboard to fix everything later, it doesn’t serve its purpose.
You can, thankfully, text the accuracy of a dictation app very easily. All you need to do is come up with a sample, a couple hundred words, and read the same thing to every app you want to test. Then compare the original to the dictation. Simply count the number of errors and you’ll have an idea of the program’s accuracy.
For an additional bit of knowledge, try using an app both before and after you train it. You can see the amount of improvement training the app will provide. For those apps that don’t allow training, the best you can do is just try it out.
Languages. In some cases, you might be using foreign words or even an entire language other than the de facto default of English to compose your messages. French Canadians might be using their variety of French. Folks in the American Southwest might be using various dialects of Spanish. If you have any need to use alternative languages, make sure the app you choose supports those languages.
Accessories and Considerations
If you’re going to consider using a speech to text program, you need to consider a few other elements to the situation as well. There’s more to speech to text than just the app, after all.
Microphones. You’ve no doubt watched a YouTube video, a livestream, or a commercial where the audio source is poor. A bad microphone with no processing and no filtering will be full of noise. Imagine how much havoc that noise wreaks on a program that is designed to recognize waveforms in audio! Heck, just think about the difference in quality between a radio DJ and a caller who calls into their show. That’s the difference between a serviceable phone mic and a professional studio mic.
Now, I’m not saying you need to get a studio setup in your office in order to do any dictation. However, having a microphone with some decent quality will be better than nothing. Phone mics are often fine, some headsets work fine, and most desktop mics like the Yeti line will work as well.
Surroundings. Unfortunately, using a dictation app isn’t always the best approach to working on the go, which is when you would most like to be hands-free. If your car is reasonably insulated from road noise you might be able to do some work, but the noise of a gym is unlikely to help matters, and you can eliminate the idea of working on a train pretty much entirely.
Of course, you wouldn’t do that in the first place, right? Dictating to your phone in a crowded public space is just rude. It doesn’t matter if you’re composing emails or talking to your sister on the phone.
Pronunciation. Well, pronunciation and enunciation. How well do you say your words, and how accurate do you say them according to phonetics? I know a lot of people who have blind spots with some words, mispronouncing them without even realizing they do it. If you commonly mispronounce words, you might have trouble with those words when you’re dictating them.
To a certain extent, speech to text programs can compensate for some speech issues, but if you have a more prevalent speech impediment, such as a lisp, you might encounter further issues with the apps. In these cases, there’s not really a good solution besides simply trying them out and finding one that works for you.
With all of this in mind, here are some options you have for voice to text apps.
Speech to Text App Options
Dragon – I already linked this one up above. Dragon is the premier option and is going to have the best accuracy, the best training, and the best extensions for workflow management. It’s a lot more than a simple dictation app, and it’s definitely overkill if you just want to streamline your blog posts or email writing. However, if you want to give your hands a rest – on doctor’s orders or not – it can be a godsend. Just be aware of the price before you dig in.
Default Programs – Both Windows and Mac platforms have their own built-in speech to text accessibility options. Windows has Speech Recognition, which allows you to control your computer, dictate text, and generally perform most basic computing actions. You’re not exactly going to be playing a new PC game with it, but it can get you through business tasks without touching a keyboard.
Apple has Dictation, with the same story. It’s an internet-connected app with some additional accuracy because of it, but it’s slightly less useful for navigating your computer on its own. It also has issues if you lose internet without first enabling enhanced dictation.
Both apps are free and are already included and installed on your machine, whichever platform you’re using. You just need to enable them and get started using them.
Google – Google actually has a dictation mode that comes default with Google Docs. If you tend to use the GSuite for email, writing, or note taking, this can be a great option to look into. It’s pretty easy to use, but it’s only available while you’re using Chrome. Sorry, those of you who are Firefox loyalists or Opera die-hards. You can read about it here.
Mobile Apps – Don’t forget that your smartphone probably has a dictation app built in. I mentioned them already; remember those personal assistants? Google Now, Cortana, and Siri can all recognize your voice and dictate an email for you, and they’re all free.
Other mobile apps exist as well, like Speech Recognizer (for iOS) and ListNote (for Android). Those are two of the most popular and most useful mobile apps for dictation. There are others, but these are the most refined that you’ll find for cheap or free.
Dictation – If you don’t want to use Chrome, this is basically a web-based shell for Google’s speech recognition app. It has all of the benefits and drawbacks of Google’s speech recognition engine, just with another website acting as an intermediary.
Tazti – This one is sort of a mid-range solution. It has more features and better recognition than most of the free options, including the ones built in to your operating system. On the other hand, it’s less fully featured than Dragon. The mid-range option is reflected in the price, $40 on a perpetual sale.
Regardless of which app you choose, remember that it only works as well as you’ve trained it to work. You need to put in the time to work on enunciation and have decent enough hardware to support it, otherwise you’re going to have mixed results at the best of times.