Naturally Speaking . . .

by rundy on August 17, 2003

Previous post:

Next post:

The idea of talking to your computer has been around for decades in science fiction and around five or so years in real life. Ever since the idea of talking to a computer was conceived it was heralded by many as the eventual demise of typing. When voice recognition software came into actual existence in the mid- to late 1990′s some people were ready to sign the death warrant of the keyboard.

As a writer, I’ve a natural interest in this subject. In the past, I thought talking to a computer and having it understand me would be cool and very science-fiction-like. On a more realistic level, I’ve had a deep and abiding suspicion of those people who said the keyboard was passe. I felt they were taking the cool and science-fiction feeling and pretending the game was, in fact, real.

For years I could do nothing more than read about voice recognition software and be alternately curious and derisive. Only recently did I have the chance to try the technology out for myself. What I discovered was both surprising, and also confirming of my initial thoughts.

Perhaps a bit ironically, my first two forays into this new technology involved setting up Dragon’s Naturally Speaking software for two women in their early 70s. I first installed voice recognition software for a neighbor because, though she could type, she expressed herself most naturally when speaking. Since she was working on a book, she complained that it was difficult to write things out by hand–she wanted to explain verbally. So I installed software so she could “write” her book without writing.

My Grandma P was the second person I helped with voice recognition software. She wanted to use this software because she didn’t know how to type, but still wanted to use her computer to communicate. I didn’t actually install Dragon Naturally Speaking for her, but I helped her work through problems using it.

In helping my Grandma and my neighbor I saw that ease of use varied quite a bit between individuals. My neighbor is a very verbal and outgoing person, and she speaks quite clearly. She had a relatively easy time becoming accustomed to using Dragon Naturally Speaking. Beyond getting over the initial learning curve of remembering to turn off the microphone when she didn’t want the program recording her, there was only the problem of learning how to deal with the software’s small errors. (One blooper that sent her into hysterical laughter was when she said the phrase “. . . It takes time to be that” and Dragon translated “. . . It takes time to be fat.” Dragon Naturally Speaking wasn’t perfect, but I thought it did surprisingly well with her voice. Better than I had expected. The program could be taught to compensate for its errors, and my neighbor quickly adapted to talking to her computer.

On the other end of the scale was my Grandma P, who does not speak as clearly, and is not as talkative a person. If computers or typewriters had been available when she was young and she had been taught how to type, I’m sure that would be her preferred method of using a computer. But neither typewriters nor computers were available in her hey-day, and today she feels too old to learn how to type. So she is making a go at trying to talk to her computer.

Success is not coming so easily for Grandma. A good deal of her problem comes from the simple fact that voice recongition technology is not up to the level of the science fiction books. A person can’t walk up to a computer and simply begin babbling at it and expect good word recognition. The software must be trained to recognize each person’s voice. This means my Grandmother has traded the job of learning how to type for the job of teaching a stupid computer program how to listen. This is a lot of work for her–reading off selected passages of text and correcting the program whenever it misunderstands what she is saying. So she skimps on the teaching lessons for the program (and herself) and this makes using Dragon a more frustrating process.

Helping other people use voice recognition software gave me an idea of how the program would work, but in walking people through using the software, I never actually used it myself. I am a curious fellow, and helping other people was not the same thing as using it myself. I can type at speeds of 60+ words per minute–which I call at least as fast as I can think–so there was no pressing need for me to use voice recognition software. Nonetheless, I was curious how well I could use the software and how I would react to it.

Recently when I was over at Grandma P’s I decided to try out her copy of Dragon Naturally Speaking. Going into the test I was aware I had particular problems that I had to overcome. I am a poor speaker. I have a tendency to stutter, and when I’m not stuttering I am speaking too fast. My brain and my mouth simply don’t work together very well–especially when I’m trying to communicate with other people. Short of having a lisp or other physical mouth problem, I figured I was a good difficult test case for today’s voice recognition technology.

The results of my attempt were mixed. To my pleasant surprise, I could make Dragon Naturally Speaking recognize my words with a good degree of accuracy. If I spoke slowly and with clear enunciation, the program had a high degree of accuracy. The high mark of its success, I thought, was when the program correctly interpreted the posessive “Scorceress’s” which I find difficult enough to say, much less consider a program able to understand that it is possesive. Pretty good, I thought.

Good, but the progam could also frustrate me. It wasn’t perfect, and homonyms gave Dragon particular problems. Also, it went no faster, and possibly quite a bit slower, than my own dear typing.

In the end, I was less than enamored. Certainly there will be a place for voice recognition software in the future–it even has a place in the present day. But at this point I still have to laugh when people say it will replace the keyboard. Wonder what I mean? Try to edit a long document using voice commands. How much talking will it take you to shuffle around paragraphs, tighten up sentences, fix tenses, and clarify ideas? Talking your way through a document takes too much time and energy. There are areas where voice recognition software has advantages, but in the area of my writing–essays and novels–I find speaking a distinct disadvantage.

True, part of this may be due to the fact that I am not a very verbal person. But I don’t think that is the entire issue. Writing novels and essays is an inherently different medium than speaking, or casual letter writing. To speak is not the same thing as to write. To try to speak writing will always be awkward. Speech and writing don’t follow the same rhythm. There is a nuanced development and careful crafting to writing which is not present in the same form in speech. Dictating a personal letter can work, because in one sense it is an informal conversation in which only one person is speaking. Dictating an essay or story, which requires critical attention to flow and development, is awkward and frustrating. I, at least, feel a need to look at what I’m writing, to judge it, and to correct it. Writing lives in a different sphere than speaking. I need to see writing, and to manipulate it with my hands. Speech is something that leaves our lips and is gone. Writing is something to be molded until it is perfect, or as near to perfection as it can reach.

A second, less abstract problem with speaking writing is that some writing requires far too much speaking. I can write for hours at a time. Talking for hours on end? I would go hoarse. Resaying a paragraph a half dozen times until I thought it sounded just right? I would go nuts. My hands and my head can work independently–my mouth and my brain cannot. While my hands are putting in corrections that my brain has decided upon, my brain is already working on the next problem.

By the time I finished all my testing, I concluded that voice recognition software was not for me. More than just that, it is folly to call it a replacement for expression through pen or keyboard. Voice recognition will have its place, but I could think of only one place where it might be of use to me. Sometimes I get a whole bunch of ideas crammed into my head, and I just want to let them out. Either they are a bunch of ideas for many different projects, or else they are various distantly related ideas for one story or essay. With everything crammed together in my head I can’t put it down coherently on paper. I want to let the thoughts tumble out, to simply get it down so I can organize it later. This is when turning away from the monitor and simply speaking might work.

Just might. But that is a small use for still expensive software.

Wordsmiths will be using keyboards for a long time to come. That’s my prediction.

Save and Share:

Comments on this entry are closed.

Previous post:

Next post: