A voice-mail system that would instantly and effortlessly guide an after-hours caller to the voice-mail box of the appropriate staff member would be a boon to your schools’ image in the community. Until now, that level of sophistication has been out of reach for most schools. But that soon could change.
Recent advances are promising to make speech-recognition technology a near-term solution for the school field, especially for specialized applications such as voice-driven directory assistance and information retrieval systems.
“There has been a striking improvement in the accuracy and ease of use of voice-recognition systems just in the last few years,” said Peter Grunwald, president of the market research firm Grunwald Associates. “It clearly has some interesting potential as we go forward.”
Up to now, speech-recognition technology hasn’t caught on at large. Factors such as background noise, variations in human speech, and prohibitive cost have limited its use in schools mainly to helping students with disabilities, who otherwise couldn’t input data, or to helping students develop language and reading skills (“LISTEN: Voice-recognition system helps student readers,” Oct. 1998). But the technology has been improving dramatically in recent years, to the point where it soon could be used much more widely.
Background noise, for example, can be cancelled largely through the use of noise-cancelling microphones attached to headsets. And improved chip technology has reduced the system requirements for using voice-recognition software to what is fast becoming standard in new PCs for the classrooma Pentium II or equivalent processor with from 48 to 64 MB of RAM and a sound card or built-in sound system.
One traditional stumbling block to the technology’s widespread adoption: To understand human speech, voice-recognition systems have required users to “enroll,” or train, on the systems first. Enrolling lets the software “learn” a user’s dialect and speech patterns so it can accurately understand what he or she is saying. For most systems, this has meant as much as an hour of straight reading just to set up the software for a particular user.
But improved technology has reduced the time it takes to enroll users. Lernout & Houspie, a leader in speech technology, introduced a new version of its Voice Xpress speech recognition software in June that the company claims is its most user-friendly yet. Version 4 of Voice Xpress reduces enrollment time to just 10 minutes, according to Anatoly Tikhman, president of the company’s applications division.
Speech-recognition technology has begun to infiltrate even conventional applications. General Magic, of Sunnyvale, Calif., has introduced a free service that uses voice-recognition technology to let you check your eMail over the phone, for example.
Called myTalk, the service gives members a toll-free number to call to hear their eMail messages read automatically by the system’s computers. At any point during the process, you can tell the system to skip to the next message, save, delete, or even reply. The system records your response as a voice file that is then attached to an outgoing message.
Another example is Corel’s WordPerfect Office 2000 suite, which is available in an optional Voice-Powered Edition. The software integrates Dragon’s Naturally Speaking voice-recognition technology with Corel’s WordPerfect word processing package to let users create documents by speaking into a headset microphone.
National competency center
Widespread use by students is probably still far off in schools. “A lab or classroom with 30 students using computers would be bedlam with everyone talking to their computers at once,” points out eSchool News columnist Trevor Shaw, who serves as director of technology for St. Benedict’s Preparatory School in New Jersey.
Still, the infiltration of speech recognition technology into mainstream applications is an indication that the technology is beginning to arrive. And for some specialized applications, such as voice-activated information retrieval or directory assistance systems, the technology is ready for use in schools right now.
Applications such as these were on display at an open house hosted by IBM and Western Connecticut State University on June 22. Through its partnership with IBM, the university has developed a national “competency center” for speech technology, where researchers develop and test voice-recognition applications for the education market.
One of those applications, which will be piloted in K-12 schools this fall, is called Educator’s Toolkit. Its purpose is to create a classroom environment in which teachers can walk freely around the room wearing headsets and talk to the technology around them to control it without using pointers or remote control devices.
“When you try to use technology in the classroom, it’s not seamlessit takes time away from teaching content to launch applications manually,” said Marla Fischer, director of the university’s Center for Technology Research and Productivity, of which the Center of Competence for Speech Technology is a part. “This is a way for teachers to use voice commands to launch video or computer applications, so teachers can focus specifically on classroom instruction.”
Another application to be piloted this fall is Homework Assistant. Developed jointly by IBM and Wizzard Software, Homework Assistant uses an animated icon, or “avatar,” that talks to students and asks them a series of questions. Based on a student’s verbal responses, the software is able to assess his or her individual learning style so homework assignments can be adjusted accordingly.
For example, if the software discovers that a student learns best by hearing the spoken word, it creates an assignment based on speaking. If the student likes to read and learn, the homework will be delivered in written form.
The center plans to pilot Homework Assistant this fall in youth centers and public schools in Connecticut. At the Harambee Youth Center in Danbury, for example, about 30 students will be involved in a study of the system’s effectiveness. Fifteen students will use the speech technology to complete their homework assignments, and 15 will be given traditional homework assistance. Grade-point averages for the two groups will be compared at the end of the school year to assess the results.
At its open house, the Center of Competence for Speech Technology also demonstrated an application used last year by the university. Called Directory Dialer, it’s a voice-activated call-routing system that lets callers leave a message or find a telephone number by asking for the person they wish to speak to by name.
It also finds departments and locations, and this fall it will be expanded to give callers a fax number, pager number, or eMail address as well.
“It’s a real productivity tool,” Fischer said. “It’s got the potential to be very powerful.”
For one thing, the technology behind Directory Dialer is speaker-independent. Because it simply matches the name indicated by the caller to the names in its database, the system has a high degree of accuracy and doesn’t require users to be enrolled, Fischer said.
Directory Dialer also eliminates the maze of touch-tone driven menus. For K-12 schools, Fischer added, it could be a particularly useful application: If a parent calls after 3 p.m., he or she can still get information without having to suffer through a lengthy menu of touch-tone options, thereby improving communications.
Directory Dialer is available commercially through IBM. The software runs on IBM’s Netfinity server with a 400 MHz or faster processor, is capable of handling up to 250,000 names, can be monitored and updated through the internet, and is priced according to the size of the directory, hardware, and services needed.
Which witch is which
IBM’s ViaVoice speech-recognition technology, like that of other vendors, still isn’t perfect. In a demonstration at the June 22 open house, a computer running ViaVoice correctly spelled, “Last summer Barry rowed the boat while Susan rode her bicycle down the road” and “Yesterday I read the book with the red cover.” But it was tripped up by the sentence, “On Halloween could you tell which witch was which?”
Grunwald cautioned that schools thinking of investing in speech-recognition software should understand what they hope to accomplish with itand what they’re prepared to tolerate in terms of its accuracybefore they plunk down any money.
“Ninety-six percent accuracy sounds very highbut in practice, a four percent error rate could defeat the purpose of what you’re trying to accomplish with the technology,” he said.
Lernout & Houspie
General Magic Inc.
Western Connecticut State University