INTUITY CONVERSANT System Version 6.0 Application Design Guidelines 585-310-670 Issue 1.0 December 1996 Introduction to Voice Response Application Design Page 1-5 Important Terminology 1 Important Terminology It is a good id ea to b ec ome familiar with the some of the terms and c onc ep ts p resented in the rest of this b ook. The list below is p rovid ed to help fac ilitate your familiarization. announc ement Sp eec h played b y the system to the c aller that informs, b ut d oes not instruc t the c aller to ac t. Comp are to prompt. ap plic ation The c omp uter p rog ram that d efines and c ontrols the voic e response transac tion b etween the system and the c aller. barge-In The capability that allows callers to respond while a prompt is b eing p layed . This is similar to d ial throug h for touc h tones. c aller The p erson who c alls for a servic e, g ets c onnec ted to the system, and interac ts with an app lic ation. c onnec ted -d ig it A sequenc e of d ig its sp oken b y a c aller without intentional or reg ular p auses in b etween d ig its. d ial ahead The touc h-tone rec og nition c ap ab ility that allows the system to c ollec t touc h tones as they are entered b y c allers, even b efore they are asked for. The touc h-tone input is then used in the order in whic h they were rec eived. This allows c allers to resp ond to more than one p romp t at a time, without having to listen to the intermed iate p romp ts. d ial throug h The touc h-tone c ap ab ility that allows c allers to resp ond while a prompt is being played. The playback of speech c eases and the ap p lic ation resp ond s to the key that was p ressed . This is similar to b arg e-in for WholeWord sp eec h rec ognition. Also known as talk off. fax The c apab ility that allows an ap p lic ation to send stored or d ynamically c reated fax messag es at the caller’s request, or rec eive inc oming faxes from the c aller. Flex Word speech rec og nitionThe op tional system c ap ab ility that rec og nizes c allers spoken input based on matc hing caller speech to word models fashioned from representations of sub-words (phonemes), the smallest unit of sp eec h. Flex Word Toolkit An op tional software p ac kag e that p rovid es a p oint-and -c lic k, grap hic al mec hanism for you to d efine c ustom list of word s for use with FlexWord Sp eec h Rec og nition.
INTUITY CONVERSANT System Version 6.0 Application Design Guidelines 585-310-670 Issue 1.0 December 1996 Introduction to Voice Response Application Design Page 1-6 Important Terminology 1 g rammar The set of rules b y whic h speec h is rec og nized b y WholeWord sp eec h rec og nition. For examp le, a Promp t & Collec t statement using the US_1_5 g rammar will rec ognize the word s “ one,” “ two,” “ three,” “ four,” and “ five” in US Eng lish. Grap hic al Sp eec h Ed i t o rAn op tional software p ac kag e p rovid es a point-and -c lic k, g rap hic al mec hanism for you to rec ord and ed it p romp ts and announc ements for use in your ap p lic ations. key word One of the word s in a list of word s that the system is instruc ted to rec ognize at a p artic ular p oint in the transac tion. menu A p romp t that gives c allers a c hoic e of two or more op tions. For examp le, “ For sales, p ress or say 1. For servic e, p ress or say 2. For an attend ant, p ress or say 0.” p hrase sc reening The speec h rec ognition c ap ab ility that d ec id es whether or not a c and idate key word is a c lose enough matc h to b e d ec lared a valid key word . Promp t & Collec t actionThe ac tion used to p lay a p romp t to the c aller, ac c ep t the c aller’s resp onse, and g o to the ap p rop riate p lac e in the ap p lic ation to proc ess the c aller’s req uest. p romp t Sp eec h p layed b y the system that instruc ts the c aller to enter information that is p art of a Promp t & Collec t ac tion. Comp are to Announc ement. rec og nition The p roc ess within the system that c omp ares c aller sp eec h to internal mod els and returns a matc h to the ap p lic ation. rec og nition typ e The c hoic es that are assoc iated with the Rec og nition Typ e field in the Promp t & Collec t ac tion form found in the Sc rip t Builder program. The values for recognition type, minimum number of digits, and maximum of digits work in c onjunc tion to allow the system to selec t a g rammar to b e used for that rec og nition event. See also grammar. Sc rip t Build er An optional software p ac kag e that allows you to d efine and g enerate voic e resp onse ap p lic ations to run on the I NTUITY C O N VERSA N T s y st e m . sp eec h, c ustom The p art of the system sp eec h datab ase that inc ludes ap p lic ation-dep end ent, p rerec orded sp eec h p hrases. sp eec h, enhanc ed basicThe part of the system speech database that inc ludes prerecorded speech phrases corresponding to numbers, ord inal numb ers, d ays of the week, and months of the year. Ap p lic ations use these p hrases to sp eak numb ers, amounts, dates, and times in a natural-sound ing way. sp eec h, p rerec ord ed A promp t or announc ement that has been rec ord ed b y a person.
INTUITY CONVERSANT System Version 6.0 Application Design Guidelines 585-310-670 Issue 1.0 December 1996 Introduction to Voice Response Application Design Page 1-7 Important Terminology 1 sub stitution error An error mad e b y the FlexWord speec h rec og nition software where it mistakes one word for another. talk off See dial throug h. Text-to-Speec h An op tional software p ac kag e that c onverts ASCII text into sp oken, c omp uter-g enerated p romp ts and announc ements. This pac kag e is supp orted for the US Eng lish lang uag e only. touc h-tone The sig nal sent when a c aller p resses any of the 12 keys on a p ush-b utton telep hone that send s d ual tones rather than rotary p ulses. transac tion The exc hang e of information b etween the c aller and an ap p lic ation. In a typic al transac tion, the c aller d ials in to or g ets transferred to the system, then the system answers and plays a g reeting . The c aller enters information in response to sp oken p romp ts and the system sp eaks information b ac k until the interac tion is c omp lete. See also application. usability The system q uality that reflec ts whether or not c allers c an learn and use the features suc c essfully. A system with hig h usab ility is c alled easy to use, or usable. user Interfac e The asp ec t of the app lic ation with whic h c allers interac t; inc lud es p romp ts and announc ements from the system to whic h c allers resp ond using touc h-tone or sp eec h. voc abulary The set of wordlists assoc iated with a partic ular FlexWord application. WholeWord speec h rec og nitionThe op tional system c ap ab ility that rec og nizes the lang uag e-sp ec ific word s (and c ommon synonyms) for the d ig its 0 throug h 9. Rec og nition is b ased on matc hing c aller sp eec h to word mod els fashioned from many samp les of p eop le saying eac h entire, whole word . word A FlexWord sp eec h rec og nition voc ab ulary item c onsisting of either a sing le word or a p hrase of several word s. word spotting The speech recognition capability that allows the system to p ic k out key word s from a stream of c aller sp eec h, whic h may inc lud e extraneous sp eec h, b ac kg round noise, or c aller noises. Works in c onjunc tion with p hrase sc reening . wordlist A set of words used with FlexWord sp eec h rec og nition that can be recognized by the system.
Voice Response Advanced Technologies Page 2-1 Overview 2 INTUITY CONVERSANT System Version 6.0 Application Design Guidelines 585-310-670 Issue 1.0 December 1996 2 2Voice Response Advanced Technologies Overview This c hap ter p rovid es an introd uc tion to the INTUITY™ C ON VERSAN T® s y st e m ad vanc ed tec hnolog ies inc lud ing touc h-tone and d ial p ulse rec og nition, sp eec h rec ognition, Text-to-Sp eec h, and the Sc rip t Build er FAX Ac tions. Purpose The purp ose of this c hap ter is to d esc rib e what eac h ad vanc ed tec hnolog y d oes, the typ es of ap p lic ations for whic h eac h is b est suited, and how the tec hnolog ies c an work tog ether.
INTUITY CONVERSANT System Version 6.0 Application Design Guidelines 585-310-670 Issue 1.0 December 1996 Voice Response Advanced Technologies Page 2-2 Touch-Tone and Dial Pulse Recognition 2 Touch-Tone and Dial Pulse Recognition The INTUITY CONVERSANT system p rovid es touc h-tone rec og nition and d ial p ulse rec og nition as two method s for c allers to p rovid e non-sp oken input to the system. Touch-Tone Recognition Touc h-tone rec og nition is a c ommon feature of voic e resp onse ap p lic ations. The majority of the telep hones in the United States are eq uip p ed with touc h-tone servic e, b ut touc h-tone telep hone availab ility will b e d ifferent based on loc ations. A g reater p roportion of b usiness loc ations have touc h-tone c ap ab ilities than d o resid enc es. If your c allers have touc h-tone telep hones, it is ec onomic al and effic ient to allow touc h-tone c aller inp ut for most transac tions. Touch-Tone Recognition Uses Touc h-tone inp ut may b e used to selec t c hoic es from a sp oken menu. It may also b e used to enter numeric al d ata suc h as c red it c ard numb ers or p ersonal id entific ation numb ers. Touch-Tone Recognition Capabilities Touc h-tone rec og nition rec og nizes the d ig its 0 throug h 9, as well as the asterisk (*) and pound sign (#). With touc h-tone rec og nition, c allers have the op tion to resp ond with a tone while a prompt is playing. This capability is known as dial through (may also b e referred to talkoff). As soon as the system d etec ts the tone, the p romp t stops. The system also sup p orts d ial ahead. This c ap ab ility allows the system to c ollec t touc h-tone inp ut as it is entered b y c allers, even b efore they are p romp ted for it. The touc h-tone input is then used in the ord er in whic h it was rec eived . This allows c allers to resp ond to more than one p romp t at a time, without having to listen to the intermed iate p romp ts. If exp erienc ed c allers are familiar with the up c oming p romp ts, they d o not have to wait until the next p romp t starts p laying b efore p ressing the req uired touc h-tone b uttons. Tone-touc h rec og nition c an b e used in c onjunc tion with other rec og nition method s, suc h as the Dial Pulse Rec og nition (DPR) software p ac kag e, as well as the two sp oken-inp ut rec og nition software p ac kag es (WholeWord and FlexWord ™ sp eec h rec og nition) d isc ussed later in this c hapter.
INTUITY CONVERSANT System Version 6.0 Application Design Guidelines 585-310-670 Issue 1.0 December 1996 Voice Response Advanced Technologies Page 2-3 Touch-Tone and Dial Pulse Recognition 2 Touch-Tone Recognition Accuracy The INTUITY CONVERSANT system is very ac c urate when rec og nizing touc h-tone input. However, c allers may make mistakes when entering touc h-tones, and a well d esigned ap p lic ation must hand le c aller errors g rac efully. Dial Pulse Recognition If touc h-tone servic e is not wid ely availab le in your area or c ountry of interest, you c an offer d ial p ulse rec og nition to p rovid e non-sp oken inp ut to the system. Dial Pulse Recognition Uses The DPR software allows c allers with rotary telep hones, or p ush-b utton telep hones that g enerate d ial pulses, to interac t with system ap p lic ations. Muc h like tone-touc h inp ut, d ial p ulse inp ut may used to enter menu c hoic es and numeric al d ata suc h as a b ank ac c ount numb er. Dial Pulse Recognition Capabilities The DPR software rec og nizes the d igits 0 throug h 9, but does not rec ognize the asterisk (*) or p ound sig n (#). DPR c an b e used tog ether in an ap plic ation with touc h-tone rec og nition and with sp eec h rec og nition (WholeWord or FlexWord ). Dial ahead and dial through are not supported for DPR. Dial Pulse Recognition Accuracy DPR is slig htly less ac c urate than touc h-tone rec og nition, but it d oes allow c allers to interac t with the ap p lic ation without talking to an ag ent. The ac c urac y of DPR c an b e imp roved , however, b y the use of sp ec ific ap p lic ation d esig n tec hniq ues. These tec hniq ues are disc ussed in g reater d etail in Chap ter 4, ‘‘Desig ning a Voic e Resp onse Ap p lic ation.’’
INTUITY CONVERSANT System Version 6.0 Application Design Guidelines 585-310-670 Issue 1.0 December 1996 Voice Response Advanced Technologies Page 2-4 Speech Recognition 2 Speech Recognition It makes g ood b usiness sense to p rovid e the c ap ab ilities offered b y the sp eec h rec ognition software. Sp eec h rec og nition allows your c allers to: nSp eak their resp onses; a more natural interfac e for c aller inp ut. nProvid e inp ut if they d o not have touc h-tone servic e. nProvid e inp ut for some typ es of information, like names, that d o not have a natural translation to touc h-tone inp ut. WholeWord Speech Recognition The WholeWord Sp eec h Rec og nition software p ac kag e is used to rec og nize sp oken inp ut of c onnec ted d ig its and yes/no resp onses. WholeWord Speech Recognition Uses WholeWord speec h rec og nition is most suc c essful when it is used to aug ment a touc h-tone ap p lic ation to p roc ess c allers who d o not have touc h-tone telep hones. The b est app lic ations first ask c allers to ind ic ate whether they have a touc h-tone telep hone (usually b y p ressing one on the keyp ad ). If no tone is d etec ted , the ap plic ation p romp ts c allers to resp ond with sp oken inp ut (instead of transferring the c all to an attend ant). In this manner, c allers who want to p rovid e sp oken inp ut c an b e served b y the system, instead of req uiring an attend ant. If your ap p lic ation req uires inp ut that c an easily b e map p ed to touc h-tone sig nals, d o not ig nore touc h-tone input in favor of sp eec h rec og nition. For long er d ig it seq uenc es, touc h-tone inp ut is more ac c urate. WholeWord and FlexWord sp eec h rec og nition c an b e used in the same ap plic ation for inc reased flexib ility. When d evelop ing your ap p lic ation, you c an sp ec ify a Promp t & Collec t ac tion to use either WholeWord or FlexWord sp eec h rec ognition, d ep end ing on what you want the c aller to say. For examp le, you c an first p rompt “ Please say your ac c ount numb er.” After the selec tion, the next p romp t is “ Choose from the following transac tions. Say ‘ac c ount balanc e,’ or say ‘transfer,’ or say ‘attend ant’ to sp eak to a servic e rep resentative.” Se e ‘‘FlexWord Sp eec h Rec og nition’’ b elow for more information on FlexWord sp eec h rec og nition.
INTUITY CONVERSANT System Version 6.0 Application Design Guidelines 585-310-670 Issue 1.0 December 1996 Voice Response Advanced Technologies Page 2-5 Speech Recognition 2 WholeWord Speech Recognition Capabilities This section describes the different WholeWord speech recognition capabilities. WholeWord Speech Recognition Languages The WholeWord Sp eec h Rec og nition software is used to rec og nize word s in the following lang uag es: nAustralian Eng lish nBrazilian Portug uese nCanad ian Frenc h nCastilian Sp anish nDutch nFren c h nGerman nJap anese nLatin-Americ an Sp anish nUK English nUS English Eac h lang uag e c an rec ognize: nEq uivalents for “ yes” and “ no” nSing le d ig its (zero throug h nine) and c ommonly used synonyms nA series of digits (also known as connected digits) Bilingual Applications Any two of the lang uag es listed ab ove c an b e used tog ether on a sing le system to sup p ort biling ual ap plic ations. A b iling ual p erson c an rec og nize two lang uag es simultaneously. A b iling ual ap plic ation, however, c an only rec ognize sp eec h in a sing le lang uag e at any one Promp t & Collec t ac tion. You c an d esig n an ap plic ation that asks c allers to ind ic ate their p referred lang uag e, then play p romp ts and announc ements, and rec ognize the word s “ yes” and “ no” as well as sp oken d ig its in one of the two lang uag es installed in your system. This allows your system to und erstand c allers who resp ond in either lang uage you make availab le. See ‘‘Biling ual, Multilingual, and Non-US English Applications’’ in Chapter 4, ‘‘Designing a Voice Resp onse Ap p lic ation,’’ for more information.
INTUITY CONVERSANT System Version 6.0 Application Design Guidelines 585-310-670 Issue 1.0 December 1996 Voice Response Advanced Technologies Page 2-6 Speech Recognition 2 WholeWord Speech Recognition and Key Word Spotting WholeWord speech rec og nition also sup p orts key word sp otting . A key word is one of the word s in a list of word s that the system is instruc ted to rec og nize at a p artic ular p oint in the transac tion. Key word sp otting is the ab ility of the rec og nizer to isolate a key word from a stream of c aller input, inc lud ing extraneous sp eec h, b ac kg round noise, or c aller noises. For this reason, c allers d o not have to say the key word in isolation. For examp le, if the rec og nizer is listening for c allers to say “ yes” or “ no,” it c an also rec og nize “ yes” if c allers say “ Yes, I d o.” However, the rec og nizer find s a key word most ac c urately when it is said alone, without any other word s or other noises b efore or after it. WholeWord Speech Recognition and Barge-in WholeWord speech recognition supports barge-in. Experienced callers do not have to wait until the end of a promp t to b eg in sp eaking their resp onses. As soon as the system rec og nizes something the c aller says, the p romp t stop s playing. This allows a single application to support both inexperienced and exp erienc ed c allers. WholeWord Speech Recognition Accuracy Dec ide c arefully where to allow sp oken inp ut. Sp oken input is not the most ap prop riate inp ut for all ap p lic ations. Touc h-tone inp ut may b e faster and more ac c urate if your c allers are often sp eaking from a noisy environment suc h as a c ar or an airp ort. Some c allers may have sec urity c onc erns ab out speaking p rivate information (like an ac c ount numb er) aloud if they use your servic e from outsid e their homes. If c allers must enter a long series of d ig its or several d ata items, touc h-tone inp ut may ac hieve b etter ac c urac y than WholeWord sp eec h rec ognition. FlexWord Speech Recognition Another way for the system to proc ess sp oken inp ut is with the FlexWord Sp eec h Rec og nition software p ac kag e. FlexWord sp eec h rec og nition is used to rec ognize sp oken word s from a spec ific set of word s, a voc ab ulary, that you, the ap plic ation d esig ner, d efine. Allowing c allers to say the op tion they want instead of saying a number assigned to the option can make the interaction more natural and easier to use. FlexWord Speech Recognition Uses FlexWord sp eec h rec og nition p rovides an intuitive, natural interfac e to the c allers and may be most suc c essful in ap p lic ations where it would b e awkward or inc onvenient for c allers to enter touc h-tone input, suc h as when entering a name. FlexWord sp eec h rec og nition ap p lic ations c an sup p ort more menu c hoic es than either touc h-tone inp ut or WholeWord sp eec h rec og nition. Menus c an rang e from small, suc h as a c hoic e of c lothing sizes, up throug h larg e, suc h as the names of all 500 p eop le in your c omp any.