This section describes two sets of the text character processing switches and the control sequences used to turn them on and off. By turning certain switches on or off, you can change or program the way text is converted to speech. One set of switches (which uses N for oN and F for oFf as function IDs) is called the On/Off set. The other set (which uses A for Activate and D for Deactivate as function IDs) is called the Activate/Deactivate set. Each set may have up to 16 individual switches.
The text character processing switches are listed below with their names and default values. Each list of switches is in priority order, with 1 being the highest priority. If the input of a sequence of characters causes a conflict between two switches, the lower numbered switch takes precedence.
These switches can be turned on and off using the Switch On and Switch Off Control Sequences. When the TTSC is first powered on, or is reset, the switches are set to their default vales.
The Switch On and Switch Off Control Sequences and the switch functions are described on following pages.
|
| ||
|
|
|
|
|
1 |
Alphabetic Pronunciation |
Off |
|
2 |
Punctuation Pronunciation |
Off |
|
3 |
Digit Pronunciation |
Off |
|
4 |
Space Pronunciation |
Off |
|
5 |
Minus Sign Pronunciation |
Off |
|
6 |
Full Number Pronunciation |
Off |
|
7 |
Upper-case Acronym Pronunciation |
On |
|
8 |
Abbreviation Pronunciation |
On |
|
9 |
Allophonic Processing |
On |
|
10 |
Error Character Generation |
On |
|
12 |
Monotone Pitch |
Off |
|
13 |
Capital Pronunciation |
On |
|
14 |
Fast Processing |
Off |
|
15 |
Carriage Returns after Output |
Off |
|
16 |
Do not use |
-- |
|
Activate/Deactivate Switches | ||
|
1 |
High-frequency Energy Boost |
On |
|
2 |
State Abbreviation and Zip |
Off |
|
3 |
Direction Abbreviation Pronunciation |
Off |
|
4 |
Automatic Ordinalization |
Off |
|
5 |
Very High Speech Rate |
Off |
|
(6) |
(Bracket Used as PlusBet Delimiter) |
(On) |
|
(7) |
(PlusBet or Internal Phoneme Recognition) |
(On) |
On/Off Switch that turns on one or more on/off switches.
Format:
<ESC> [ <p1> ; <p2> ... <pn>N
Parameters:
<p1>, <p2>,...<pn>
Parameters 1 through n specify one or more switches to turn on. Each switch is represented by a numeric value, as shown in Table 4-1.
Each switch controls some aspect of the text-to-speech conversion:
The parameter values are separated by semicolons.
Values that are 0 or greater than 16 are ignored because they are invalid. Using a combination of valid and invalid parameters in the same sequence results in using the valid values, except that all values are ignored if one or more values is greater than 255.
A maximum of 16 values may be specified. If more than 16 are specified, or one or more values is greater than 255, the sequence is discarded and TruVoice returns an error message.
Switches that are on, remain on until turned off by a Switch Off sequence or until a Reset Sequence occurs. The Switch Off sequence is described below.
N Switch On Function ID. Indicates that the on/off switches are to be turned on.
Example:
In the following example, the first four switches are turned on:
Input: <ESC>[1;2;3;4N<SP> We <SP> met <SP> at <SP> 3:35.
Speech: cap dub-you ee, space, em ee tee, space, ay tee, space, three colon three five period
4.2.2 Switch Off
On/Off Switch that turns off one or more on/off switches.
Format:
<ESC> [ <p1> ; <p2> ... <pn>F
Parameters:
<p1>, <p2>,...<pn>
Parameters 1 through n specify one or more switches to turn off. Each switch is represented by a numeric value, as shown in Table 4-1. Each switch controls some aspect of the text-to-speech conversion.
The parameter values are separated by semicolons.
Values that are 0 or greater than 16 are ignored. Using a combination of valid and invalid parameters in the same sequence results in using the valid values, except that all values are ignored if one or more values is greater than 255.
A maximum of 16 values may be specified. If more than 16 are specified, or one or more values is greater than 255, the sequence is discarded and TruVoice returns and error message.
Switches that are off, remain off until turned on by a Switch On sequence or until a Reset Sequence occurs. The Switch On sequence is described on the previous page.
F Switch Off function ID. Indicates that the on/off switches are to be turned off.
Example:
In the following example, switches 7, 8 and 13 are turned off.
Input: <ESC>[7;8;13F <SP> Mr. <SP> Jones <SP> works <SP> at <SP> BETA <SP> engineering.
Speech: em are Jones works at beta engineering.
4.2.3 Alphabetic Pronunciation
(Switch 1)
On/Off Switch that controls the pronunciation of alphabetic, apostrophe, or tilde text characters.
When the Alphabetic Pronunciation Switch is on, the TTSC pronounces each text character "a" through "z," "A" through "Z," apostrophe ('), and tilde (~) individually, thus spelling out words.
The upper case letters "A" through "Z" are preceded by the word "cap," unless Switch 13 is off. The letter "w" is pronounced "dub-you." A pause is inserted between each word, as if a comma appeared between them.
When Switch 1 is on, all other switches for pronunciation of words, abbreviations, acronyms, etc. are bypassed.
When this switch is off (default), the TTSC uses the other switches to determine how these characters are interpreted.
Example:
In this example, the Alphabetic Pronunciation Switch is on.
Input: Mr. <SP> Will's <SP> son <SP> called.
Speech: cap em are, cap dub-you eye ell ell apostrophe ess, ess oh en, see ay ell ell ee dee
4.2.4 Punctuation Pronunciation
(Switch 2)
On/Off Switch that controls the pronunciation of punctuation marks.
When the Punctuation Pronunciation Switch is on, the name of each character listed in Table 4-2 is spoken as indicated.
When this switch is off (default), the name of the character is not pronounced, and the TTSC uses the other switches to determine how these characters are interpreted.
For the hyphen character, this switch overrides the Minus Sign Pronunciation Switch. For the tilde character, this switch overrides the tilde as an emphatic indicator.
|
|
|
|
|
& |
Ampersand |
ampersand |
|
. |
Period |
period |
|
? |
Question Mark |
question |
|
! |
Exclamation | |
|
Point |
exclamation |
|
|
" |
Quotation Mark |
quote |
|
$ |
Dollar Sign |
dollar |
|
( |
Opening Parenthesis |
open paren |
|
) |
Closing Parenthesis |
close paren |
|
, |
Comma |
comma |
|
- |
Hyphen |
dash |
|
: |
Colon |
colon |
|
; |
Semicolon |
semicolon |
|
[ |
Opening Bracket |
open bracket |
|
] |
Closing Bracket |
close bracket |
|
{ |
Opening Brace |
open brace |
|
} |
Closing Brace |
close brace |
|
~ |
Tilde |
tilde |
|
= |
Equals Sign |
equals |
Example:
In these examples, the Alphabetic Pronunciation Switch is off and the Punctuation Pronunciation Switch is on.
Input: Who <SP> are <SP> you?
Speech: Who are you? Question.
Input: John, <SP> Mary, <SP> and <SP> Scott <SP> left <SP> at <SP>12:35.
Speech: John, comma, Mary, comma, and Scott left at twelve colon thirty five period.
Input: 25 - 12 = 13
Speech: twenty five dash twelve equals thirteen
Input: text-to-speech
Speech: text dash to dash speech
Input: going -- going -- gone
Speech: going dash dash going dash dash gone
Input: $1.10
Speech: dollar one period one zero.
Input: 2:05 a.m.
Speech: two colon oh five ay period em period
4.2.5 Digit Pronunciation
(Switch 3)
Note: TruVoice turns off this switch after it speaks a phrase using the spoken-digits attribute.
On/Off Switch that controls the pronunciation of digits 0 through 9. This switch has priority over the other switches regarding pronunciation of numbers.
When the Digit Pronunciation Switch is on, TTSC pronounces each digit of a number, one digit at a time instead of the number as a whole. See Table 4-3.
When this switch is off (default), the TTSC uses the number switches described below to determine how these characters are interpreted.
|
|
|
|
|
0 |
Zero |
zero |
|
1 |
One |
one |
|
2 |
Two |
two |
|
3 |
Three |
three |
|
4 |
Four |
four |
|
5 |
Five |
five |
|
6 |
Six |
six |
|
7 |
Seven |
seven |
|
8 |
Eight |
eight |
|
9 |
Nine |
nine |
Example:
In these examples, the Punctuation and Digit Pronunciation Switches are on.
Input: 25 - 12 = 13
Speech: two five dash one two equals one three
Input: $1.10
Speech: dollar one period one zero
4.2.6 Space Pronunciation
(Switch 4)
On/Off Switch that controls the pronunciation of <SP> characters.
When the Space Pronunciation Switch is on, every <SP> character that is encountered is pronounced as the word "space" followed by a short pause. This switch does not affect the behavior of the <SP> character as a word separator.
When this switch is off (default), the word "space" is not spoken when a <SP> is encountered.
Several input control characters act like the <SP> character. These are carriage return <CR>, line feed <LF>, and <TAB>. All three are pronounced "space" if the Space Pronunciation Switch is on.
Example:
In this example, the Alphabetic, Punctuation, and Space Pronunciation Switches are on.
Input: Jill <SP> won.
Speech: cap jay eye ell ell, space, dub-you oh en period
4.2.7 Minus Sign Pronunciation
(Switch 5)
On/Off Switch that controls the pronunciation of the hyphen (-).
When the Minus Sign Pronunciation Switch is on, every hyphen (-) character whose next non-space character is a digit ("0" through "9"), is pronounced as the word "minus." This switch does not affect the behavior of the hyphen character when it is not followed by a digit.
When this switch is off (default), a hyphen whose next non-space character is a digit is not pronounced but does cause a short pause in the speech.
Example:
In this example, the Minus Sign Pronunciation Switch is on.
Input: 25 - 12 = 13
Speech: twenty five minus twelve equals thirteen
In this example, the Minus Sign Pronunciation Switch is off and the Digit Pronunciation Switch is on.
Input: Her <SP> number <SP> is 856-1626.
Speech: Her number is eight five six, one six two six.
4.2.8 Full Number Pronunciation
(Switch 6)
On/Off Switch that controls the pronunciation of three- and four-digit numbers without commas.
If Switch 6 is on, three and four digit numbers without commas are pronounced including the words "thousand" or "hundred."
If Switch 6 is off (default), these words are not included when saying the number.
Switch 6 has no effect if the Digit Pronunciation Switch (Switch 3) is on.
Example:
In these examples the Digit Pronunciation Switch is off.
Input: 135
Speech: one thirty five (Switch 6 off)
Speech: one hundred thirty five (Switch 6 on)
Input: 1990
Speech: nineteen ninety (Switch 6 off)
Speech: one thousand nine hundred ninety (Switch 6 on)
Input: 856-1626
Speech: eight fifty six, sixteen twenty six (Switch 6 off)
Speech: eight hundred fifty six, one thousand six hundred twenty six (Switch 6 on)
In this example, the Digit Pronunciation Switch is on.
Input: 856-1626
Speech: eight five six, one six two six (Switch 6 off)
Speech: eight five six, one six two six (Switch 6 on)
4.2.9 Upper Case Acronym Pronunciation
(Switch 7)
On/Off Switch that controls how the TTSC pronounces words in which all the characters are upper case.
When the Upper Case Acronym Pronunciation Switch is on (default), the TTSC pronounces each individual letter of an upper case word. A slower mode of spelling is available by putting a tilde (~) immediately before the upper case word. Upper case words with at least two letters are considered to be acronyms. If the last letter of an acronym is immediately followed by an apostrophe and the letter "s" (or "S"), it is pronounced plural.
When the switch is off, TTSC pronounces upper case words as regular words. Upper case words with vowels are spoken as words and those without vowels are spelled out. This switch should be turned off if the text sent to the TTSC is all upper case characters.
Example:
In these examples, the switch is off.
Input: She <SP> works <SP> at <SP> BETA <SP> engineering.
Speech: She works at beta engineering.
Input: He <SP> owns <SP> JLK <SP> industries.
Speech: He owns jay ell kay industries. NOTE: "JLK" is spelled out in this example because it does not contain a vowel.
Input: HE <SP> OWNS <SP> JLK <SP> INDUSTRIES.
Speech: He owns jay ell kay industries.
Input: It's <SP> SPI's <SP> speech.
Speech: Its spis speech.
Input: He <SP> owns <SP> MR <SP> industries.
Speech: He owns mister industries.
In these examples, the switch is on.
Input: HE <SP> OWNS <SP> JLK <SP> INDUSTRIES.
Speech: aych ee oh dub-you en ess jay ell kay eye en dee you ess tee are eye ee ess.
Input: It's <SP> SPI's <SP> speech.
Speech: Its ess pee eyes speech.
Input: He <SP> owns <SP> MR <SP> industries.
Speech: He owns emm are industries.
4.2.10 Abbreviation Pronunciation
(Switch 8)
On/Off Switch that controls the pronunciation of abbreviations.
When the Abbreviation Pronunciation Switch is on (default), the abbreviations listed under "Abbreviations" in Section 3 are recognized and pronounced as shown there.
When the switch is off, abbreviations are not recognized and are pronounced as regular words.
Example:
In these examples, the switch is off.
Input: Mr. <SP> Jones <SP> saw <SP> her.
Speech: cap em are Jones saw her.
Input: It <SP> was <SP> Sept. 12.
Speech: It was sept. Twelve.
In these examples, the switch is on.
Input: Mr. <SP> Jones <SP> saw <SP> her.
Speech: Mister Jones saw her.
Input: It <SP> was <SP> Sept. 12.
Speech: It was September 12.
4.2.11 Allophonic Processing
(Switch 9)
Controls the pronunciation of phoneme characters using allophonic rules specific to the English language. When this switch is on (default), the TTSC uses allophonic rules specific to the English language when processing phoneme characters sent from application programs. (See Section 6.)
When this switch is off, the TTSC bypasses the allophonic rules, but does not bypass the allophonic processing of phonemes generated internally. Allophonic processing is always performed on these phonemes.
Example:
Allophonic processing is necessary for high quality synthesis because the acoustic characteristics of phonemes depend on the nature of the surrounding speech material. For example, allophonic rules modify the way "p" is spoken when it occurs after "s" as in the word "spit" and the way "t" is spoken when it occurs before "on" as in "button."
Allophonic processing is also sensitive to the syntactic structure and stress pattern of the current sentence, as well as the speech rate and prosody mode specified by the application program. For example, at a fast speech rate, the string "fast speech" is pronounced "faspeech."
Because allophonic processing improves speech quality, this switch should generally be left on. However, because it uses rules specific to English language, it may be necessary to turn it off, when processing phoneme characters, to synthesize speech in another language.
4.2.12 Error Character Generation
(Switch 10)
|
CAUTION! Do not use this switch On/Off Switch that controls whether or not the <BEL> Control Character is sent to the application program when an invalid control sequence is detected. |
If Switch 10 is on (default), the <BEL> Character is sent to the application program. If the switch is turned off, the <BEL> character is not sent to the application program.
4.2.13 Monotone Pitch
(Switch 12)
Note: Use this switch with care.
On/Off Switch that controls the baseline pitch value selected by the "Baseline Pitch Select" control sequence.
When this switch is on, TTSC produces monotone speech.
When this switch is off, which is the default setting, the pitch varies like human speech.
This switch affects only speech produced from input characters received after the sequence which turned the switch on.
4.2.14 Capital Pronunciation
(Switch 13)
On/Off Switch that controls the pronunciation of the word "cap" before upper case letters.
The default position for Switch 13 is ON.
Switch 13 is used in conjunction with Switch 1, Alphabetic Pronunciation. The combination of these two switches produces the following situations:
When Switch 1 and Switch 13 are both on, words are spelled out and upper case letters "A" through "Z" are preceded by the word "cap."
When Switch 1 is on and Switch 13 is off, "cap" is not spoken.
When Switch 1 is off (default), Switch 13 has no effect.
Example:
Both Switch 1, Alphabetic Pronunciation, and Switch 13, Capital Pronunciation are on.
Input: Ed <SP> Jones <SP> lives <SP> in <SP> Detroit.
Speech: cap ee dee, cap jay oh en ee ess, ell eye vee ee ess, eye en, cap dee ee tee are oh eye tee.
Switch 1 is on and Switch 13 is off.
Input: Ed <SP> Jones <SP> lives <SP> in <SP> Detroit.
Speech: ee dee, jay oh en ee ess, ell eye vee ee ess, eye en, dee ee tee are oh eye tee.
4.2.15 Fast Processing
(Switch 14)
On/Off Switch used in special applications to minimize the delay between the receipt of initial text by the TTSC and speech production.
Fast Processing is used when speech must begin quickly, such as in response to a pushed button.
When Switch 14 is off (default), speech is produced by full algorithmic processing, and the delay between receipt of initial text and the start of speech production can be as long as one second because of the time required to process an entire sentence.
When Switch 14 on, the delay is minimized by producing speech with a subset of the text-to-speech processing rules.
The time delay is reduced by sacrificing some speech quality and therefore is not recommended unless the application requires rapid start-up.
When Switch 14 is on, the duration of words and silences in certain portions are reduced. This is especially noticeable in Word Prosody. In Word Prosody, fast processing reduces the time it normally takes to speak a word sequence.
Example:
The Fast Processing switch is useful when it is necessary to start speaking before a complete sentence is received from the application program.
4.2.16 Carriage Return On Output Sequences
(Switch 15)
|
CAUTION! Do not use this switch On/Off Switch that controls whether or not a carriage return is appended to all output sequences which the TTSC sends to the application program. |
When this switch is on, a carriage return is appended to all output sequences which the TTSC sends to the application program. This accommodates programs written in high level languages (such as FORTRAN) which require a carriage return as a record terminator.
When this switch is off, (default) carriage returns are not appended to output sequences.
4.3 Activate/Deactivate Switches
4.3.1 Switch On
Turns on one or more activate/deactivate switches.
Format:
<ESC> [ <p1> ; <p2> ... <pn> A
Parameters:
<p1>,<p2>,...<pn>
Parameters 1 through n specify one or more switches to turn on. Each switch is represented by a numeric value, as shown in Table 4-1.
Other characteristics of these parameters are the same as those described for the on/off switch parameters.
A Switch On function ID. Indicates that the activate/deactivate switches are to be turned on.
Example:
In the following example, switches 2 and 3 are turned on:
Input: <ESC>[2;3A<SP> We <SP> live <SP> in <SP> N. <SP> Portland, <SP> OR.
Speech: We live in north Portland, Oregon.
4.3.2 Switch Off
Turns off one or more activate/deactivate switches.
Format:
<ESC> [ <p1> ; <p2> ... <pn> D
Parameters:
<p1>,<p2>,...<pn>
Parameters 1 through n specify one or more switches to turn off. Each switch is represented by a numeric value, as shown in the Table 4-1.
Other characteristics of these parameters are the same as those for the on/off switch parameters.
D Switch Off function ID. Indicates that the activate/deactivate switches are to be turned off.
Example:
In the following example, switch 7 is turned off.
Input: <ESC>[7D<SP> He <SP> pulled <SP> the <SP> [BO].
Speech: He pulled the bow.
4.3.3 High Frequency Energy Boost
(Switch 1)
Some sounds, namely fricatives, have their high-frequency energy boosted in order to enhance their intelligibility over long-distance telephone lines. However, as a result, these sounds may sound too loud when heard through loud speakers, headphones or some PBX Equipment.
This switch, which is normally on, controls this energy boost. It should be left on when the TTSC output is heard over long distance telephone lines. It should be turned off when the TTSC is heard locally at a workstation or over some PBX equipment.
4.3.4 State Abbreviation And Zip Code Pronunciation
(Switch 2)
If Switch 2 is off (default), upper-case two-letter state abbreviations are treated as acronyms (normally spelled), lower-case two- letter abbreviations are pronounced, as two-letter words, zip codes starting with "0" are pronounced "zero" followed by two two-digit numbers, and other zip codes are pronounced as a string of single digits.
When Switch 2 is on, two-letter state and U.S. territory abbreviations are recognized, and the state and territory names are fully pronounced. The case may be all upper, all lower, or mixed. A final period is optional. The following territories and other entities are included:
AS American Samoa
DC District of Columbia
GU Guam
PR Puerto Rico
TT Trust Territory
VI Virgin Islands
Also, when Switch 2 is on, all number sequences are treated as if they were zip codes, and are pronounced as strings of individual digits.
Example:
In these examples Switch 2 is off.
Input: Our <SP> address <SP> is <SP> Mtn. <SP> View, <SP> CA, <SP> 94043.
Speech: Our address is Mountain View, see aye, nine four zero four three.
Input: He <SP> lives <SP> at <SP> 4675 <SP> MA <SP> Ave <SP> in <SP> Cambridge, <SP> ma, <SP> 02139.
Speech: He lives at forty-six seventy-five em aye Avenue in Cambridge, ma, zero twenty-one thirty-nine.
In these examples Switch 2 is on.
Input: Our <SP> address <SP> is <SP> Mtn. <SP> View, <SP> CA, <SP> 94043.
Speech: Our address is Mountain View, California, nine four zero four three.
Input: He <SP> lives <SP> at <SP> 4675 <SP> MA <SP> Ave <SP> in <SP> Cambridge, <SP> ma, <SP> 02139.
Speech: He lives at four six seven five Massachusetts avenue in Cambridge, Massachusetts, zero two one three nine.
4.3.5 Direction Abbreviation Pronunciation
(Switch 3)
When Switch 3 is off (default), one-letter direction abbreviations are pronounced as the letter, and two-letter abbreviations are treated as acronyms that is, normally spelled.
When Switch 3 is on, four one-letter direction abbreviations (namely, N., S., W., E.) and four two-letter direction abbreviations (namely, NE, SE, NW, NE) are pronounced as the directions north, south, west, east, northeast, southeast, northwest, northeast. A period is mandatory after the one-letter abbreviations, but is optional for the two-letter abbreviations. Direction abbreviations are not case sensitive.
Example:
In these examples, Switch 3 is on.
Input: Cary <SP> Grant <SP> starred <SP> in <SP> the <SP> movie, <SP> "N. <SP> by <SP> NW."
Speech: Cary Grant starred in the movie, "north by northwest."
Input: 121 <SP> N. <SP> Oak <SP> Rd.
Speech: one twenty-one north Oak road.
Input: 4536 <SP> N <SP> Pl. <SP> NE.
Speech: forty-five thirty-six en place northeast.
4.3.6 Automatic Ordinalization
(Switch 4)
When Switch 4 is off (default), numbers are read without ordinalization unless explicitly indicated.
When Switch 4 is on, all numbers are automatically pronounced as ordinal numbers. This is useful when pronouncing dates. However, Switch 4 should be turned off before the year part of the date.
Example:
Input: <ESC>[4D <SP> November <SP> 5, <SP> 1986.
Speech: November five, nineteen eighty-six.
Input: <ESC>[4A <S5> November <SP> 5, <SP> 1986.
Speech: November fifth, nineteen eighty-sixth.
Input: <ESC>[4A <SP> November <SP> 5, <SP> <ESC>[4D <SP> 1986.
Speech: November fifth, nineteen eighty-six.
4.3.7 Very High Speech Rate
(Switch 5)
When Switch 5 is off (default), the speech rate is specified by the speech rate controls, which are described in Section 5.
When Switch 5 is on, the speech rate is set to 325 words per minute, which is faster than is otherwise available.
Although not necessary, when switch 5 is on, it is desirable to also turn the monotone pitch switch (on/off switch 12) on, and set one of the other speech rate controls to maximum (with <ESC>[25v or <ESC>[250r). These adjustments smooth out the allophone transitions and the pitch excursions.