Previous Page TOC Index Next Page



2 Input Format


2.1 Introduction

This section describes the format for text to speech input. It includes the following topics as they apply to formatting text for output as speech:

• The ASCII character set

• Input and output characters

• Control characters

• Data characters

• Text characters

• Phoneme characters

• Word separators

• One-letter words

• An example of formatting a text string

These topics are discussed in greater detail in subsequent sections.


2.2 ASCII Character Set

The ASCII character set is described in two tables on the following pages. The first table lists the ASCII character set as it is used with the control (Ctrl) key. Table 2-1 shows the ASCII character, mnemonic name, hexadecimal value, and the function the character provides when combined with the Ctrl key.

Note: In this manual, a symbol such as <ESC> refers to a control character listed in Table 2-1, NOT the character sequence
<-E-S-C->.

Table 2-2 lists the ASCII character set as it is used alone without the control key. This table includes the ASCII character, hexadecimal value, and the name of the character.

Table 2-1 ASCII Character Set with Control Key

ASCII Character With Control

Mnemonic Name

Hex Value

Function

@

<NUL>

00

Null

A

<SOH>

01

Start of Heading

B

<STX>

02

Start of Text

C

<ETX>

03

End of Text

D

<EOT>

04

End of Transmission

E

<ENQ>

05

Enquiry

F

<ACK>

06

Acknowledge

G

<BEL>

07

Bell

H

<BS>

08

Backspace

I

<HT>

09

Horizontal Tabulation

J

<LF>

0A

Line Feed

K

<VT>

0B

Vertical Tabulation

L

<FF>

0C

Form Feed

M

<CR>

0D

Carriage Return

N

<SO>

0E

Shift Out

O

<SI>

0F

Shift In

P

<DLE>

10

Data Link Escape

Q

<XON>

11

Device Control 1 (XON)

R

<DC2>

12

Device Control 2

S

<XOF>

13

Device Control 3 (XOFF)

T

<DC4>

14

Device Control 4

U

<NAK>

15

Negative Acknowledge

V

<SYN>

16

Synchronous Idle

W

<ETB>

17

End of Transmission Block

X

<CAN>

18

Cancel

Y

<EM>

19

End of Medium

Z

<SUB>

1A

Substitute

[

<ESC>

1B

Escape

\

<FS>

1C

File Separator

]

<GS>

1D

Group Separator

^

<RS>

1E

Record Separator

_

<US>

1F

Unit Separator

Table 2-2 ASCII Character Set Without Control

ASCII Character

Hex Value

ASCII Name

<SP>

20

Space

!

21

Exclamation Point

"

22

Quotation Mark

#

23

Number Sign

$

24

Dollar Sign

%

25

Percent Sign

&

26

Ampersand

'

27

Apostrophe

(

28

Opening Parenthesis

)

29

Closing Parenthesis

*

2A

Asterisk

+

2B

Plus

,

2C

Comma

-

2D

Hyphen

.

2E

Period

/

2F

Slant

0

30

Zero

1

31

One

2

32

Two

3

33

Three

4

34

Four

5

35

Five

6

36

Six

7

37

Seven

8

38

Eight

9

39

Nine

:

3A

Colon

;

3B

Semicolon

< or <LT>

3C

Less Than

=

3D

Equals

or <GT>

3E

Greater Than

?

3F

Question Mark

A

41

Upper Case A

B

42

Upper Case B

C

43

Upper Case C

D

44

Upper Case D

E

45

Upper Case E

F

46

Upper Case F




Table 2-2 ASCII Character Set Without Control (continued)

ASCII Character

Hex
Value

ASCII
Name

G

47

Upper Case G

H

48

Upper Case H

I

49

Upper Case I

J

4A

Upper Case J

K

4B

Upper Case K

L

4C

Upper Case L

M

4D

Upper Case M

N

4E

Upper Case N

O

4F

Upper Case O

P

50

Upper Case P

Q

51

Upper Case Q

R

52

Upper Case R

S

53

Upper Case S

T

54

Upper Case T

U

55

Upper Case U

V

56

Upper Case V

W

57

Upper Case W

X

58

Upper Case X

Y

59

Upper Case Y

Z

5A

Upper Case Z

[

5B

Opening Bracket

\

5C

Reverse Slant

]

5D

Closing Bracket

^

5E

Circumflex

_

5F

Underline

`

60

Grave Accent

a

61

Lower Case A

b

62

Lower Case B

c

63

Lower Case C

d

64

Lower Case D

e

65

Lower Case E

f

66

Lower Case F

g

67

Lower Case G

h

68

Lower Case H

i

69

Lower Case I

j

6A

Lower Case J

k

6B

Lower Case K

l

6C

Lower Case L

m

6D

Lower Case M

Table 2-2 ASCII Character Set Without Control (continued)

ASCII Character

Hex
Value

ASCII
Name

n

6E

Lower Case N

o

6F

Lower Case O

p

70

Lower Case P

q

71

Lower Case Q

r

72

Lower Case R

s

73

Lower Case S

t

74

Lower Case T

u

75

Lower Case U

v

76

Lower Case V

w

77

Lower Case W

x

78

Lower Case X

y

79

Lower Case Y

z

7A

Lower Case Z

{

7B

Opening Brace

|

7C

Vertical Line

}

7D

Closing Brace

~

7E

Tilde

<DEL>

7F

Delete


2.3 Control and Data Characters

Control and data characters are used to control the operation of the TTSC and to tell it what sounds to make. Control characters control the TTSC operation. Data characters tell the TTSC what to speak.

Control and data characters are also used by the TTSC to provide the TTSC status information to the application program.

2.3.1 Control Characters

Control characters are used to control the TTSC operation. They are organized into input and output control sequences.

A control sequence consists of either a single character or an Escape sequence. Use of single control characters has been kept to a minimum in the TTSC because it may conflict with the standard usage of such characters by the application program. The format for the control sequence is:

<ESC>[p1;p2;p3...function id

where:

<ESC>[ Introduces the sequence. Because the Escape character is equivalent to CTRL-[, this sequence is equivalent to CTRL-[[. In addition, TruVoice converts the sequence ^[[ to <ESC>[ so that it can be entered using standard text characters.

p1;p2;p3 Are the optional numeric parameters. These parameters are digits separated by semicolons.

function id Is an alpha character in either upper or lower case which identifies the function.

Control sequences send control information from the application program to the TTSC. The primary functions performed by control sequences are as follows:

Text Character Processing

Changing Text Character Processing Switches

Speech Modification

Speech Control and Synchronization

Data Transfer Control and Synchronization

Phoneme Character Processing

These functions are described in detail in later sections.

2.3.2 Data Characters

Data characters tell the TTSC what to speak. There are two types of data characters:

Text Characters

Phoneme Characters (Phonemes)

2.3.3 Text Characters

A text character is any letter in a word or digit in a number to be spoken, or punctuation or symbol associated with such words and numbers. Text characters include abbreviations and special characters such as in "$1.35," "120 kg," and "5,129," as well as regular text such as "John went home."

In general, text characters in standard typographical form can be used in text to speech conversion. The TTSC accepts and appropriately speaks the following:

Words composed of lower case letters.

Words composed of upper case letters.

Words containing an initial upper-case letter.

A large number of standard abbreviations.

Numbers with or without commas, with decimal points, and with ordinalizers (for example, 32nd).

Time of day (for example, 3:05).

Monetary units if preceded by a dollar sign.

Standard punctuation.

Acronyms composed of upper case letters.

The TTSC also accepts other sequences of text characters (e.g., aB#123z), but may not speak them as intended.

Section 3, "Text Character Processing," discusses in detail how the TTSC handles text characters. Special rules controlled by software switches permit the TTSC to modify the way it processes text characters. These are discussed in detail in Section 4.

2.3.4 Phoneme Characters

Text characters can be pronounced differently in different words. For example, the letter C is pronounced like a K in the word cat, but like an S in the word dance.

Phoneme characters (phonemes) give more control over the particular sounds that are generated because, unlike text characters, only one sound is associated with each of them.

Phonemes are used in two situations:

1. When a word might be incorrectly pronounced by the TTSC.

2. To input syntactic data such as the location of ends of phrases, which can improve the speech output quality.

The method by which one converts a word into phonemes, and a detailed description of the use of phonemes, is provided in Section 6, "Phoneme Character Processing."

2.3.5 Buffered and Immediate Characters

Because control and data characters may be sent at faster rate than they are spoken, the TTSC stores them in a buffer that has a capacity of about 500 characters, and processes these buffered characters in sequence as it speaks.
The sequence of buffered control and data characters is called the processing list.

It may be necessary to immediately reset or stop the TTSC. To provide for these actions, Immediate Characters are processed immediately upon receipt instead of being buffered and processed in sequence.


2.4 Word Separators

Words are usually separated from each other by one or more blank-space characters <SP>, control sequences, or by any character that is not "a" through "z" or an apostrophe ('). Multiple consecutive <SP> characters are permissible, but do not introduce additional silence between words.

Words are also separated when the maximum word length is reached. The maximum word length is 29 characters. If a word separator has not been encountered by the 29th character, the word is terminated automatically at the 29th letter, and the remainder of the word is treated as a new word. Word separators are explained in greater detail in Section 3.


2.5 One-Letter Words

One-letter words (such as, "S" and "P") are pronounced as the names of the letters ("ess" and "pee") unless the letter is "a" or "A". The letters "a" and "A" are not included since "a" is a valid word which is sometimes pronounced in a reduced manner ("uh").

The letters "i" and "I" are pronounced "eye", but in a reduced manner such that when pronouncing "S P I" as "ess pee eye" the "eye" has less stress than "ess" and "pee."


2.6 Formatting Text Strings

The following examples illustrate how text is formatted to provide correct speech. Each example is organized as follows:

Description: Brief description of the example.

Message: The written text of the message.

Input: he control sequences and text as they are sent to the TTSC. The format <abc> refers to a single control character; for example, <L><CTRL-R> refers to the control character CONTROL R (or DC2), <ESC> refers to the control character ESCAPE (or CONTROL [) and <SP> refers to the space character.

Speech: The spoken TTSC output.

Definition: The definition of the control sequences and text as identified by the *#* symbols below the lines.

Note: The TTSC will not start to speak until it encounters the final period in a sentence.

2.6.1 Example 1

Description: Simple message

Message: The quick brown fox jumped over the lazy dog.

Input: The <SP> quick<SP> brown <SP> fox <SP>jumped <SP> over <SP> the <SP> lazy <SP> dog.

Speech: The quick brown fox jumped over the lazy dog.

2.6.2 Example 2

Description: This example demonstrates the use of control sequences to solve specific problems. The object of this example is to program the TTSC to speak to persons calling a theater for information about a performance.

Message: Welcome to the Bijou theater. Currently showing is "Dull Movie", rated P.G. Show times are 6:00, 8:00, and 10:30 P.M. Admission price is $3.50 for adults & $1.00 for children under 5. For more information, call 856-8255.

Input: <ESC>[200r <ESC>[120p <ESC> [2a
*1* *2* *3*

<ESC>[5s Welcome <SP> to <SP> the <ESC>[1I
*4* *5* *6*

&BE1zb <ESC>[0I theater. <SP><SP>
*7* *8* *9*

Currently <SP> showing <SP> is <SP>

"Dull <SP> Movie", <SP> rated <SP> PG.
*10*

<SP><SP>Show <SP> times <SP> are <SP>

6:00, <SP> 8:00, <SP> and <SP> 10:30
*11*

<SP>pm. <SP><SP> Admission <SP> price

<SP>is <SP> $3.50 <SP> for <SP>
*12*

adults <SP> & <SP> $1 <SP> for
*13*

<SP>children <SP> under <SP> 5.
*14*

<SP><SP>For <SP> more <SP> information,

<SP> call <ESC>[100r <ESC>[3N 856-8255
*15* *16* *17*

<ESC>[3F.
*18*

Speech: Welcome to the beezhew theater. Currently showing is "Dull Movie", rated pee jee. Show times are six o’clock, eight o’clock, and ten thirty pee em. Admission price is three dollars and fifty cents for adults and one dollar for children under five. For more information, call eight five six, eight two five five.

Definition: *1* Select a fairly fast speech rate, so the telephone line will be used efficiently.

*2* Select a baseline pitch that sounds good over the telephone line.

*3* Adjust attenuation to match needs of equipment.

*4* Insert a five tenths of a second pause to allow time for equipment to get ready.

*5* Begin message.

*6* Select phoneme character input.

*7* Phoneme spelling of BIJOU, an irregularly spelled word.

*8* Select text character input.

*9* Continue the message.

*10* Note acronym PG.

*11* Note times of day.

*12* Note monetary amount.

*13* Note special punctuation character.

*14* Note number.

*15* Lower the speech rate to allow time for the listener to write down the number.

*16* Turn on the Digit Pronunciation Switch.

*17* Note phone number.

*18* Turn off the digit pronunciation switch in case it is not appropriate for the following text.

Previous Page TOC Index Next Page