One of the cool things about the Macintosh is the presence of the Mac OS Toolbox, a set of libraries that allows developers to work with features of the Macintosh User Interface. The Toolbox provides programmatic access to a wide range of capabilities, including file and memory management, Apple Events, multimedia (Sound, Speech, and Video output), and more.
One of the cool things about MacPerl is that most of the Mac Toolbox calls are available as Perl modules. The MacPerl interface to the Toolbox allows programmers access to more than 2000 constants and 1300 API calls from the Mac Toolbox. In fact, the Toolbox modules form one of the biggest published sets of Perl XS (eXternal Subroutine) and interface code: 22000+ lines of XS code, 15000+ lines of Perl code.
This month's column focuses on one aspect of the Toolbox: the Speech Manager. Using "normal-looking" Perl code, a MacPerl programmer can make MacPerl speak, tell jokes and stories (with a little prompting)... even sing! If you'd love to have your Perl scripts talking to you, we'll show you how. Haven't got a Macintosh? Try one of the new iMacs; Apple probably has a flavor you'll like :-)
If you don't want to clip the code from this article, all examples are available from the Code Contributions department of The MacPerl Pages.
This month's guest columnist is Brian McNett.
From his earliest days on Apple IIs, until he encountered MacPerl, Brian McNett could frequently be heard to complain about having failed to learn every programming language he'd ever attempted. It may not have actually been that bad... Now, with no excuse left, he quite frequently engages in all facets of Perl on the Mac just short of building XS. He's an avid musician and amateur astronomer, and works as both webmaster and staff writer for Mycoinfo, a small mycology e-journal. When prodded, he's been known to teach the occasional class in gourmet mushroom cultivation. He's 36, and lives in Bremerton, WA, USA, where he's surrounded by both ocean and mountains.
And now, introductions over, readers, please welcome Brian McNett. I'll see you next month.
- Vicki
p.s. If you're a MacPerl user with knowledge to share, I'm looking for other guest columnists for future issues. Just send me a note at macperl@perlmonth.com
The Macintosh has had speech synthesis capabilities for a very long time. Oh, the original 128K Mac didn't do speech, but the first versions of MacinTalk existed then (running on 'souped-up' 512K Macs). Steve Jobs is alleged to have demoed speech to the press (1) on one of these "ahead of its time" Macs during the initial product launch.
Over the years, the code that underlies the Mac's speech capabilities has undergone numerous changes and even complete re-writes. The current version is called "PlainTalk"; it includes speech recognition as well as speech synthesis. Over two dozen voices are included, from the ordinary (if somewhat lugubrious) Fred and Bruce, to more fanciful choices (e.g., Bubbles, Pipe Organ). Besides being available to users in Talking Alerts, spoken email, Speakable Items ("Tell me a Joke") and other programs, the Speech Manager is accessible to Macintosh developers via the Mac Toolbox calls.
What does this have to do with MacPerl, you may ask?
Well, Matthias Neeracher, who is responsible for there being a MacPerl to begin with, has done a very thorough job of porting the Mac Toolbox to MacPerl. Toolbox calls are implemented as PerlXS code, as one might expect for Perl, called in much the same way as ordinary modules. (Matthias recently got a job at Apple. He's part of the speech group there, and responsible for maintaining the code base for both the Speech Manager and the Speech Recognition Manager. Coincidence? :-)
When working with the Mac Toolbox, there is one very strong caveat which must be applied to any Mac Toolbox call: You must read and understand the pertinent portions of Apple's technical documentation, which comes in the form of the massive tome, "Inside Macintosh" (hereafter referred to as IM).
Now it used to be that you could get IM in one of two ways: become a certified Apple Developer (for which you paid a pricey annual fee) and Apple would send you a CD-ROM with IM on it (among other things), or shell out even more money to Addison-Wesley for print versions of each volume. Happily, these days, Apple is much more open with its documentation, and IM is available in its entirety in both HTM and PDF forms (2).
So, what specifically do we need to know about using the Speech Manager? Well, apart from the details of specific Speech Manager functions...
From Inside Macintosh (3):
"Managing Speech ChannelsTo take advantage of any but the most rudimentary of the Speech Manager's capabilities, you need to use speech channels. However, you cannot create a speech channel simply by declaring a variable of type SpeechChannel. Before your application calls any routine that requires a speech channel as a parameter, you must call the NewSpeechChannel function to allow the Speech Manager to allocate memory associated with the speech channel. Later, you can release the memory occupied by a speech channel by calling the DisposeSpeechChannel function. In general, it is a good idea to create a speech channel just before you need it and then dispose of it as soon as you have finished processing speech through it."
Fair enough, but the question then arises: what occurs if we somehow fail to dispose of a speech channel?
From Inside Macintosh (3):
"The Speech Manager releases any speech channels that have not been explicitly disposed of by an application when the application quits. In general, however, your application should dispose of any speech channels it has created whenever it receives a suspend event. This ensures that other applications can take full advantage of Speech Manager and Sound Manager capabilities."
However, Matthias, in response to a recent post to the MacPerl mailing list, claims that the reference to "suspend events' here is "fairly bizarre" in a modern context and is probably not very relevant. "Consider that part of the documentation waived :-)" he says. So, don't worry about having to capture suspend events from MacPerl.
Here's a little subroutine which opens a new speech channel and passes some text to it. While the Speech Manager is busy, it does nothing; the instant the Speech Manager is done, it disposes of the speech channel. We assume "use Mac::Speech" occurs some place earlier in the script:
sub give_speech {
if (! $speaker) { $speaker = "Fred" }
$voice = $Voice{$speaker};
$channel = NewSpeechChannel($voice) or die $^E;
SpeakText($channel, $speech) or die $^E;
while (SpeechBusy()) {}
DisposeSpeechChannel($channel);
}
The first step is to supply a reasonable default voice. So, if we haven't already defined $speaker, we assign the string "Fred" to it. Now there's more to the voice than just its name. Here, we treat the voice records as a hash (%Voice), and the info we need is stored in $Voice{Fred}, or more generically $Voice{$speaker}. We assign that string to $voice. I'm not going to go into any detail at all about the full contents of a voice record; that's what IM is for.
Next, we open a speech channel, passing it the voice record, and only then do we assign the speech channel to $channel. It's always a good idea to use the $EXTENDED_OS_ERROR or $^E when using Mac Toolbox calls, so we do this both for NewSpeechChannel() and SpeakText().
Notice the empty while() loop. We're not doing anything at all while the Speech Manager is busy, but this loop could contain a WaitNextEvent() call (as long as we remember to use Mac::Events). Trapping that call would allow the user to abort the script with a mouse click or key press.
Finally, we dispose of the speech channel. Okay, let's see this baby in action, shall we?
#!perl -w
# talking_hello_world.pl
# some voices have trouble pronouncing "circuits"
# "Bruce" seems to work okay. "Fred" doesn't.
use Mac::Speech;
$speaker = "Bruce";
$speech = <<EOS;
I am completely operational and all of my circuits are
functioning perfectly.
EOS
give_speech();
sub give_speech {
if (! $speaker) { $speaker = "Fred" }
$voice = $Voice{$speaker};
$channel = NewSpeechChannel($voice) or die $^E;
SpeakText($channel, $speech) or die $^E;
while (SpeechBusy()) {}
DisposeSpeechChannel($channel);
}
Here's a quick roll-call of the available voices, courtesy of a prolific Mac Speech aficionado, David Seay, and modified to mellifluous effect by this column's regular author:
#!perl
# "Voice Lister" v1.0
# by David Seay g-s@navix.net
# with addition by Vicki Brown, shamelessly stolen from other samples
#
use Mac::Speech;
# Prints alphabetized list of Mac's voices.
$count = CountVoices();
for $i (1..$count) {
$voice = GetIndVoice($count - $i + 1);
$desc = ${GetVoiceDescription($voice)};
$nameLength = ord(substr($desc,16,1));
$name = substr($desc,17,$nameLength);
$channel = NewSpeechChannel($voice) or die $^E;
print "$name\n";
SpeakText $channel, "$name";
while (SpeechBusy()) {}
DisposeSpeechChannel $channel if $channel;
}
Easy as pie, right? Well, in fact, Mac::Speech did manage to rear its head during a recent discussion of calculating pi which occured on the MacPerl mailing list. As this followed on a rather impressive demo of speech capabilities in MacPerl (by David Seay), Bruce van Allen issued a challenge of "an irrational prize" to the first person to "Make MacPerl speak pi." As you can see, with a basic speaking routine already in hand, all one really has to do is pass it the right string. However, there's a niggling little limitation. Here's the script; the explanation follows (4).
#!perl -w
# Speaking of Pi (speaking_of_pi.pl)
# How to Speak Pi to any precision!
# by Brian McNett <brianmc@telebyte.net>
#
# Thanks to Creede Lambard for supplying Pi to more
# precision than I'm actually using.
#
# "Fred" seems to be the best voice for this.
use Mac::Speech;
$speaker = "Fred";
$speech = <<INTRO;
I can speak Pie up to 512 signifigant digits.
How many digits do you wish me to speak?
INTRO
give_speech();
$pi_length = MacPerl::Ask('How many digits do you wish me to speak?');
$double_len = (2 * $pi_length);
$pi = <<EOS;
3.1 4 1 5 9 2 6 5 3 5 8 9 7 9 3 2 3 8 4 6 2 6 4 3 3 8 3 2 7 9 5 0 2 8
8 4 1 9 7 1 6 9 3 9 9 3 7 5 1 0 5 8 2 0 9 7 4 9 4 4 5 9 2 3 0 7 8 1 6
4 0 6 2 8 6 2 0 8 9 9 8 6 2 8 0 3 4 8 2 5 3 4 2 1 1 7 0 6 7 9 8 2 1 4
8 0 8 6 5 1 3 2 8 2 3 0 6 6 4 7 0 9 3 8 4 4 6 0 9 5 5 0 5 8 2 2 3 1 7
2 5 3 5 9 4 0 8 1 2 8 4 8 1 1 1 7 4 5 0 2 8 4 1 0 2 7 0 1 9 3 8 5 2 1
1 0 5 5 5 9 6 4 4 6 2 2 9 4 8 9 5 4 9 3 0 3 8 1 9 6 4 4 2 8 8 1 0 9 7
5 6 6 5 9 3 3 4 4 6 1 2 8 4 7 5 6 4 8 2 3 3 7 8 6 7 8 3 1 6 5 2 7 1 2
0 1 9 0 9 1 4 5 6 4 8 5 6 6 9 2 3 4 6 0 3 4 8 6 1 0 4 5 4 3 2 6 6 4 8
2 1 3 3 9 3 6 0 7 2 6 0 2 4 9 1 4 1 2 7 3 7 2 4 5 8 7 0 0 6 6 0 6 3 1
5 5 8 8 1 7 4 8 8 1 5 2 0 9 2 0 9 6 2 8 2 9 2 5 4 0 9 1 7 1 5 3 6 4 3
6 7 8 9 2 5 9 0 3 6 0 0 1 1 3 3 0 5 3 0 5 4 8 8 2 0 4 6 6 5 2 1 3 8 4
1 4 6 9 5 1 9 4 1 5 1 1 6 0 9 4 3 3 0 5 7 2 7 0 3 6 5 7 5 9 5 9 1 9 5
3 0 9 2 1 8 6 1 1 7 3 8 1 9 3 2 6 1 1 7 9 3 1 0 5 1 1 8 5 4 8 0 7 4 4
6 2 3 7 9 9 6 2 7 4 9 5 6 7 3 5 1 8 8 5 7 5 2 7 2 4 8 9 1 2 2 7 9 3 8
1 8 3 0 1 1 9 4 9 1 2 9 8 3 3 6 7 3 3 6 2 4 4 0 6 5 6 6 4 3 0 8 6 0 2
1 3 9 4 9 4 6 3
EOS
if ($double_len <= 1024) {
$speak_pi = substr( $pi , 0, $double_len);
$speech = "Pie is equal to approximately: $speak_pi";
give_speech();
}
else {
$speech = "I'm sorry, I can't speak Pie to that precision";
give_speech();
}
sub give_speech {
if (! $speaker) { $speaker = "Fred" }
$voice = $Voice{$speaker};
$channel = NewSpeechChannel($voice) or die $^E;
SpeakText($channel, $speech) or die $^E;
while (SpeechBusy()) {}
DisposeSpeechChannel($channel);
$speech=$speaker="";
}
What's with all the spaces? Why can't we just write out the digits in a more normal fashion? This is that niggling little detail I mentioned before. The Speech Manager seems able only to speak decimal notation accurately out to about the limit of double-precision floats. Beyond that, it reverts to reading the number as if it were a very large integer. Placing a space between each digit forces the Speech Manager to read off digits individually, which is what we want anyway...
Whereas "Bruce" is a good voice for speaking text, "Fred" seems to handle numbers better, moving briskly along, and only pausing occasionally as if to take a breath. I used the MacPerl package to put up a dialog, basically "Ask()"ing the user for some input. If it weren't just a quick & dirty hack, I'd do something more robust, perhaps using Mac::Dialogs, but that's beyond the scope of this article. Alas, I've yet to lay claim to my irrational prize.
We've just seen the most basic use of the Mac::Speech module, but the Speech Manager also lets you control the pitch of the words spoken, making the voice higher or lower. This is done by using SetSpeechPitch, which takes the current channel and the pitch value as its argument.
#!perl
# MacPerl Sings a Scale
# by David Seay <g-s@navix.net>
use Mac::Speech;
@scalePitches = split(",","48,50,52,53,55,57,59,60");
@scaleWords = split(",","dough,ray,me,fa,soul,la,tee,dough");
$voice = $Voice{Cellos};
$channel = NewSpeechChannel($voice) or die $^E;
for $p (0..$#scalePitches) {
SetSpeechPitch $channel, $scalePitches[$p];
SpeakText $channel, $scaleWords[$p] or die $^E;
while (SpeechBusy()) {}
}
DisposeSpeechChannel $channel;
Okay, now for something a bit more challenging. Let's make MacPerl really sing.
#!perl
# Author: David Seay <http://www.mastercall.com/g-s>
# Modifed by Charles Albrecht to add accidentals & sing "Daisy Bell"
# Further modified by Jim Miner <jfm@winternet.com>
# to normalize tempo
#
# Still Further Modifications by
# Brian McNett <brianmc@telebyte.net>
# (removed messy string concatenation, added quit-if-runtime)
#
# Daisy Bell (A Bicycle Built for Two), Henry Darce, 1892
# v 1.2
use Mac::Speech;
@noteNames = split(" ","c d e f g a b C D E F G A B C1");
@scalePitches = split(",","48,50,52,53,55,57,59,60,62,64,65,67,69,71,72");
for $p (0..$#scalePitches) {
$pitch{$noteNames[$p]} = $scalePitches[$p];
}
# Increase if song is too fast or ends of words are clipped
$tempo_factor = .3;
# NOTE DURATIONS
$dur{e} = .50 * $tempo_factor; # eighth note
$dur{de} = .75 * $tempo_factor; # dotted eighth note
$dur{'q'} = 1.0 * $tempo_factor; # quarter note
$dur{dq} = 1.5 * $tempo_factor; # dotted quarter note
$dur{h} = 2.0 * $tempo_factor; # half note
$dur{he} = 2.5 * $tempo_factor; # half note + an eighth
$dur{dh} = 3.0 * $tempo_factor; # dotted half note
$dur{dhe} = 3.5 * $tempo_factor; # dotted half note + an eighth
$dur{w} = 4.0 * $tempo_factor; # whole note
# ACCIDENTALS
$acc{'s'} = 1; # sharp
$acc{n} = 0; # natural
$acc{f} = -1; # flat
# FORMAT FOR '$song' = syllable pitch accidental duration
$song = <<SONG;
. c n h, daay A s dh, zeee G n dh, daay D s dh, zeee a s dh,
give c n q, me d n q, your d s q, aaan c n h, sir d s q, doooo a s w,
I'm F n dh, halff A s dh, craay G n dh, zeee D s dh,
alll c n q, for d n q, the d s q, love f n h, of g n q, you f n w,
it g n q, won't g s q, be g n q, uh f n q, sty a s h, lish g n q,
mare f n q, ridge d s w,
I f n q, can't g n h, uh d s q, ford c n h,
uh d s q, care c n q, ridge a s w,
but a s q, you'll d s h, look g n q, sweet f n q, . c n h,
upon d s h, the g n q, seat f n q, of g n e, uh g s e,
buyy a s q, sick g n q, ul d s q, built f n h, for a s q, two d s w
SONG
@song = split(",",$song);
$voice = $Voice{Zarvox};
# some voices won't change pitch
# try 'Zarvox' or 'Pipe Organ' or 'Cellos' or 'Bad News'
$channel = NewSpeechChannel($voice) or die $^E;
for $n (0..$#song) {
($word,$note,$sharp,$dur) = split(" ",$song[$n]);
SetSpeechPitch $channel, $pitch{$note} + $acc{$sharp};
SpeakText $channel, $word or die $^E;
select(undef, undef, undef, $dur{$dur});
}
DisposeSpeechChannel $channel;
MacPerl::Quit(1);
This is, admittedly, a rather stilted rendition of "Daisy Bell," but it demonstrates the basics of getting MacPerl to sing. There's most certainly more capability buried within the speech manager. For one thing, it's possible to open more than one simultaneous speech channel, assign different voices to each, and have them say different things. David Seay (5) , to whom I'm deeply in debt for having provided many of the examples above, has an entire page at macperl.com devoted to speaking and singing scripts. No doubt there's plenty more that can be done with MacPerl and the Speech Manager.