Text-To-Speech in Silverlight Using WCF
Back in February, I wrote a blog post showing you how to, using Silverlight 4 OOB (out of browser) with elevated trust, access system devices; including how to use the SAPI.SpVoice API through the new Silverlight 4 COM Interop feature to implement text to speech. The biggest problem to that approach is that it will only work on the Windows platform.
So, I started thinking to myself, that the real purpose behind Text-To-Speech isn’t just the cool factor, but it is accessibility for impaired users. I have been doing web development for nearly a decade, and I have always been conscious about web users with impairments that may make viewing or navigating my websites more difficult. Text-To-Speech isn’t very helpful if it will only work in a fully trusted out of browser Silverlight 4 application on a Windows machine. So instead, lets create a Text-To-Speech solution that will work inside the browser, on any browser, on any machine. And heck, I want this to be in Silverlight 3.
Lets get started by creating a new Silverlight 3 application, and be sure to include a web project as well.
Right click the web project and add a reference to the System.Speech.dll.
Next, lets right click the web project and click Properties –> Web and set the specific port number to your liking, I set mine to 1914. This will come in handy when we create our WCF service.
Now we need to create the WCF service that will take our text and send it back as a WAV stream, so go ahead and right click the web project and select “Add New Item”. From this list add a new WCF service and name is SpeechService. Now this is important; when you have created your WCF service an endpoint is created for you in the Web.config file. You need to change the binding of the endpoint to basicHttpBinding. Silverlight only works using basicHttpBinding, and will not create the ServiceReferences.ClientConfig file properly if you do not do this.
Now, lets add an OperationCOntract to your WCF services interface called Speak that returns a byte[], and takes a string parameter.
[ServiceContract]
public interface ISpeechService
{
[OperationContract]
byte[] Speak(string textToSay);
}
The implementation of this method will look like the following:
public byte[] Speak(string textToSay)
{
SpeechSynthesizer ss = new SpeechSynthesizer();
MemoryStream ms = new MemoryStream();
ss.SetOutputToWaveStream(ms);
ss.Speak(textToSay);
return ms.ToArray();
}
The next thing we need to do is create our UI in Silverlight. Here is what mine looks like.
<Grid x:Name="LayoutRoot">
<StackPanel>
<TextBox x:Name="_txtTextToSay" />
<Button Content="Speak To Me" Click="Button_Click" />
<MediaElement x:Name="_audioPlayer"/>
</StackPanel>
</Grid>
Create an event handler for your button. Next add a service reference to your SpeechService in your web project.
The next part is somewhat complicated and time consuming. You have to write your own WAVV decoding class that takes the byte array that is return from the service and converts it to a System.Windows.Media.MediaStreamSource. Luckily for you, I already did this for you with the help of some resources on MSDN. In the button’s event handler add this code:
private void Button_Click(object sender, RoutedEventArgs e)
{
SpeechServiceClient client = new SpeechServiceClient("BasicHttpBinding_ISpeechService");
client.SpeakCompleted += (o, ea) =>
{
WavMediaStreamSource audioStream = new WavMediaStreamSource(new MemoryStream(ea.Result));
_audioPlayer.SetSource(audioStream);
};
client.SpeakAsync(_txtTextToSay.Text);
}
Basically what this does is uses the WavMediaStreamSource class I created that inherits from MediaStreamSource, takes the byte[] returned from the SpeechService and converts it back to a stream, then is passes it off to my WAV decoding classes, which is used as the source for the MediaElement responsible for playing the audio.
All that is next is to build your solution and start making your Silverlight applications more accessible.


