Delphi is 27, happy birthday and welcome to the new generation
Happy Birthday Delphi
It is meanwhile a regular. It is again February 14 and it is the birthday of Delphi. We celebrate today 27 years! Not including its predecessor Turbo Pascal, it’s for a whole generation of software developers already well over the half of a career that Delphi gives us professional satisfaction as well as bread on the table. But today, I wanted to devote this birthday of Delphi to the next generation. The new generation that is at the start of a new software development career. The new generation that discovers Delphi and is feeling enthusiasm and the new generation for which Delphi still is a valuable & productive tool to accomplish real-life software development tasks that has an impact on people’s lives. Enough said, read the article Stephanie Bracke, our intern this year at TMS software wrote herself or watch the video she made herself! See or read & be amazed about what Stephanie accomplished for Delphi apps that can change people’s lives, including people with disabilities.
Bruno
Stephanie.Introduce(‘Hello world’);
When I first started my internship at TMS software, I felt overwhelmed and that still feels like an understatement, but luckily my mentor Bruno repeatedly told me ( and I needed to hear that every time as well ) that in order to be able to run, you should learn how to walk. To get started and prepare myself during the summer holidays, I was given a two book, the first by Marco Cantù and another by Holger Flick. to try to get myself familiar with the Delphi IDE and also the Object Pascal language.
My project
For my first real assignment I was given an open source library called Annyang! And was the task was to study this library and turn this into an easy to use speech to text component for TMS WEB Core.
I feel like creating the TSpeechToText component with Annyang! Is like entering a rabbit hole, in school we learned the basics, but the deeper you go, the more there is to learn!
In short, I created a web application with TMS WEB Core and only one button that starts Annyang. But of course, you can start the engine automatically at application startup or in different ways, if you don’t like that button. You’re as free as a bird here!
Once the button on my webpage is clicked, the device microphone gets activated (see red little icon the caption) in your browser and Annyang starts to listen to your command.
I added a few commands, for example:
- Start -> starts the camera
- Snap ->takes a picture
- Listen -> Annyang starts listening to sentences and adds it to the memo
Once you activate the “Listen” command, Annyang will still listen to single word commands and execute those as a priority instead of adding the recognized words in a TWebMemo control, whenever that single word is used in a sentence, the entire sentence will be written down without execution of said command.
There are also commands like zoom, reset, pause and resume but those are for you to find out in the demo!
A deeper look into my code
Well, Googling aside, I made a component for TMS software and poured that little rascal (or should I say Pascal ? ) into a demo. It’s important to know that what I visualized here in the demo isn’t actually the component. The component that does the work is a non-visual component, so you do not see it in the screenshot.
I’ll try to show you a bit of my building brick, how my component was molded into the shape I needed it to be.
procedure TSpeechToText.Start; begin if UseAsDictaphone = true then Dictate() else Command(); // direct JavaScript interface to the Annyang library asm if (annyang){ annyang.start(); SpeechKITT.annyang(); SpeechKITT.setStylesheet('css/skittUi.CSS'); SpeechKITT.vroom(); } end; end;
But.. what’s the difference between the command and the dictate function?
procedure TSpeechToText.Dictate; procedure HandleDictaphone(ADictate: string); begin if Assigned(OnDictate) then OnDictate(Self, ADictate); end; begin asm if (annyang){ var commands = { '*variable': repeatUser }; function repeatUser(userSentence){ HandleDictaphone(userSentence); } annyang.addCommands(commands); } end; end;
Annyang understands commands with named variables, called splats in Annyang convention, ( which have been used here for the dictate function ) and optional words.
When the dictate function gets called, Annyang will not only be listening to every word separately, but will convert the spoken words and sentences to text that can be added to a memo control. And this dictate part is also included in the demo:
The alternative way is to let Annyang listend to commands and in the TSpeechToText component, trigger an event handler when Annyang heard the command word:
procedure TSpeechToText.Command; var i: Integer; cmd: string; procedure HandleCommand(ACommand: string); var vc: TVoiceCommand; begin vc := FVoiceCommands.Find(ACommand); if Assigned(vc) and Assigned(OnCommand) then OnCommand(Self, vc); if Assigned(vc) and Assigned(vc.OnCommand) then vc.OnCommand(vc); end; begin for i := 0 to FVoiceCommands.Count -1 do begin cmd := FVoiceCommands[i].Command; asm if (annyang){ var Function = function() {} var commands = {}; commands[cmd] = Function; annyang.addCommands(commands); } end; end; asm if (annyang){ annyang.addCallback('resultMatch', function(phrase,cmd,alt) { HandleCommand(phrase); } ); annyang.addCallback('resultNoMatch', function(phrases) { HandleNoMatch(phrases); } ); } end; end;
First the commands, stored in a collection, get added to the engine and as you can see when a command gets recognized by Annyang, it will trigger the event handler from where whatever action you want it to do can be called.
Which command is linked to what action is entirely up to you, thanks to this easy to use collection with collection item event handlers!
Command’s can be added in 2 ways, by using the designer in your Delphi IDE like this:
or by writing the code for it manually like this:
var vc: TVoiceCommand; begin vc := SpeechToText.VoiceCommands.Add; vc.Command := 'Start'; vc.OnCommand := MyVoiceCommandStartHandler; end;
Once the command was added to the VoiceCommands collection as a TCollectionItem, working with voice commands event handler is just as simple as:
procedure TForm1.MyVoiceCommandStartHandle(Sender: TObject); begin WebCamera1.Start; end;
Left is the camera that has not yet been started and right is the camera acting upon the magic word “Start”
My component is coming!
Right now, I’m polishing and refactoring the component a little bit and create a distribution with package and when this is ready, the TSpeechToText component will be added as a free and open source component to the page of extra components for TMS WEB Core. I’m curious what you envision using TSpeechToText for in your apps, so let me know in the comments.
Stephanie