Delphi is 27, happy birthday and welcome to the new generation

Happy Birthday Delphi

It is meanwhile a regular. It is again February 14 and it is the birthday of Delphi. We celebrate today 27 years! Not including its predecessor Turbo Pascal, it’s for a whole generation of software developers already well over the half of a career that Delphi gives us professional satisfaction as well as bread on the table. But today, I wanted to devote this birthday of Delphi to the next generation. The new generation that is at the start of a new software development career. The new generation that discovers Delphi and is feeling enthusiasm and the new generation for which Delphi still is a valuable & productive tool to accomplish real-life software development tasks that has an impact on people’s lives. Enough said, read the article Stephanie Bracke, our intern this year at TMS software wrote herself or watch the video she made herself! See or read & be amazed about what Stephanie accomplished for Delphi apps that can change people’s lives, including people with disabilities.

Bruno

Stephanie.Introduce(‘Hello world’);

TMS Software Delphi  Components

When I first started my internship at TMS software, I felt overwhelmed and that still feels like an understatement, but luckily my mentor Bruno repeatedly told me ( and I needed to hear that every time as well ) that in order to be able to run, you should learn how to walk. To get started and prepare myself during the summer holidays, I was given a two book, the first by Marco Cantù and another by Holger Flick. to try to get myself familiar with the Delphi IDE and also the Object Pascal language

My project

For my first real assignment I was given an open source library called Annyang! And was the task was to study this library and turn this into an easy to use speech to text component for TMS WEB Core

I feel like creating the TSpeechToText component with Annyang! Is like entering a rabbit hole, in school we learned the basics, but the deeper you go, the more there is to learn! 

In short, I created a web application with TMS WEB Core and only one button that starts Annyang. But of course, you can start the engine automatically at application startup or in different ways, if you don’t like that button. You’re as free as a bird here!

Once the button on my webpage is clicked, the device microphone gets activated (see red little icon the caption) in your browser and Annyang starts to listen to your command.

I added a few commands, for example: 

  • Start -> starts the camera
  • Snap ->takes a picture
  • Listen -> Annyang starts listening to sentences and adds it to the memo

Once you activate the “Listen” command, Annyang will still listen to single word commands and execute those as a priority instead of adding the recognized words in a TWebMemo control, whenever that single word is used in a sentence, the entire sentence will be written down without execution of said command.

There are also commands like zoom, reset, pause and resume but those are for you to find out in the demo!

 TMS Software Delphi  Components

The more time I spend using Annyang as a component the more I can think of ways to enhance it, there are just so many ways and possibilities of using this component that I can’t wait to see what other users would use it for!

A deeper look into my code

Well, Googling aside, I made a component for TMS software and poured that little rascal (or should I say Pascal ? ) into a demo. It’s important to know that what I visualized here in the demo isn’t actually the component. The component that does the work is a non-visual component, so you do not see it in the screenshot.

I’ll try to show you a bit of my building brick, how my component was molded into the shape I needed it to be.

For example, this is the method implementation for how “Annyang!” gets started behind the scenes. This snippet shows that at first we decide how the component is going to be used, if we’re not using the dictate function, we’re using the command function and then start Annyang.
procedure TSpeechToText.Start;
begin
  if UseAsDictaphone = true then
    Dictate()
  else
    Command();
  // direct JavaScript interface to the Annyang library
  asm  
    if (annyang){
      annyang.start(); 
      SpeechKITT.annyang();
      SpeechKITT.setStylesheet('css/skittUi.CSS');
      SpeechKITT.vroom();
      }
  end;
end;

But.. what’s the difference between the command and the dictate function?

procedure TSpeechToText.Dictate;

  procedure HandleDictaphone(ADictate: string);
  begin
    if Assigned(OnDictate) then
      OnDictate(Self, ADictate);
  end;

begin
  asm
  if (annyang){
    var commands = {
      '*variable': repeatUser
      };

    function repeatUser(userSentence){
      HandleDictaphone(userSentence);
    }
    annyang.addCommands(commands);
  }
  end;
end;

Annyang understands commands with named variables, called splats in Annyang convention, ( which have been used here for the dictate function ) and optional words.

When the dictate function gets called, Annyang will not only be listening to every word separately, but will convert the spoken words and sentences to text that can be added to a memo control. And this dictate part is also included in the demo:

TMS Software Delphi  Components

The alternative way is to let Annyang listend to commands and in the TSpeechToText component, trigger an event handler when Annyang heard the command word:

procedure TSpeechToText.Command;
var
  i: Integer;
  cmd: string;

  procedure HandleCommand(ACommand: string);
  var
    vc: TVoiceCommand;

  begin
    vc := FVoiceCommands.Find(ACommand);
    if Assigned(vc) and Assigned(OnCommand) then
      OnCommand(Self, vc);

    if Assigned(vc) and Assigned(vc.OnCommand) then
      vc.OnCommand(vc);
  end;

begin
  for i := 0 to FVoiceCommands.Count -1 do
  begin   
    cmd := FVoiceCommands[i].Command;
    asm
      if (annyang){
        var Function = function() {}
        var commands = {};
        commands[cmd] = Function;
        annyang.addCommands(commands); 
        }
    end;
  end;

  asm
    if (annyang){
      annyang.addCallback('resultMatch', function(phrase,cmd,alt)
        {
          HandleCommand(phrase);
        } );
      annyang.addCallback('resultNoMatch', function(phrases) 
        {
          HandleNoMatch(phrases);
        } );
    }
  end;
end;

First the commands, stored in a collection, get added to the engine and as you can see when a command gets recognized by Annyang, it will trigger the event handler from where whatever action you want it to do can be called.

Which command is linked to what action is entirely up to you, thanks to this easy to use collection with collection item event handlers!  
Command’s can be added in 2 ways, by using the designer in your Delphi IDE like this:

TMS Software Delphi  Components

or by writing the code for it manually like this:

var
  vc: TVoiceCommand;
begin
  vc := SpeechToText.VoiceCommands.Add;
  vc.Command := 'Start';
  vc.OnCommand := MyVoiceCommandStartHandler;
end;

Once the command was added to the VoiceCommands collection as a TCollectionItem, working with voice commands event handler is just as simple as:

procedure TForm1.MyVoiceCommandStartHandle(Sender: TObject);
begin
  WebCamera1.Start;
end;

Left is the camera that has not yet been started and right is the camera acting upon the magic word “Start”

TMS Software Delphi  Components

My component is coming!

Right now, I’m polishing and refactoring the component a little bit and create a distribution with package and when this is ready, the TSpeechToText component will be added as a free and open source component to the page of extra components for TMS WEB Core. I’m curious what you envision using TSpeechToText for in your apps, so let me know in the comments.

Stephanie