Detecting Logos On Images Using Google Cloud Vision API

It’s a cliché that Developers are known for their taste to work long hours while drinking coffee… so, if you are into programming you most probably know the logo in the image below as it stands for the American multinational and world’s largest coffeehouse chain, Starbucks!

The power of logos

The point is that there is no name written in the coffee mug, just a logo. Let’s imagine for a moment that I don’t know which brand the logo represents; wouldn’t it be nice to have a way to find out what the logo stands for and to whom it belongs? Even more than that – wouldn’t it be even better if we would incorporate that ability to recognize the logos in our own Windows, desktop and mobile applications? The good news is that’s exactly what we can do with the Google Cloud Vision API.

Google Cloud Vision API Logo Detection API

The option for “Logo Detection” is a part of the Google Cloud Vision API that we can use to detect and extract information about multiple logos in an image. For each logo detected Google provides a textual description of the entity identified, a confidence score – how certain the machine learning AI is that the detection is accurate – and a bounding polygon for the logo in the file so we know where it is located within the area of the image.

starbucks1-2325393

We can easily use machine-learning AI in our Delphi applications

Google Cloud’s Vision API offers powerful pre-trained machine learning models that you can easily use on your desktop and mobile applications through REST or RPC API methods calls. Lets say you want your application to detect objects, locations, activities, animal species, products, or maybe you want not only to detect faces but also their emotions, or you may have the need to read printed or handwritten text, this and much more is possible to be done for free (up to first 1000 units/month per feature) or at very affordable prices and scalable to the use you make with no upfront commitments.

We can use RAD Studio and Delphi to easily setup its REST client library to take advantage of Google Cloud’s Vision API to empower our desktop and mobile applications and if the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format.

Our RAD Studio and Delphi applications will be able to either call the API and perform the detection on a local image file by sending the contents of the image file as a base64 encoded string in the body of the request or rather use an image file located in Google Cloud Storage or on the Web without the need to send the contents of the image file in the body of your request.

How do I set up the Google Cloud Vision Logo Detection API?

Make sure you refer to Google Cloud Vision API documentation in the Detect Logo section – https://cloud.google.com/vision/docs/detecting-logos – but in general terms this is what you need to do on Google’s side:

  • Visit https://cloud.google.com/vision and login with your Gmail account
  • Create or select a Google Cloud Platform (GCP) project
  • Enable the Vision API for that project
  • Enable the Billing for that project
  • Create a API Key credential

How do I call Google Vision API Logo Detection endpoint?

Now all we need to do is to call the API URL via a HTTP POST method passing the request JSON body with type LOGO_DETECTION and source as the link to the image we want to analyse. One can do that using REST Client libraries available on several programming languages and a quick start guide is available on Google’s documentation found here https://cloud.google.com/vision/docs/quickstart-client-libraries

Actually at the bottom page of the Google Cloud Vision documentation Guide – https://cloud.google.com/vision/docs/detecting-logos – there is an option “Try This API” that allows you to post the JSON request body as shown below and get the JSON response as follows.

What does the Google Vision API Logo Detection endpoint return?

After the call the result will be a list with a “Logo” description, the confidence score which ranges from 0-no confidence to 1-very high confidence, and a bounding polygon showing where in the image the object was found. You can use the polygon information to draw a square on top of the image and highlight the Logos so the final result would be something like shown in the image below.

starbucks2-9180061

How do I connect my applications to Google Cloud Vision Logo Detection API?

Once you have followed basic steps to set up Logo Detection API on Google’s side, make sure you go to the Console and in the Credentials menu item click on Create Credentials buton and add a API key. Copy this key as we will need it later.

captura-de-tela-2021-04-17-20-22-54-4

RAD Studio Delphi and C++Builder make it very easy to connect to APIs as you can you REST Debugger to automatically create the REST components and paste them into your app.

In Delphi all the job is done using 3 components tot make the API call. They are the TRESTClient, TRESTRequest, and TRESTResponse. Once you connect the REST Debugger successfully, copy and past the components you will notice that the API URL is set on the BaseURL of TRESTClient. On the TRESTRequest component you will see that the request type is set to rmPOST, the ContentType is set to ctAPPLICATION_JSON, and that it contains one request body for the POST.

How do I set up the REST connection in my Delphi application?

Run your RAD Studio Delphi and on the main menu click on Tools > REST Debbuger. Configure the REST Debugger as follows marking the content-type as application/json, and adding the POST url, the JSON request body and the API key you created. Once you click the “Send Request” button you should see the JSON response, just like we showed earlier.

How do I build a Windows desktop or Android/iOS mobile device application using the Google Cloud Vision API Logo Detection?

Now that you were able to sucessfully configure and test your API calls on the REST Debbuger, just click the Copy Components button, go back to Delphi and create a new application project and Paste the components on your main form.

Very simple code added to a TButton OnClick event to make sure every thing is configured correctly and voila! In five minutes we have made our very first call to Google Vision API and we are able to receive JSON responses for whatever images we want to perform Logo Detection. Please note that on the TRESTResponse component the RootElement is set to ‘responses[0].logoAnnotations’. This means that the ‘logoAnnotations’ element in the JSON is specifically selected to be pulled into the in memory table, specifically a TFDMemTable.

The sample Delphi application calling the Google Cloud Vision API using REST

The sample application features a TEdit as a place to paste in the link to the image you want to analyze and another TEdit for the maxResults parameter, a TMemo to display the JSON results of the REST API call, and a TStringGrid component to navigate and display the data in a tabular way demonstrating how to easily integrate the JSON response result with a TFDMemTable component. When the button is clicked the image is analyzed and the application presents the response JSON as text and as data in a grid. Now you have every thing you need in order to integrate with the response data and make your application process the information the way it better suits your needs!

captura-de-tela-2021-05-20-10-59-46-8599607

Summary of using the Logo Detection Google Cloud Vision API in an application

In this blog post we’ve seen how to sign up for the Google Cloud Vision API in order to perform Logo Detection on images. We’ve seen how to use the RAD Studio REST Debugger to connect to the endpoint and copy that code into a real application. And finally we’ve seen how easy and fast it is to use RAD Studio Delphi to create a real Windows (and Linux and macOS and Android and iOS) application which connects to the Google Cloud Vision API, executes Logo Detection image analysis and gives as result a memory dataset ready for you to iterate!


Head over to https://github.com/checkdigits/google_logo_detection_api_delphi_example for the desktop and mobile Google Cloud Vision API Logo Detection REST demo.