Detect Real-World Places Of Interest With A Super-Easy API

How cool would it be for your Windows or mobile app to allow travelers to take random pictures at interesting places and retrieve instant useful information about the location they are visiting and the landmarks they can see? In this article we will see how to easily do just that using RAD Studio and Delphi with only few lines of code!

Google API – harnessing super computing power the Delphi way

Google may be a great search engine but behind all of the search hits and map directions there is a massive stack of super-computing power driving it all. A search on a location in Google Maps quite often shows not only a detailed map and even a street-level view of the area but also, frequently, a series of additional information about the locale such as points of interest, local landmarks and other items about the selected place. Google allows us to tap into that rich enhanced intelligence and apply it not just to maps but to objects and places in images we have taken or have in our data. This is done via the Google Cloud Vision API.

Google’s Cloud Vision API – using machine learning to ‘understand’ our images

Google Cloud’s Vision API offers powerful pre-trained machine learning models that you can easily use on your desktop and mobile applications through REST or RPC API methods calls. Lets say you want your application to detect objects, locations, activities, animal species, products, or maybe you want not only to detect faces but also their emotions, or you may have the need to read printed or handwritten text, this and much more is possible to be done for free (up to first 1000 units/month per feature) or at very affordable prices and scalable to the use you make with no upfront commitments.

Detecting landmarks

The option to “Detect Landmarks” is part of the Vision API that we can use to detect and extract information about entities in an image n order to identify a landmark and retrieve its GPS coordinates.

We can use RAD Studio and Delphi to easily setup its REST client library to take advantage of Google Cloud’s Vision API to empower our desktop and mobile applications and if the request is successful, the server returns a 200 OK HTTP status code and the response in JSON format.

Our RAD Studio and Delphi applications will be able to either call the API and perform the detection on a local image file by sending the contents of the image file as a base64 encoded string in the body of the request or rather use an image file located in Google Cloud Storage or on the Web without the need to send the contents of the image file in the body of your request.

How do I set up the Google Cloud Vision Landmark Detection API?

Make sure you refer to Google Cloud Vision API documentation in the “Detect Landmark” section – https://cloud.google.com/vision/docs/detecting-landmarks – generally, this is what you need to do on Google’s side:

  • Visit https://cloud.google.com/vision and login with your Gmail account
  • Create or select a Google Cloud Platform (GCP) project
  • Enable the Vision API for that project
  • Enable the Billing for that project
  • Create a API Key credential

AI vision, in your apps!

Lets say one good friend of yours is travelling and they send you an image from a picture he/she just took. With just a few lines of code we can develop an app that receives this picture, calls Google’s vision API and, within seconds we receive back the name and GPS coordinates for the landmark. So now you know that this landmark is called “Elevador Lacerda” and your friend is visiting Salvador-Bahia-Brazil.

One click more and our app could use the landmark name to check on a site like Wikipedia in order to retrieve more information. Within minutes we have a very cool travelers app that allows users to take pictures of any landmark and retrieve useful information about the places they are visiting.

Google's Cloud Vision API ElevadorL acerda image

How do I use the Google Vision API Landmark Detection API in my Delphi app?

We need to call the API URL via a HTTP POST method passing the request JSON body with type LANDMARK_DETECTION and set the source as the link to the image we want to analyze.

We can do that using the REST Client libraries available for several programming languages. A quick-start guide is available on Google’s documentation here: https://cloud.google.com/vision/docs/quickstart-client-libraries.

Trying out the Google Cloud Vision API

At the bottom page of the the Google Cloud Vision documentation Guide – https://cloud.google.com/vision/docs/detecting-landmarks – there is an option “Try This API” that allows you to post the JSON request body as shown below and get the JSON response as follows:

What does the Google Vision API Landmark Detection API return?

After the call the result will be a list with a “Landmark” description, the confidence score (which ranges from 0-no confidence to 1-very high confidence). There is also GPS coordinates for the landmark location and a bounding polygon showing where in the image the landmark was found.

You can use the GPS coordinates to show the landmark on a map and you can us the polygon information to draw a square on top of the image and highlight the landmark.

The list of how many landmarks are detected will be limited up to the maximum configured in the parameter maxResults in the JSON request. Go ahead and try it out!

How do I connect my applications to Google Cloud Vision Landmark Detection API?

Once you have followed basic steps to set up Landmark Detection API on Google’s side, make sure you go to the Google Cloud Platform Console . In the Credentials menu item, click on “Create Credentials” button and add an API key. Copy this key as we will need it later.

Google's Cloud Vision API - adding the credentials in the Google Cloud Platform console

Connecting RAD Studio to the Google Cloud Vision API

RAD Studio Delphi and C++Builder make it very easy to connect to APIs as you can use the REST Debugger to automatically create the required REST components and paste them into your app.

In Delphi all the job is done using 3 components to make the API call. They are the TRESTClient, TRESTRequest, and TRESTResponse. Once you connect the REST Debugger successfully, copy and paste the components you will notice that the API URL is set to the BaseURL of TRESTClient. On the TRESTRequest component you will see that the request type is set to rmPOST, the ContentType is set to ctAPPLICATION_JSON, and that it contains one request body for the POST.

Run your RAD Studio Delphi and on the main menu click on Tools > REST Debugger. Configure the REST Debugger as follows marking the content-type as application/json, and adding the POST url, the JSON request body and the API key you created earlier. Once you click the “Send Request” button you should see the JSON response, just like we demonstrated above.

How do I build a Windows desktop or Android/iOS mobile device application using the Google Cloud Vision API Landmark Detection?

Now that you were able to successfully configure and test your API calls on the REST Debbuger, just click the “Copy Components” button, go back to Delphi and create a new application project and Paste the components on your main form. That’s the power of RAD Studio of Delphi, it could not be easier!

Very simple code added to a TButton OnClick event to make sure every thing is configured correctly and voila! In five minutes we have made our very first call to Google Vision API and we are able to receive JSON response for whatever images we want to perform Landmark Detection. Please note that on the TRESTResponse component the RootElement is set to ‘responses[0].landmarkAnnotations’. This means that the ‘landmarkAnnotations’ element in the JSON is specifically selected to be pulled into the in memory table (TFDMemTable).

The sample application features a TEdit as a place to paste in the link to the image you want to analyze and another TEdit for the maxResults parameter, a TMemo to display the JSON results of the REST API call, and a TStringGrid component to navigate and display the data in a tabular way demonstrating how to easily integrate the JSON response result with a TFDMemTable component. When the button is clicked the image is analyzed and the application presents the response JSON as text and as data in a grid. Now you have every thing you need in order to integrate with the response data and make your application process the information the way it better suits your needs!

Google's Cloud Vision API - REST results are returned

So now we have a working example of using Google’s Cloud Vision API in our Windows and mobile applications

In this blog post we’ve seen how to sign up for the Google Cloud Vision API in order to perform Landmark Detect on images. We’ve seen how to use the RAD Studio REST Debugger to connect to the endpoint and copy that code into a real application. And finally we’ve seen how easy and fast it is to use RAD Studio Delphi to create a real Windows – or Linux, macOS, Android and iOS – application which connects to the Google Cloud Vision API. Our app executes Landmark Detection image analysis and gives as result a memory dataset ready for us to use in any way we choose!

Head over and download the full source code for the desktop and mobile Google Cloud Vision API Landmark Detect REST demo here: https://github.com/checkdigits/google_landmark_api_delphi_example


Are you going to make your Windows and mobile apps recognize landmarks and places of interest?