Automatically Label Objects In An Image With DenseCap API And Javascript

As businesses and individuals move increasingly online, they accumulate large volumes of data in the form of digital images. If you have found yourself in this position, then you are definitely aware that to efficiently organize, analyze, and edit your images takes plenty of time. It takes so much time, in fact, that often the only good solution is for you to develop your own high-quality software to ease the ever-increasing load.

Thankfully, there are tools available to you that speed things up, and they do it using AI and machine learning. They perceive and recognize objects within your images and automatically label them appropriately. In addition to being very useful for organizing your directories full of digital images, they also help you search for and discover the important information contained within your images.

DeepAI.org has created a number of these machine learning and computer vision technology tools. DenseCap API in particular scans and adds captions to your images by identifying objects they contain. Most importantly, DenseCap API is fast. It takes seconds to scan and caption even your largest images.

If building your own AI and machine learning apps interests you, then read on to find out how you can quickly build a Sencha Ext JS app that automatically labels objects in an image. Let’s get started building an app that looks like this:

What is the DenseCap API by DeepAI?

So what does the DenseCap API do? Simply put, the DenseCap API takes your image input and returns a JSON object that contains information about each item or object it detects within the image. For object it identifies it returns the following three attributes:

  • Object label
  • Confidence value of the object
  • Coordinates/bounding box of the object within the image.

The great thing about DeepAI is that the API key is provided for free. You can use this key to try out their interface and once you run out of a fixed number of queries, you can get your own key.

To see it in action for yourself, first, open the command line in Sencha Cmd and type the following:

curl
-F 'image=https://images.pexels.com/photos/4198322/pexels-photo-4198322.jpeg?auto=compress&cs=tinysrgb&dpr=2&h=750&w=1260'
-H 'api-key:42638949-3cf9-486b-a063-eaf5f4df0634'
https://api.deepai.org/api/densecap

Pasted below is a part of the JSON text that the API returns. The API actually identifies more objects for this image, but we are only showing two of them below to help you gain an understanding of how the API works:

 {

  "id": "26f9959b-2001-4f98-a648-99c6f8856190",

    "output": {

        "captions": [

            {

                "caption": "a white plate with a white frosting",

                "bounding_box": [

                    420,

                    189,

                    578,

                    589

                ],

                "confidence": 0.9974018335342407

            },

            {

                "caption": "a small orange bowl",

                "bounding_box": [

                    497,

                    987,

                    236,

                    351

                ],

                "confidence": 0.9843024015426636

            }

}

 

How can I set up the DenseCap Sencha App?

Setting up the app yourself is pretty simple. To begin, you need to generate a minimal desktop application using the Ext JS Modern Toolkit. If you are new to Sencha, you can follow this tutorial here to generate an empty project. For this example, let’s call our app ImgLabels, and place all the project files in a directory called img-labels.

Once you have done that, open the index.html file located in your main directory, and add the source for the DeepAI package. You can add the following line anywhere in the header of the HTML file.

Create the Object Labels Grid

To display your results you need to create your Object Labels Grid. This shows the multiple objects within the image, each with its own information. To build your grid, create a new file called ImgLabelsGrid.js in the app/desktop/src/view folder and add the following code to it. Make sure to replace imgLabels with the name of your app if you named it differently:

Ext.define('ImgLabels.view.ImgLabelsGrid', {

extend: 'Ext.grid.Grid',

xtype: 'imgLabelsGrid',

columns: [

{

    type: 'column',

    text: 'Caption',

    dataIndex: 'caption',

    width: 600

},

{

    text: 'Confidence',

    dataIndex: 'captionConfidence',

    width: 200

},

{

  text: 'Bounding Box',

  dataIndex: 'boundingBox',

  width: 200

}]

});

Create the Main View

Now we need to create the Main View. It is quite straightforward and contains the following components:

  • A text field to input the source image URL.  In this case, for testing purposes, lets use a default value to this field to keep trying out the app simple.
  • A button with text ‘Label Image’
  • The displayed source image
  • The grid displaying the objects within the image and their attributes

To populate the grid we need a data store. We’ll call the data store the imgLabelsStoreand define it in the main view as well.

Open the MainView.js file in the app/desktop/src/view/main folder. Replace its contents with the following code, again making sure to replace imgLabels with the name of your app:

Ext.define('ImgLabels.view.main.MainView', {

  extend: 'Ext.Container',

  xtype: 'mainview',

  align : 'stretch',

  controller: 'mainviewcontroller',

  viewModel: {

    type: 'mainviewmodel'

  },

  items: [

    {

      xtype: 'textfield',

      label: 'URL of Input Image: ',

      reference: 'imgUrl',

      value: 'https://images.pexels.com/photos/4198322/pexels-photo-4198322.jpeg?auto=compress&cs=tinysrgb&dpr=2&h=750&w=1260'

    },

    {

      xtype: 'button',

      text: 'Label Image',

      handler: 'onLabelImage'

    },

    {

      xtype: 'image',

      reference: 'srcImage',

      width: 200,

      height: 200

    },

    {

      xtype: 'imgLabelsGrid',

      title: 'Image Labels',

      bind: { store: '{imgLabelsStore}'},     

      height:500,

      widht: 800

    }

  ],

  viewModel: {

    stores: {

      imgLabelsStore: {

        type: 'store',

        storeId: 'dStore',

        autoLoad: true,

        fields:[

        {

          name: 'caption',

          mapping: 'caption'

         },

        {

          name: 'captionConfidence',

          mapping: 'confidence'

        },

        {

          name: 'boundingBox',

          mapping: 'bounding_box'

        }],

        proxy:

        {

          type: 'memory',

          data: null,

          reader: {

            rootProperty: 'captions'

            }

        }

      }

    }

  },

   defaults: {

      flex: 1,

      margin: 10

  }

})

Add the Main Controller

Next you need to add the main controller so when you click the ‘Label Image’ button, it calls the DenseCap API and populates the imgLabels grid with the relevant data. The main controller handles the button press and takes the required action. Open the MainViewController.js file in the app/desktop/src/view/main folder and replace its contents with the code below. Needless to say, the ImgLabels has to be replaced by the name of your app if you have changed it.

Ext.define('ImgLabels.view.main.MainViewController', {

  extend: 'Ext.app.ViewController',

  alias: 'controller.mainviewcontroller',

  onLabelImage: function (button) {

  //clear the source image

  this.lookupReference('srcImage').setSrc("");

  imgUrl = this.lookupReference('imgUrl').getValue();

    this.lookupReference('srcImage').setSrc(imgUrl);

    this.labelImage(imgUrl);

  },

  labelImage: async function(imgUrl) {

   try{

   //replace with your API key

  deepai.setApiKey('42638949-3cf9-486b-a063-eaf5f4df0634');

//call the DeepAI API

//ref: https://deepai.org/machine-learning-model/densecap

    var resp = await deepai.callStandardApi("densecap", {

            image: imgUrl,

});

    console.log(`Response:${JSON.stringify(resp)}`);

    var data = resp.output;

    //connect the response to the data store

    var store = Ext.data.StoreManager.lookup('dStore');

    store.getProxy().data = data;

    store.reload();

  }  

  catch(err){alert(err);}

}

})

In the code above, the onLabelImage() function is the button click handler. When the ‘Label Image’ button is clicked, it displays the source image is displayed and calls the function. The labelImage() function does the main job of calling the DenseCap API and attaching the JSON response to the data store of the imgLabelsGrid.

Where can I get the Ext JS DenseCap project source code?

Wonderful! We just created an app to automatically detect and label objects present in images both quickly and easily. The credit goes to Sencha’s Ext JS framework, which enables us to build awesome AI and machine learning apps for all modern devices.

You can download the full source code and try it out.

Happy coding!