If you’ve been presented with an opportunity to work with machine learning tools with advanced image recognition functionality, you’d be wise not to pass it up, even if you’re new to this technology. An array of high-profile tech giants have developed their image recognition tools for developer use, and without the need to build a neural network from scratch.
Here’s an overview of three mature image recognition and detection tools from some tech giants for you to consider, and help choose the optimal one to meet your development needs.
Google Cloud Vision
With Google’s visual recognition API, you can easily add advanced computer vision functionality to your application:
- Face, landmark and logo detection helps recognize multiple faces and related attributes such as emotions or headwear (note that facial recognition is not supported here), natural and handmade structures as well as product logos within one picture. A user can perform image analysis on a file located in Google Cloud Storage or on the web.
- Optical character recognition (OCR) can be used to spot and extract text within a file of various formats, from PDF and TIFF to PNG and GIF. The tool also automatically identifies a vast array of languages and can detect handwriting.
- Label detection and content moderation allows a user to establish categories and also spot explicit material — such as adult or violent content — within an image.
- Object localizer and image attribute functionality helps identify the exact place and type of object in an image as well as detect its general attributes such dominant colors or cropping vertices.
After you enable the Cloud Vision API for your project, a user can start to implement it with a variety of programming languages via client libraries. The image recognition tool also offers AutoML Vision, which lets you train high-quality custom machine learning models without the need for prior experience.
Clarifai’s API is another image recognition tool that doesn’t require any machine learning knowledge prior to implementation. It can recognize images and also perform thorough video analysis.
A user can start to make image or video predictions with the Clarifai API after they specify a parameter. For example, if you input a “color” model, the system will provide predictions about the dominant colors in an image. You can either use Clarifai’s pre-built models or train your own one.
Clarifai video analysis processes one video frame per second, and provides a list of predicted concepts for every second of video. The user will need to input the parameter to begin, and split a video into different components if it exceeds maximum size limits.
Clarifai also offers additional tools for further experimentation and analysis. Explorer is a web application where you can introduce additional inputs, preview your applications and also create and train new models with your own images and concepts. The Model Evaluation tool can provide relevant performance metrics on custom-built models.
Amazon Rekognition is another image recognition tool to consider. Rekognition provides similar functionality as its counterparts, and also adds in facial comparison and celebrity recognition from a variety of pre-built categories, such as entertainment, business, sports and politics.
With Rekognition Image, the service can measure the likelihood of a face appearing in multiple pictures, and also verify a user against a reference photo in near real time.
Apart from image recognition, Amazon also offers near real-time analysis of streaming video. The system automatically extracts rich metadata from Amazon Rekognition Video and outputs it to a Kinesis data stream to detect objects and faces, create a searchable video library and carry out content moderation.
Which tool should you choose?
Each tool provides its own set of features that can potentially meet your image recognition demands. Here is a chart that compares Cloud Vision, Clarifai and Rekognition on several important parameters.
|Google Cloud Vision||Clarifai||Amazon Rekognition|
|Object and label detection||✔||✔||✔|
|Explicit content identification||✔||✔||✔|
|Video analysis and scene recognition||x||✔||✔|
|OS||Linux, macOS, Windows, iOS, Android||Linux, macOS, Windows, iOS, Android, IoT
|Linux, macOS, Windows, iOS, Android, IoT|
The image recognition tool space is crowded with tools that can potentially enhance your product. Weigh all of your options and compare their different features before you make a decision. If one of these tools doesn’t fit, consider some alternatives such as Watson Visual Recognition from IBM or Ditto Labs.
Yana Yelina is a Technology Writer at Oxagile, a custom software development company with a focus on building machine learning solutions. Her articles have been featured on KDNuggets, ITProPortal, Business2Community, to name a few. Yana is passionate about the untapped potential of technology and explores the perks it can bring businesses of every stripe. You can reach Yana at firstname.lastname@example.org or connect via LinkedIn or Twitter.