Google AI releases its attention center model, which uses machine learning to determine which parts of an image catch a human’s attention first

Have you ever wondered if when looking at a picture you see certain parts of the picture first? What are these parts and do they have some special features that draw focus to these parts? Now imagine a machine that can focus on those parts. Knowing these parts is a very helpful idea to speed up the process of image compression and decompression.

To decompress the sections that first catch human attention, researchers at Google Research recently published an attention center model that uses machine learning-trained models to try to identify which parts of an image catch a human’s attention first.

This model is in Tensorflow Lite format and takes an RGB image as input and gives the output image a green dot as the center of attention.

Meet Hailo-8™: An AI Processor Using Computer Vision for Multi-Camera, Multi-Person Recognition (Sponsored)

Source: https://opensource.googleblog.com/2022/12/open-sourcing-attention-center-model.html

The attention center model is a deep neural network that uses a pre-trained classification network such as ResNet, MobileNet, etc. as a basis and accepts an image as input. The attention center prediction engine takes its input from several intermediate layers that the backbone network produces. For example, lower layers often contain low-level information such as intensity, color, and texture, while deeper layers typically contain higher-level and more meaningful information such as shape and object.

First, a low-resolution version of the entire image is displayed. By the time your visual brain determines where to put your pupils, that part of the image has already started to sharpen. The program then predicts where your eyes will go next as they move in the frame and adds extra detail to those areas. The relatively blunt areas are filled in last after these relatively sharp sections.

READ :  Samsung Medison introduces its AI diagnostic solutions at ISUOG World Congress 2022

This model can be made really useful as it helps in loading images faster as the important parts are loaded faster. It will also come in handy when implementing machine learning and image processing as it seeks out the more impactful parts. Therefore, the implementations of such a model are extensive and very useful.


Try this GitHub and reference article. All credit for this research goes to the researchers on this project. Also don’t forget to participate our Reddit page and Discord Channelwhere we share the latest AI research news, cool AI projects and more.


Rishabh Jain, is a consulting intern at MarktechPost. He is currently pursuing B.tech in Computer Science from IIIT, Hyderabad. He is a machine learning enthusiast and has a keen interest in statistical methods in artificial intelligence and data analysis. His passion is developing better algorithms for AI.