Delivering advanced face recognition capabilities at the edge

By Rick Bye, Senior Product Marketing Manager, NXP

Using face recognition for access control

The COVID-19 pandemic highlights concerns over how many different surfaces we often touch during a typical day. For example, everyone entering a secured building, such as a workplace or after-hours hotel entrance, touches access control keypads that present an ideal surface for spreading germs and viruses unless cleaned after every use. Protecting building occupants and guests is driving the need to implement touchless access control systems, and face recognition technology is a popular and effective way to achieve this goal. However, adding face recognition to access control systems can be challenging for the manufacturer. Many current access control systems are self-contained, without any internet connection, and are often battery-powered, so one of the first design decisions when implementing face recognition is whether to run the machine learning (ML) inference engine locally at the network edge or in the cloud, which requires the addition of internet connectivity.

The cloud has the advantage of offering almost limitless processing power to perform complex tasks such as face recognition. Still, cloud connectivity may result in unacceptable latency, even with high-speed wireless or wired connection. Cloud-based face recognition also creates privacy concerns for many users who aren’t comfortable with their images being transmitted to the cloud.

Face recognition can be implemented entirely at the network edge with no cloud connectivity to protect user privacy. With some face recognition implementations, user images are never sent to the cloud. The only data transmitted to the cloud is the face model of a registered user transferred to devices that they have opted to use for access control. This small file contains a proprietary encoding of the facial features used by the face recognition engine, and this data is only shared with the cloud when a user opts-in by registering their face with the system, either at a face recognition device or by using a mobile or PC-based application for remote registration.

Until recently, face recognition at the edge required a multicore applications processor running an advanced operating system such as Linux or Android, a challenging requirement for a power-sensitive, battery-powered device such as a smart lock. Fortunately, face recognition solutions are now available that leverage ML vision pipelines that have been optimized to run on resource-constrained microcontrollers (MCUs) running simple real-time operating systems such as FreeRTOS. Of great benefit to battery-powered systems, not only do MCUs typically consume less power than applications processors, but they also take much less time to boot, making it practical to put them into a deep sleep mode to minimize power consumption and only waking up when a passive infra-red (PIR) sensor detects a person in range.

Spoofing face recognition systems

Unfortunately, adversaries have discovered easy ways to fool many face recognition systems by merely placing a photograph of an authorized individual in front of the camera. This kind of spoofing can be eliminated with liveness detection, for example, supplementing a visible light (red-green-blue or RGB) camera with a second camera operating in the infra-red (IR) spectrum. This simple RGB+IR dual-camera liveness detection technique is very effective under normal indoor lighting conditionals, but it can struggle with some extreme outdoor conditions if, for example, the subject is backlit by the sun low in the sky.

A much more robust way to implement liveness detection uses an advanced 3D structured light module (SLM) based camera, which can create a depth map of an actual human face in the field of view. A photograph has no depth and will fail, enabling anti-spoofing access control in even the most challenging lighting conditions. A 3D SLM camera uses a type of light-emitting diode (LED), called a vertical-cavity surface-emitting laser (VCSEL) with a diffractive optical element (DOE), to project an array of hundreds or thousands of dots onto the subject in a specific series of patterns. An IR camera detects the deformation of these light spots, which enables the creation of a depth map using triangulation techniques.

3D face recognition at the edge

As you can imagine, using 3D SLM cameras for face recognition requires significantly greater processing power compared to using simple 2D cameras. Fortunately, there is a new class of embedded processing devices known as “crossover MCUs” that combines the high performance of an applications processor with the low-power characteristics, ease of use, and embedded functionality of an MCU capable of running a simple RTOS. A typical crossover MCU, such as an i.MX RT series device from NXP Semiconductors contains an Arm Cortex-M7 core running at speeds ranging from 300 MHz to 1 GHz. It sometimes integrates an additional Cortex-M CPU for additional processing performance. These MCUs have sufficient processing performance to support ML inferencing engines required by many face recognition systems, along with the low power consumption required for power-constrained edge applications such as 3D cameras. Figure 1 illustrates the functional block diagram of a i.MX RT106F crossover MCU for face recognition.

Figure 1 – Functional block diagram of an Arm Cortex-M i.MX RT106F crossover MCU for face recognition (source NXP)

Many Arm processor-based MCUs optimized for edge applications include industry-standard mobile industry processor interface (MIPI) interfaces for connecting cameras. They also offer sufficient peripheral connectivity to interface to a 3D structured lighting module. MCU vendors addressing the growing access control market often provide developers with comprehensive software enablement environments and evaluation kits to streamline the development of 2D and 3D face recognition solutions.

Leveraging the processing power and RTOS support of crossover MCUs, it will soon be possible to implement 3D face recognition technology and optionally combine it with speech recognition to enable far-field voice control. These integrated face and speech recognition systems will deliver truly hands-free, touchless user interfaces enabled by feature-rich, low-power microcontrollers that are readily available to embedded developers today.

Rick Bye is a Senior Product Marketing Manager on NXP’s IoT Solutions team. Based in Austin, Rick has product marketing responsibility for NXP’s EdgeReady ML/AI voice control and face recognition solutions.