What is the Coherent Hub Interface?

The Coherent Hub Interface (CHI) is used in system-on-chip (SoC) designs to track which processor has the most recent copy of a data block, preventing other processors from using old data. CHI is used in a wide range of applications requiring high-performance cache coherence, such as mobile devices, networking equipment, automotive systems, and data centers.

A specification within the ARM Advanced Microcontroller Bus Architecture (AMBA) protocol defines how fully coherent processors and dynamic memory controllers can enable efficient cache coherence management between multiple processing cores within an SoC.

In this context, a coherent processor is a processor that operates within an SoC where all processors maintain a consistent (coherent) view of shared memory. If one processor modifies data, all other processors accessing that data will see the updated version, preventing inconsistencies due to outdated cached copies.

What’s the problem?

SoC solutions are usually optimized for maximum performance, and each processor has its own dedicated L1 and L2 cache memory for the fastest access. That can result in memory coherence problems.

CHI was added in AMBA 5 to overcome the AXI coherency extensions (ACE) limitations. ACE was designed for smaller coherent clusters and doesn’t scale well to complex SoCs with many processors. In addition, ACE was replaced by ACE5-Lite, which is designed to work in coordination with CHI. ACE5-Lite is used by I/O coherent managers that need to communicate with other fully coherent managers with caches in the system (Figure 1).

Figure 1. CHI was added in AMBA 5 to overcome the limitations of ACE. (Image: Medium)

How does CHI fix the problem?

CHI uses a central hub as the point of contact for all memory transactions. It directs requests to the appropriate memory controller and handles cache coherency operations. CHI uses packet-based communication, where data, control signals, and addresses are transmitted in separate packets for efficient data transfers.

A processor sends a request packet to the hub with the required address and operation (read or write) to access memory. The request is broadcast to all connected processors using a “snoop” transaction to locate the current data. Each processor checks if it has a copy of the data in its cache and updates its cache state accordingly.

CHI maintains cache coherency across multiple processors by tracking data’s cache state and performing the required snoop operations when another processor accesses a cache line.

CHI maintains information about the cache state of each data line to ensure consistency across the SoC. Based on the snoop responses, the CHI directs the corresponding memory controller to get the data from memory and send it back to the requesting processor.

CHI is scalable

Scalability is a key feature of CHI and differentiates it from earlier solutions based on ACE. Scalability begins with the separation of the protocol and transport layers. That enables different implementations to provide an optimal trade-off between power, area, and performance regardless of memory size or complexity. CHI is also compatible with mixed systems using ACE, ACE-Lite, and AXI (Figure 2).

Figure 2. CHI is highly scalable and is compatible with ACE, ACE-Lite, and AXI in heterogeneous platforms. (Image: *SemiWIki*)

As SoCs scale in complexity, quality of service (QoS) can become a concern. To address QoS, CHI includes a mechanism that supports prioritized traffic management. Different data streams can be allocated resources based on importance, with critical activities getting preferential treatment compared with lower-priority routine tasks.

CHI uses dedicated signals called QoS regulators to assign priority levels to transactions. The QoS mechanism manages bandwidth and latency based on designated QoS levels.

The QoS regulators monitor the incoming traffic and can dynamically adjust the priority of transactions based on pre-defined rules, ensuring critical data gets faster access to the network even when there is high traffic congestion. Designers can configure QoS parameters like bandwidth limits and latency thresholds for different types of transactions.

Summary

ARM added CHI to the AMBI 5 release and has overcome limitations in ACE. It can be combined with ACE, ACE-Lite, and AXI in heterogeneous systems. CHI uses packet-based communications for efficient implementation. It’s highly scalable to support complex systems and can deliver a deterministic QoS to support various activities.