The Coherent Hub Interface (CHI) is used in system-on-chip (SoC) designs to track which processor has the most recent copy of a data block, preventing other processors from using old data. CHI is used in a wide range of applications requiring high-performance cache coherence, such as mobile devices, networking equipment, automotive systems, and data centers.
A specification within the ARM Advanced Microcontroller Bus Architecture (AMBA) protocol defines how fully coherent processors and dynamic memory controllers can enable efficient cache coherence management between multiple processing cores within an SoC.
In this context, a coherent processor is a processor that operates within an SoC where all processors maintain a consistent (coherent) view of shared memory. If one processor modifies data, all other processors accessing that data will see the updated version, preventing inconsistencies due to outdated cached copies.
What’s the problem?
SoC solutions are usually optimized for maximum performance, and each processor has its own dedicated L1 and L2 cache memory for the fastest access. That can result in memory coherence problems.
CHI was added in AMBA 5 to overcome the AXI coherency extensions (ACE) limitations. ACE was designed for smaller coherent clusters and doesn’t scale well to complex SoCs with many processors. In addition, ACE was replaced by ACE5-Lite, which is designed to work in coordination with CHI. ACE5-Lite is used by I/O coherent managers that need to communicate with other fully coherent managers with caches in the system (Figure 1).

How does CHI fix the problem?
CHI uses a central hub as the point of contact for all memory transactions. It directs requests to the appropriate memory controller and handles cache coherency operations. CHI uses packet-based communication, where data, control signals, and addresses are transmitted in separate packets for efficient data transfers.
A processor sends a request packet to the hub with the required address and operation (read or write) to access memory. The request is broadcast to all connected processors using a “snoop” transaction to locate the current data. Each processor checks if it has a copy of the data in its cache and updates its cache state accordingly.
CHI maintains cache coherency across multiple processors by tracking data’s cache state and performing the required snoop operations when another processor accesses a cache line.
CHI maintains information about the cache state of each data line to ensure consistency across the SoC. Based on the snoop responses, the CHI directs the corresponding memory controller to get the data from memory and send it back to the requesting processor.
CHI is scalable
Scalability is a key feature of CHI and differentiates it from earlier solutions based on ACE. Scalability begins with the separation of the protocol and transport layers. That enables different implementations to provide an optimal trade-off between power, area, and performance regardless of memory size or complexity. CHI is also compatible with mixed systems using ACE, ACE-Lite, and AXI (Figure 2).

As SoCs scale in complexity, quality of service (QoS) can become a concern. To address QoS, CHI includes a mechanism that supports prioritized traffic management. Different data streams can be allocated resources based on importance, with critical activities getting preferential treatment compared with lower-priority routine tasks.
CHI uses dedicated signals called QoS regulators to assign priority levels to transactions. The QoS mechanism manages bandwidth and latency based on designated QoS levels.
The QoS regulators monitor the incoming traffic and can dynamically adjust the priority of transactions based on pre-defined rules, ensuring critical data gets faster access to the network even when there is high traffic congestion. Designers can configure QoS parameters like bandwidth limits and latency thresholds for different types of transactions.
Summary
ARM added CHI to the AMBI 5 release and has overcome limitations in ACE. It can be combined with ACE, ACE-Lite, and AXI in heterogeneous systems. CHI uses packet-based communications for efficient implementation. It’s highly scalable to support complex systems and can deliver a deterministic QoS to support various activities.
References
AMBA®CHI New Features for Cache Coherent Verification, Synopsys
ARM AMBA CHI5, Mirabilis Design
Bus snooping, Wikipedia
Cache Coherence Everywhere may be Easier Than you Think, SemiWiki
CHI, gem5
Snooping-based Cache Coherence Protocols, fiveable
Understanding the CHI Protocol: Achieving High-Performance Coherency, Medium
What is AMBA 5 CHI, and how does it help?, ARM Community
EEWorld Online related content
What are the AMBA protocols?
RISC-V vs. ARM vs. x86 – What’s the difference?
Exceptions, traps and interrupts, what’s the difference?
What is an SoC?
Packaging options and advances for digital ICs
Leave a Reply