• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer

Microcontroller Tips

Microcontroller engineering resources, new microcontroller products and electronics engineering news

  • Products
    • 8-bit
    • 16-bit
    • 32-bit
    • 64-bit
  • Applications
    • 5G
    • Automotive
    • Connectivity
    • Consumer Electronics
    • EV Engineering
    • Industrial
    • IoT
    • Medical
    • Security
    • Telecommunications
    • Wearables
    • Wireless
  • Learn
    • eBooks / Tech Tips
    • EE Training Days
    • FAQs
    • Learning Center
    • Tech Toolboxes
    • Webinars/Digital Events
  • Resources
    • Design Guide Library
    • DesignFast
    • LEAP Awards
    • Podcasts
    • White Papers
  • Videos
    • EE Videos & Interviews
    • Teardown Videos
  • EE Forums
    • EDABoard.com
    • Electro-Tech-Online.com
  • Engineering Training Days
  • Advertise
  • Subscribe

What is ‘compute-in-memory’ and why is it important for AI?

July 3, 2025 By Rakesh Kumar Leave a Comment

If you are running AI workloads, here is something that might surprise you. Your processors are wasting more energy shuffling data around than actually doing the calculations you care about. This inefficiency is becoming a serious limit for the next generation of artificial intelligence systems. As neural networks grow to billions of parameters, traditional von Neumann architectures are hitting physical barriers.

This article explains what compute-in-memory (CIM) technology is and how it works. We will examine how current implementations are already delivering significantly better efficiency improvements compared to conventional processors. We will also explore why this new approach could change AI computing.

Challenges with traditional computers

Traditional computers keep computational units and memory systems separate. They constantly exchange data through energy-intensive transfers. Early proposals like Terasys, IRAM, and FlexRAM emerged in the 1990s. However, these initial attempts had major limitations. The CMOS technology at the time wasn’t advanced enough. Application demands were also different.

The traditional von Neumann architecture (Figure 1a) maintains a strict separation between the central processing unit and memory. This approach requires constant data transfers across a bandwidth-limited bus. This separation creates the “memory wall” problem, which particularly hurts AI workloads.

Figure 1. Evolution of computing architectures from (a) traditional von Neumann with separated CPU and memory, through (b) near-memory computing, to true compute-in-memory approaches using (c) SRAM-based and (d) eNVM-based implementations. (Image: IEEE)

Understanding compute-in-memory

CIM, also known as processing-in-memory, is very different from the traditional von Neumann architecture that has dominated computing for decades. It performs computations directly within or very close to where the data is stored.

Near-memory computing (Figure 1b) brings memory closer to processing units. True in-memory computing approaches (Figures 1c and 1d) work differently. They embed computational capabilities directly within memory arrays. This integration of storage and logic units reduces data movement. This decreases both latency and energy consumption, which are the two major bottlenecks in modern AI applications.

The rapid growth of big data and machine learning applications has driven the rise of CIM. These applications demand high computational efficiency.

Technical implementation approaches

CIM can be implemented using various memory technologies, each offering distinct advantages for different AI workloads.

Static Random-Access Memory (SRAM) has emerged as the most popular choice for CIM implementations. Its speed, robustness, and compatibility with existing fabrication processes make it ideal for AI accelerators. Researchers have developed modified SRAM bitcell structures, including 8T, 9T, and 10T configurations, along with auxiliary peripheral circuits to enhance performance.

The comprehensive nature of SRAM-based CIM development is illustrated in Figure 2. The figure shows how circuit-level innovations enable sophisticated computing functions and real-world AI applications. At the circuit level (Figure 2a), SRAM-based CIM requires specialized bitcell structures and peripheral circuits. These include analog-to-digital converters, time control systems, and redundant reference columns. These circuit innovations enable a range of functional capabilities (Figure 2b).

Figure 2. Complete framework of SRAM-based compute-in-memory showing the progression from (a) circuit-level implementations with bitcell structures and peripheral circuits, through (b) functional capabilities including digital and mixed-signal operations, to (c) real-world AI applications like CNN, AES encryption, and classification algorithms. (Image: Researching)

Digital operations include Boolean logic and content-addressable memory. Mixed-signal operations support multiply-accumulate and sum of absolute difference computations that are fundamental to neural networks.

As demonstrated in the application layer (Figure 2c), these technical capabilities translate into accelerated AI algorithms. These include convolutional neural networks for image classification, AES encryption for security applications, and k-nearest neighbor algorithms for pattern recognition. However, SRAM faces challenges, including low density and high leakage current, that limit its scalability for large AI processors.

Dynamic Random-Access Memory (DRAM), while less common for direct in-memory computation due to its refreshing requirements, plays a central role in Near-Memory Processing architectures. Technologies such as High-Bandwidth Memory and Hybrid Memory Cube utilize 3D stacking to reduce the physical distance between computation and memory.

Resistive Random-Access Memory (ReRAM) is the most promising new technology for CIM. This non-volatile memory has several advantages. It offers high density and works well with back-end fabrication processes. It is also very suitable for matrix-vector multiplication operations. These operations are fundamental to neural networks.

CIM implementations also vary in their computational domains. Analog CIM uses the physical properties of memory cells to perform operations. It works through current summation and charge collection. This offers higher weight density but can have noise issues. Digital CIM provides high accuracy with one device per bit. Mixed-signal approaches try to balance the benefits of both analog and digital methods.

Transformative benefits for AI applications

The practical benefits of CIM for AI are both measurable and compelling, as demonstrated in Figure 3. The energy efficiency comparison reveals the advantages of CIM architectures across different technology nodes. While traditional CPUs achieve only 0.01-0.1 TOPS/W (tera operations per second per watt), digital in-memory architectures deliver 1-100 TOPS/W, representing 100 to 1000 times better energy efficiency. Advanced CIM approaches like silicon photonics and optical systems push efficiency even higher.

Figure 3. Energy efficiency comparison across technology nodes (left) and energy consumption breakdown (right) for different processor types. (Image: ResearchGate)

The energy breakdown analysis (Figure 3, right) reveals why CIM is effective. Traditional CPUs are dominated by memory access energy (blue bars), while CIM architectures reduce this bottleneck by performing computation directly in memory. This fundamental advantage translates to measurable performance improvements across AI applications.

The real-world impact of CIM on transformer and LLM acceleration is demonstrated by recent implementations shown in Table 1. Various CIM architectures have achieved performance improvements with speedups ranging from 2.3x to 200x compared to NVIDIA GPUs. Energy efficiency gains reach up to 1894x. These results span multiple transformer models, including BERT, GPT, and RoBERTa, demonstrating CIM’s broad applicability to modern language models.

Table 1. Comparison of various CIM architectures for transformer and LLM benchmarks, showing substantial speedup and efficiency improvements over NVIDIA GPUs across different models and memory technologies. (Image: arXiv)

Summary

As we enter the post-Moore’s Law era, CIM represents a significant architectural shift that addresses key challenges in AI computing. The technology is advancing rapidly, with SRAM-based solutions approaching commercial viability and emerging non-volatile memory solutions showing potential for future applications. As AI continues to expand across technology applications, CIM could become an important enabling technology for more efficient AI deployment.

References

Memory Is All You Need: An Overview of Compute-in-Memory Architectures for Accelerating Large Language Model Inference, arXiv
Energy-efficient computing-in-memory architecture for AI processor – device, circuit, architecture perspective, Science China
A review on SRAM-based computing in-memory: Circuits, functions, and applications, Researching
An Overview of Processing-in-Memory Circuits for Artificial Intelligence and Machine Learning, IEEE
Analog, In-memory Compute Architectures for Artificial Intelligence, ResearchGate
In-Memory Computing for Machine Learning and Deep Learning, IEEE
Emerging In-memory Computing for Neural Networks, Fraunhofer

Related EE World content

What is DDR (Double Data Rate) Memory and SDRAM Memory
Memory-centric computing and memory system architectures
What is DRAM (Dynamic Random Access Memory) vs SRAM?
Memory basics – volatile, non-volatile and persistent
Memory technology from Floating Gates to FRAM
Memory technologies and packaging options

You may also like:


  • How can silent data corruption be detected and corrected in…

  • How does the Zenoh protocol enhance edge device operation?

  • How do AI agents and model context protocol work together?

  • What is the math of negative feedback and how is…

  • What are the different key layers of IoT architecture? part…

Filed Under: Artificial intelligence, FAQ, Featured Tagged With: FAQ

Reader Interactions

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Primary Sidebar

Featured Contributions

Five challenges for developing next-generation ADAS and autonomous vehicles

Securing IoT devices against quantum computing risks

RISC-V implementation strategies for certification of safety-critical systems

What’s new with Matter: how Matter 1.4 is reshaping interoperability and energy management

Edge AI: Revolutionizing real-time data processing and automation

More Featured Contributions

EE TECH TOOLBOX

“ee
Tech Toolbox: 5G Technology
This Tech Toolbox covers the basics of 5G technology plus a story about how engineers designed and built a prototype DSL router mostly from old cellphone parts. Download this first 5G/wired/wireless communications Tech Toolbox to learn more!

EE Learning Center

EE Learning Center

EE ENGINEERING TRAINING DAYS

engineering
“bills
“microcontroller
EXPAND YOUR KNOWLEDGE AND STAY CONNECTED
Get the latest info on technologies, tools and strategies for EE professionals.

DesignFast

Design Fast Logo
Component Selection Made Simple.

Try it Today
design fast globle

Footer

Microcontroller Tips

EE World Online Network

  • 5G Technology World
  • EE World Online
  • Engineers Garage
  • Analog IC Tips
  • Battery Power Tips
  • Connector Tips
  • DesignFast
  • EDA Board Forums
  • Electro Tech Online Forums
  • EV Engineering
  • Power Electronic Tips
  • Sensor Tips
  • Test and Measurement Tips

Microcontroller Tips

  • Subscribe to our newsletter
  • Advertise with us
  • Contact us
  • About us

Copyright © 2025 · WTWH Media LLC and its licensors. All rights reserved.
The material on this site may not be reproduced, distributed, transmitted, cached or otherwise used, except with the prior written permission of WTWH Media.

Privacy Policy