• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer

Microcontroller Tips

Microcontroller engineering resources, new microcontroller products and electronics engineering news

  • Products
    • 8-bit
    • 16-bit
    • 32-bit
    • 64-bit
  • Applications
    • Automotive
    • Connectivity
    • Consumer Electronics
    • Industrial
    • Medical
    • Security
  • EE Forums
    • EDABoard.com
    • Electro-Tech-Online.com
  • Videos
    • TI Microcontroller Videos
  • EE Resources
    • DesignFast
    • eBooks / Tech Tips
    • FAQs
    • LEAP Awards
    • Podcasts
    • Webinars
    • White Papers
  • EE Learning Center

AI applications and emerging computing architectures – Virtual Conversation (part 1 of 2)

October 12, 2020 By Jeff Shepard Leave a Comment

Hosted by Jeff Shepard, EE World has organized this “virtual conversation” with Gary Bronner (GB), Senior Vice President with Rambus Labs. Mr. Bronner has generously agreed to share his experience and insights into AI applications and emerging computing architectures.

JS: What is usually the biggest challenge designers face when first using artificial intelligence?

Gary Bronner (GB), Senior Vice President, Rambus Labs

GB: One of the biggest challenges designers face when first using AI is that it requires a careful balancing of processing and memory. A fast AI accelerator needs a memory system feeding it enough data to stay busy. It is very hard to get the ratio of accelerator performance to memory bandwidth right, or even know what it should be unless you understand the specific problem you are working on and what the workloads will be. This is why memory, or, more precisely, the type of memory, is so important for AI. Both GDDR6 and HBM2 are widely used in AI systems. The choice between them comes down to tradeoffs in cost, power, chip area, and design complexity.

JS: What is the most misunderstood aspect of AI, and how does that translate into specific design challenges?

GB: A common misconception is that all AI or deep learning applications work the same. Depending on whether you are doing image recognition or speech or translation or any of the other applications of deep learning, the architecture that will work best for a specific system is actually very different. There are several approaches that can significantly change the architecture of the system. So when you design hardware, you can’t just think of one. For example, you could build the best machine for convolutional networks that will allow you to do great image recognition. However, it wouldn’t work as well if you used the same machine as a translator.

The other thing that is important to remember is that the field is still evolving, with new algorithms being produced almost every day. How do you design a system to be useful for algorithms that haven’t been invented yet? Some of the ability to answer that question depends on how good your crystal ball is.

More practically, it translates to keeping in mind how much bandwidth you need today as well as how much you’ll need in the future. Architecturally, it’s about figuring out how to flexibly route resources around the various compute elements. For example, the higher performance of an AI accelerator means it needs more and more data to keep itself busy and productive. Which drives the need for high-performance memory systems, and very high bandwidth memory interfaces, either GDDR6 or various HBMx generations, both of which Rambus helps enable.

GDDR6 is a high-performance memory solution that can be used in various applications requiring high memory throughput, such as ADAS, data center, and AI. (Image: Rambus Labs)

What it ultimately boils down to is you need to design a system that is very optimized, but not completely specialized, to provide some added flexibility.

JS: Considering AI accelerators using CPUs, GPUs, and FPGAs, what are the tradeoffs in terms of specific application features and benefits of each of those approaches?

GB: As the saying goes, “you can have something done cheap, fast, or right – pick two.” The same concept applies here; AI accelerators can be fast, energy-efficient, or flexible, but not all three.

CPUs are general-purpose, so they are the least expensive but provide the lowest performance for AI and are the least power efficient. FPGAs offer better performance and power efficiency compared to CPUs, but they are not as performant as GPUs. However, they offer a level of flexibility that makes them good for prototyping or addressing the rapidly evolving algorithms. As a result, we’re currently seeing FPGAs used frequently for AI accelerators. GPUs lack the flexibility of FPGAs but offer better performance and power efficiency.

I think there is actually a fourth approach to build an AI accelerator – custom ASICs. An ASIC is probably the highest performance and highest efficiency approach once you are confident that the algorithms are relatively set. The challenge with them is it can take six months to design an ASIC, by which point the algorithm it was designed for would be too old, and the process would have to be started over.

JS: Do you expect to see new computing architectures emerge optimized specifically for AI? If so, what should we expect to see in this area?

GB: Yes, we are already seeing what is being referred to as a new “golden age of computing,” which is resulting in many specialized chips. We are seeing this tremendous need for ever-increasing performance that is manifesting in new technology such as Google’s TPU, startups such as Cerebras Systems, along with probably more than 30 or 40 other designs in various stages of deployment. These applications are good at running AI, and because there is such high demand, there’s an astonishing ability to fund it, continuing to drive performance and efficiency.

This can be further broken down into data center and edge. At the edge, the big concern is achieving efficient implementation of machine learning functions for more inference tasks. For example, we need Alexa to recognize the “hot word.” Whereas what is important in the data center—where the training takes place—is attaining the highest performance to improve machine learning algorithms.

Right now, the number of architectures is growing rapidly as the industry evolves and tries to figure out what’s going to ultimately work best. This will eventually lead to a kind of natural selection where we’ll see a few of the best solutions become de facto standards.

JS: Thank you to Mr. Bronner, for sharing his insights and experience, great conversation!  You might also be interested in reading “Measuring the performance of AI and its impact on society” Virtual Conversation (part 2 of 2).

You may also like:

  • Embedded systems software and programming
    Embedded systems software and programming for a safer world
  • green artificial intelligence
    How “green” is your Artificial Intelligence?
  • artificial intelligence benchmarks
    Benchmarking AI from the edge to the cloud
  • artificial intelligence
    Artificial Intelligence, Machine Learning, Deep Learning, and Cognitive Computing
  • Memory centric computing system architectures
    Memory-centric computing and memory system architectures
  • Memory basics
    Memory basics – Volatile, non-volatile and persistent

Filed Under: FAQ, Featured, Storage Tagged With: rambusinc

Reader Interactions

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Primary Sidebar

DesignFast

Component Selection Made Simple.

Try it Today
design fast globle

EE Training Center Classrooms

“ee

“ee

“ee

“ee

Subscribe to our Newsletter

Subscribe to weekly industry news, new product innovations and more.

Subscribe today

RSS Current EDABoard.com discussions

  • Power controll on 230V with zero switching and PWM?
  • FT232 > Package
  • ADS EM simulation
  • Deviation of dain current of PA between simulation and measurement
  • multi-voltage synthesis problem

RSS Current Electro-Tech-Online.com Discussions

  • new to Ardunio but trying to compile
  • Jon's Imaginarium – A Comment
  • Where can I find a pole pig?
  • NE555p circuit help
  • Where has the fun gone?

Follow us on Twitter

Tweets by MicroContrlTips

Footer

Microcontroller Tips

EE World Online Network

  • DesignFast
  • EE World Online
  • EDA Board Forums
  • Electro Tech Online Forums
  • Connector Tips
  • Analog IC Tips
  • Power Electronic Tips
  • Sensor Tips
  • Test and Measurement Tips
  • Wire and Cable Tips
  • 5G Technology World

Microcontroller Tips

  • Subscribe to our newsletter
  • Advertise with us
  • Contact us
  • About us
Follow us on TwitterAdd us on FacebookFollow us on YouTube Follow us on Instagram

Copyright © 2021 · WTWH Media LLC and its licensors. All rights reserved.
The material on this site may not be reproduced, distributed, transmitted, cached or otherwise used, except with the prior written permission of WTWH Media.

Privacy Policy