• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer

Microcontroller Tips

Microcontroller engineering resources, new microcontroller products and electronics engineering news

  • Products
    • 8-bit
    • 16-bit
    • 32-bit
    • 64-bit
  • Applications
    • 5G
    • Automotive
    • Connectivity
    • Consumer Electronics
    • EV Engineering
    • Industrial
    • IoT
    • Medical
    • Security
    • Telecommunications
    • Wearables
    • Wireless
  • Learn
    • eBooks / Tech Tips
    • EE Training Days
    • FAQs
    • Learning Center
    • Tech Toolboxes
    • Webinars/Digital Events
  • Resources
    • Design Guide Library
    • DesignFast
    • LEAP Awards
    • Podcasts
    • White Papers
  • Videos
    • EE Videos & Interviews
    • Teardown Videos
  • EE Forums
    • EDABoard.com
    • Electro-Tech-Online.com
  • Engineering Training Days
  • Advertise
  • Subscribe

Multi-modal large language models running on mid- to high-end SoCs reduces power-per-inference

January 8, 2024 By Redding Traiger Leave a Comment

Ambarella, Inc. announced during CES that it is demonstrating multi-modal large language models (LLMs) running on its new N1 SoC series at a fraction of the power-per-inference of leading GPU solutions. Ambarella aims to bring generative AI—a transformative technology that first appeared in servers due to the large processing power required—to edge endpoint devices and on-premise hardware, across a wide range of applications such as video security analysis, robotics, and a multitude of industrial applications.
Ambarella will initially be offering optimized generative AI processing capabilities on its mid to high-end SoCs, from the existing CV72 for on-device performance under 5W, through to the new N1 series for server-grade performance under 50W. Compared to GPUs and other AI accelerators, Ambarella provides complete SoC solutions that are up to 3x more power-efficient per generated token, while enabling immediate and cost-effective deployment in products.
All of Ambarella’s AI SoCs are supported by the company’s new Cooper Developer Platform. Additionally, in order to reduce customer’s time-to-market, Ambarella has pre-ported and optimized popular LLMs, such as Llama-2, as well as the Large Language and Video Assistant (LLava) model running on N1 for multi-modal vision analysis of up to 32 camera sources. These pre-trained and fine-tuned models will be available for partners to download from the Cooper Model Garden.
For many real-world applications, visual input is a key modality, in addition to language, and Ambarella’s SoC architecture is natively well-suited to process video and AI simultaneously at very low power. Providing a full-function SoC enables the highly efficient processing of multi-modal LLMs while still performing all system functions, unlike a standalone AI accelerator.
Generative AI will be a step function for computer vision processing that brings context and scene understanding to a variety of devices, from security installations and autonomous robots to industrial applications. Examples of the on-device LLM and multi-modal processing enabled by this new Ambarella offering include: smart contextual searches of security footage; robots that can be controlled with natural language commands; and different AI helpers that can perform anything from code generation to text and image generation.
Most of these systems rely heavily on both camera and natural language understanding and will benefit from on-device generative AI processing for speed and privacy, as well as a lower total cost of ownership. The local processing enabled by Ambarella’s solutions also perfectly suits application-specific LLMs, which are typically fine-tuned on the edge for each scenario; versus the classical server approach of using bigger and more power-hungry LLMs to cater to every use case.
Based on Ambarella’s powerful CV3-HD architecture, initially developed for autonomous driving applications, the N1 series of SoCs repurposes all this performance for running multi-modal LLMs in an extremely low power footprint. For example, the N1 SoC runs Llama2-13B with up to 25 output tokens per second in single-streaming mode at under 50W of power. Combined with the ease of integration of pre-ported models, this new solution can quickly help OEMs deploy generative AI into any power-sensitive application, from an on-premise AI box to a delivery robot.
Both the N1 SoC and a demonstration of its multi-modal LLM capabilities will be on display this week at the Ambarella exhibition during CES.

You may also like:


  • Matter 1.2 is here — what does that mean for the…

  • What is Rust used for in an embedded system?

  • How does MISRA fit into automotive and industrial systems?

  • What’s the difference between an SPE isolation inductor and a…

  • How do MCUs support automotive infotainment systems?

Filed Under: Artificial intelligence, Industrial, Products, Robotics/Drones, SoC, Tools Tagged With: ambarellainc

Reader Interactions

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Primary Sidebar

Featured Contributions

Five challenges for developing next-generation ADAS and autonomous vehicles

Securing IoT devices against quantum computing risks

RISC-V implementation strategies for certification of safety-critical systems

What’s new with Matter: how Matter 1.4 is reshaping interoperability and energy management

Edge AI: Revolutionizing real-time data processing and automation

More Featured Contributions

EE TECH TOOLBOX

“ee
Tech Toolbox: Internet of Things
Explore practical strategies for minimizing attack surfaces, managing memory efficiently, and securing firmware. Download now to ensure your IoT implementations remain secure, efficient, and future-ready.

EE Learning Center

EE Learning Center

EE ENGINEERING TRAINING DAYS

engineering
“bills
“microcontroller
EXPAND YOUR KNOWLEDGE AND STAY CONNECTED
Get the latest info on technologies, tools and strategies for EE professionals.

RSS Current EDABoard.com discussions

  • i need an embedded c program that will read a 12 bit memory address from the io pins and output the data to pins from the memory in a 8051 mcontroller
  • DC/DC Converter with wide range input
  • Help with HFSS: Mesh error, found 6 bodies without triangles
  • DC TIG inverter to AC and pulse TIG inverter
  • Achieving Efficiency and Stability in High-Speed Electric Motorcycle Motor Controllers: The YMIN Solid-Liquid Hybrid Capacitor Solution

RSS Current Electro-Tech-Online.com Discussions

  • Behringer MX 1602 mixer - reading block diagram
  • Fluke 123 scopemeter not reading ANY voltage, please help
  • Back to the old BASIC days
  • Actin group needed for effective PCB software tutorials
  • My Advanced Realistic Humanoid Robots Project

DesignFast

Design Fast Logo
Component Selection Made Simple.

Try it Today
design fast globle

Footer

Microcontroller Tips

EE World Online Network

  • 5G Technology World
  • EE World Online
  • Engineers Garage
  • Analog IC Tips
  • Battery Power Tips
  • Connector Tips
  • DesignFast
  • EDA Board Forums
  • Electro Tech Online Forums
  • EV Engineering
  • Power Electronic Tips
  • Sensor Tips
  • Test and Measurement Tips

Microcontroller Tips

  • Subscribe to our newsletter
  • Advertise with us
  • Contact us
  • About us

Copyright © 2025 · WTWH Media LLC and its licensors. All rights reserved.
The material on this site may not be reproduced, distributed, transmitted, cached or otherwise used, except with the prior written permission of WTWH Media.

Privacy Policy