• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer

Microcontroller Tips

Microcontroller engineering resources, new microcontroller products and electronics engineering news

  • Products
    • 8-bit
    • 16-bit
    • 32-bit
    • 64-bit
  • Applications
    • 5G
    • Automotive
    • Connectivity
    • Consumer Electronics
    • EV Engineering
    • Industrial
    • IoT
    • Medical
    • Security
    • Telecommunications
    • Wearables
    • Wireless
  • Learn
    • eBooks / Tech Tips
    • EE Training Days
    • FAQs
    • Learning Center
    • Tech Toolboxes
    • Webinars/Digital Events
  • Resources
    • Design Guide Library
    • DesignFast
    • LEAP Awards
    • Podcasts
    • White Papers
  • Videos
    • EE Videos & Interviews
    • Teardown Videos
  • EE Forums
    • EDABoard.com
    • Electro-Tech-Online.com
  • Engineering Training Days
  • Advertise
  • Subscribe

Edge inference co-processor delivers high throughput for edge apps

April 11, 2019 By Aimee Kalnoskas Leave a Comment

Edge inference co-processorFlex Logix Technologies, Inc. today announced that it has leveraged its core patent-protected interconnect technology from its embedded FPGA (eFPGA) line of business combined with inference-optimized nnMAX clusters to develop the InferXÔ X1 edge inference co-processor. Unveiled today in a presentation at the Linley Processor Conference in Santa Clara, the Flex Logix InferX X1 chip delivers high throughput in edge applications with a single DRAM, resulting in much higher throughput/watt than existing solutions. Its performance advantage is especially strong at low batch sizes which are required in edge applications where there is typically only one camera/sensor.

InferX X1’s performance at small batch sizes is close to data center inference boards and is optimized for large models which need 100s of billions of operations per image.  For example, for YOLOv3 real time object recognition, InferX X1 processes 12.7 frames/second of 2 megapixel images at batch size = 1. Performance is roughly linear with image size: so frame rate approximately doubles for a 1 megapixel image.  This is with a single DRAM.

InferX X1 will be available as chips for edge devices and on half-height, half-length PCIe cards for edge servers and gateways. It is programmed using the nnMAX Compiler which takes Tensorflow Lite or ONNX models. The internal architecture of the inference engine is hidden from the user.

InferX supports integer 8, 16 and bfloat 16 numerics with the ability to mix them across layers, enabling easy porting of models with optimized throughput at maximum precision. InferX supports Winograd transformation for integer 8 mode for common convolution operations which accelerates throughput by 2.25x for these functions while minimizing bandwidth by doing on-chip, on-the-fly conversion of weights to Winograd mode. To ensure no loss of precision, Winograd calculations are done with 12 bits of accuracy.

The new nnMAX neural inference engine leverages the same core interconnect technology used in eFPGA combined with multiplier-accumulators optimized for inference and aggregated into clusters of 64 with local weight storage for each layer.

In neural inference, computation is dominated by trillions of operations (multiplies and accumulates), typically using 8-bit integer inputs and weights, and sometimes 16-bit integer or 16-bit bfloat floating point. It is possible to mix these numerics layer-by-layer as needed to achieve target precision. The technology Flex Logix has developed for eFPGA is also ideally suited for inference because eFPGA allows for re-configurable data paths and fast control logic for each network stage. SRAM in eFPGA is reconfigurable as needed in neural networks where each layer can require different data sizes; and Flex Logix interconnects allow reconfigurable connections between SRAM input banks, MAC clusters, and activation to SRAM output banks at each stage.

The result is an nnMAX tile of 1024 MACs with local SRAM, which in 16nm has ~2.1 TOPS peak performance. nnMAX tiles can be arrayed into NxN arrays of any size, without any GDS change, with varying amounts of SRAM as needed to optimize for the target neural network model, up to to >100 TOPS peak performance.

Filed Under: Applications, Connectivity, Data centers, FPGA, microcontroller, Neural Networking Tagged With: flexlogixtechnologiesinc

Reader Interactions

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Primary Sidebar

Featured Contributions

Five challenges for developing next-generation ADAS and autonomous vehicles

Securing IoT devices against quantum computing risks

RISC-V implementation strategies for certification of safety-critical systems

What’s new with Matter: how Matter 1.4 is reshaping interoperability and energy management

Edge AI: Revolutionizing real-time data processing and automation

More Featured Contributions

EE TECH TOOLBOX

“ee
Tech Toolbox: 5G Technology
This Tech Toolbox covers the basics of 5G technology plus a story about how engineers designed and built a prototype DSL router mostly from old cellphone parts. Download this first 5G/wired/wireless communications Tech Toolbox to learn more!

EE Learning Center

EE Learning Center

EE ENGINEERING TRAINING DAYS

engineering
“bills
“microcontroller
EXPAND YOUR KNOWLEDGE AND STAY CONNECTED
Get the latest info on technologies, tools and strategies for EE professionals.

DesignFast

Design Fast Logo
Component Selection Made Simple.

Try it Today
design fast globle

Footer

Microcontroller Tips

EE World Online Network

  • 5G Technology World
  • EE World Online
  • Engineers Garage
  • Analog IC Tips
  • Battery Power Tips
  • Connector Tips
  • DesignFast
  • EDA Board Forums
  • Electro Tech Online Forums
  • EV Engineering
  • Power Electronic Tips
  • Sensor Tips
  • Test and Measurement Tips

Microcontroller Tips

  • Subscribe to our newsletter
  • Advertise with us
  • Contact us
  • About us

Copyright © 2025 · WTWH Media LLC and its licensors. All rights reserved.
The material on this site may not be reproduced, distributed, transmitted, cached or otherwise used, except with the prior written permission of WTWH Media.

Privacy Policy