XMOS has jointly developed a new voice interface development kit, “codama”, with Yukai Engineering and NTT DOCOMO.
The “codama” voice interface development kit includes the XMOS VocalFusion XVF3100 voice processor; it enables AI engineers and developers to build a voice interface with NTT DOCOMO “AI Agent API” into their products. This technology brings people the freedom to control their electronic devices and access a wide range of IoT services by simply using their voice – wherever they are in the room and whatever is happening around them.
The XVF3100 delivers outstanding voice capture accuracy over long distances; its barge-in capability means the user can interrupt any music playing on the consumer device, even when it’s playing loudly, and the wake-word is spoken softly. Additionally, the XVF3100 uses beamforming to capture the direction of arrival of the voice command and track the person speaking as they move around the room. Sophisticated algorithms deliver acoustic echo cancellation, dereverberation and noise suppression, to capture the voice of interest and clean up the signal for onward send to the NTT DOCOMO “AI Agent API”.
Finally, Yukai have added an excellent personalization feature that lets people create their own wake-word on Yukai Engineering’s website and download it to their device.
The “codama” voice interface development kit will be sold from 21stDecember 2018 via Yukai Engineering, MACNICA and Amazon.
Key features
- Small form factor development kit 64mm(W) x 55mm(L) x 20mm (H). (H)20mm: including pin headers
- Sample programs for NTT DOCOMO AI Agent API (available via Yukai Engineering, ux-xu.com)
- Works with Raspberry Pi
- Linear microphone array: 4 x Infineon XENSIV™IM69D130
- Wake-word engine by Sensory
- Full duplex Acoustic Echo Cancellation (AEC)
- Barge-in
- adaptive beamformer
- Dereverberation
- Noise suppression
- Automatic Gain Control (AGC)
- Direction of arrival (DOA) indication
Leave a Reply