Asr And Communication Processing; Xvf3510-Int - For Integrated Voice Interface Applications - XMOS VocalFusion XVF3510 User Manual

Hide thumbs

Table Of Contents

Table of Contents

of. A reference copy of the audio is provided to the AEC in order for it to accurately estimate the

echo.

}

Automatic Delay Estimation & Control (ADEC): Automatically monitors and automatically

compensates for the delay between the reference audio and the echo received by the

microphone.

Following echo cancellation, the ASR and communications paths diverge to permit parameter tuning

appropriate for the individual audio output use cases.

}

Interference Cancellation (IC): Suppresses static noise from point sources such as cooker

hoods, washing machines, or radios for which there is no reference audio signal available.

}

Voice Activity Detection (VAD): Controls adaption the IC and AGC to optimise output for near-

end speech.

}

Noise Suppression (NS): Suppresses diffuse noise from sources whose frequency

characteristics do not change rapidly over time (i.e., diffuse stationary noise).

}

Automatic Gain Control (AGC): Controls the audio output level via separate AGC channels for

Automatic Speech Recognition (ASR) and communications output. The VAD is used to prevent

gain changes during speech to improve speech recognition performance.

The pipeline has been designed to minimise the need to tune and modify these functions. However, if

required for specific use cases, these later sections of this document provide details of the relevant

parameters and processes.

2.3. ASR AND COMMUNICATION PROCESSING

The audio pipeline discussed above produces two separate audio streams, one specifically tuned for

integration with keyword and ASR services and the other designed for conferencing and

communication applications. Both processed audio streams are available to be output at the same

using the left and right channels of USB and I2S. The default configuration is as follows:

Table 2-1 Default channel mapping (both USB and I2S)

CHANNEL

[0] - Left

[1] - Right

In situations where an ASR is used to invoke a call it may be necessary to continually monitor the ASR

channel for a 'end call' intent. The parallel output of both ASR and Communications processed streams

allow the combination of high-quality calling audio with the tuned ASR capability.

The IO_MAP configuration parameter (see

configure both channels to be ASR or Communications if required.

2.4. XVF3510-INT - FOR INTEGRATED VOICE INTERFACE APPLICATIONS

The XVF3510-INT product embeds the core audio processing pipeline in an audio infrastructure that

supports rate conversion, filtering and signal routing. This infrastructure is controllable by the host

system via a set of control registers. In addition, the XVF3510-INT provides a set of peripheral

interfaces to the host system to other devices, eg digital inputs, LEDs, SPI peripherals etc.

The peripheral interfaces supported include an interface to an optional QSPI Flash device containing

the XVF3510 firmware and configuration information that is loaded by the processor on start-up.

XM-014232-PC

DEFAULT

Automatic Speech Recognition (ASR) optimised

Communications

Signal flow and processing

section) allows users to also

Table of Contents

This manual is also suitable for:

Vocalfusion xvf3510-int Vocalfusion xvf3510-ua

Asr And Communication Processing; Xvf3510-Int - For Integrated Voice Interface Applications - XMOS VocalFusion XVF3510 User Manual

2.3. ASR AND COMMUNICATION PROCESSING

2.4. XVF3510-INT - FOR INTEGRATED VOICE INTERFACE APPLICATIONS

Related Manuals for XMOS VocalFusion XVF3510

Related Content for XMOS VocalFusion XVF3510

This manual is also suitable for:

Table of Contents