By clicking “Accept All Cookies”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.

Boosting Performance: A Comparative Analysis of NVIDIA Riva 2.1.0 vs Riva 1.10.0

Vladimir Nechaev

In a recent milestone, NVIDIA unveiled Riva Speech Skills release 2.1.0, packed with new features and enhancements (full release notes can be found here)., including a noteworthy improvement in Conformer ASR latency and throughput. This caught our attention at Data Monsters, as we've been utilizing the Conformer CTC model in streaming mode for one of our ongoing projects, where performance is of paramount importance. Eager to explore the possibilities, we conducted a series of rigorous tests to assess the performance gains.

Fig. 1: Unleashing the Power - Conformer-CTC Latency vs Throughput

Above, you can witness the results of our performance comparison between Riva 2.10 and Riva 1.10.0-beta versions of the Conformer-CTC model in streaming mode, meticulously measured on a single Tesla V100 GPU. To ensure consistency, we employed a pretrained model with default riva-build options.

The models were constructed using Riva 1.10.0-beta and Riva 2.1.0, with varying configurations: 'low_latency', 'intermediate', and 'high_throughput,' denoting chunk sizes of 160, 400, and 800, respectively. The graph eloquently illustrates the performance improvements across each configuration, with a more pronounced impact on the effective number of streams.

For up to approximately 20 streams, there is minimal discernible difference. However, as the number of streams increases, a noticeable reduction in latency becomes apparent in Riva 2.1.0, holding steady with the same RTFX and stream count. Take, for instance, the high-throughput model with 128 audio streams, which performs approximately 100 milliseconds faster in Riva 2.1.0 than its Riva 1.10.0-beta counterpart with an equivalent stream count.

If your workload involves 20 or more concurrent streams, the upgrade to Riva 2.1.0 promises tangible performance improvements. However, it's essential to consider other changes accompanying the upgrade when making your decision. Rest assured, the future of accelerated speech processing is bright with NVIDIA Riva at the helm.

Remember, this is just a glimpse into the exciting world of performance enhancements offered by NVIDIA Riva 2.1.0. Explore the comprehensive release notes and dive into the multitude of possibilities that lie ahead.

Written by Anna Mosolova, Vladimir Nechaev, and Marina Molchanova - Data Scientists at Data Monsters.

Latest articles

Browse all
November 2, 2024

OnLogic and Data Monsters Partner to Deliver Advanced Computer Vision Analytics & Edge AI Solutions

OnLogic and Data Monsters partner on creating advanced computer vision analytics for manufacturing companies. This solution...

Read
April 23, 2024

IBA Group and Data Monsters Form Strategic Generative Al Partnership

IBA Group, a global leader in software services and prominent SAP partner, together with Data Monsters, an NVIDIA Elite consulting

Read