Boosting Performance: A Comparative Analysis of NVIDIA Riva 2.1.0 vs Riva 1.10.0
In a recent milestone, NVIDIA unveiled Riva Speech Skills release 2.1.0, packed with new features and enhancements (full release notes can be found here)., including a noteworthy improvement in Conformer ASR latency and throughput. This caught our attention at Data Monsters, as we've been utilizing the Conformer CTC model in streaming mode for one of our ongoing projects, where performance is of paramount importance. Eager to explore the possibilities, we conducted a series of rigorous tests to assess the performance gains.
Above, you can witness the results of our performance comparison between Riva 2.10 and Riva 1.10.0-beta versions of the Conformer-CTC model in streaming mode, meticulously measured on a single Tesla V100 GPU. To ensure consistency, we employed a pretrained model with default riva-build options.
The models were constructed using Riva 1.10.0-beta and Riva 2.1.0, with varying configurations: 'low_latency', 'intermediate', and 'high_throughput,' denoting chunk sizes of 160, 400, and 800, respectively. The graph eloquently illustrates the performance improvements across each configuration, with a more pronounced impact on the effective number of streams.
For up to approximately 20 streams, there is minimal discernible difference. However, as the number of streams increases, a noticeable reduction in latency becomes apparent in Riva 2.1.0, holding steady with the same RTFX and stream count. Take, for instance, the high-throughput model with 128 audio streams, which performs approximately 100 milliseconds faster in Riva 2.1.0 than its Riva 1.10.0-beta counterpart with an equivalent stream count.
If your workload involves 20 or more concurrent streams, the upgrade to Riva 2.1.0 promises tangible performance improvements. However, it's essential to consider other changes accompanying the upgrade when making your decision. Rest assured, the future of accelerated speech processing is bright with NVIDIA Riva at the helm.
Remember, this is just a glimpse into the exciting world of performance enhancements offered by NVIDIA Riva 2.1.0. Explore the comprehensive release notes and dive into the multitude of possibilities that lie ahead.
Written by Anna Mosolova, Vladimir Nechaev, and Marina Molchanova - Data Scientists at Data Monsters.