{"id":245116,"date":"2024-07-18T19:47:58","date_gmt":"2024-07-18T19:47:58","guid":{"rendered":"https:\/\/michigandigitalnews.com\/index.php\/2024\/07\/18\/optimizing-ivf-pq-performance-with-rapids-cuvs-key-tuning-techniques\/"},"modified":"2025-06-25T17:14:27","modified_gmt":"2025-06-25T17:14:27","slug":"optimizing-ivf-pq-performance-with-rapids-cuvs-key-tuning-techniques","status":"publish","type":"post","link":"https:\/\/michigandigitalnews.com\/index.php\/2024\/07\/18\/optimizing-ivf-pq-performance-with-rapids-cuvs-key-tuning-techniques\/","title":{"rendered":"Optimizing IVF-PQ Performance with RAPIDS cuVS: Key Tuning Techniques"},"content":{"rendered":"<p> [ad_1]<br \/>\n<\/p>\n<div>\n<figure class=\"figure mt-2\">&#13;<br \/>\n                                &#13;<\/p>\n<p>&#13;<br \/>\n                                    <a href=\"https:\/\/blockchain.news\/Profile\/Tony-Kim\">Tony Kim<\/a>&#13;<br \/>\n                                    <span class=\"publication-date ml-2\"> Jul 18, 2024 19:39<\/span>&#13;\n                                <\/p>\n<p>&#13;<\/p>\n<p class=\"lead\">Learn how to optimize the IVF-PQ algorithm for vector search performance using RAPIDS cuVS, with practical tips on tuning hyper-parameters and improving recall.<\/p>\n<p>&#13;<br \/>\n                                <a href=\"https:\/\/image.blockchain.news:443\/features\/D8E08E86F8EDBDDCD68414CF49BDD8B1401B11A69515DFF98E6B2B03EE9CF9D7.jpg\">&#13;<br \/>\n                                    <img decoding=\"async\" class=\"rounded\" src=\"https:\/\/image.blockchain.news:443\/features\/D8E08E86F8EDBDDCD68414CF49BDD8B1401B11A69515DFF98E6B2B03EE9CF9D7.jpg\" alt=\"Optimizing IVF-PQ Performance with RAPIDS cuVS: Key Tuning Techniques\"\/>&#13;<br \/>\n                                <\/a>&#13;<br \/>\n                            <\/figure>\n<p>In the first part of the series, an overview of the IVF-PQ algorithm was presented, explaining its foundation on the IVF-Flat algorithm and the use of Product Quantization (PQ) to compress the index and support larger datasets. In part two, the focus shifts to the practical aspects of tuning IVF-PQ performance, which is crucial for achieving optimal results, especially when dealing with billion-scale datasets.<\/p>\n<h2>Tuning Parameters for Index Building<\/h2>\n<p>IVF-PQ shares some parameters with IVF-Flat, such as coarse-level indexing and search hyper-parameters. However, IVF-PQ introduces additional parameters that control compression. One of the critical parameters is <code>n_lists<\/code>, which determines the number of partitions (inverted lists) into which the input dataset is clustered. The performance is influenced by the number of lists probed and their sizes. Experiments suggest that <code>n_lists<\/code> in the range of 10K to 50K yield good performance across recall levels, though this can vary depending on the dataset.<\/p>\n<p>Another crucial parameter is <code>pq_dim<\/code>, which controls compression. Starting with one fourth the number of features in the dataset and increasing in steps is a good technique for tuning this parameter. Figure 2 in the original blog post illustrates significant drops in QPS, which can be attributed to factors such as increased compute work and shared memory requirements per CUDA block.<\/p>\n<p>The <code>pq_bits<\/code> parameter, ranging from 4 to 8, controls the number of bits used in each individual PQ code, affecting the codebook size and recall. Reducing <code>pq_bits<\/code> can improve search speed by fitting the look-up table (LUT) in shared memory, although this comes at the cost of recall.<\/p>\n<h3>Additional Parameters<\/h3>\n<p>The <code>codebook_kind<\/code> parameter determines how the codebooks for the second-level quantizer are constructed, either for each subspace or for each cluster. The choice between these options can impact training time, GPU shared memory utilization, and recall. Parameters such as <code>kmeans_n_iters<\/code> and <code>kmeans_trainset_fraction<\/code> are also important, though they rarely need adjustment.<\/p>\n<h2>Tuning Parameters for Search<\/h2>\n<p>The <code>n_probes<\/code> parameter, discussed in the previous blog post on IVF-Flat, is essential for search accuracy and throughput. IVF-PQ provides additional parameters like <code>internal_distance_dtype<\/code> and <code>lut_dtype<\/code>, which control the representation of distance or similarity during the search and the datatype used to store the LUT, respectively. Adjusting these parameters can significantly impact performance, especially for datasets with large dimensionality.<\/p>\n<h2>Improving Recall with Refinement<\/h2>\n<p>When tuning parameters is not enough to achieve the desired recall, refinement offers a promising alternative. This separate operation, performed after the ANN search, recomputes exact distances for selected candidates and reranks them. The refinement operation can significantly improve recall, as demonstrated in Figure 4 of the original blog post, though it requires access to the source dataset.<\/p>\n<h2>Summary<\/h2>\n<p>The series on accelerating vector search with inverted-file indexes covers two cuVS algorithms: IVF-Flat and IVF-PQ. IVF-PQ extends IVF-Flat with PQ compression, enabling faster searches and the ability to handle billion-scale datasets with limited GPU memory. By fine-tuning parameters for index building and search, data practitioners can achieve the best results efficiently. The RAPIDS cuVS library offers a range of vector search algorithms to cater to various use cases, from exact searches to low-accuracy-high-QPS ANN methods.<\/p>\n<p>For practical tuning of IVF-PQ parameters, refer to the <a rel=\"nofollow\" href=\"https:\/\/github.com\/rapidsai\/cuvs\/blob\/HEAD\/notebooks\/tutorial_ivf_pq.ipynb\">IVF-PQ notebook<\/a> on GitHub. For more details on the provided APIs, see the <a rel=\"nofollow\" href=\"https:\/\/docs.rapids.ai\/api\/cuvs\/nightly\/\">cuVS documentation<\/a>.<\/p>\n<p><span><i>Image source: Shutterstock<\/i><\/span><\/p>\n<p>                            <!-- Divider --><\/p>\n<p>                            <!-- Author info END --><br \/>\n                            <!-- Divider --><\/p><\/div>\n<p>[ad_2]<br \/>\n<br \/><a href=\"https:\/\/blockchain.news\/news\/optimizing-ivf-pq-performance-rapids-cuvs\">Source link <\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>[ad_1] &#13; &#13; &#13; Tony Kim&#13; Jul 18, 2024 19:39&#13; &#13; Learn how to optimize the IVF-PQ algorithm for vector search performance using RAPIDS cuVS,<\/p>\n","protected":false},"author":1,"featured_media":245117,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"_uf_show_specific_survey":0,"_uf_disable_surveys":false,"footnotes":""},"categories":[171],"tags":[],"_links":{"self":[{"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/posts\/245116"}],"collection":[{"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/comments?post=245116"}],"version-history":[{"count":0,"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/posts\/245116\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/media\/245117"}],"wp:attachment":[{"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/media?parent=245116"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/categories?post=245116"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/michigandigitalnews.com\/index.php\/wp-json\/wp\/v2\/tags?post=245116"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}