- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
向量索引与优化
展开查看详情
1 .Indexing Methods & Optimization Opportunities 易小萌
2 .Background: Vector Search © 2020 Zilliz. All rights reserved.
3 .Information Retrieval: from keywords to rich media © 2020 Zilliz. All rights reserved.
4 .Embedding: represent rich media data as vectors a b c a b c © 2020 Zilliz. All rights reserved.
5 .Vector Search: Indexing methods © 2020 Zilliz. All rights reserved.
6 .Graph: HNSW/NSG/NGT Efficient and robust approximate nearest neighbor © 2020 Zilliz. All search using rights reserved. Hierarchical Navigable Small world graphs
7 .Space Partition: IVF/LSH/Tree © 2020 Zilliz. All rights reserved. The inverted MultiIndex
8 .Encoding: PQ/SQ © 2020 Zilliz. All rights reserved. Product quantization for nearest neighbor search
9 .Comparison Fast Fast, accurate, and small, never reached at the same time… HNSW L&C FLAT ∅ IVF_PQ IVF Accurate _SQ Small © 2020 Zilliz. All rights reserved.
10 .Observations © 2020 Zilliz. All rights reserved.
11 .Larger nlist works with issues © 2020 Zilliz. All rights reserved. Revisiting the inverted indices for billion-scale approximate nearest neighbors
12 .Larger nlist works with issues 1 0.9 0.8 0.7 Recall 0.6 1024 0.5 2048 0.4 0.3 4096 0.2 1 2 4 8 16 32 Nprobe © 2020 Zilliz. All rights reserved.
13 .Larger nlist works with issues © 2020 Zilliz. All rights reserved.
14 .Filter and Validation works 10 9 IVF_FLAT 8 IVF_SQ_FLAT 7 IVF_PQ_FLAT Search Time 6 5 4 3 2 1 0 0.7 0.8 0.9 1 Recall © 2020 Zilliz. All rights reserved.
15 .Retrospect Fast • Larger nlist works with issues • Filter and Validation works HNSW L&C FLAT ∅ IVF_PQ IVF Accurate _SQ Small © 2020 Zilliz. All rights reserved.
16 .Idea: a three-layer framework © 2020 Zilliz. All rights reserved.
17 .Layers: function decomposition brings optimization opportunity Layer Data Size Candidates Requireme Function for a query nt Space Clusters Small Full Accurate, Partition Fast Candidate Compress Mediu Small Fast Filtering ed vectors m portion Result Original Large Very small Accurate Validation vectors portion © 2020 Zilliz. All rights reserved.
18 .Layers: function decomposition brings optimization opportunity Layer Size Require Index Type Optimize Function ment (Adjustable) Opportunity Space Small Accurate, HNSW Cache-based Partition Fast optimization Candidate Medi Fast SQ/PQ Data locality, Filtering um inter/intra query parallelism Result Large Accurate FLAT SSD-based Storage, Validation compute-read pipeline © 2020 Zilliz. All rights reserved.
19 .Q&A © 2020 Zilliz. All rights reserved.