While running comparisons between HNSW and DiskANN off of Milvus, I came to realize that DiskANN was still outperforming HNSW when it came to queries per second and recall for lower dataset sizes (1M and 10M, 768 dims).
As I was trying to reason as to why this was, I looked at the latency data for p95 and p99. Of course, HNSW was much faster in terms of latency for p95 and p99 as compared to DiskANN, since HNSW has everything completely in DRAM as opposed to DiskANN that only stores the compressed vectors here. However, since the latency is better here, one would suspect this would ultimately impact queries per second, where HNSW would win in terms of performance.
When moving to 100M, 768 dims, I could clearly see that the latency difference between using solid state storage devices versus DRAM, and in a system that could fit that much data still in DRAM, the performance then sided with HNSW. At larger vector sizes, the performance penalty of having higher latency and more random disk I/O requests with DiskANN shows with lower queries per second performance, as the efficiency of the algorithm is not enough to overcome the hardware latency differences.
However, due to DiskANN and Vamana's approach to a single layer search (as opposed to HNSW multiple layer search), at lower vector sizes, it is possible for DiskANN to outperform HNSW due to its more efficient search and compressed vectors in DRAM. But it isn't necessarily intuitive that this would be the case from the data that VectorDBBench itself presents. It made me wonder, is the average latency between my HNSW and DiskANN runs actually pretty similar, which results in DiskANN outperforming at lower vector dataset sizes. However, I am unable to verify this as I can only compare the p95 and p99 results for latency.
Thus, my request is to add average latency values to be reported alongside p95 and p99 so that users can get a fuller story when it comes to the performance of a VectorDB with regards to latency and its potential impacts on queries per second. Or, if a list of percentiles alongside average could be presented to vectordbbench command line invocations, that would provide an even better comparison for latency. Please let me know if this is a feasible request. Thanks.
While running comparisons between HNSW and DiskANN off of Milvus, I came to realize that DiskANN was still outperforming HNSW when it came to queries per second and recall for lower dataset sizes (1M and 10M, 768 dims).
As I was trying to reason as to why this was, I looked at the latency data for p95 and p99. Of course, HNSW was much faster in terms of latency for p95 and p99 as compared to DiskANN, since HNSW has everything completely in DRAM as opposed to DiskANN that only stores the compressed vectors here. However, since the latency is better here, one would suspect this would ultimately impact queries per second, where HNSW would win in terms of performance.
When moving to 100M, 768 dims, I could clearly see that the latency difference between using solid state storage devices versus DRAM, and in a system that could fit that much data still in DRAM, the performance then sided with HNSW. At larger vector sizes, the performance penalty of having higher latency and more random disk I/O requests with DiskANN shows with lower queries per second performance, as the efficiency of the algorithm is not enough to overcome the hardware latency differences.
However, due to DiskANN and Vamana's approach to a single layer search (as opposed to HNSW multiple layer search), at lower vector sizes, it is possible for DiskANN to outperform HNSW due to its more efficient search and compressed vectors in DRAM. But it isn't necessarily intuitive that this would be the case from the data that VectorDBBench itself presents. It made me wonder, is the average latency between my HNSW and DiskANN runs actually pretty similar, which results in DiskANN outperforming at lower vector dataset sizes. However, I am unable to verify this as I can only compare the p95 and p99 results for latency.
Thus, my request is to add average latency values to be reported alongside p95 and p99 so that users can get a fuller story when it comes to the performance of a VectorDB with regards to latency and its potential impacts on queries per second. Or, if a list of percentiles alongside average could be presented to vectordbbench command line invocations, that would provide an even better comparison for latency. Please let me know if this is a feasible request. Thanks.