Performance Tuning¶
This page provides guidance on how to optimize HNSW for your specific use case, balancing search speed, memory usage, and accuracy.
Overview¶
HNSW performance is influenced by several factors:
- Graph construction parameters - Control the structure and connectivity of the graph
- Search parameters - Determine the trade-off between search speed and accuracy
- Hardware considerations - CPU, memory, and cache utilization
- Data characteristics - Dimensionality, distribution, and size of the dataset
By tuning these parameters, you can optimize HNSW for your specific requirements.
Key Performance Parameters¶
M (Maximum Number of Connections)¶
The M
parameter controls the maximum number of connections per node in the graph:
- Higher M values (e.g., 32-64):
- Improve search accuracy
- Increase memory usage
- Slow down graph construction
-
Potentially improve search speed (up to a point)
-
Lower M values (e.g., 5-10):
- Reduce memory usage
- Speed up graph construction
- May reduce search accuracy
Recommended values:
- For high-dimensional data (>100 dimensions): 16-32
- For medium-dimensional data (20-100 dimensions): 12-16
- For low-dimensional data (<20 dimensions): 5-10
ef Construction¶
The efConstruction
parameter controls the size of the dynamic candidate list during graph construction:
- Higher efConstruction values (e.g., 200-500):
- Improve graph quality and search accuracy
- Slow down graph construction
-
No impact on search speed or memory usage after construction
-
Lower efConstruction values (e.g., 40-100):
- Speed up graph construction
- May reduce search accuracy
Recommended values:
- For high recall requirements (>95%): 200-500
- For balanced performance: 100-200
- For fast construction with acceptable accuracy: 40-100
ef Search¶
The ef
parameter controls the size of the dynamic candidate list during search:
- Higher ef values (e.g., 100-1000):
- Improve search accuracy (recall)
-
Slow down search
-
Lower ef values (e.g., 10-50):
- Speed up search
- May reduce search accuracy
Recommended values:
- For high recall requirements (>95%): ef = 100-200
- For balanced performance: ef = 50-100
- For fast search with acceptable accuracy: ef = 10-50
Batch Size¶
When adding multiple vectors, the batch size can impact performance:
// Add vectors in batches
for i := 0; i < len(vectors); i += batchSize {
end := i + batchSize
if end > len(vectors) {
end = len(vectors)
}
graph.BatchAdd(vectors[i:end], ids[i:end])
}
- Larger batch sizes (e.g., 1000-10000):
- Better memory locality and CPU cache utilization
- More efficient parallel processing
-
Higher memory usage during batch processing
-
Smaller batch sizes (e.g., 100-1000):
- Lower memory usage
- Potentially slower overall construction
Recommended values:
- For most systems: 1000-5000 vectors per batch
- For memory-constrained systems: 100-1000 vectors per batch
Performance Benchmarking¶
To find the optimal parameters for your specific use case, it's recommended to run benchmarks:
func benchmarkSearch(graph *hnsw.Graph, queries []Vector, k int, efValues []int) {
for _, ef := range efValues {
startTime := time.Now()
var results [][]Node
for _, query := range queries {
results = append(results, graph.SearchWithEF(query, k, ef))
}
duration := time.Since(startTime)
avgTime := duration.Seconds() / float64(len(queries))
// Calculate recall if ground truth is available
recall := calculateRecall(results, groundTruth)
fmt.Printf("ef=%d: avg_time=%.6fs, recall=%.4f\n", ef, avgTime, recall)
}
}
This benchmark helps you find the optimal ef
value for your requirements.
Memory Optimization¶
Compact Storage¶
HNSW can use a significant amount of memory, especially for large datasets. To reduce memory usage:
Compact storage reduces memory usage by:
- Storing vectors more efficiently
- Using less memory for graph connectivity
- Potentially improving cache locality
Quantization¶
For very large datasets, you can use vector quantization to reduce memory usage:
Quantization reduces memory usage by:
- Converting 32-bit floats to lower precision (e.g., 8-bit integers)
- Potentially reducing search accuracy
- Improving cache locality and search speed
External Storage¶
For extremely large datasets that don't fit in memory, you can use external storage:
// Use memory-mapped storage for vectors
storage := hnsw.NewMMapVectorStorage("vectors.bin")
graph := hnsw.NewGraph[int](hnsw.WithExternalVectorStorage(storage))
External storage:
- Allows datasets larger than available RAM
- May slow down search due to disk I/O
- Can be combined with caching strategies for frequently accessed vectors
CPU Optimization¶
SIMD Acceleration¶
HNSW can benefit from SIMD (Single Instruction, Multiple Data) instructions for distance calculations:
// Enable SIMD acceleration for distance calculations
graph := hnsw.NewGraph[int](hnsw.WithSIMD(true))
SIMD acceleration:
- Speeds up distance calculations by processing multiple vector elements in parallel
- Works best for Euclidean and cosine distance
- May not be available on all platforms
Thread Pool¶
For concurrent operations, using a thread pool can improve performance:
// Use a thread pool with 8 workers
pool := hnsw.NewThreadPool(8)
graph := hnsw.NewGraph[int](hnsw.WithThreadPool(pool))
A thread pool:
- Reduces thread creation overhead
- Allows better control of parallelism
- Can be shared across multiple HNSW graphs
Scaling Strategies¶
Sharding¶
For very large datasets, you can use sharding to distribute the data across multiple HNSW graphs:
// Create multiple HNSW graphs (shards)
shards := make([]*hnsw.Graph, numShards)
for i := range shards {
shards[i] = hnsw.NewGraph[int]()
}
// Add vectors to shards based on some partitioning strategy
for i, vector := range vectors {
shardIndex := getShardIndex(vector) // Custom sharding function
shards[shardIndex].Add(vector, ids[i])
}
// Search across all shards and merge results
func searchAcrossShards(query Vector, k int) []Node {
var allResults []Node
// Search each shard concurrently
var wg sync.WaitGroup
resultChan := make(chan []Node, len(shards))
for _, shard := range shards {
wg.Add(1)
go func(shard *hnsw.Graph) {
defer wg.Done()
results := shard.Search(query, k)
resultChan <- results
}(shard)
}
// Wait for all searches to complete
wg.Wait()
close(resultChan)
// Collect and merge results
for results := range resultChan {
allResults = append(allResults, results...)
}
// Sort and return top k
sort.Slice(allResults, func(i, j int) bool {
return allResults[i].Distance < allResults[j].Distance
})
if len(allResults) > k {
allResults = allResults[:k]
}
return allResults
}
Sharding:
- Allows scaling beyond the memory limits of a single machine
- Enables parallel processing across shards
- May reduce search accuracy if not implemented carefully
Distributed HNSW¶
For extremely large datasets, you can implement a distributed HNSW system:
// Example of a simple distributed HNSW client
type DistributedHNSW struct {
shardClients []*ShardClient
}
func (d *DistributedHNSW) Search(query Vector, k int) []Node {
var allResults []Node
// Search each shard concurrently
var wg sync.WaitGroup
resultChan := make(chan []Node, len(d.shardClients))
for _, client := range d.shardClients {
wg.Add(1)
go func(client *ShardClient) {
defer wg.Done()
results := client.Search(query, k)
resultChan <- results
}(client)
}
// Wait for all searches to complete
wg.Wait()
close(resultChan)
// Collect and merge results
for results := range resultChan {
allResults = append(allResults, results...)
}
// Sort and return top k
sort.Slice(allResults, func(i, j int) bool {
return allResults[i].Distance < allResults[j].Distance
})
if len(allResults) > k {
allResults = allResults[:k]
}
return allResults
}
Distributed HNSW:
- Scales to billions of vectors across multiple machines
- Requires careful design of the distribution strategy
- Introduces network latency for search operations
Real-world Performance Tuning Examples¶
High-Recall Text Search¶
For applications requiring high recall, such as semantic text search:
// Configuration for high-recall text search
graph := hnsw.NewGraph[int](
hnsw.WithM(16),
hnsw.WithEfConstruction(300),
hnsw.WithDistance(hnsw.CosineDistance),
)
// Add vectors (text embeddings)
for i, embedding := range embeddings {
graph.Add(embedding, i)
}
// Search with high ef for better recall
results := graph.SearchWithEF(queryEmbedding, 10, 200)
Fast Image Similarity Search¶
For applications requiring fast search, such as real-time image similarity:
// Configuration for fast image similarity search
graph := hnsw.NewGraph[int](
hnsw.WithM(8),
hnsw.WithEfConstruction(100),
hnsw.WithDistance(hnsw.EuclideanDistance),
hnsw.WithSIMD(true),
)
// Add vectors (image features)
for i, features := range imageFeatures {
graph.Add(features, i)
}
// Search with low ef for faster results
results := graph.SearchWithEF(queryFeatures, 5, 20)
Memory-Efficient Recommendation System¶
For large-scale recommendation systems with memory constraints:
// Configuration for memory-efficient recommendation system
graph := hnsw.NewGraph[int](
hnsw.WithM(10),
hnsw.WithEfConstruction(150),
hnsw.WithCompactVectors(true),
hnsw.WithQuantization(8),
)
// Add vectors in batches to manage memory usage
batchSize := 1000
for i := 0; i < len(itemVectors); i += batchSize {
end := i + batchSize
if end > len(itemVectors) {
end = len(itemVectors)
}
graph.BatchAdd(itemVectors[i:end], itemIDs[i:end])
}
// Search with moderate ef for balanced performance
results := graph.SearchWithEF(userPreferences, 20, 50)
Monitoring and Profiling¶
To continuously optimize HNSW performance, implement monitoring and profiling:
// Simple search profiling
func profileSearch(graph *hnsw.Graph, query Vector, k int, ef int) {
startTime := time.Now()
results := graph.SearchWithEF(query, k, ef)
duration := time.Since(startTime)
fmt.Printf("Search stats: time=%.6fs, results=%d, ef=%d\n",
duration.Seconds(), len(results), ef)
}
// Memory usage monitoring
func monitorMemoryUsage() {
var m runtime.MemStats
runtime.ReadMemStats(&m)
fmt.Printf("Memory stats: alloc=%.2f MB, sys=%.2f MB, gc=%d\n",
float64(m.Alloc)/1024/1024,
float64(m.Sys)/1024/1024,
m.NumGC)
}
Regular profiling helps identify performance bottlenecks and opportunities for optimization.
Next Steps¶
Now that you understand how to tune HNSW performance, you can explore:
- Graph Structure: Learn about the hierarchical structure of HNSW
- Search Algorithms: Dive deeper into the search algorithms used in HNSW
- Distance Functions: Understand how to choose the right distance function
- Graph Analyzer: Evaluate and diagnose the quality of your HNSW graphs
- Advanced Techniques: Explore sophisticated usage patterns