Performance Tuning¶

This page provides guidance on how to optimize HNSW for your specific use case, balancing search speed, memory usage, and accuracy.

Overview¶

HNSW performance is influenced by several factors:

Graph construction parameters - Control the structure and connectivity of the graph
Search parameters - Determine the trade-off between search speed and accuracy
Hardware considerations - CPU, memory, and cache utilization
Data characteristics - Dimensionality, distribution, and size of the dataset

By tuning these parameters, you can optimize HNSW for your specific requirements.

Key Performance Parameters¶

M (Maximum Number of Connections)¶

The M parameter controls the maximum number of connections per node in the graph:

graph := hnsw.NewGraph[int](hnsw.WithM(16))

Higher M values (e.g., 32-64):
Improve search accuracy
Increase memory usage
Slow down graph construction
Potentially improve search speed (up to a point)
Lower M values (e.g., 5-10):
Reduce memory usage
Speed up graph construction
May reduce search accuracy

Recommended values:

For high-dimensional data (>100 dimensions): 16-32
For medium-dimensional data (20-100 dimensions): 12-16
For low-dimensional data (<20 dimensions): 5-10

ef Construction¶

The efConstruction parameter controls the size of the dynamic candidate list during graph construction:

graph := hnsw.NewGraph[int](hnsw.WithEfConstruction(200))

Higher efConstruction values (e.g., 200-500):
Improve graph quality and search accuracy
Slow down graph construction
No impact on search speed or memory usage after construction
Lower efConstruction values (e.g., 40-100):
Speed up graph construction
May reduce search accuracy

Recommended values:

For high recall requirements (>95%): 200-500
For balanced performance: 100-200
For fast construction with acceptable accuracy: 40-100

ef Search¶

The ef parameter controls the size of the dynamic candidate list during search:

results := graph.SearchWithEF(query, 10, 100)

Higher ef values (e.g., 100-1000):
Improve search accuracy (recall)
Slow down search
Lower ef values (e.g., 10-50):
Speed up search
May reduce search accuracy

Recommended values:

For high recall requirements (>95%): ef = 100-200
For balanced performance: ef = 50-100
For fast search with acceptable accuracy: ef = 10-50

Batch Size¶

When adding multiple vectors, the batch size can impact performance:

// Add vectors in batches
for i := 0; i < len(vectors); i += batchSize {
    end := i + batchSize
    if end > len(vectors) {
        end = len(vectors)
    }
    graph.BatchAdd(vectors[i:end], ids[i:end])
}

Larger batch sizes (e.g., 1000-10000):
Better memory locality and CPU cache utilization
More efficient parallel processing
Higher memory usage during batch processing
Smaller batch sizes (e.g., 100-1000):
Lower memory usage
Potentially slower overall construction

Recommended values:

For most systems: 1000-5000 vectors per batch
For memory-constrained systems: 100-1000 vectors per batch

Performance Benchmarking¶

To find the optimal parameters for your specific use case, it's recommended to run benchmarks:

func benchmarkSearch(graph *hnsw.Graph, queries []Vector, k int, efValues []int) {
    for _, ef := range efValues {
        startTime := time.Now()

        var results [][]Node
        for _, query := range queries {
            results = append(results, graph.SearchWithEF(query, k, ef))
        }

        duration := time.Since(startTime)
        avgTime := duration.Seconds() / float64(len(queries))

        // Calculate recall if ground truth is available
        recall := calculateRecall(results, groundTruth)

        fmt.Printf("ef=%d: avg_time=%.6fs, recall=%.4f\n", ef, avgTime, recall)
    }
}

This benchmark helps you find the optimal ef value for your requirements.

Memory Optimization¶

Compact Storage¶

HNSW can use a significant amount of memory, especially for large datasets. To reduce memory usage:

// Use compact storage for vectors
graph := hnsw.NewGraph[int](hnsw.WithCompactVectors(true))

Compact storage reduces memory usage by:

Storing vectors more efficiently
Using less memory for graph connectivity
Potentially improving cache locality

Quantization¶

For very large datasets, you can use vector quantization to reduce memory usage:

// Use 8-bit quantization for vectors
graph := hnsw.NewGraph[int](hnsw.WithQuantization(8))

Quantization reduces memory usage by:

Converting 32-bit floats to lower precision (e.g., 8-bit integers)
Potentially reducing search accuracy
Improving cache locality and search speed

External Storage¶

For extremely large datasets that don't fit in memory, you can use external storage:

// Use memory-mapped storage for vectors
storage := hnsw.NewMMapVectorStorage("vectors.bin")
graph := hnsw.NewGraph[int](hnsw.WithExternalVectorStorage(storage))

External storage:

Allows datasets larger than available RAM
May slow down search due to disk I/O
Can be combined with caching strategies for frequently accessed vectors

CPU Optimization¶

SIMD Acceleration¶

HNSW can benefit from SIMD (Single Instruction, Multiple Data) instructions for distance calculations:

// Enable SIMD acceleration for distance calculations
graph := hnsw.NewGraph[int](hnsw.WithSIMD(true))

SIMD acceleration:

Speeds up distance calculations by processing multiple vector elements in parallel
Works best for Euclidean and cosine distance
May not be available on all platforms

Thread Pool¶

For concurrent operations, using a thread pool can improve performance:

// Use a thread pool with 8 workers
pool := hnsw.NewThreadPool(8)
graph := hnsw.NewGraph[int](hnsw.WithThreadPool(pool))

A thread pool:

Reduces thread creation overhead
Allows better control of parallelism
Can be shared across multiple HNSW graphs

Scaling Strategies¶

Sharding¶

For very large datasets, you can use sharding to distribute the data across multiple HNSW graphs:

// Create multiple HNSW graphs (shards)
shards := make([]*hnsw.Graph, numShards)
for i := range shards {
    shards[i] = hnsw.NewGraph[int]()
}

// Add vectors to shards based on some partitioning strategy
for i, vector := range vectors {
    shardIndex := getShardIndex(vector) // Custom sharding function
    shards[shardIndex].Add(vector, ids[i])
}

// Search across all shards and merge results
func searchAcrossShards(query Vector, k int) []Node {
    var allResults []Node

    // Search each shard concurrently
    var wg sync.WaitGroup
    resultChan := make(chan []Node, len(shards))

    for _, shard := range shards {
        wg.Add(1)
        go func(shard *hnsw.Graph) {
            defer wg.Done()
            results := shard.Search(query, k)
            resultChan <- results
        }(shard)
    }

    // Wait for all searches to complete
    wg.Wait()
    close(resultChan)

    // Collect and merge results
    for results := range resultChan {
        allResults = append(allResults, results...)
    }

    // Sort and return top k
    sort.Slice(allResults, func(i, j int) bool {
        return allResults[i].Distance < allResults[j].Distance
    })

    if len(allResults) > k {
        allResults = allResults[:k]
    }

    return allResults
}

Sharding:

Allows scaling beyond the memory limits of a single machine
Enables parallel processing across shards
May reduce search accuracy if not implemented carefully

Distributed HNSW¶

For extremely large datasets, you can implement a distributed HNSW system:

// Example of a simple distributed HNSW client
type DistributedHNSW struct {
    shardClients []*ShardClient
}

func (d *DistributedHNSW) Search(query Vector, k int) []Node {
    var allResults []Node

    // Search each shard concurrently
    var wg sync.WaitGroup
    resultChan := make(chan []Node, len(d.shardClients))

    for _, client := range d.shardClients {
        wg.Add(1)
        go func(client *ShardClient) {
            defer wg.Done()
            results := client.Search(query, k)
            resultChan <- results
        }(client)
    }

    // Wait for all searches to complete
    wg.Wait()
    close(resultChan)

    // Collect and merge results
    for results := range resultChan {
        allResults = append(allResults, results...)
    }

    // Sort and return top k
    sort.Slice(allResults, func(i, j int) bool {
        return allResults[i].Distance < allResults[j].Distance
    })

    if len(allResults) > k {
        allResults = allResults[:k]
    }

    return allResults
}

Distributed HNSW:

Scales to billions of vectors across multiple machines
Requires careful design of the distribution strategy
Introduces network latency for search operations

Real-world Performance Tuning Examples¶

High-Recall Text Search¶

For applications requiring high recall, such as semantic text search:

// Configuration for high-recall text search
graph := hnsw.NewGraph[int](
    hnsw.WithM(16),
    hnsw.WithEfConstruction(300),
    hnsw.WithDistance(hnsw.CosineDistance),
)

// Add vectors (text embeddings)
for i, embedding := range embeddings {
    graph.Add(embedding, i)
}

// Search with high ef for better recall
results := graph.SearchWithEF(queryEmbedding, 10, 200)

Fast Image Similarity Search¶

For applications requiring fast search, such as real-time image similarity:

// Configuration for fast image similarity search
graph := hnsw.NewGraph[int](
    hnsw.WithM(8),
    hnsw.WithEfConstruction(100),
    hnsw.WithDistance(hnsw.EuclideanDistance),
    hnsw.WithSIMD(true),
)

// Add vectors (image features)
for i, features := range imageFeatures {
    graph.Add(features, i)
}

// Search with low ef for faster results
results := graph.SearchWithEF(queryFeatures, 5, 20)

Memory-Efficient Recommendation System¶

For large-scale recommendation systems with memory constraints:

// Configuration for memory-efficient recommendation system
graph := hnsw.NewGraph[int](
    hnsw.WithM(10),
    hnsw.WithEfConstruction(150),
    hnsw.WithCompactVectors(true),
    hnsw.WithQuantization(8),
)

// Add vectors in batches to manage memory usage
batchSize := 1000
for i := 0; i < len(itemVectors); i += batchSize {
    end := i + batchSize
    if end > len(itemVectors) {
        end = len(itemVectors)
    }
    graph.BatchAdd(itemVectors[i:end], itemIDs[i:end])
}

// Search with moderate ef for balanced performance
results := graph.SearchWithEF(userPreferences, 20, 50)

Monitoring and Profiling¶

To continuously optimize HNSW performance, implement monitoring and profiling:

// Simple search profiling
func profileSearch(graph *hnsw.Graph, query Vector, k int, ef int) {
    startTime := time.Now()
    results := graph.SearchWithEF(query, k, ef)
    duration := time.Since(startTime)

    fmt.Printf("Search stats: time=%.6fs, results=%d, ef=%d\n", 
               duration.Seconds(), len(results), ef)
}

// Memory usage monitoring
func monitorMemoryUsage() {
    var m runtime.MemStats
    runtime.ReadMemStats(&m)

    fmt.Printf("Memory stats: alloc=%.2f MB, sys=%.2f MB, gc=%d\n",
               float64(m.Alloc)/1024/1024,
               float64(m.Sys)/1024/1024,
               m.NumGC)
}

Regular profiling helps identify performance bottlenecks and opportunities for optimization.

Next Steps¶

Now that you understand how to tune HNSW performance, you can explore:

Graph Structure: Learn about the hierarchical structure of HNSW
Search Algorithms: Dive deeper into the search algorithms used in HNSW
Distance Functions: Understand how to choose the right distance function
Graph Analyzer: Evaluate and diagnose the quality of your HNSW graphs
Advanced Techniques: Explore sophisticated usage patterns