Skip to content

Performance Tuning

This page provides guidance on how to optimize HNSW for your specific use case, balancing search speed, memory usage, and accuracy.

Overview

HNSW performance is influenced by several factors:

  1. Graph construction parameters - Control the structure and connectivity of the graph
  2. Search parameters - Determine the trade-off between search speed and accuracy
  3. Hardware considerations - CPU, memory, and cache utilization
  4. Data characteristics - Dimensionality, distribution, and size of the dataset

By tuning these parameters, you can optimize HNSW for your specific requirements.

Key Performance Parameters

M (Maximum Number of Connections)

The M parameter controls the maximum number of connections per node in the graph:

graph := hnsw.NewGraph[int](hnsw.WithM(16))
  • Higher M values (e.g., 32-64):
  • Improve search accuracy
  • Increase memory usage
  • Slow down graph construction
  • Potentially improve search speed (up to a point)

  • Lower M values (e.g., 5-10):

  • Reduce memory usage
  • Speed up graph construction
  • May reduce search accuracy

Recommended values:

  • For high-dimensional data (>100 dimensions): 16-32
  • For medium-dimensional data (20-100 dimensions): 12-16
  • For low-dimensional data (<20 dimensions): 5-10

ef Construction

The efConstruction parameter controls the size of the dynamic candidate list during graph construction:

graph := hnsw.NewGraph[int](hnsw.WithEfConstruction(200))
  • Higher efConstruction values (e.g., 200-500):
  • Improve graph quality and search accuracy
  • Slow down graph construction
  • No impact on search speed or memory usage after construction

  • Lower efConstruction values (e.g., 40-100):

  • Speed up graph construction
  • May reduce search accuracy

Recommended values:

  • For high recall requirements (>95%): 200-500
  • For balanced performance: 100-200
  • For fast construction with acceptable accuracy: 40-100

The ef parameter controls the size of the dynamic candidate list during search:

results := graph.SearchWithEF(query, 10, 100)
  • Higher ef values (e.g., 100-1000):
  • Improve search accuracy (recall)
  • Slow down search

  • Lower ef values (e.g., 10-50):

  • Speed up search
  • May reduce search accuracy

Recommended values:

  • For high recall requirements (>95%): ef = 100-200
  • For balanced performance: ef = 50-100
  • For fast search with acceptable accuracy: ef = 10-50

Batch Size

When adding multiple vectors, the batch size can impact performance:

// Add vectors in batches
for i := 0; i < len(vectors); i += batchSize {
    end := i + batchSize
    if end > len(vectors) {
        end = len(vectors)
    }
    graph.BatchAdd(vectors[i:end], ids[i:end])
}
  • Larger batch sizes (e.g., 1000-10000):
  • Better memory locality and CPU cache utilization
  • More efficient parallel processing
  • Higher memory usage during batch processing

  • Smaller batch sizes (e.g., 100-1000):

  • Lower memory usage
  • Potentially slower overall construction

Recommended values:

  • For most systems: 1000-5000 vectors per batch
  • For memory-constrained systems: 100-1000 vectors per batch

Performance Benchmarking

To find the optimal parameters for your specific use case, it's recommended to run benchmarks:

func benchmarkSearch(graph *hnsw.Graph, queries []Vector, k int, efValues []int) {
    for _, ef := range efValues {
        startTime := time.Now()

        var results [][]Node
        for _, query := range queries {
            results = append(results, graph.SearchWithEF(query, k, ef))
        }

        duration := time.Since(startTime)
        avgTime := duration.Seconds() / float64(len(queries))

        // Calculate recall if ground truth is available
        recall := calculateRecall(results, groundTruth)

        fmt.Printf("ef=%d: avg_time=%.6fs, recall=%.4f\n", ef, avgTime, recall)
    }
}

This benchmark helps you find the optimal ef value for your requirements.

Memory Optimization

Compact Storage

HNSW can use a significant amount of memory, especially for large datasets. To reduce memory usage:

// Use compact storage for vectors
graph := hnsw.NewGraph[int](hnsw.WithCompactVectors(true))

Compact storage reduces memory usage by:

  • Storing vectors more efficiently
  • Using less memory for graph connectivity
  • Potentially improving cache locality

Quantization

For very large datasets, you can use vector quantization to reduce memory usage:

// Use 8-bit quantization for vectors
graph := hnsw.NewGraph[int](hnsw.WithQuantization(8))

Quantization reduces memory usage by:

  • Converting 32-bit floats to lower precision (e.g., 8-bit integers)
  • Potentially reducing search accuracy
  • Improving cache locality and search speed

External Storage

For extremely large datasets that don't fit in memory, you can use external storage:

// Use memory-mapped storage for vectors
storage := hnsw.NewMMapVectorStorage("vectors.bin")
graph := hnsw.NewGraph[int](hnsw.WithExternalVectorStorage(storage))

External storage:

  • Allows datasets larger than available RAM
  • May slow down search due to disk I/O
  • Can be combined with caching strategies for frequently accessed vectors

CPU Optimization

SIMD Acceleration

HNSW can benefit from SIMD (Single Instruction, Multiple Data) instructions for distance calculations:

// Enable SIMD acceleration for distance calculations
graph := hnsw.NewGraph[int](hnsw.WithSIMD(true))

SIMD acceleration:

  • Speeds up distance calculations by processing multiple vector elements in parallel
  • Works best for Euclidean and cosine distance
  • May not be available on all platforms

Thread Pool

For concurrent operations, using a thread pool can improve performance:

// Use a thread pool with 8 workers
pool := hnsw.NewThreadPool(8)
graph := hnsw.NewGraph[int](hnsw.WithThreadPool(pool))

A thread pool:

  • Reduces thread creation overhead
  • Allows better control of parallelism
  • Can be shared across multiple HNSW graphs

Scaling Strategies

Sharding

For very large datasets, you can use sharding to distribute the data across multiple HNSW graphs:

// Create multiple HNSW graphs (shards)
shards := make([]*hnsw.Graph, numShards)
for i := range shards {
    shards[i] = hnsw.NewGraph[int]()
}

// Add vectors to shards based on some partitioning strategy
for i, vector := range vectors {
    shardIndex := getShardIndex(vector) // Custom sharding function
    shards[shardIndex].Add(vector, ids[i])
}

// Search across all shards and merge results
func searchAcrossShards(query Vector, k int) []Node {
    var allResults []Node

    // Search each shard concurrently
    var wg sync.WaitGroup
    resultChan := make(chan []Node, len(shards))

    for _, shard := range shards {
        wg.Add(1)
        go func(shard *hnsw.Graph) {
            defer wg.Done()
            results := shard.Search(query, k)
            resultChan <- results
        }(shard)
    }

    // Wait for all searches to complete
    wg.Wait()
    close(resultChan)

    // Collect and merge results
    for results := range resultChan {
        allResults = append(allResults, results...)
    }

    // Sort and return top k
    sort.Slice(allResults, func(i, j int) bool {
        return allResults[i].Distance < allResults[j].Distance
    })

    if len(allResults) > k {
        allResults = allResults[:k]
    }

    return allResults
}

Sharding:

  • Allows scaling beyond the memory limits of a single machine
  • Enables parallel processing across shards
  • May reduce search accuracy if not implemented carefully

Distributed HNSW

For extremely large datasets, you can implement a distributed HNSW system:

// Example of a simple distributed HNSW client
type DistributedHNSW struct {
    shardClients []*ShardClient
}

func (d *DistributedHNSW) Search(query Vector, k int) []Node {
    var allResults []Node

    // Search each shard concurrently
    var wg sync.WaitGroup
    resultChan := make(chan []Node, len(d.shardClients))

    for _, client := range d.shardClients {
        wg.Add(1)
        go func(client *ShardClient) {
            defer wg.Done()
            results := client.Search(query, k)
            resultChan <- results
        }(client)
    }

    // Wait for all searches to complete
    wg.Wait()
    close(resultChan)

    // Collect and merge results
    for results := range resultChan {
        allResults = append(allResults, results...)
    }

    // Sort and return top k
    sort.Slice(allResults, func(i, j int) bool {
        return allResults[i].Distance < allResults[j].Distance
    })

    if len(allResults) > k {
        allResults = allResults[:k]
    }

    return allResults
}

Distributed HNSW:

  • Scales to billions of vectors across multiple machines
  • Requires careful design of the distribution strategy
  • Introduces network latency for search operations

Real-world Performance Tuning Examples

For applications requiring high recall, such as semantic text search:

// Configuration for high-recall text search
graph := hnsw.NewGraph[int](
    hnsw.WithM(16),
    hnsw.WithEfConstruction(300),
    hnsw.WithDistance(hnsw.CosineDistance),
)

// Add vectors (text embeddings)
for i, embedding := range embeddings {
    graph.Add(embedding, i)
}

// Search with high ef for better recall
results := graph.SearchWithEF(queryEmbedding, 10, 200)

For applications requiring fast search, such as real-time image similarity:

// Configuration for fast image similarity search
graph := hnsw.NewGraph[int](
    hnsw.WithM(8),
    hnsw.WithEfConstruction(100),
    hnsw.WithDistance(hnsw.EuclideanDistance),
    hnsw.WithSIMD(true),
)

// Add vectors (image features)
for i, features := range imageFeatures {
    graph.Add(features, i)
}

// Search with low ef for faster results
results := graph.SearchWithEF(queryFeatures, 5, 20)

Memory-Efficient Recommendation System

For large-scale recommendation systems with memory constraints:

// Configuration for memory-efficient recommendation system
graph := hnsw.NewGraph[int](
    hnsw.WithM(10),
    hnsw.WithEfConstruction(150),
    hnsw.WithCompactVectors(true),
    hnsw.WithQuantization(8),
)

// Add vectors in batches to manage memory usage
batchSize := 1000
for i := 0; i < len(itemVectors); i += batchSize {
    end := i + batchSize
    if end > len(itemVectors) {
        end = len(itemVectors)
    }
    graph.BatchAdd(itemVectors[i:end], itemIDs[i:end])
}

// Search with moderate ef for balanced performance
results := graph.SearchWithEF(userPreferences, 20, 50)

Monitoring and Profiling

To continuously optimize HNSW performance, implement monitoring and profiling:

// Simple search profiling
func profileSearch(graph *hnsw.Graph, query Vector, k int, ef int) {
    startTime := time.Now()
    results := graph.SearchWithEF(query, k, ef)
    duration := time.Since(startTime)

    fmt.Printf("Search stats: time=%.6fs, results=%d, ef=%d\n", 
               duration.Seconds(), len(results), ef)
}

// Memory usage monitoring
func monitorMemoryUsage() {
    var m runtime.MemStats
    runtime.ReadMemStats(&m)

    fmt.Printf("Memory stats: alloc=%.2f MB, sys=%.2f MB, gc=%d\n",
               float64(m.Alloc)/1024/1024,
               float64(m.Sys)/1024/1024,
               m.NumGC)
}

Regular profiling helps identify performance bottlenecks and opportunities for optimization.

Next Steps

Now that you understand how to tune HNSW performance, you can explore: