Skip to content

Performance Tuning

Want to make Quiver go even faster? You've come to the right place! This guide will help you squeeze every last drop of performance out of your vector database. Let's make those vectors fly! 🚀

Understanding Performance Factors

Quiver's performance depends on several factors:

  1. Vector Dimension - Higher dimensions require more computation
  2. Index Size - More vectors means more data to search through
  3. Search Parameters - Quality vs. speed trade-offs
  4. Hardware - CPU, memory, and storage capabilities
  5. Configuration - Optimal settings for your use case

Let's dive into how to optimize each of these factors.

Hardware Considerations

CPU

Quiver benefits from:

  • Multiple cores for parallel processing
  • Modern CPUs with SIMD support (AVX2, AVX-512)
  • High clock speeds for faster distance calculations

CPU Recommendation

For production use, aim for at least 4 cores with AVX2 support. For large indices (>10M vectors), consider 8+ cores.

Memory

Memory requirements depend on:

  • Number of vectors
  • Vector dimension
  • HNSW graph connectivity (M parameter)
  • Metadata size and caching

Approximate memory usage:

Memory (GB) ≈ (Vector Count × Vector Dimension × 4 bytes) + 
              (Vector Count × M × 8 bytes) + 
              (Metadata Size × Caching Factor)

Memory Recommendation

For production use, allocate at least 2-4x the size of your raw vector data.

Storage

Storage considerations:

  • SSD is strongly recommended over HDD
  • NVMe SSDs provide the best performance for persistence
  • Network storage may introduce latency

Storage Recommendation

Use local NVMe SSDs for the best performance. If using network storage, ensure low latency and high throughput.

HNSW Parameter Tuning

M (Maximum Connections)

The M parameter controls the maximum number of connections per node:

config := quiver.Config{
    // ... other settings ...
    HNSWM: 16, // Default is 16
}

Tuning recommendations:

  • Lower values (8-12): Faster construction, less memory, lower accuracy
  • Default (16): Good balance for most applications
  • Higher values (32-64): Better accuracy, more memory, slower construction

Real-world Example

In our benchmarks with 1M vectors of 128 dimensions:

M Value Search Time Memory Usage Accuracy
8 40μs 1.2GB 92%
16 60μs 1.5GB 97%
32 85μs 2.1GB 99%

efConstruction

The efConstruction parameter controls the quality of graph construction:

config := quiver.Config{
    // ... other settings ...
    HNSWEfConstruct: 200, // Default is 200
}

Tuning recommendations:

  • Lower values (100-150): Faster construction, lower quality graph
  • Default (200): Good balance for most applications
  • Higher values (300-500): Better quality graph, slower construction

When to Increase

Increase efConstruction if you need higher search accuracy and can afford longer build times.

efSearch

The efSearch parameter controls the quality of search:

config := quiver.Config{
    // ... other settings ...
    HNSWEfSearch: 100, // Default is 100
}

Tuning recommendations:

  • Lower values (50-80): Faster search, lower accuracy
  • Default (100): Good balance for most applications
  • Higher values (200-400): Better accuracy, slower search

Real-world Example

In our benchmarks with 1M vectors of 128 dimensions:

efSearch Search Time Accuracy
50 35μs 95%
100 60μs 98%
200 110μs 99.5%

Batch Processing Optimization

Batch Size

The BatchSize parameter controls how many vectors are batched before insertion:

config := quiver.Config{
    // ... other settings ...
    BatchSize: 1000, // Default is 1000
}

Tuning recommendations:

  • Lower values (100-500): Less memory usage, more frequent updates
  • Default (1000): Good balance for most applications
  • Higher values (5000-10000): Better throughput for bulk loading, higher memory usage

Bulk Loading

For initial bulk loading, use a larger batch size (5000-10000) to maximize throughput.

Arrow Integration

For maximum throughput when loading large datasets, use Arrow integration:

// Create Arrow record with vectors and metadata
builder := array.NewRecordBuilder(memory.DefaultAllocator, quiver.NewVectorSchema(dimension))
// ... populate the builder ...
record := builder.NewRecord()

// Add all vectors from the Arrow record
err := idx.AppendFromArrow(record)

This can be 10-100x faster than individual additions.

DuckDB Optimization

Query Optimization

Optimize your metadata filters:

  • Be specific in your filters to reduce the result set
  • Use appropriate indexes for frequently queried fields
  • Avoid complex joins or subqueries

Caching Strategy

Quiver caches metadata in memory for better performance:

// Get metadata for a vector
meta := idx.getMetadata(id) // Uses internal caching

For large datasets, consider:

  • Adjusting cache size based on available memory
  • Preloading frequently accessed metadata
  • Monitoring cache hit rates

Concurrency Tuning

Read Concurrency

Quiver supports concurrent reads:

// These can run concurrently
go func() { idx.Search(query1, 10, 1, 10) }()
go func() { idx.Search(query2, 10, 1, 10) }()
go func() { idx.Search(query3, 10, 1, 10) }()

For high-throughput search scenarios:

  • Use a worker pool to manage concurrent searches
  • Monitor CPU usage and adjust concurrency accordingly
  • Consider using a load balancer for distributed setups

Write Concurrency

Writes are serialized internally, but you can still batch them efficiently:

// Batch writes in goroutines
for i := 0; i < 10; i++ {
    go func(offset int) {
        for j := 0; j < 1000; j++ {
            id := uint64(offset*1000 + j)
            vector := generateVector(dimension)
            metadata := generateMetadata(id)
            idx.Add(id, vector, metadata)
        }
    }(i)
}

Hybrid Search Optimization

Strategy Selection

Quiver automatically chooses between two hybrid search strategies:

  1. Filter-then-search: For highly selective filters
  2. Search-then-filter: For less selective filters

You can optimize this by:

  • Making your filters as selective as possible
  • Using appropriate indexes on metadata fields
  • Monitoring query performance and adjusting as needed

For advanced use cases, implement custom hybrid search logic:

// Example: Two-stage search with custom logic
func customHybridSearch(idx *quiver.Index, query []float32, filter string, k int) ([]quiver.SearchResult, error) {
    // First stage: Get candidate IDs from metadata
    metaResults, err := idx.QueryMetadata(fmt.Sprintf("SELECT id FROM metadata WHERE %s", filter))
    if err != nil {
        return nil, err
    }

    // If too many results, refine with vector search
    if len(metaResults) > 1000 {
        // Use vector search first, then filter
        return searchThenFilter(idx, query, filter, k)
    } else {
        // Use filter first, then vector search
        return filterThenSearch(idx, query, metaResults, k)
    }
}

Persistence and Backup Optimization

Persistence Interval

Adjust the persistence interval based on your write patterns:

config := quiver.Config{
    // ... other settings ...
    PersistInterval: 5 * time.Minute, // Default is 5 minutes
}
  • Shorter intervals: Less data loss risk, more I/O overhead
  • Longer intervals: Less I/O overhead, more data loss risk

Backup Compression

Enable backup compression to save storage space:

config := quiver.Config{
    // ... other settings ...
    BackupCompression: true, // Default is true
}

This trades CPU usage for storage space.

Monitoring and Profiling

Metrics Collection

Collect performance metrics to identify bottlenecks:

// Get metrics
metrics := idx.CollectMetrics()

// Log or export metrics
fmt.Printf("Vector count: %d\n", metrics["vector_count"])
fmt.Printf("Search latency: %.2fms\n", metrics["search_latency_ms"])
fmt.Printf("Memory usage: %.2fMB\n", metrics["memory_usage_mb"])

Profiling

Use Go's built-in profiling tools:

# CPU profiling
go test -bench=BenchmarkSearch -cpuprofile=cpu.prof

# Memory profiling
go test -bench=BenchmarkSearch -memprofile=mem.prof

# Analyze with pprof
go tool pprof cpu.prof
go tool pprof mem.prof

Configuration Examples

Small Dataset (<100K vectors)

config := quiver.Config{
    Dimension:       128,
    StoragePath:     "./data/small.db",
    Distance:        quiver.Cosine,
    MaxElements:     100000,
    HNSWM:           12,
    HNSWEfConstruct: 150,
    HNSWEfSearch:    80,
    BatchSize:       500,
    PersistInterval: 1 * time.Minute,
}

Medium Dataset (100K-1M vectors)

config := quiver.Config{
    Dimension:       128,
    StoragePath:     "./data/medium.db",
    Distance:        quiver.Cosine,
    MaxElements:     1000000,
    HNSWM:           16,
    HNSWEfConstruct: 200,
    HNSWEfSearch:    100,
    BatchSize:       1000,
    PersistInterval: 5 * time.Minute,
}

Large Dataset (>1M vectors)

config := quiver.Config{
    Dimension:       128,
    StoragePath:     "./data/large.db",
    Distance:        quiver.Cosine,
    MaxElements:     10000000,
    HNSWM:           24,
    HNSWEfConstruct: 300,
    HNSWEfSearch:    150,
    BatchSize:       5000,
    PersistInterval: 10 * time.Minute,
}
config := quiver.Config{
    Dimension:       128,
    StoragePath:     "./data/search_optimized.db",
    Distance:        quiver.Cosine,
    MaxElements:     1000000,
    HNSWM:           32,
    HNSWEfConstruct: 400,
    HNSWEfSearch:    80,  // Lower for faster search
    BatchSize:       1000,
    PersistInterval: 5 * time.Minute,
}
config := quiver.Config{
    Dimension:       128,
    StoragePath:     "./data/accuracy_optimized.db",
    Distance:        quiver.Cosine,
    MaxElements:     1000000,
    HNSWM:           48,  // Higher for better graph quality
    HNSWEfConstruct: 500, // Higher for better graph construction
    HNSWEfSearch:    200, // Higher for more accurate search
    BatchSize:       1000,
    PersistInterval: 5 * time.Minute,
}

Next Steps

Now that you've optimized Quiver's performance, check out: