Performance Tuning¶

Want to make Quiver go even faster? You've come to the right place! This guide will help you squeeze every last drop of performance out of your vector database. Let's make those vectors fly! 🚀

Understanding Performance Factors¶

Quiver's performance depends on several factors:

Vector Dimension - Higher dimensions require more computation
Index Size - More vectors means more data to search through
Search Parameters - Quality vs. speed trade-offs
Hardware - CPU, memory, and storage capabilities
Configuration - Optimal settings for your use case

Let's dive into how to optimize each of these factors.

Hardware Considerations¶

CPU¶

Quiver benefits from:

Multiple cores for parallel processing
Modern CPUs with SIMD support (AVX2, AVX-512)
High clock speeds for faster distance calculations

CPU Recommendation

For production use, aim for at least 4 cores with AVX2 support. For large indices (>10M vectors), consider 8+ cores.

Memory¶

Memory requirements depend on:

Number of vectors
Vector dimension
HNSW graph connectivity (M parameter)
Metadata size and caching

Approximate memory usage:

Memory (GB) ≈ (Vector Count × Vector Dimension × 4 bytes) + 
              (Vector Count × M × 8 bytes) + 
              (Metadata Size × Caching Factor)

Memory Recommendation

For production use, allocate at least 2-4x the size of your raw vector data.

Storage¶

Storage considerations:

SSD is strongly recommended over HDD
NVMe SSDs provide the best performance for persistence
Network storage may introduce latency

Storage Recommendation

Use local NVMe SSDs for the best performance. If using network storage, ensure low latency and high throughput.

HNSW Parameter Tuning¶

M (Maximum Connections)¶

The M parameter controls the maximum number of connections per node:

config := quiver.Config{
    // ... other settings ...
    HNSWM: 16, // Default is 16
}

Tuning recommendations:

Lower values (8-12): Faster construction, less memory, lower accuracy
Default (16): Good balance for most applications
Higher values (32-64): Better accuracy, more memory, slower construction

Real-world Example

In our benchmarks with 1M vectors of 128 dimensions:

M Value	Search Time	Memory Usage	Accuracy
8	40μs	1.2GB	92%
16	60μs	1.5GB	97%
32	85μs	2.1GB	99%

efConstruction¶

The efConstruction parameter controls the quality of graph construction:

config := quiver.Config{
    // ... other settings ...
    HNSWEfConstruct: 200, // Default is 200
}

Tuning recommendations:

Lower values (100-150): Faster construction, lower quality graph
Default (200): Good balance for most applications
Higher values (300-500): Better quality graph, slower construction

When to Increase

Increase efConstruction if you need higher search accuracy and can afford longer build times.

efSearch¶

The efSearch parameter controls the quality of search:

config := quiver.Config{
    // ... other settings ...
    HNSWEfSearch: 100, // Default is 100
}

Tuning recommendations:

Lower values (50-80): Faster search, lower accuracy
Default (100): Good balance for most applications
Higher values (200-400): Better accuracy, slower search

Real-world Example

In our benchmarks with 1M vectors of 128 dimensions:

efSearch	Search Time	Accuracy
50	35μs	95%
100	60μs	98%
200	110μs	99.5%

Batch Processing Optimization¶

Batch Size¶

The BatchSize parameter controls how many vectors are batched before insertion:

config := quiver.Config{
    // ... other settings ...
    BatchSize: 1000, // Default is 1000
}

Tuning recommendations:

Lower values (100-500): Less memory usage, more frequent updates
Default (1000): Good balance for most applications
Higher values (5000-10000): Better throughput for bulk loading, higher memory usage

Bulk Loading

For initial bulk loading, use a larger batch size (5000-10000) to maximize throughput.

Arrow Integration¶

For maximum throughput when loading large datasets, use Arrow integration:

// Create Arrow record with vectors and metadata
builder := array.NewRecordBuilder(memory.DefaultAllocator, quiver.NewVectorSchema(dimension))
// ... populate the builder ...
record := builder.NewRecord()

// Add all vectors from the Arrow record
err := idx.AppendFromArrow(record)

This can be 10-100x faster than individual additions.

DuckDB Optimization¶

Query Optimization¶

Optimize your metadata filters:

Be specific in your filters to reduce the result set
Use appropriate indexes for frequently queried fields
Avoid complex joins or subqueries

Caching Strategy¶

Quiver caches metadata in memory for better performance:

// Get metadata for a vector
meta := idx.getMetadata(id) // Uses internal caching

For large datasets, consider:

Adjusting cache size based on available memory
Preloading frequently accessed metadata
Monitoring cache hit rates

Concurrency Tuning¶

Read Concurrency¶

Quiver supports concurrent reads:

// These can run concurrently
go func() { idx.Search(query1, 10, 1, 10) }()
go func() { idx.Search(query2, 10, 1, 10) }()
go func() { idx.Search(query3, 10, 1, 10) }()

For high-throughput search scenarios:

Use a worker pool to manage concurrent searches
Monitor CPU usage and adjust concurrency accordingly
Consider using a load balancer for distributed setups

Write Concurrency¶

Writes are serialized internally, but you can still batch them efficiently:

// Batch writes in goroutines
for i := 0; i < 10; i++ {
    go func(offset int) {
        for j := 0; j < 1000; j++ {
            id := uint64(offset*1000 + j)
            vector := generateVector(dimension)
            metadata := generateMetadata(id)
            idx.Add(id, vector, metadata)
        }
    }(i)
}

Hybrid Search Optimization¶

Strategy Selection¶

Quiver automatically chooses between two hybrid search strategies:

Filter-then-search: For highly selective filters
Search-then-filter: For less selective filters

You can optimize this by:

Making your filters as selective as possible
Using appropriate indexes on metadata fields
Monitoring query performance and adjusting as needed

Custom Hybrid Search¶

For advanced use cases, implement custom hybrid search logic:

// Example: Two-stage search with custom logic
func customHybridSearch(idx *quiver.Index, query []float32, filter string, k int) ([]quiver.SearchResult, error) {
    // First stage: Get candidate IDs from metadata
    metaResults, err := idx.QueryMetadata(fmt.Sprintf("SELECT id FROM metadata WHERE %s", filter))
    if err != nil {
        return nil, err
    }

    // If too many results, refine with vector search
    if len(metaResults) > 1000 {
        // Use vector search first, then filter
        return searchThenFilter(idx, query, filter, k)
    } else {
        // Use filter first, then vector search
        return filterThenSearch(idx, query, metaResults, k)
    }
}

Persistence and Backup Optimization¶

Persistence Interval¶

Adjust the persistence interval based on your write patterns:

config := quiver.Config{
    // ... other settings ...
    PersistInterval: 5 * time.Minute, // Default is 5 minutes
}

Shorter intervals: Less data loss risk, more I/O overhead
Longer intervals: Less I/O overhead, more data loss risk

Backup Compression¶

Enable backup compression to save storage space:

config := quiver.Config{
    // ... other settings ...
    BackupCompression: true, // Default is true
}

This trades CPU usage for storage space.

Monitoring and Profiling¶

Metrics Collection¶

Collect performance metrics to identify bottlenecks:

// Get metrics
metrics := idx.CollectMetrics()

// Log or export metrics
fmt.Printf("Vector count: %d\n", metrics["vector_count"])
fmt.Printf("Search latency: %.2fms\n", metrics["search_latency_ms"])
fmt.Printf("Memory usage: %.2fMB\n", metrics["memory_usage_mb"])

Profiling¶

Use Go's built-in profiling tools:

# CPU profiling
go test -bench=BenchmarkSearch -cpuprofile=cpu.prof

# Memory profiling
go test -bench=BenchmarkSearch -memprofile=mem.prof

# Analyze with pprof
go tool pprof cpu.prof
go tool pprof mem.prof

Configuration Examples¶

Small Dataset (<100K vectors)¶

config := quiver.Config{
    Dimension:       128,
    StoragePath:     "./data/small.db",
    Distance:        quiver.Cosine,
    MaxElements:     100000,
    HNSWM:           12,
    HNSWEfConstruct: 150,
    HNSWEfSearch:    80,
    BatchSize:       500,
    PersistInterval: 1 * time.Minute,
}

Medium Dataset (100K-1M vectors)¶

config := quiver.Config{
    Dimension:       128,
    StoragePath:     "./data/medium.db",
    Distance:        quiver.Cosine,
    MaxElements:     1000000,
    HNSWM:           16,
    HNSWEfConstruct: 200,
    HNSWEfSearch:    100,
    BatchSize:       1000,
    PersistInterval: 5 * time.Minute,
}

Large Dataset (>1M vectors)¶

config := quiver.Config{
    Dimension:       128,
    StoragePath:     "./data/large.db",
    Distance:        quiver.Cosine,
    MaxElements:     10000000,
    HNSWM:           24,
    HNSWEfConstruct: 300,
    HNSWEfSearch:    150,
    BatchSize:       5000,
    PersistInterval: 10 * time.Minute,
}

High-Throughput Search¶

config := quiver.Config{
    Dimension:       128,
    StoragePath:     "./data/search_optimized.db",
    Distance:        quiver.Cosine,
    MaxElements:     1000000,
    HNSWM:           32,
    HNSWEfConstruct: 400,
    HNSWEfSearch:    80,  // Lower for faster search
    BatchSize:       1000,
    PersistInterval: 5 * time.Minute,
}

High-Accuracy Search¶

config := quiver.Config{
    Dimension:       128,
    StoragePath:     "./data/accuracy_optimized.db",
    Distance:        quiver.Cosine,
    MaxElements:     1000000,
    HNSWM:           48,  // Higher for better graph quality
    HNSWEfConstruct: 500, // Higher for better graph construction
    HNSWEfSearch:    200, // Higher for more accurate search
    BatchSize:       1000,
    PersistInterval: 5 * time.Minute,
}

Next Steps¶

Now that you've optimized Quiver's performance, check out:

Benchmarking - Measure Quiver's performance
HNSW Algorithm - Learn more about how HNSW works
DuckDB Integration - Understand how Quiver uses DuckDB