Performance Tuning¶
Want to make Quiver go even faster? You've come to the right place! This guide will help you squeeze every last drop of performance out of your vector database. Let's make those vectors fly! 🚀
Understanding Performance Factors¶
Quiver's performance depends on several factors:
- Vector Dimension - Higher dimensions require more computation
- Index Size - More vectors means more data to search through
- Search Parameters - Quality vs. speed trade-offs
- Hardware - CPU, memory, and storage capabilities
- Configuration - Optimal settings for your use case
Let's dive into how to optimize each of these factors.
Hardware Considerations¶
CPU¶
Quiver benefits from:
- Multiple cores for parallel processing
- Modern CPUs with SIMD support (AVX2, AVX-512)
- High clock speeds for faster distance calculations
CPU Recommendation
For production use, aim for at least 4 cores with AVX2 support. For large indices (>10M vectors), consider 8+ cores.
Memory¶
Memory requirements depend on:
- Number of vectors
- Vector dimension
- HNSW graph connectivity (M parameter)
- Metadata size and caching
Approximate memory usage:
Memory (GB) ≈ (Vector Count × Vector Dimension × 4 bytes) +
(Vector Count × M × 8 bytes) +
(Metadata Size × Caching Factor)
Memory Recommendation
For production use, allocate at least 2-4x the size of your raw vector data.
Storage¶
Storage considerations:
- SSD is strongly recommended over HDD
- NVMe SSDs provide the best performance for persistence
- Network storage may introduce latency
Storage Recommendation
Use local NVMe SSDs for the best performance. If using network storage, ensure low latency and high throughput.
HNSW Parameter Tuning¶
M (Maximum Connections)¶
The M
parameter controls the maximum number of connections per node:
Tuning recommendations:
- Lower values (8-12): Faster construction, less memory, lower accuracy
- Default (16): Good balance for most applications
- Higher values (32-64): Better accuracy, more memory, slower construction
Real-world Example
In our benchmarks with 1M vectors of 128 dimensions:
M Value | Search Time | Memory Usage | Accuracy |
---|---|---|---|
8 | 40μs | 1.2GB | 92% |
16 | 60μs | 1.5GB | 97% |
32 | 85μs | 2.1GB | 99% |
efConstruction¶
The efConstruction
parameter controls the quality of graph construction:
Tuning recommendations:
- Lower values (100-150): Faster construction, lower quality graph
- Default (200): Good balance for most applications
- Higher values (300-500): Better quality graph, slower construction
When to Increase
Increase efConstruction
if you need higher search accuracy and can afford longer build times.
efSearch¶
The efSearch
parameter controls the quality of search:
Tuning recommendations:
- Lower values (50-80): Faster search, lower accuracy
- Default (100): Good balance for most applications
- Higher values (200-400): Better accuracy, slower search
Real-world Example
In our benchmarks with 1M vectors of 128 dimensions:
efSearch | Search Time | Accuracy |
---|---|---|
50 | 35μs | 95% |
100 | 60μs | 98% |
200 | 110μs | 99.5% |
Batch Processing Optimization¶
Batch Size¶
The BatchSize
parameter controls how many vectors are batched before insertion:
Tuning recommendations:
- Lower values (100-500): Less memory usage, more frequent updates
- Default (1000): Good balance for most applications
- Higher values (5000-10000): Better throughput for bulk loading, higher memory usage
Bulk Loading
For initial bulk loading, use a larger batch size (5000-10000) to maximize throughput.
Arrow Integration¶
For maximum throughput when loading large datasets, use Arrow integration:
// Create Arrow record with vectors and metadata
builder := array.NewRecordBuilder(memory.DefaultAllocator, quiver.NewVectorSchema(dimension))
// ... populate the builder ...
record := builder.NewRecord()
// Add all vectors from the Arrow record
err := idx.AppendFromArrow(record)
This can be 10-100x faster than individual additions.
DuckDB Optimization¶
Query Optimization¶
Optimize your metadata filters:
- Be specific in your filters to reduce the result set
- Use appropriate indexes for frequently queried fields
- Avoid complex joins or subqueries
Caching Strategy¶
Quiver caches metadata in memory for better performance:
For large datasets, consider:
- Adjusting cache size based on available memory
- Preloading frequently accessed metadata
- Monitoring cache hit rates
Concurrency Tuning¶
Read Concurrency¶
Quiver supports concurrent reads:
// These can run concurrently
go func() { idx.Search(query1, 10, 1, 10) }()
go func() { idx.Search(query2, 10, 1, 10) }()
go func() { idx.Search(query3, 10, 1, 10) }()
For high-throughput search scenarios:
- Use a worker pool to manage concurrent searches
- Monitor CPU usage and adjust concurrency accordingly
- Consider using a load balancer for distributed setups
Write Concurrency¶
Writes are serialized internally, but you can still batch them efficiently:
// Batch writes in goroutines
for i := 0; i < 10; i++ {
go func(offset int) {
for j := 0; j < 1000; j++ {
id := uint64(offset*1000 + j)
vector := generateVector(dimension)
metadata := generateMetadata(id)
idx.Add(id, vector, metadata)
}
}(i)
}
Hybrid Search Optimization¶
Strategy Selection¶
Quiver automatically chooses between two hybrid search strategies:
- Filter-then-search: For highly selective filters
- Search-then-filter: For less selective filters
You can optimize this by:
- Making your filters as selective as possible
- Using appropriate indexes on metadata fields
- Monitoring query performance and adjusting as needed
Custom Hybrid Search¶
For advanced use cases, implement custom hybrid search logic:
// Example: Two-stage search with custom logic
func customHybridSearch(idx *quiver.Index, query []float32, filter string, k int) ([]quiver.SearchResult, error) {
// First stage: Get candidate IDs from metadata
metaResults, err := idx.QueryMetadata(fmt.Sprintf("SELECT id FROM metadata WHERE %s", filter))
if err != nil {
return nil, err
}
// If too many results, refine with vector search
if len(metaResults) > 1000 {
// Use vector search first, then filter
return searchThenFilter(idx, query, filter, k)
} else {
// Use filter first, then vector search
return filterThenSearch(idx, query, metaResults, k)
}
}
Persistence and Backup Optimization¶
Persistence Interval¶
Adjust the persistence interval based on your write patterns:
config := quiver.Config{
// ... other settings ...
PersistInterval: 5 * time.Minute, // Default is 5 minutes
}
- Shorter intervals: Less data loss risk, more I/O overhead
- Longer intervals: Less I/O overhead, more data loss risk
Backup Compression¶
Enable backup compression to save storage space:
This trades CPU usage for storage space.
Monitoring and Profiling¶
Metrics Collection¶
Collect performance metrics to identify bottlenecks:
// Get metrics
metrics := idx.CollectMetrics()
// Log or export metrics
fmt.Printf("Vector count: %d\n", metrics["vector_count"])
fmt.Printf("Search latency: %.2fms\n", metrics["search_latency_ms"])
fmt.Printf("Memory usage: %.2fMB\n", metrics["memory_usage_mb"])
Profiling¶
Use Go's built-in profiling tools:
# CPU profiling
go test -bench=BenchmarkSearch -cpuprofile=cpu.prof
# Memory profiling
go test -bench=BenchmarkSearch -memprofile=mem.prof
# Analyze with pprof
go tool pprof cpu.prof
go tool pprof mem.prof
Configuration Examples¶
Small Dataset (<100K vectors)¶
config := quiver.Config{
Dimension: 128,
StoragePath: "./data/small.db",
Distance: quiver.Cosine,
MaxElements: 100000,
HNSWM: 12,
HNSWEfConstruct: 150,
HNSWEfSearch: 80,
BatchSize: 500,
PersistInterval: 1 * time.Minute,
}
Medium Dataset (100K-1M vectors)¶
config := quiver.Config{
Dimension: 128,
StoragePath: "./data/medium.db",
Distance: quiver.Cosine,
MaxElements: 1000000,
HNSWM: 16,
HNSWEfConstruct: 200,
HNSWEfSearch: 100,
BatchSize: 1000,
PersistInterval: 5 * time.Minute,
}
Large Dataset (>1M vectors)¶
config := quiver.Config{
Dimension: 128,
StoragePath: "./data/large.db",
Distance: quiver.Cosine,
MaxElements: 10000000,
HNSWM: 24,
HNSWEfConstruct: 300,
HNSWEfSearch: 150,
BatchSize: 5000,
PersistInterval: 10 * time.Minute,
}
High-Throughput Search¶
config := quiver.Config{
Dimension: 128,
StoragePath: "./data/search_optimized.db",
Distance: quiver.Cosine,
MaxElements: 1000000,
HNSWM: 32,
HNSWEfConstruct: 400,
HNSWEfSearch: 80, // Lower for faster search
BatchSize: 1000,
PersistInterval: 5 * time.Minute,
}
High-Accuracy Search¶
config := quiver.Config{
Dimension: 128,
StoragePath: "./data/accuracy_optimized.db",
Distance: quiver.Cosine,
MaxElements: 1000000,
HNSWM: 48, // Higher for better graph quality
HNSWEfConstruct: 500, // Higher for better graph construction
HNSWEfSearch: 200, // Higher for more accurate search
BatchSize: 1000,
PersistInterval: 5 * time.Minute,
}
Next Steps¶
Now that you've optimized Quiver's performance, check out:
- Benchmarking - Measure Quiver's performance
- HNSW Algorithm - Learn more about how HNSW works
- DuckDB Integration - Understand how Quiver uses DuckDB