Benchmarking¶
Want to know exactly how fast Quiver can go? Let's dive into benchmarking! This guide will show you how to measure Quiver's performance and compare different configurations. Time to put those vectors to the test! ⏱️
Why Benchmark?¶
Benchmarking helps you:
- Understand Quiver's performance characteristics
- Compare different configurations
- Identify bottlenecks
- Set realistic expectations for production use
- Track performance improvements over time
Built-in Benchmarks¶
Quiver comes with a comprehensive suite of benchmarks that measure various aspects of performance:
# Run all benchmarks
go test -bench=. -benchmem ./...
# Run a specific benchmark
go test -bench=BenchmarkSearch -benchmem ./...
Understanding Benchmark Output¶
Here's an example benchmark output:
This tells you:
BenchmarkSearch-10
: The benchmark name and number of CPU cores used20054
: Number of iterations run59194 ns/op
: Average time per operation (59.2 microseconds)24193 B/op
: Average memory allocated per operation (24.2 KB)439 allocs/op
: Average number of allocations per operation
Available Benchmarks¶
Quiver includes the following benchmarks:
Benchmark | Description |
---|---|
BenchmarkAdd | Measures vector addition performance |
BenchmarkSearch | Measures basic vector search performance |
BenchmarkHybridSearch | Measures hybrid search (vector + metadata) performance |
BenchmarkSearchWithNegatives | Measures search with negative examples performance |
BenchmarkBatchAdd | Measures batch addition performance with different batch sizes |
BenchmarkSearchWithDifferentK | Measures search performance with different K values |
BenchmarkSearchWithDifferentDimensions | Measures search performance with different vector dimensions |
Running Custom Benchmarks¶
Creating a Benchmark¶
You can create custom benchmarks to test specific scenarios:
// Example custom benchmark
func BenchmarkCustomSearch(b *testing.B) {
// Setup
logger, _ := zap.NewNop() // Use a no-op logger for benchmarks
idx, _ := quiver.New(quiver.Config{
Dimension: 128,
StoragePath: ":memory:",
HNSWM: 16,
BatchSize: 1000,
}, logger)
defer idx.Close()
// Add test vectors
for i := 0; i < 10000; i++ {
vector := generateRandomVector(128)
metadata := map[string]interface{}{
"category": "test",
"id": i,
}
idx.Add(uint64(i), vector, metadata)
}
// Create query vector
queryVector := generateRandomVector(128)
// Reset timer before the actual benchmark
b.ResetTimer()
// Run the benchmark
for i := 0; i < b.N; i++ {
idx.Search(queryVector, 10, 1, 10)
}
}
Benchmark Parameters¶
You can customize benchmark parameters:
# Run for at least 5 seconds per benchmark
go test -bench=. -benchtime=5s -benchmem ./...
# Run with a specific number of CPU cores
go test -bench=. -cpu=1,4,8 -benchmem ./...
# Run with verbose output
go test -bench=. -v -benchmem ./...
Benchmark Scenarios¶
Vector Addition¶
Benchmark vector addition with different batch sizes:
func BenchmarkAddWithDifferentBatchSizes(b *testing.B) {
batchSizes := []int{100, 1000, 10000}
for _, batchSize := range batchSizes {
b.Run(fmt.Sprintf("BatchSize-%d", batchSize), func(b *testing.B) {
// Setup with specific batch size
logger, _ := zap.NewNop()
idx, _ := quiver.New(quiver.Config{
Dimension: 128,
StoragePath: ":memory:",
BatchSize: batchSize,
}, logger)
defer idx.Close()
// Reset timer
b.ResetTimer()
// Run benchmark
for i := 0; i < b.N; i++ {
vector := generateRandomVector(128)
metadata := map[string]interface{}{
"category": "test",
"id": i,
}
idx.Add(uint64(i), vector, metadata)
}
})
}
}
Search Performance¶
Benchmark search with different HNSW parameters:
func BenchmarkSearchWithDifferentEfSearch(b *testing.B) {
efValues := []int{50, 100, 200, 400}
for _, ef := range efValues {
b.Run(fmt.Sprintf("Ef-%d", ef), func(b *testing.B) {
// Setup with specific efSearch
logger, _ := zap.NewNop()
idx, _ := quiver.New(quiver.Config{
Dimension: 128,
StoragePath: ":memory:",
HNSWEfSearch: ef,
}, logger)
defer idx.Close()
// Add test vectors
for i := 0; i < 10000; i++ {
vector := generateRandomVector(128)
metadata := map[string]interface{}{
"category": "test",
"id": i,
}
idx.Add(uint64(i), vector, metadata)
}
// Create query vector
queryVector := generateRandomVector(128)
// Reset timer
b.ResetTimer()
// Run benchmark
for i := 0; i < b.N; i++ {
idx.Search(queryVector, 10, 1, 10)
}
})
}
}
Hybrid Search¶
Benchmark hybrid search with different filter selectivity:
func BenchmarkHybridSearchWithDifferentFilters(b *testing.B) {
filters := []struct{
name string
filter string
selectivity string
}{
{"HighlySelective", "id < 100", "1%"},
{"MediumSelective", "id < 1000", "10%"},
{"LowSelective", "id < 5000", "50%"},
}
for _, f := range filters {
b.Run(fmt.Sprintf("%s-%s", f.name, f.selectivity), func(b *testing.B) {
// Setup
logger, _ := zap.NewNop()
idx, _ := quiver.New(quiver.Config{
Dimension: 128,
StoragePath: ":memory:",
}, logger)
defer idx.Close()
// Add test vectors
for i := 0; i < 10000; i++ {
vector := generateRandomVector(128)
metadata := map[string]interface{}{
"category": "test",
"id": i,
}
idx.Add(uint64(i), vector, metadata)
}
// Create query vector
queryVector := generateRandomVector(128)
// Reset timer
b.ResetTimer()
// Run benchmark
for i := 0; i < b.N; i++ {
idx.SearchWithFilter(queryVector, 10, f.filter)
}
})
}
}
Comparing Results¶
Using benchstat¶
The benchstat
tool helps compare benchmark results:
# Install benchstat
go install golang.org/x/perf/cmd/benchstat@latest
# Run benchmarks before changes
go test -bench=. -benchmem ./... > before.txt
# Run benchmarks after changes
go test -bench=. -benchmem ./... > after.txt
# Compare results
benchstat before.txt after.txt
Example output:
name old time/op new time/op delta
Search-10 59.2µs ± 2% 52.1µs ± 3% -12.00% (p=0.000 n=10+10)
HybridSearch-10 208µs ± 5% 187µs ± 4% -10.10% (p=0.000 n=10+10)
name old alloc/op new alloc/op delta
Search-10 24.2kB ± 0% 22.1kB ± 0% -8.68% (p=0.000 n=10+10)
HybridSearch-10 80.6kB ± 0% 75.2kB ± 0% -6.70% (p=0.000 n=10+10)
name old allocs/op new allocs/op delta
Search-10 439 ± 0% 412 ± 0% -6.15% (p=0.000 n=10+10)
HybridSearch-10 822 ± 0% 798 ± 0% -2.92% (p=0.000 n=10+10)
This shows the performance change between the old and new versions.
Visualizing Results¶
You can visualize benchmark results using tools like:
- benchviz
- benchgraph
- Custom plotting with Python/matplotlib
Example Python script for plotting:
import matplotlib.pyplot as plt
import pandas as pd
import re
# Parse benchmark output
def parse_benchmark(filename):
data = []
with open(filename, 'r') as f:
for line in f:
if line.startswith('Benchmark'):
parts = re.split(r'\s+', line.strip())
name = parts[0]
ops = int(parts[1])
ns_per_op = float(parts[2])
mb_per_op = float(parts[3]) / 1024 / 1024
allocs_per_op = int(parts[4])
data.append({
'name': name,
'ops': ops,
'ns_per_op': ns_per_op,
'mb_per_op': mb_per_op,
'allocs_per_op': allocs_per_op
})
return pd.DataFrame(data)
# Load data
df = parse_benchmark('benchmark_results.txt')
# Plot
plt.figure(figsize=(12, 6))
plt.bar(df['name'], df['ns_per_op'] / 1000) # Convert to microseconds
plt.ylabel('Time per operation (µs)')
plt.xticks(rotation=45, ha='right')
plt.title('Quiver Benchmark Performance')
plt.tight_layout()
plt.savefig('benchmark_performance.png')
Real-world Benchmarks¶
Here are some real-world benchmark results from Quiver running on an M2 Pro CPU:
Basic Operations¶
Operation | Throughput | Latency | Memory/Op | Allocs/Op |
---|---|---|---|---|
Add | 6.4K ops/sec | 156µs | 20.9 KB | 370 |
Search | 16.9K ops/sec | 59µs | 24.2 KB | 439 |
Hybrid Search | 4.8K ops/sec | 208µs | 80.6 KB | 822 |
Search with Negatives | 7.9K ops/sec | 126µs | 32.5 KB | 491 |
Batch Performance¶
Batch Size | Throughput | Latency | Memory/Op | Allocs/Op |
---|---|---|---|---|
100 | 63 batches/sec | 15.8ms | 2.0 MB | 35.8K |
1000 | 6.6 batches/sec | 152ms | 19.0 MB | 331K |
10000 | 0.64 batches/sec | 1.57s | 208 MB | 3.7M |
Search with Different K Values¶
K Value | Throughput | Latency | Memory/Op | Allocs/Op |
---|---|---|---|---|
10 | 16.5K ops/sec | 61µs | 23.8 KB | 441 |
50 | 2.1K ops/sec | 480µs | 190 KB | 2.9K |
100 | 1.9K ops/sec | 516µs | 317 KB | 2.9K |
Search with Different Dimensions¶
Dimension | Throughput | Latency | Memory/Op | Allocs/Op |
---|---|---|---|---|
32 | 27.6K ops/sec | 36µs | 23.3 KB | 429 |
128 | 16.0K ops/sec | 63µs | 25.6 KB | 457 |
512 | 7.0K ops/sec | 143µs | 24.2 KB | 455 |
Performance Tuning Based on Benchmarks¶
Based on benchmark results, here are some tuning recommendations:
For High-Throughput Addition¶
- Increase batch size (1000-5000)
- Use Arrow integration for bulk loading
- Consider parallel additions with multiple goroutines
For Low-Latency Search¶
- Reduce
efSearch
(50-80) - Use smaller vector dimensions if possible
- Keep index size manageable
- Consider in-memory storage
For High-Accuracy Search¶
- Increase
M
(32-64) - Increase
efConstruction
(300-500) - Increase
efSearch
(200-400)
Next Steps¶
Now that you've benchmarked Quiver's performance, check out:
- Performance Tuning - Optimize Quiver for your needs
- HNSW Algorithm - Learn more about how HNSW works
- DuckDB Integration - Understand how Quiver uses DuckDB