Benchmarking¶

Want to know exactly how fast Quiver can go? Let's dive into benchmarking! This guide will show you how to measure Quiver's performance and compare different configurations. Time to put those vectors to the test! ⏱️

Why Benchmark?¶

Benchmarking helps you:

Understand Quiver's performance characteristics
Compare different configurations
Identify bottlenecks
Set realistic expectations for production use
Track performance improvements over time

Built-in Benchmarks¶

Quiver comes with a comprehensive suite of benchmarks that measure various aspects of performance:

# Run all benchmarks
go test -bench=. -benchmem ./...

# Run a specific benchmark
go test -bench=BenchmarkSearch -benchmem ./...

Understanding Benchmark Output¶

Here's an example benchmark output:

BenchmarkSearch-10                 20054             59194 ns/op           24193 B/op         439 allocs/op

This tells you:

BenchmarkSearch-10: The benchmark name and number of CPU cores used
20054: Number of iterations run
59194 ns/op: Average time per operation (59.2 microseconds)
24193 B/op: Average memory allocated per operation (24.2 KB)
439 allocs/op: Average number of allocations per operation

Available Benchmarks¶

Quiver includes the following benchmarks:

Benchmark	Description
`BenchmarkAdd`	Measures vector addition performance
`BenchmarkSearch`	Measures basic vector search performance
`BenchmarkHybridSearch`	Measures hybrid search (vector + metadata) performance
`BenchmarkSearchWithNegatives`	Measures search with negative examples performance
`BenchmarkBatchAdd`	Measures batch addition performance with different batch sizes
`BenchmarkSearchWithDifferentK`	Measures search performance with different K values
`BenchmarkSearchWithDifferentDimensions`	Measures search performance with different vector dimensions

Running Custom Benchmarks¶

Creating a Benchmark¶

You can create custom benchmarks to test specific scenarios:

// Example custom benchmark
func BenchmarkCustomSearch(b *testing.B) {
    // Setup
    logger, _ := zap.NewNop() // Use a no-op logger for benchmarks
    idx, _ := quiver.New(quiver.Config{
        Dimension:   128,
        StoragePath: ":memory:",
        HNSWM:       16,
        BatchSize:   1000,
    }, logger)
    defer idx.Close()

    // Add test vectors
    for i := 0; i < 10000; i++ {
        vector := generateRandomVector(128)
        metadata := map[string]interface{}{
            "category": "test",
            "id": i,
        }
        idx.Add(uint64(i), vector, metadata)
    }

    // Create query vector
    queryVector := generateRandomVector(128)

    // Reset timer before the actual benchmark
    b.ResetTimer()

    // Run the benchmark
    for i := 0; i < b.N; i++ {
        idx.Search(queryVector, 10, 1, 10)
    }
}

Benchmark Parameters¶

You can customize benchmark parameters:

# Run for at least 5 seconds per benchmark
go test -bench=. -benchtime=5s -benchmem ./...

# Run with a specific number of CPU cores
go test -bench=. -cpu=1,4,8 -benchmem ./...

# Run with verbose output
go test -bench=. -v -benchmem ./...

Benchmark Scenarios¶

Vector Addition¶

Benchmark vector addition with different batch sizes:

func BenchmarkAddWithDifferentBatchSizes(b *testing.B) {
    batchSizes := []int{100, 1000, 10000}

    for _, batchSize := range batchSizes {
        b.Run(fmt.Sprintf("BatchSize-%d", batchSize), func(b *testing.B) {
            // Setup with specific batch size
            logger, _ := zap.NewNop()
            idx, _ := quiver.New(quiver.Config{
                Dimension: 128,
                StoragePath: ":memory:",
                BatchSize: batchSize,
            }, logger)
            defer idx.Close()

            // Reset timer
            b.ResetTimer()

            // Run benchmark
            for i := 0; i < b.N; i++ {
                vector := generateRandomVector(128)
                metadata := map[string]interface{}{
                    "category": "test",
                    "id": i,
                }
                idx.Add(uint64(i), vector, metadata)
            }
        })
    }
}

Search Performance¶

Benchmark search with different HNSW parameters:

func BenchmarkSearchWithDifferentEfSearch(b *testing.B) {
    efValues := []int{50, 100, 200, 400}

    for _, ef := range efValues {
        b.Run(fmt.Sprintf("Ef-%d", ef), func(b *testing.B) {
            // Setup with specific efSearch
            logger, _ := zap.NewNop()
            idx, _ := quiver.New(quiver.Config{
                Dimension: 128,
                StoragePath: ":memory:",
                HNSWEfSearch: ef,
            }, logger)
            defer idx.Close()

            // Add test vectors
            for i := 0; i < 10000; i++ {
                vector := generateRandomVector(128)
                metadata := map[string]interface{}{
                    "category": "test",
                    "id": i,
                }
                idx.Add(uint64(i), vector, metadata)
            }

            // Create query vector
            queryVector := generateRandomVector(128)

            // Reset timer
            b.ResetTimer()

            // Run benchmark
            for i := 0; i < b.N; i++ {
                idx.Search(queryVector, 10, 1, 10)
            }
        })
    }
}

Hybrid Search¶

Benchmark hybrid search with different filter selectivity:

func BenchmarkHybridSearchWithDifferentFilters(b *testing.B) {
    filters := []struct{
        name string
        filter string
        selectivity string
    }{
        {"HighlySelective", "id < 100", "1%"},
        {"MediumSelective", "id < 1000", "10%"},
        {"LowSelective", "id < 5000", "50%"},
    }

    for _, f := range filters {
        b.Run(fmt.Sprintf("%s-%s", f.name, f.selectivity), func(b *testing.B) {
            // Setup
            logger, _ := zap.NewNop()
            idx, _ := quiver.New(quiver.Config{
                Dimension: 128,
                StoragePath: ":memory:",
            }, logger)
            defer idx.Close()

            // Add test vectors
            for i := 0; i < 10000; i++ {
                vector := generateRandomVector(128)
                metadata := map[string]interface{}{
                    "category": "test",
                    "id": i,
                }
                idx.Add(uint64(i), vector, metadata)
            }

            // Create query vector
            queryVector := generateRandomVector(128)

            // Reset timer
            b.ResetTimer()

            // Run benchmark
            for i := 0; i < b.N; i++ {
                idx.SearchWithFilter(queryVector, 10, f.filter)
            }
        })
    }
}

Comparing Results¶

Using benchstat¶

The benchstat tool helps compare benchmark results:

# Install benchstat
go install golang.org/x/perf/cmd/benchstat@latest

# Run benchmarks before changes
go test -bench=. -benchmem ./... > before.txt

# Run benchmarks after changes
go test -bench=. -benchmem ./... > after.txt

# Compare results
benchstat before.txt after.txt

Example output:

name                old time/op    new time/op    delta
Search-10             59.2µs ± 2%    52.1µs ± 3%  -12.00%  (p=0.000 n=10+10)
HybridSearch-10       208µs ± 5%     187µs ± 4%   -10.10%  (p=0.000 n=10+10)

name                old alloc/op   new alloc/op   delta
Search-10             24.2kB ± 0%    22.1kB ± 0%   -8.68%  (p=0.000 n=10+10)
HybridSearch-10       80.6kB ± 0%    75.2kB ± 0%   -6.70%  (p=0.000 n=10+10)

name                old allocs/op  new allocs/op  delta
Search-10               439 ± 0%       412 ± 0%    -6.15%  (p=0.000 n=10+10)
HybridSearch-10         822 ± 0%       798 ± 0%    -2.92%  (p=0.000 n=10+10)

This shows the performance change between the old and new versions.

Visualizing Results¶

You can visualize benchmark results using tools like:

benchviz
benchgraph
Custom plotting with Python/matplotlib

Example Python script for plotting:

import matplotlib.pyplot as plt
import pandas as pd
import re

# Parse benchmark output
def parse_benchmark(filename):
    data = []
    with open(filename, 'r') as f:
        for line in f:
            if line.startswith('Benchmark'):
                parts = re.split(r'\s+', line.strip())
                name = parts[0]
                ops = int(parts[1])
                ns_per_op = float(parts[2])
                mb_per_op = float(parts[3]) / 1024 / 1024
                allocs_per_op = int(parts[4])
                data.append({
                    'name': name,
                    'ops': ops,
                    'ns_per_op': ns_per_op,
                    'mb_per_op': mb_per_op,
                    'allocs_per_op': allocs_per_op
                })
    return pd.DataFrame(data)

# Load data
df = parse_benchmark('benchmark_results.txt')

# Plot
plt.figure(figsize=(12, 6))
plt.bar(df['name'], df['ns_per_op'] / 1000)  # Convert to microseconds
plt.ylabel('Time per operation (µs)')
plt.xticks(rotation=45, ha='right')
plt.title('Quiver Benchmark Performance')
plt.tight_layout()
plt.savefig('benchmark_performance.png')

Real-world Benchmarks¶

Here are some real-world benchmark results from Quiver running on an M2 Pro CPU:

Basic Operations¶

Operation	Throughput	Latency	Memory/Op	Allocs/Op
Add	6.4K ops/sec	156µs	20.9 KB	370
Search	16.9K ops/sec	59µs	24.2 KB	439
Hybrid Search	4.8K ops/sec	208µs	80.6 KB	822
Search with Negatives	7.9K ops/sec	126µs	32.5 KB	491

Batch Performance¶

Batch Size	Throughput	Latency	Memory/Op	Allocs/Op
100	63 batches/sec	15.8ms	2.0 MB	35.8K
1000	6.6 batches/sec	152ms	19.0 MB	331K
10000	0.64 batches/sec	1.57s	208 MB	3.7M

Search with Different K Values¶

K Value	Throughput	Latency	Memory/Op	Allocs/Op
10	16.5K ops/sec	61µs	23.8 KB	441
50	2.1K ops/sec	480µs	190 KB	2.9K
100	1.9K ops/sec	516µs	317 KB	2.9K

Search with Different Dimensions¶

Dimension	Throughput	Latency	Memory/Op	Allocs/Op
32	27.6K ops/sec	36µs	23.3 KB	429
128	16.0K ops/sec	63µs	25.6 KB	457
512	7.0K ops/sec	143µs	24.2 KB	455

Performance Tuning Based on Benchmarks¶

Based on benchmark results, here are some tuning recommendations:

For High-Throughput Addition¶

Increase batch size (1000-5000)
Use Arrow integration for bulk loading
Consider parallel additions with multiple goroutines

For Low-Latency Search¶

Reduce efSearch (50-80)
Use smaller vector dimensions if possible
Keep index size manageable
Consider in-memory storage

For High-Accuracy Search¶

Increase M (32-64)
Increase efConstruction (300-500)
Increase efSearch (200-400)

Next Steps¶

Now that you've benchmarked Quiver's performance, check out:

Performance Tuning - Optimize Quiver for your needs
HNSW Algorithm - Learn more about how HNSW works
DuckDB Integration - Understand how Quiver uses DuckDB