Vector Operations¶

Quiver offers a rich set of vector operations to help you manage and search your vector database. Let's explore all the ways you can sling vectors around! 🏹

Adding Vectors¶

Basic Addition¶

The most fundamental operation is adding a vector to the index:

// Add a single vector with ID 1
err := idx.Add(1, []float32{0.1, 0.2, 0.3, ...}, map[string]interface{}{
    "category": "science",
    "name": "black hole",
})

Each vector needs:

A unique ID (uint64)
The vector data ([]float32)
Metadata (map[string]interface{})

Required Metadata

Quiver requires at least a "category" field in the metadata. This helps with organization and filtering.

Batch Addition¶

For better performance when adding many vectors, Quiver automatically batches additions:

// Add 1000 vectors in a loop
for i := 0; i < 1000; i++ {
    vector := generateRandomVector(dimension)
    metadata := map[string]interface{}{
        "category": "batch",
        "index": i,
    }
    idx.Add(uint64(i), vector, metadata)
}

The vectors will be added to a batch buffer and inserted into the index when:

The batch size reaches the configured limit (BatchSize)
You explicitly call idx.flushBatch()
You close the index with idx.Close()

Arrow Integration¶

For even faster bulk loading, Quiver supports Apache Arrow:

// Create Arrow record with vectors and metadata
builder := array.NewRecordBuilder(memory.DefaultAllocator, quiver.NewVectorSchema(dimension))
// ... populate the builder ...
record := builder.NewRecord()

// Add all vectors from the Arrow record
err := idx.AppendFromArrow(record)

This is the fastest way to add vectors, especially when loading from external sources.

Searching Vectors¶

Basic Search¶

The most common operation is searching for similar vectors:

// Search for the 10 most similar vectors
results, err := idx.Search(queryVector, 10, 1, 10)

The parameters are:

queryVector: The vector to search for
k: Number of results to return
page: Page number (1-indexed)
pageSize: Number of results per page

The results include:

Vector ID
Distance (similarity score)
Metadata

for i, result := range results {
    fmt.Printf("%d. ID: %d, Distance: %.4f, Name: %s\n", 
        i+1, result.ID, result.Distance, result.Metadata["name"])
}

Hybrid Search (Vector + Metadata)¶

Combine vector similarity with metadata filtering:

// Find vectors similar to queryVector that match the filter
results, err := idx.SearchWithFilter(queryVector, 10, 
    "category = 'science' AND json_array_contains(tags, 'physics')")

The filter is a SQL WHERE clause that operates on the metadata.

SQL Power

The filter uses DuckDB's SQL engine, so you can use any SQL expression that DuckDB supports!

Search with Negative Examples¶

Find vectors similar to A but dissimilar to B:

// Define what we like and what we don't like
positiveVector := []float32{0.1, 0.2, 0.3, ...}
negativeVectors := [][]float32{
    {0.9, 0.8, 0.7, ...},
    {0.8, 0.7, 0.6, ...},
}

// Search with negative examples
results, err := idx.SearchWithNegatives(
    positiveVector,    // What we're looking for
    negativeVectors,   // What we want to avoid
    10, 1, 10)         // k, page, pageSize

This is great for:

Recommendation systems ("more like this, less like that")
Refining search results based on user feedback
Exploring different regions of the vector space

Faceted Search¶

Search with specific metadata facets:

// Search with facets
results, err := idx.FacetedSearch(queryVector, 10, map[string]string{
    "category": "science",
    "year": "2023",
})

Multi-Vector Search¶

Search with multiple query vectors:

// Search with multiple query vectors
multiResults, err := idx.MultiVectorSearch(
    [][]float32{vector1, vector2, vector3}, 
    5)

This returns a slice of result slices, one for each query vector.

Metadata Operations¶

Querying Metadata¶

You can query metadata directly without vector search:

// Query metadata using SQL
results, err := idx.QueryMetadata(
    "SELECT * FROM metadata WHERE category = 'science' ORDER BY created_at DESC LIMIT 10")

This returns a slice of metadata maps.

Performance Considerations¶

Search Performance¶

The HNSWEfSearch parameter controls the trade-off between search speed and accuracy
Higher values give more accurate results but slower searches
For most applications, values between 50-200 provide a good balance

Addition Performance¶

Batch additions are much faster than individual additions
The BatchSize parameter controls how many vectors are batched before insertion
Larger batch sizes improve throughput but increase memory usage
Arrow integration provides the best performance for bulk loading

Next Steps¶

Now that you've mastered vector operations, check out:

Metadata & Filtering - Learn more about metadata and filtering
Persistence & Backup - Keep your data safe
HTTP API - Use Quiver as a service