Skip to content

Vector Operations

Quiver offers a rich set of vector operations to help you manage and search your vector database. Let's explore all the ways you can sling vectors around! 🏹

Adding Vectors

Basic Addition

The most fundamental operation is adding a vector to the index:

// Add a single vector with ID 1
err := idx.Add(1, []float32{0.1, 0.2, 0.3, ...}, map[string]interface{}{
    "category": "science",
    "name": "black hole",
})

Each vector needs:

  • A unique ID (uint64)
  • The vector data ([]float32)
  • Metadata (map[string]interface{})

Required Metadata

Quiver requires at least a "category" field in the metadata. This helps with organization and filtering.

Batch Addition

For better performance when adding many vectors, Quiver automatically batches additions:

// Add 1000 vectors in a loop
for i := 0; i < 1000; i++ {
    vector := generateRandomVector(dimension)
    metadata := map[string]interface{}{
        "category": "batch",
        "index": i,
    }
    idx.Add(uint64(i), vector, metadata)
}

The vectors will be added to a batch buffer and inserted into the index when:

  1. The batch size reaches the configured limit (BatchSize)
  2. You explicitly call idx.flushBatch()
  3. You close the index with idx.Close()

Arrow Integration

For even faster bulk loading, Quiver supports Apache Arrow:

// Create Arrow record with vectors and metadata
builder := array.NewRecordBuilder(memory.DefaultAllocator, quiver.NewVectorSchema(dimension))
// ... populate the builder ...
record := builder.NewRecord()

// Add all vectors from the Arrow record
err := idx.AppendFromArrow(record)

This is the fastest way to add vectors, especially when loading from external sources.

Searching Vectors

The most common operation is searching for similar vectors:

// Search for the 10 most similar vectors
results, err := idx.Search(queryVector, 10, 1, 10)

The parameters are:

  • queryVector: The vector to search for
  • k: Number of results to return
  • page: Page number (1-indexed)
  • pageSize: Number of results per page

The results include:

  • Vector ID
  • Distance (similarity score)
  • Metadata
for i, result := range results {
    fmt.Printf("%d. ID: %d, Distance: %.4f, Name: %s\n", 
        i+1, result.ID, result.Distance, result.Metadata["name"])
}

Hybrid Search (Vector + Metadata)

Combine vector similarity with metadata filtering:

// Find vectors similar to queryVector that match the filter
results, err := idx.SearchWithFilter(queryVector, 10, 
    "category = 'science' AND json_array_contains(tags, 'physics')")

The filter is a SQL WHERE clause that operates on the metadata.

SQL Power

The filter uses DuckDB's SQL engine, so you can use any SQL expression that DuckDB supports!

Search with Negative Examples

Find vectors similar to A but dissimilar to B:

// Define what we like and what we don't like
positiveVector := []float32{0.1, 0.2, 0.3, ...}
negativeVectors := [][]float32{
    {0.9, 0.8, 0.7, ...},
    {0.8, 0.7, 0.6, ...},
}

// Search with negative examples
results, err := idx.SearchWithNegatives(
    positiveVector,    // What we're looking for
    negativeVectors,   // What we want to avoid
    10, 1, 10)         // k, page, pageSize

This is great for:

  • Recommendation systems ("more like this, less like that")
  • Refining search results based on user feedback
  • Exploring different regions of the vector space

Search with specific metadata facets:

// Search with facets
results, err := idx.FacetedSearch(queryVector, 10, map[string]string{
    "category": "science",
    "year": "2023",
})

Search with multiple query vectors:

// Search with multiple query vectors
multiResults, err := idx.MultiVectorSearch(
    [][]float32{vector1, vector2, vector3}, 
    5)

This returns a slice of result slices, one for each query vector.

Metadata Operations

Querying Metadata

You can query metadata directly without vector search:

// Query metadata using SQL
results, err := idx.QueryMetadata(
    "SELECT * FROM metadata WHERE category = 'science' ORDER BY created_at DESC LIMIT 10")

This returns a slice of metadata maps.

Performance Considerations

Search Performance

  • The HNSWEfSearch parameter controls the trade-off between search speed and accuracy
  • Higher values give more accurate results but slower searches
  • For most applications, values between 50-200 provide a good balance

Addition Performance

  • Batch additions are much faster than individual additions
  • The BatchSize parameter controls how many vectors are batched before insertion
  • Larger batch sizes improve throughput but increase memory usage
  • Arrow integration provides the best performance for bulk loading

Next Steps

Now that you've mastered vector operations, check out: