Vector Operations¶
Quiver offers a rich set of vector operations to help you manage and search your vector database. Let's explore all the ways you can sling vectors around! 🏹
Adding Vectors¶
Basic Addition¶
The most fundamental operation is adding a vector to the index:
// Add a single vector with ID 1
err := idx.Add(1, []float32{0.1, 0.2, 0.3, ...}, map[string]interface{}{
"category": "science",
"name": "black hole",
})
Each vector needs:
- A unique ID (uint64)
- The vector data ([]float32)
- Metadata (map[string]interface{})
Required Metadata
Quiver requires at least a "category"
field in the metadata. This helps with organization and filtering.
Batch Addition¶
For better performance when adding many vectors, Quiver automatically batches additions:
// Add 1000 vectors in a loop
for i := 0; i < 1000; i++ {
vector := generateRandomVector(dimension)
metadata := map[string]interface{}{
"category": "batch",
"index": i,
}
idx.Add(uint64(i), vector, metadata)
}
The vectors will be added to a batch buffer and inserted into the index when:
- The batch size reaches the configured limit (
BatchSize
) - You explicitly call
idx.flushBatch()
- You close the index with
idx.Close()
Arrow Integration¶
For even faster bulk loading, Quiver supports Apache Arrow:
// Create Arrow record with vectors and metadata
builder := array.NewRecordBuilder(memory.DefaultAllocator, quiver.NewVectorSchema(dimension))
// ... populate the builder ...
record := builder.NewRecord()
// Add all vectors from the Arrow record
err := idx.AppendFromArrow(record)
This is the fastest way to add vectors, especially when loading from external sources.
Searching Vectors¶
Basic Search¶
The most common operation is searching for similar vectors:
The parameters are:
queryVector
: The vector to search fork
: Number of results to returnpage
: Page number (1-indexed)pageSize
: Number of results per page
The results include:
- Vector ID
- Distance (similarity score)
- Metadata
for i, result := range results {
fmt.Printf("%d. ID: %d, Distance: %.4f, Name: %s\n",
i+1, result.ID, result.Distance, result.Metadata["name"])
}
Hybrid Search (Vector + Metadata)¶
Combine vector similarity with metadata filtering:
// Find vectors similar to queryVector that match the filter
results, err := idx.SearchWithFilter(queryVector, 10,
"category = 'science' AND json_array_contains(tags, 'physics')")
The filter is a SQL WHERE clause that operates on the metadata.
SQL Power
The filter uses DuckDB's SQL engine, so you can use any SQL expression that DuckDB supports!
Search with Negative Examples¶
Find vectors similar to A but dissimilar to B:
// Define what we like and what we don't like
positiveVector := []float32{0.1, 0.2, 0.3, ...}
negativeVectors := [][]float32{
{0.9, 0.8, 0.7, ...},
{0.8, 0.7, 0.6, ...},
}
// Search with negative examples
results, err := idx.SearchWithNegatives(
positiveVector, // What we're looking for
negativeVectors, // What we want to avoid
10, 1, 10) // k, page, pageSize
This is great for:
- Recommendation systems ("more like this, less like that")
- Refining search results based on user feedback
- Exploring different regions of the vector space
Faceted Search¶
Search with specific metadata facets:
// Search with facets
results, err := idx.FacetedSearch(queryVector, 10, map[string]string{
"category": "science",
"year": "2023",
})
Multi-Vector Search¶
Search with multiple query vectors:
// Search with multiple query vectors
multiResults, err := idx.MultiVectorSearch(
[][]float32{vector1, vector2, vector3},
5)
This returns a slice of result slices, one for each query vector.
Metadata Operations¶
Querying Metadata¶
You can query metadata directly without vector search:
// Query metadata using SQL
results, err := idx.QueryMetadata(
"SELECT * FROM metadata WHERE category = 'science' ORDER BY created_at DESC LIMIT 10")
This returns a slice of metadata maps.
Performance Considerations¶
Search Performance¶
- The
HNSWEfSearch
parameter controls the trade-off between search speed and accuracy - Higher values give more accurate results but slower searches
- For most applications, values between 50-200 provide a good balance
Addition Performance¶
- Batch additions are much faster than individual additions
- The
BatchSize
parameter controls how many vectors are batched before insertion - Larger batch sizes improve throughput but increase memory usage
- Arrow integration provides the best performance for bulk loading
Next Steps¶
Now that you've mastered vector operations, check out:
- Metadata & Filtering - Learn more about metadata and filtering
- Persistence & Backup - Keep your data safe
- HTTP API - Use Quiver as a service