Monitoring¶
Quiver provides comprehensive monitoring capabilities to help you understand the performance and health of your vector database.
Overview¶
Monitoring features include:
- Performance metrics
- Health checks
- Resource usage tracking
- Operation logging
- Prometheus integration
- Custom metric collection
Built-in Metrics¶
Core Metrics¶
// Collect basic metrics
metrics := idx.CollectMetrics()
// Available metrics
fmt.Printf("Vector count: %d\n", metrics["vector_count"])
fmt.Printf("Batch size: %d\n", metrics["batch_size"])
fmt.Printf("Cache size: %d\n", metrics["cache_size"])
fmt.Printf("DB connections: %d\n", metrics["db_connections"])
Performance Metrics¶
// Search performance metrics
metrics := idx.CollectMetrics()
fmt.Printf("Average search time: %.2fms\n", metrics["avg_search_time_ms"])
fmt.Printf("Search QPS: %.2f\n", metrics["search_qps"])
Health Checks¶
Basic Health Check¶
// Verify index health
if err := idx.HealthCheck(); err != nil {
log.Printf("Health check failed: %v", err)
}
Detailed Health Status¶
// Get detailed health status
status := idx.GetHealthStatus()
fmt.Printf("Index status: %s\n", status.Status)
fmt.Printf("Last backup: %s\n", status.LastBackup)
fmt.Printf("Disk usage: %.2f%%\n", status.DiskUsage)
Prometheus Integration¶
Metric Endpoints¶
Available Prometheus Metrics¶
# HELP quiver_vector_count Total number of vectors in the index
# TYPE quiver_vector_count gauge
quiver_vector_count 1000
# HELP quiver_search_duration_seconds Search operation duration
# TYPE quiver_search_duration_seconds histogram
# HELP quiver_batch_operations_total Total number of batch operations
# TYPE quiver_batch_operations_total counter
Resource Monitoring¶
Memory Usage¶
metrics := idx.CollectMetrics()
fmt.Printf("Memory usage: %d bytes\n", metrics["memory_usage"])
fmt.Printf("Cache memory: %d bytes\n", metrics["cache_memory"])
Disk Usage¶
metrics := idx.CollectMetrics()
fmt.Printf("Storage size: %d bytes\n", metrics["storage_size"])
fmt.Printf("Backup size: %d bytes\n", metrics["backup_size"])
Operation Logging¶
Configure Logging¶
config := zap.NewProductionConfig()
logger, _ := config.Build()
idx, err := quiver.New(quiver.Config{
// ... other config
}, logger)
Log Levels¶
// Debug level for detailed operations
logger.Debug("Processing batch",
zap.Int("size", batchSize),
zap.Duration("time", duration))
// Info level for important operations
logger.Info("Backup completed",
zap.String("path", backupPath),
zap.Int("vectors", vectorCount))
// Error level for issues
logger.Error("Search failed",
zap.Error(err),
zap.Any("query", query))
Performance Monitoring¶
Search Performance¶
// Track search latency
start := time.Now()
results, err := idx.Search(vector, k, page, pageSize)
duration := time.Since(start)
logger.Info("Search completed",
zap.Duration("duration", duration),
zap.Int("results", len(results)))
Batch Operations¶
// Monitor batch processing
metrics := idx.CollectMetrics()
fmt.Printf("Batch queue size: %d\n", metrics["batch_queue_size"])
fmt.Printf("Batch processing rate: %.2f/s\n", metrics["batch_rate"])
Custom Metrics¶
Define Custom Metrics¶
// Create custom metric collectors
var (
customCounter = promauto.NewCounter(prometheus.CounterOpts{
Name: "quiver_custom_operations_total",
Help: "Total number of custom operations",
})
customHistogram = promauto.NewHistogram(prometheus.HistogramOpts{
Name: "quiver_custom_duration_seconds",
Help: "Duration of custom operations",
})
)
Track Custom Metrics¶
Best Practices¶
- Metric Collection
- Regular health checks
- Monitor resource usage
- Track performance metrics
-
Set up alerts
-
Log Management
- Structured logging
- Appropriate log levels
- Log rotation
-
Error tracking
-
Performance Monitoring
- Track latency
- Monitor throughput
- Watch resource usage
- Set baselines
Alerting¶
Configure Alerts¶
// Example alert configuration
type Alert struct {
Metric string
Threshold float64
Duration time.Duration
}
alerts := []Alert{
{
Metric: "search_latency_p95",
Threshold: 100, // milliseconds
Duration: 5 * time.Minute,
},
{
Metric: "disk_usage_percent",
Threshold: 85, // percent
Duration: time.Hour,
},
}
Next Steps¶
- Learn about Performance Tuning
- Explore Scaling
- Understand Benchmarks