Command Line Interface

FlashFS provides a comprehensive command-line interface (CLI) for interacting with snapshots, diffs, and expiry policies. The CLI is built using Cobra, a powerful library for creating modern CLI applications.

Overview

The FlashFS CLI includes commands for:

  • Creating and managing snapshots
  • Computing and applying diffs between snapshots
  • Querying snapshot contents
  • Managing snapshot expiry policies
  • Streaming operations for large datasets

Command Structure

flashfs
├── snapshot - Create a snapshot of a directory
├── diff - Compute differences between snapshots
├── apply - Apply a diff to a snapshot
├── query - Query snapshot contents
├── stream - Commands that use streaming for large datasets
│   ├── snapshot - Take a snapshot using streaming for large directories
│   └── diff - Compare snapshots using streaming for large datasets
└── expiry - Manage snapshot expiry policies
    ├── set - Set expiry policy
    ├── apply - Apply expiry policy
    └── show - Show current expiry policy

Global Flags

These flags apply to all commands:

--help, -h     Show help for a command
--verbose, -v  Enable verbose output

Snapshot Command

Create a snapshot of a directory.

flashfs snapshot [flags]

Flags

--path, -p string     Directory to snapshot (required)
--output, -o string   Output snapshot file (required)
--exclude string      Exclude pattern (e.g., "*.tmp,*.log")

Implementation Details

The snapshot command uses the streaming walker implementation (WalkStreamWithCallback) to efficiently process large directory structures. This provides several benefits:

  1. Memory Efficiency: Files are processed as they're discovered, keeping memory usage relatively constant regardless of directory size.
  2. Progress Reporting: The command shows real-time progress updates as files are processed.
  3. Responsiveness: Users see immediate feedback rather than waiting for the entire walk to complete.
  4. Cancellation Support: The operation can be cleanly cancelled with Ctrl+C.

This makes the snapshot command suitable for very large directory structures, as it avoids loading all file metadata into memory at once.

Examples

# Create a snapshot of the current directory
flashfs snapshot --path . --output backup.snap

# Create a snapshot excluding temporary files
flashfs snapshot --path /home/user/documents --output docs.snap --exclude "*.tmp,*.bak"

# Create a snapshot without computing file hashes (faster but less accurate)
flashfs snapshot --path /var/www --output web.snap --no-hash

Stream Command

Commands that use streaming for large datasets with enhanced progress reporting and cancellation support.

flashfs stream [command]

Available Commands

snapshot   Take a snapshot using streaming for large directories
diff       Compare snapshots using streaming for large datasets

Stream Snapshot Command

Take a snapshot of a directory using streaming processing with real-time progress reporting and cancellation support.

flashfs stream snapshot [flags]

Flags

--source string       Directory to snapshot (required)
--output string       Output file for snapshot (required)
--hash-algo string    Hashing algorithm (BLAKE3, MD5, SHA1, SHA256) (default "BLAKE3")
--skip-errors         Skip errors during directory traversal
--partial-hash        Use partial hashing for large files
--workers int         Number of worker goroutines (default 4)

Implementation Details

The stream snapshot command is specifically designed for large directories and provides:

  1. Two-Pass Processing: First counts files to provide accurate progress reporting, then processes them.
  2. Real-Time Progress Bars: Shows detailed progress for each phase of the operation.
  3. Detailed Statistics: Displays comprehensive statistics after completion, including file counts, sizes, compression ratios, and processing speeds.
  4. Graceful Cancellation: Can be safely interrupted at any point with Ctrl+C.
  5. Memory Efficiency: Processes files in a streaming fashion to minimize memory usage.

Examples

# Create a snapshot of a large directory with progress reporting
flashfs stream snapshot --source /path/to/large/directory --output backup.snap

# Create a snapshot with partial hashing for large files
flashfs stream snapshot --source /data --output data.snap --partial-hash

# Create a snapshot with a specific hashing algorithm
flashfs stream snapshot --source /home/user --output home.snap --hash-algo SHA256

# Create a snapshot with more worker threads for faster processing
flashfs stream snapshot --source /var/www --output web.snap --workers 8

# Create a snapshot that skips errors during traversal
flashfs stream snapshot --source /system --output system.snap --skip-errors

Stream Diff Command

Compare snapshots using streaming processing with real-time progress reporting and cancellation support.

flashfs stream diff [flags]

Flags

--base string         Base snapshot file (required)
--target string       Target snapshot file (required)
--output string       Output file for diff (required)

Implementation Details

The stream diff command is designed for comparing large snapshots and provides:

  1. Progress Reporting: Shows detailed progress for loading snapshots, computing differences, and writing the diff file.
  2. Detailed Statistics: Displays comprehensive statistics after completion, including snapshot sizes, diff size, and diff ratio.
  3. Graceful Cancellation: Can be safely interrupted at any point with Ctrl+C.

Examples

# Compare two large snapshots with progress reporting
flashfs stream diff --base snapshot1.snap --target snapshot2.snap --output changes.diff

# Compare snapshots in different directories
flashfs stream diff --base /backups/old.snap --target /backups/new.snap --output /backups/changes.diff

Diff Command

Compute differences between snapshots and store them in a structured format.

flashfs diff [flags]

Flags

--base, -b string     Base snapshot file (required)
--target, -t string   Target snapshot file (required)
--output, -o string   Output diff file (required)
--detailed            Perform detailed comparison including file content hashes
--parallel int        Number of parallel workers for comparison (default: number of CPU cores)
--no-hash             Skip hash comparison (faster but less accurate)
--path-filter string  Only compare files matching the specified path pattern

Examples

# Compute differences between two snapshots
flashfs diff --base snapshot1.snap --target snapshot2.snap --output diff.diff

# Compute differences with detailed comparison
flashfs diff --base snapshot1.snap --target snapshot2.snap --output diff.diff --detailed

# Compute differences using 8 parallel workers
flashfs diff --base snapshot1.snap --target snapshot2.snap --output diff.diff --parallel 8

# Compute differences for a specific path only
flashfs diff --base snapshot1.snap --target snapshot2.snap --output diff.diff --path-filter "/home/user/documents/*"

Diff Format

The generated diff file contains a structured representation of changes:

  • Added files: Files that exist in the target snapshot but not in the base snapshot
  • Modified files: Files that exist in both snapshots but have different metadata or content
  • Deleted files: Files that exist in the base snapshot but not in the target snapshot

Each change is stored as a DiffEntry with:

  • Path of the changed file
  • Type of change (added, modified, deleted)
  • Before and after values for size, modification time, permissions, and content hash (as applicable)

This structured format enables efficient application of changes and provides detailed information about what has changed between snapshots.

Apply Command

Apply a diff to a snapshot to generate a new snapshot.

flashfs apply [flags]

Flags

--base, -b string    Base snapshot file (required)
--diff, -d string    Diff file to apply (required)
--output, -o string  Output snapshot file (required)

Examples

# Apply a diff to generate a new snapshot
flashfs apply --base snapshot1.snap --diff changes.diff --output snapshot2.snap

How Apply Works

The apply command:

  1. Loads the base snapshot into memory
  2. Deserializes the diff file into a structured Diff object
  3. Processes each DiffEntry based on its type:
  4. For added files (type 0): Creates a new entry in the snapshot
  5. For modified files (type 1): Updates the existing entry with new metadata
  6. For deleted files (type 2): Removes the entry from the snapshot
  7. Serializes the modified snapshot and writes it to the output file

This structured approach ensures that changes are applied correctly and efficiently, maintaining the integrity of your file system representation.

Query Command

Query snapshots for files matching various criteria.

flashfs query [command] [flags]

Available Commands

query             Query snapshots for files matching criteria
find-duplicates   Find duplicate files across snapshots
find-changes      Find files that changed between snapshots
find-largest      Find the N largest files in a snapshot

Query Flags

--dir string          Base directory containing snapshots (required)
--pattern string      File pattern to match
--min-size string    Minimum file size (e.g., "1MB")
--max-size string    Maximum file size (e.g., "1GB")
--start-time string  Start time (RFC3339)
--end-time string    End time (RFC3339)
--hash string        File hash to match
--is-dir            Match directories only
--all-snapshots     Query all snapshots
--snapshot string    Snapshot ID to query

Find Duplicates Flags

--dir string         Base directory containing snapshots (required)
--snapshots strings  Snapshot IDs to search
--pattern string     File pattern to match

Find Changes Flags

--dir string      Base directory containing snapshots (required)
--old string      Old snapshot ID (required)
--new string      New snapshot ID (required)
--pattern string  File pattern to match

Find Largest Flags

--dir string       Base directory containing snapshots (required)
--snapshot string  Snapshot ID to query (required)
--n int           Number of files to return (default 10)
--pattern string  File pattern to match

Examples

# Query files in a snapshot
flashfs query --dir ~/.flashfs --snapshot my-snapshot --pattern "*.txt"

# Query files by size range
flashfs query --dir ~/.flashfs --snapshot my-snapshot --min-size 1MB --max-size 10MB

# Query files by modification time
flashfs query --dir ~/.flashfs --snapshot my-snapshot --start-time "2024-01-01T00:00:00Z" --end-time "2024-02-01T00:00:00Z"

# Find duplicate files across snapshots
flashfs query find-duplicates --dir ~/.flashfs --snapshots "snap1,snap2,snap3"

# Find files that changed between snapshots
flashfs query find-changes --dir ~/.flashfs --old snap1 --new snap2

# Find the 20 largest files in a snapshot
flashfs query find-largest --dir ~/.flashfs --snapshot my-snapshot --n 20

Expiry Command

Manage snapshot expiry policies.

flashfs expiry [command]

Subcommands

Set

Set the expiry policy for snapshots.

flashfs expiry set [flags]
Flags
--max-snapshots int    Maximum number of snapshots to keep (0 = unlimited)
--max-age string       Maximum age of snapshots to keep (e.g., 30d, 2w, 6m, 1y)
--keep-hourly int      Number of hourly snapshots to keep
--keep-daily int       Number of daily snapshots to keep
--keep-weekly int      Number of weekly snapshots to keep
--keep-monthly int     Number of monthly snapshots to keep
--keep-yearly int      Number of yearly snapshots to keep
--apply                Apply the policy immediately after setting it
--dir string           Base directory for snapshots (default: current directory)
Examples
# Keep only the 10 most recent snapshots
flashfs expiry set --max-snapshots 10

# Remove snapshots older than 30 days
flashfs expiry set --max-age 30d

# Set a comprehensive retention policy
flashfs expiry set --keep-hourly 24 --keep-daily 7 --keep-weekly 4 --keep-monthly 12 --keep-yearly 5

# Set a policy and apply it immediately
flashfs expiry set --max-age 30d --apply

Apply

Apply the current expiry policy to snapshots.

flashfs expiry apply [flags]
Flags
--dir string  Base directory for snapshots (default: current directory)
Examples
# Apply the current expiry policy
flashfs expiry apply

# Apply the policy to snapshots in a specific directory
flashfs expiry apply --dir /path/to/snapshots

Show

Show the current expiry policy.

flashfs expiry show [flags]
Flags
--dir string  Base directory for snapshots (default: current directory)
Examples
# Show the current expiry policy
flashfs expiry show

# Show the policy for snapshots in a specific directory
flashfs expiry show --dir /path/to/snapshots

Implementation Details

Command Registration

Commands are registered in the cmd package:

func init() {
    RootCmd.AddCommand(snapshotCmd)
    RootCmd.AddCommand(diffCmd)
    RootCmd.AddCommand(applyCmd)
    RootCmd.AddCommand(queryCmd)
    RootCmd.AddCommand(expiryCmd)

    // Register expiry subcommands
    expiryCmd.AddCommand(setExpiryCmd)
    expiryCmd.AddCommand(applyExpiryCmd)
    expiryCmd.AddCommand(showExpiryCmd)
}

Command Execution

Each command is implemented as a Cobra command with a RunE function:

var snapshotCmd = &cobra.Command{
    Use:   "snapshot",
    Short: "Create a snapshot of a directory",
    Long:  `Create a snapshot of a directory, capturing file metadata and optionally content hashes.`,
    RunE: func(cmd *cobra.Command, args []string) error {
        // Command implementation
        // ...
        return nil
    },
}

Error Handling

The CLI provides detailed error messages and appropriate exit codes:

if err != nil {
    fmt.Fprintf(os.Stderr, "Error: %v\n", err)
    os.Exit(1)
}

Advanced Usage

Scripting

FlashFS commands can be used in scripts for automation:

#!/bin/bash
# Create daily snapshots and apply expiry policy

# Create a snapshot with the current date
DATE=$(date +%Y%m%d)
flashfs snapshot --path /home/user/documents --output backup-$DATE.snap

# Apply expiry policy to clean up old snapshots
flashfs expiry apply

Piping Output

Query results can be piped to other commands:

# Find large files and sort by size
flashfs query --snapshot backup.snap --size-gt 10485760 --format csv | sort -t, -k2 -n

# Find recent changes and send a report by email
flashfs query --snapshot backup.snap --modified-after "2023-01-01" | mail -s "Recent Changes" user@example.com

Integration with Other Tools

FlashFS can be integrated with other tools:

# Use with find to process multiple directories
find /home -type d -name "projects" | xargs -I{} flashfs snapshot --path {} --output {}.snap

# Use with cron for scheduled snapshots
# Add to crontab: 0 0 * * * /path/to/snapshot_script.sh