Schema¶
The Schema component in FlashFS defines the data structures used for serialization and deserialization of snapshots and diffs. It uses FlatBuffers to create an efficient, cross-platform binary representation of file system metadata.
Overview¶
The Schema component:
- Defines the structure of snapshots and diffs
- Enables efficient binary serialization
- Provides zero-copy access to serialized data
- Ensures cross-platform compatibility
- Allows for schema evolution over time
FlatBuffers Schema Definition¶
FlashFS uses FlatBuffers as its serialization framework. The schema is defined in the FlatBuffers Interface Definition Language (IDL):
namespace flashfs;
table FileEntry {
path: string;
size: long;
mtime: long;
isDir: bool;
permissions: uint;
hash: [ubyte];
}
table Snapshot {
entries: [FileEntry];
}
table DiffEntry {
path: string;
type: byte; // 0 = added, 1 = modified, 2 = deleted
oldSize: long;
newSize: long;
oldMtime: long;
newMtime: long;
oldPermissions: uint;
newPermissions: uint;
oldHash: [ubyte];
newHash: [ubyte];
}
table Diff {
entries: [DiffEntry];
}
root_type Snapshot;
Schema Components¶
FileEntry¶
The FileEntry
table represents a single file or directory in a snapshot:
- path: The relative path within the snapshot
- size: File size in bytes (0 for directories)
- mtime: Modification time as Unix timestamp
- isDir: Whether the entry is a directory
- permissions: File permissions as a uint32
- hash: BLAKE3 hash of the file contents (empty for directories)
Snapshot¶
The Snapshot
table contains a list of file entries that represent the state of a file system at a specific point in time.
DiffEntry¶
The DiffEntry
table represents a change between two snapshots:
- path: The path of the changed file or directory
- type: The type of change (added, modified, deleted)
- oldSize/newSize: File size before and after the change
- oldMtime/newMtime: Modification time before and after the change
- oldPermissions/newPermissions: Permissions before and after the change
- oldHash/newHash: Content hash before and after the change
Diff¶
The Diff
table contains a list of diff entries that represent the changes between two snapshots.
Generated Code¶
The FlatBuffers compiler generates Go code from the schema definition:
// Generated code provides type-safe accessors for all fields
func (e *FileEntry) Path() string
func (e *FileEntry) Size() int64
func (e *FileEntry) Mtime() int64
func (e *FileEntry) IsDir() bool
func (e *FileEntry) Permissions() uint32
func (e *FileEntry) Hash(j int) byte
func (e *FileEntry) HashLength() int
Usage Examples¶
Accessing Snapshot Data¶
// Deserialize a snapshot
snapshot := flashfs.GetRootAsSnapshot(serializedData, 0)
// Get the number of entries
entriesLength := snapshot.EntriesLength()
// Access individual entries
for i := 0; i < entriesLength; i++ {
var entry flashfs.FileEntry
if snapshot.Entries(&entry, i) {
fmt.Printf("Path: %s, Size: %d bytes\n", entry.Path(), entry.Size())
// Access hash bytes if available
if entry.HashLength() > 0 {
hash := make([]byte, entry.HashLength())
for j := 0; j < entry.HashLength(); j++ {
hash[j] = entry.Hash(j)
}
fmt.Printf("Hash: %x\n", hash)
}
}
}
Creating Serialized Data¶
// Create a FlatBuffers builder
builder := flatbuffers.NewBuilder(0)
// Create file entries
fileEntryOffsets := make([]flatbuffers.UOffsetT, len(entries))
for i, entry := range entries {
// Create string and byte vector offsets
pathOffset := builder.CreateString(entry.Path)
hashOffset := builder.CreateByteVector(entry.Hash)
// Start building a FileEntry
flashfs.FileEntryStart(builder)
flashfs.FileEntryAddPath(builder, pathOffset)
flashfs.FileEntryAddSize(builder, entry.Size)
flashfs.FileEntryAddMtime(builder, entry.ModTime)
flashfs.FileEntryAddIsDir(builder, entry.IsDir)
flashfs.FileEntryAddPermissions(builder, entry.Permissions)
flashfs.FileEntryAddHash(builder, hashOffset)
// Finish the FileEntry
fileEntryOffsets[i] = flashfs.FileEntryEnd(builder)
}
// Create a vector of file entries
flashfs.SnapshotStartEntriesVector(builder, len(fileEntryOffsets))
for i := len(fileEntryOffsets) - 1; i >= 0; i-- {
builder.PrependUOffsetT(fileEntryOffsets[i])
}
entriesVector := builder.EndVector(len(fileEntryOffsets))
// Create the snapshot
flashfs.SnapshotStart(builder)
flashfs.SnapshotAddEntries(builder, entriesVector)
snapshot := flashfs.SnapshotEnd(builder)
// Finish the builder
builder.Finish(snapshot)
// Get the serialized data
serializedData := builder.FinishedBytes()
Schema Evolution¶
The FlatBuffers schema allows for backward-compatible evolution:
- Adding Fields: New fields can be added without breaking compatibility with older data
- Deprecating Fields: Fields can be deprecated without breaking compatibility
- Versioning: Schema versioning can be implemented through optional fields
Performance Considerations¶
The Schema component is designed for high performance:
- Zero-Copy Deserialization: Accessing data doesn't require unpacking or parsing
- Memory Efficiency: Binary representation is compact and memory-efficient
- Cross-Platform: Same binary format works across different platforms
- Fast Access: Direct access to fields without traversing the entire structure
Integration with Other Components¶
The Schema component integrates with:
- Serializer: Provides the structure for serialization
- Storage: Defines the format for stored snapshots and diffs
- Diff: Enables efficient representation of changes
Advanced Topics¶
Custom Schemas¶
For specialized use cases, the schema can be extended:
// Extended schema with additional metadata
table SnapshotWithMetadata {
entries: [FileEntry];
creationTime: long;
description: string;
tags: [string];
}
Schema Versioning¶
To maintain compatibility across versions:
// Versioned schema
table SnapshotV2 {
entries: [FileEntry];
version: uint = 2; // Default value for backward compatibility
compressionType: byte = 0; // New field in version 2
}
Nested Structures¶
For more complex data representation:
// Nested structures
table Directory {
path: string;
files: [FileEntry];
subdirectories: [Directory];
}
table HierarchicalSnapshot {
rootDirectory: Directory;
}