Sepia: a object-hierarchy-to-disk serializer shard

I have not seen this around but it’s a pattern I use a lot in my own coding:

  • I want data structured in a hierarchy
  • I want to save it to disk
  • I don’t want to use a database, I want to use plain old files and directories

Why?

  • Because I don’t want to lock the user in.
  • Because I want to manipulate the data using unixy tools
  • Because I want to version-control it

So, that’s Sepia. You lightly annotate your classes, and write how they can be turned to/from a string. Then you can roundtrip from/to your data tree to a directory tree.

Here’s a small example:

require "sepia"

# Configure Sepia to use a local directory for storage.
Sepia::Storage::INSTANCE.path = "./_data"

# A Postit is a simple Serializable object.
class Postit
  include Sepia::Serializable

  property text : String

  def initialize(@text); end
  def initialize; @text = ""; end

  # The to_sepia method defines the content of the serialized file.
  def to_sepia : String
    @text
  end

  # The from_sepia class method defines how to deserialize the object.
  def self.from_sepia(sepia_string : String) : self
    new(sepia_string)
  end
end

# A Board is a Container that can hold other Boards and Postits.
class Board
  include Sepia::Container

  property boards : Array(Board)
  property postits : Array(Postit)

  def initialize(@boards = [] of Board, @postits = [] of Postit); end
end

# --- Create and Save ---

# A top-level board for "Work"
work_board = Board.new
work_board.sepia_id = "work_board"

# A nested board for "Project X"
project_x_board = Board.new
project_x_board.sepia_id = "project_x" # This ID is only used for top-level objects

# Create some Post-its
postit1 = Postit.new("Finish the report")
postit1.sepia_id = "report_postit"
postit2 = Postit.new("Review the code")
postit2.sepia_id = "code_review_postit"

# Assemble the structure
project_x_board.postits << postit2
work_board.boards << project_x_board
work_board.postits << postit1

# Save the top-level board. This will recursively save all its contents.
work_board.save

# --- Load ---

loaded_work_board = Board.load("work_board").as(Board)

puts loaded_work_board.postits[0].text # => "Finish the report"
puts loaded_work_board.boards[0].postits[0].text # => "Review the code"

And it produces this tree on disk:

./_data
├── Board
│   └── work_board
│       ├── boards
│       │   └── project_x
│       │       └── postits
│       │           └── 0 -> ./_data/Postit/code_review_postit
│       └── postits
│             └── 0 -> ./_data/Postit/report_postit
└── Postit
    ├── code_review_postit
    └── report_postit

You can nest containers all you want, some bits are to be implemented (like preserving order when roundtripping an array) and files with data are deduplicated (they have a canonical location and are symlinked to all the places where they are referenced)

UPDATE: And I forgot to add the link, of course. https://github.com/ralsina/sepia

2 Likes

Is there a link to a repo?

Repo seems to be at GitHub - ralsina/sepia: A serializer focused on storing a tree of objects to disk in an intuitive way

I love this!

So many things absolutely do not need the performance of a database, but would benefit greatly by simple access to data via 5 decades of filesystem tools.

Absolutely brilliant. (idea that is, can’t speak for the implementation, have not looked at it :) )

2 Likes

The implementation still needs work and to be used in anger by someone other than me at some point :-D

1 Like