Cofree will tear us apart
Published:
Recently I’ve been working on distributed systems and I often need to deal with the data that may be located anywhere. We’ll see how just one algebraic construction helped to describe a problem with recycling data in general and then to solve it.
I’ve created a library apart, whose definitions are used in this post.
Introduction and motivation
Let’s suppose you need to write a program that works with some data. If the data is too big, you should think about how to manage and store it because keeping it in RAM is too dangerous, so we need to split the data into the segment that we need right now and the rest that we don’t.
How we’re going to do that? First of all, we need an abstraction that allows us to describe various data structures algebraically. And such an abstraction exists!
Basic construction Cofree
Many Haskellers know Free
type, but Cofree
is unfairly bypassed. The difference is really small - while Free
is a sum of a
and t (Free t a)
, Cofree
is a product:
Free t a = a + t (Free t a)
Cofree t a = a * t (Cofree t a)
If we pick Cofree
as a basic construction, we’ll get those functions for free:
extract
- Get a focused value, so our structure can’t be emptyunwrap
- Get a segment of structure without focused valueextend
- Update values with a function that takes care of contextcoiter
- Generate structure from a seed
So, how are we going to build various data structures using Cofree
? All we need is a Functor
. To be more precise, we need to instantiate t
type in Cofree t a
.
Suppose we need a stack or non-empty list - fairly simple structure. We can take Maybe
, cause it has everything we need: Just
lets you continue describing a structure and Nothing
lets you terminate an expression (cause it’s a null constructor).
type Stack = Cofree Maybe
stack :: Stack Int
stack = 1 :< Just (2 :< Just (3 :< Nothing))
Helper construction Shape
Okay, now we know how to describe structures algebraically via Cofree
. But we talked about separation of data that is located in different places. So we’ll just add one more construction:
data Shape t raw value = Ready (t value) | Converted raw
data Apart t raw value = Apart (Cofree (Shape t raw) value)
Ready
constructor describes values in RAM, Converted
describes data located somewhere else. Cofree
and Shape
together can give us the wonderful type Apart
.
Simple example of usage Apart
Imagine that we want to work with binary tree. How can we describe it via Cofree
? We need a Functor
of branching here. A node of a binary tree can either have a left child, a right child, both, or no children. Let’s write it down.
data Crotch a = End | Less a | Greater a | Crotch a a
type Binary = Cofree Crotch
An example binary tree from Wikipedia:
example :: Binary Int
example = 8 :< Crotch
(3:< Crotch
(1 :< End)
(6 :< Crotch
(4 :< End)
(7 :< End)))
(10 :< Greater
(14 :< Less
(13 :< End)))
Suppose we need to keep a tree in memory with height not greater than 3. We can limit
it.
limit 3 do_something_with_the_rest example
I’m intentionally not focusing on how to persist the segments - it’s not the goal of this post. We We can persist an out of range segment in a file and the function do_something_with_the_rest
can return the filename and line number. Or we can put them in Redis
/Memcashed
/Tarantool
and return the connection parameters and keys for the saved segments. It doesn’t matter.
8 :< Crotch
(3 :< Crotch
(1 :< {RESTORE_INFO})
(6 :< {RESTORE_INFO}))
(10 :< Greater
(14 :< {RESTORE_INFO}))
So we cut off our tree and put information for restoring where these unnecessary segments were. With recover
we can put the whole structure back in memory.
recover back_to_RAM scattered
Or, we can traverse with effects on a scattered data structure and restore all the segments by applying this function to them as well - just use fluent
.
fluent do_something_wherever_they_are scattered
Conclusion
So far, I’ve successfully defined acyclic oriented graphs
, binary
, prefix
, rose
, AVL
and Splay
trees, as well as some functions to deal with them.
The idea of using Cofree
as base construction for another structures was captured from the description module Control.Comonad.Cofree
in package free
of Edward Kmett.
The idea of algebraic graphs comes from paper of Andrew Mokhov.
But there is a lot to do:
- Try to define Finger tree, Scapegoat, ART and other complicated structures.
- Check compatibility with streaming libraries (
pipes
,conduit
,io-streams
,machines
). - Natural transformations between structures (cause of
Functor
). - Benchmarks, with popular
containers
, for the first.
Leave a Comment