Cofree will tear us apart

5 minute read

Published:

Recently I’ve been working on distributed systems and I often need to deal with the data that may be located anywhere. We’ll see how just one algebraic construction helped to describe a problem with recycling data in general and then to solve it.

I’ve created a library apart, whose definitions are used in this post.

Introduction and motivation

Let’s suppose you need to write a program that works with some data. If the data is too big, you should think about how to manage and store it because keeping it in RAM is too dangerous, so we need to split the data into the segment that we need right now and the rest that we don’t.

How we’re going to do that? First of all, we need an abstraction that allows us to describe various data structures algebraically. And such an abstraction exists!

Basic construction Cofree

Many Haskellers know Free type, but Cofree is unfairly bypassed. The difference is really small - while Free is a sum of a and t (Free t a), Cofree is a product:

Free t a = a + t (Free t a)
Cofree t a = a * t (Cofree t a)

If we pick Cofree as a basic construction, we’ll get those functions for free:

  • extract - Get a focused value, so our structure can’t be empty
  • unwrap - Get a segment of structure without focused value
  • extend - Update values with a function that takes care of context
  • coiter - Generate structure from a seed

So, how are we going to build various data structures using Cofree? All we need is a Functor. To be more precise, we need to instantiate t type in Cofree t a.

Suppose we need a stack or non-empty list - fairly simple structure. We can take Maybe, cause it has everything we need: Just lets you continue describing a structure and Nothing lets you terminate an expression (cause it’s a null constructor).

type Stack = Cofree Maybe

stack :: Stack Int
stack = 1 :< Just (2 :< Just (3 :< Nothing))

Helper construction Shape

Okay, now we know how to describe structures algebraically via Cofree. But we talked about separation of data that is located in different places. So we’ll just add one more construction:

data Shape t raw value = Ready (t value) | Converted raw

data Apart t raw value = Apart (Cofree (Shape t raw) value)

Ready constructor describes values in RAM, Converted describes data located somewhere else. Cofree and Shape together can give us the wonderful type Apart.

Simple example of usage Apart

Imagine that we want to work with binary tree. How can we describe it via Cofree? We need a Functor of branching here. A node of a binary tree can either have a left child, a right child, both, or no children. Let’s write it down.

data Crotch a = End | Less a | Greater a | Crotch a a

type Binary = Cofree Crotch

An example binary tree from Wikipedia:

Alt text

example :: Binary Int
example = 8 :< Crotch
	(3:< Crotch
		(1 :< End)
		(6 :< Crotch
			(4 :< End)
			(7 :< End)))
	(10 :< Greater
		(14 :< Less
			(13 :< End)))

Suppose we need to keep a tree in memory with height not greater than 3. We can limit it.

limit 3 do_something_with_the_rest example

I’m intentionally not focusing on how to persist the segments - it’s not the goal of this post. We We can persist an out of range segment in a file and the function do_something_with_the_rest can return the filename and line number. Or we can put them in Redis/Memcashed/Tarantool and return the connection parameters and keys for the saved segments. It doesn’t matter.

8 :< Crotch
	(3 :< Crotch
		(1 :< {RESTORE_INFO})
		(6 :< {RESTORE_INFO}))
	(10 :< Greater
		(14 :< {RESTORE_INFO}))

So we cut off our tree and put information for restoring where these unnecessary segments were. With recover we can put the whole structure back in memory.

recover back_to_RAM scattered

Or, we can traverse with effects on a scattered data structure and restore all the segments by applying this function to them as well - just use fluent.

fluent do_something_wherever_they_are scattered

Conclusion

So far, I’ve successfully defined acyclic oriented graphs, binary, prefix, rose, AVL and Splay trees, as well as some functions to deal with them.

The idea of using Cofree as base construction for another structures was captured from the description module Control.Comonad.Cofree in package free of Edward Kmett.

The idea of algebraic graphs comes from paper of Andrew Mokhov.

But there is a lot to do:

  • Try to define Finger tree, Scapegoat, ART and other complicated structures.
  • Check compatibility with streaming libraries (pipes, conduit, io-streams, machines).
  • Natural transformations between structures (cause of Functor).
  • Benchmarks, with popular containers, for the first.

Leave a Comment