The graph data structure of Mastodon.¶
The mastodon-graph Java package can be used to implement directed graphs with small memory footprint.
Vertices and edges are not stored as individual objects. Instead vertex and edge data is laid out in a primitive
byte array and accessed via proxy objects.
This page describes some internals of the trackmate-graph package.
Each vertex and each edge maps to a contiguous portion of a
The size of a portion (elements) is fixed for a particular
Each element starts with a fixed part that represents the graph structure and then additional payload used by the subclass to describe some vertex attributes, etc.
There is one
byte array that stores all vertices, and one
byte array that stores all edges.
References between these arrays are in the form of element indices.
Whether these are indices refer to elements in the vertex or in the edge memory array is clear from the context.
The following diagram illustrates the layout of vertices in a
In the left-most column, element indices are shown (in blue), followed by byte indices (in grey). In the example, each element of the
Spot requires 57 bytes to store.
The data for the i th
Spot starts at byte i * 57.
AbstractVertex part comprises element indices
FIRST_OUT_EDGE, occupying 4 bytes each.
The remaining 49 bytes are
FIRST_IN_EDGE is the element index (in the edge memory array) of the first incoming edge, i.e., an edge pointing to this vertex.
The remaining incoming edges of the same vertex are stored as a linked list in the edge memory as described below.
If this vertex does not have any incoming edges
FIRST_IN_EDGE is -1.
FIRST_OUT_EDGE is the element index of the first outgoing edge, i.e., an edge starting from this vertex.
The remaining outgoing edges of the same vertex are stored as a linked list in the edge memory as described below. If this vertex does not have any outgoing edges
FIRST_OUT_EDGE is -1.
The following diagram illustrates the layout of edges in a
In the left-most column, element indices are shown (in red), followed by byte indices (in grey). In the example, each element of the
Edge requires 24 bytes to store.
The data for the i th
Edge starts at byte i * 24.
AbstractEdge part comprises element indices
NEXT_TARGET_EDGE, occupying 4 bytes each.
The remaining 8 bytes are
SOURCEis the element index (in the vertex memory array) of the vertex from which this edge starts.
TARGETis the element index (in the vertex memory array) of the vertex to which this edge points.
NEXT_SOURCE_EDGEis the element index (in the edge memory array) of the next outgoing edge of the source vertex, , the next edge that has the same
SOURCE. If there is no such edge then
NEXT_TARGET_EDGEis the element index (in the edge memory array) of the next incoming edge of the target vertex, , the next edge that has the same
TARGET. If there is no such edge then
Consider the following example graph comprising vertices A, B, C, D, E and edges a, b, c, d.
This is laid out in memory as follows
The links between vertex B and it’s adjacent edges a, b, c have been highlighted.
Let’s look at that in more detail:
Vertex B is stored at element index 1 in the vertex memory array.
B has one incoming edge a.
The edge a is stored at element index 0 in the edge memory array.
FIRST_IN_EDGE field of B is 0.
Apart from a, the vertex B has no further incoming edges.
NEXT_TARGET_EDGE field of a is -1, i.e., the list of edges entering B terminates here.
B has two outgoing edges b, c.
The edge b is stored at element index 1 in the edge memory array.
FIRST_OUT_EDGE field of B is 1.
The next outgoing edge of B is c which is stored at element index 2.
NEXT_SOURCE_EDGE field of b is 2.
After c, the vertex B has no further outgoing edges.
NEXT_SOURCE_EDGE field of c is -1, i.e., the list of edges leaving B terminates here.
Below is the same memory layout again, this time highlighting the references from edges a, b, c back to the vertex memory array.
For example, edge c is leaving vertex B (index 1) and entering vertex D (index 3).
SOURCE field of c is 1, and the
TARGET field of c is 3.
Free-list of unallocated elements.¶
The vertex and edge memory arrays can only ever grow. When elements are released, they are simply marked as free for re-use. Assume that in the above example vertex D is deleted, as well as it’s adjacent edges c, d, leaving this:
After removing c, d, D the memory layout looks like this:
The element 3 in the vertex memory array as well as elements 2 and 3 in the edge memory array have been marked as free.
This is done by putting the magic number “-2” into the first 4 bytes of the element.
In vertices and edges the first 4 bytes are always occupied by a (positive) index or a −1 index list terminator.
Therefore, occupied and free blocks can not be confused.
The next 4 bytes of a freed element are the index of the next freed element in the (same) memory array, or -1 if there is no next freed element.
Each memory array remembers the index
free of the first freed element.
Newly freed elements are enqueued at
free, that is, at the head of the free-list.
So in the above example, edge element 2 was freed first, followed by edge element 3.
The next edge element will be allocated at head of the free-list and the
free index move to the next element.
For example, assume that a new edge is created from B to E:
free is -1, then no more freed elements are available and the underlying memory array has to grow to fit newly added elements.