Feeds and Messages
Ssb is designed around a data model optimized for simple data replication and strong cryptographic integrity guarantees in a social setting. The central entities in this model are feeds and messages. Messages are the pieces of data inserted into the system, feeds describe who authored a piece of data.
Messages are carry schemaless, free-form pieces of data, together with some metadata necessary for message replication and verification. Each message belongs to exactly one feed, and each message contains a backlink to the previous message posted to that feed. Feeds thus form linked lists. Links are implemented via cryptographically secure hashes, in that sense feeds behave mostly like blockchains. But whereas blockchains are traditionally used to create a global, single source of truth, ssb chooses a different approach.
Instead of one single, global linked list for all data, each ssb user has their own linked list. To ensure that no malicious actor can append data to other user's list, all messages are signed. We call this data structure - a signed hash-based linked list - a sigchain.
The sigchain-per-user design has some very desirable properties:
- data can be moved across untrusted machines, you still know that some piece of data has indeed be posted by a certain identity
- immutability of data, nothing gets lost
- the order between messages can not be confused
- replicating data becomes simple: just exchange the number of messages you know about for a certain feed, and your peer can send you everything that is newer
Of course, this design also has some drawbacks:
- the simple replication scheme does not allow subscription to only parts of the data - it's all or nothing
- immutability and non-repudiation are not appropriate for all use-cases
- a single identity can not append data from multiple machines in parallel, as that would result in a tree rather than a linked list
In some sense, ssb can be seen as an experiment whether the advantages of a sigchain-based distributed database outweigh the disadvantages. So far, it appears to be working sufficiently well.
The remainder of this chapter describes the exact format of the metadata ssb maintains to build the sigchain. Conceptually, to form the sigchain, the metadata of each message must include:
- the feed the message belongs to
- the hash of the previous message from the same feed, or
null
- the free-form content of the message
- a signature to prove that the author knew the private key of the feed
The actual metadata formats also need to include some extra information.
Metadata
The abstract model for metadata is a tuple containing the following entries:
previous
: either a multihash, or the distinct valuenull
author
: a multikeysequence
: an unsigned 53 bit integertimestamp
: an IEEE 754 64 bit float except the infinities, negative zero, andNaN
scontent
: a data value that is either:- an object containing an entry
"type"
, whose value is a string that takes between 3 and 53 (inclusive) code units when encoded as utf16 - a multibox
- an object containing an entry
swapped
: a boolean indicating how to encode the metadatasignature
: A signature of the message, generated by the cryptographic primitive of theauthor
- this can only be computed once all other metadata is known
Json Encoding
Metadata can be encoded as json, just like regular data. The json encoding is a json object containing the entries listed above, with the following additional regulations:
- the
previous
multihash must use the message hash encoding as a string - the
author
multikey must use the multikey encoding as a string - the
sequence
is serialized as a float, since that's the only number type available- it must not be negative
- it must not contain a decimal point
- the
signature
value is a string whose content is the concatenation of:- the canonic base64 encoding of the message's signature itself (see next section)
- ietf rfc 4648, section 4, disallowing superflous
=
characters inside the data or after the necessary padding=
s
- ietf rfc 4648, section 4, disallowing superflous
- the characters
.sig.
([0x2E, 0x73, 0x69, 0x67, 0x2E]
) - a primitive-specific suffix, depending of the primitive of the
author
- for ed25519, this is
ed25519
([0x65, 0x64, 0x32, 0x35, 0x35, 0x31, 0x39]
)
- for ed25519, this is
- the canonic base64 encoding of the message's signature itself (see next section)
- the
swapped
boolean is omitted - an entry
"hash": "sha256"
is added - if
swapped
, the order of entries must beprevious, sequence, author, timestamp, hash, content, signature
, else it must beprevious, author, sequence, timestamp, hash, content, signature
- if
content
is a multibox, it must use the multibox encoding as a string
When creating new messages, the (purely logical) value of swapped
should be false
. Or you might set it to true
, or generate it randomly - nobody can stop you, and everything will still work.
Messages
To this json encoding corresponds exactly one data value with a fixed order of object entries. This value is referred to as a message.
Computing a Message's Signature
To obtain the signature of a message, first compute the json encoding specified above, but without the "signature"
entry. Then compute the signing encoding of the corresponding data value, using the entry order of the json as a tie-breaker for the entry-order of the signing encoding. Finally use the signing primitive of the message's author
on the signing encoding, to obtain the signature.
Computing the Hash of a Message
To compute the hash of a message, use the value hash computation, with the signing encoding where the object entry order that would produce the correct signature.
Computing the Size of a Message
To compute the size of a message, use the value length computation, with the signing encoding where the object entry order that would produce the correct signature.
Message Validation
A message is considered valid if and only if all of the following conditions are met:
- its json encoding is a possible output of the message creation algorithm
- in particular, the claimed signature must match the data and public key
- its length is smaller than
16385
- if
previous
isnull
, the sequence number of the message must be the float1
, else it must be one larger than that of the message whose hash is the value ofprevious