Introduction
TODO (mw) re-write this.
This document is the specification of the secure-scuttlebutt protocol (ssb). The primary audience are developers who need a thorough understanding of protocol internals, for example because they want to implement it themselves. The specificaiton is not intended for developers who want to build applications on top of ssb. Ssb provides nice abstractions to those developer, hiding the nitty-gritty details. This spec however is all about those details.
It is possible to read through the spec from end to end, but it's not an easy read. It is structured into different sections, each dealing with a different aspect of the protocol. The sections begin with a high-level overview of the problem space, and then delve into the details. In the description of those details, the highest importance is given to unambiguity. If you can not tell exactly how something works, it is a bug in the spec.
Feed spec
Common Datatypes
A few datatypes appear throughout this spec, it makes sense to introduce them in one single place.
An overarching theme of various parts of the ssb protocol(s) is that of future-proofness. The protocol will need to adapt itself to new circumstances, such as new transport channels or broken cryptographic primitives. A simple way to keep flexibility are self-describing multiformats, where the data is prefixed by an identifier (and in some cases its expected length). New data types can be introduced by simply assigning a new identifier. Older software can detect data types it doesn't understand yet, and react accordingly.
Each format consists of some logical data type, and then one or more encodings. These encodings can serve different purposes, for example they might be optimized for machine-readability, human-readability, uniqueness, backwards-compatibility, etc.
Multikey
A multikey is the public key of some digital signature scheme, annotated with an identifier for the scheme itself. The only currently supported cryptographic primitive is ed25519 (which has a public key length of 32 bytes).
Multikey Encoding
The encoding of a multikey is defined as the concatenation of:
- the character
@
(0x40
) - the canonic base64 encoding of the key itself
- ietf rfc 4648, section 4, disallowing superflous
=
characters inside the data or after the necessary padding=
s - it may not omit
=
characters either - the amount of encoding bytes must always be a multiple of four
- ietf rfc 4648, section 4, disallowing superflous
- the character
.
(0x2E
) - the primitive-specific suffix
- for ed25519, this is
ed25519
([0x65, 0x64, 0x32, 0x35, 0x35, 0x31, 0x39]
)
- for ed25519, this is
Typically, this encoding is stored in a json string.
Multifeed
A multifeed represents a scuttlebutt feed. It consists of a feed kind indicator and some additional data depending on the feed kind. The only currently supported feed kind is multikey
, whose additional data surprisingly enough is a multikey.
Multifeed Encoding
The encoding of a multikey multifeed is the same as the encoding of the multikey. The encoding of possible future multifeed kinds will use a different first char than @
.
Multihash
A multihash is a pair of:
- the hash target
- the hash digest of some cryptographically secure hash function, annotated with an identifier for the hash function itself.
The only currently supported hash targets are Message
and Blob
The only currently supported cryptographic primitive is sha256 (which has a digest length of 32 bytes).
Multihash Encoding
The encoding of a multihash is defined as the concatenation of:
- depending on the hash target:
Message
: the character%
(0x25
)Blob
: the character&
(0x26
)
- the canonic base64 encoding of the digest itself
- ietf rfc 4648, section 4, disallowing superflous
=
characters inside the data or after the necessary padding=
s - it may not omit
=
characters either - the amount of encoding bytes must always be a multiple of four
- ietf rfc 4648, section 4, disallowing superflous
- the character
.
(0x2E
) - the primitive-specific suffix
- for sha256, this is
sha256
([0x73, 0x68, 0x61, 0x32, 0x35, 0x36]
)
- for sha256, this is
Typically, this encoding is stored in a json string.
Multibox
A multibox is a cyphertext, annotated with an identifier for the algorithm that produced it. The algorithm identifiers are natural numbers between 0 and 2^64 - 1 (inclusive). Even identifiers are reserved for assignment by the ssb protocol devs, odd identifiers are open for experimentation and/or custom usage.
The only currently specified algorithm is private-box, using the identifier 0.
Multibox Encoding
The encoding of a multibox is defined as the concatenation of:
- the canonic base64 encoding of the cyphertext
- ietf rfc 4648, section 4, disallowing superflous
=
- it may not omit
=
characters either - the amount of encoding bytes must always be a multiple of four
- ietf rfc 4648, section 4, disallowing superflous
- the characters
.box
([0x2E, 0x62, 0x6F, 0x78]
) - the uppercase base32 encoding without padding of the identifier, without leading zeros, using the following table (the canonic subset of Crockford's base32):
Value | Symbol |
---|---|
0 | 0 |
1 | 1 |
2 | 2 |
3 | 3 |
4 | 4 |
5 | 5 |
6 | 6 |
7 | 7 |
8 | 8 |
9 | 9 |
10 | A |
11 | B |
12 | C |
13 | D |
14 | E |
15 | F |
16 | G |
17 | H |
18 | J |
19 | K |
20 | M |
21 | N |
22 | P |
23 | Q |
24 | R |
25 | S |
26 | T |
27 | V |
28 | W |
29 | X |
30 | Y |
31 | Z |
Due to omitting leading zeros, the suffix for private box (identifier 0) is simply "box"
.
For large identifiers (between 2^60 and 2^64 - 1 inclusive), 13 characters are needed, and the most-significant bit of the leftmost character does not contribute to the decoded value. Identifiers must be encoded such that this ignored bit is set to zero. Put another way: If the identifier encoding takes up 13 characters, it must begin with one of (1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F)
.
Datamodel
Ssb accepts schemaless data, adhering to a certain format specified in this section. This is the data that application developers care about. Internally, ssb then adds some more metadata to maintain the append-only logs, discussed in the next chapter. This chapter only describes the data model of the schemaless content data.
It is important to clearly distinguish between the abstract data model of a format, and various encodings. For example, the strings 1
, 1.0
and 0.1e1
are all different json encodings of the same abstract value. And there can even be different encoding formats for the same abstract value, e.g. the bytes 0xf93c00
also encode the floating point number 1.0, but in cbor rather than json.
Ssb messages all use the same abstract data model, but there are different situations where they use different encodings. An encoding used for signing the data has different requirements than an encoding used for transmitting the data, which again has different requirements than an encoding used for persisten storage.
Encodings for persistent storage are not specified in this document, but is crucial for all ssb implementations to compute the exact same signatures, and to send data in a format that can be understood by other implementations. We call these encodings signing encoding and transport encoding respectively.
The ssb protocol was initially implemented in javascript, and it relied heavily on implicit behavior of the node js runtime. People (including the authors of this specification) have called the message format "bizarre" and worse, and that is entirely justified. But when during reading you inevitably thing "How could anybody end up with this, I could have done a much better job.", then remember: Maybe you could have, but you didn't.
Abstract Data Model
The data model describes all of the free-form data that can be carried in a message. It is close to the json data model, but with a few differences. The definition came about as the set of javascript values created by JSON.parse(json)
for which json === JSON.stringify(JSON.parse(json))
(javascript code).
Defined in a language-agnostic way:
Null
null
is a value that carries no information.
Booleans
true
and false
are values, called booleans.
Strings
An ordered sequence of bytes which form valid utf8 of length between 0
and 2^53 - 1
(inclusive) is a value, called a string. Such a string may include null bytes.
Floats
An IEEE 754 double precision (64 bit) floating point number that is none of Infinity
, -Infinity
, -0
or NaN
is a value, called a float.
Arrays
Let n
be a natural number less than 2^32
, and let v_0, ..., v_n
be values.
The ordered sequence [v_0, ..., v_n]
is a value, called an array.
Objects
Let n
be a natural number less than 2^32
, let s_0, ..., s_n
be pairwise distinct strings, and let v_0, ..., v_n
be values.
The (unordered) set of pairs (s_i, v_i)
for all 0 <= i <= n
is a value, called a object. The pairs are called entries, the strings are called keys, and the values are called values.
Signing Encoding
The encoding to turn values into a signeable array of bytes is based on json (the set of valid encodings is a subset of ECMA-404 json). There are multiple valid encodings of a single value, because some of the entries of an objects can be encoded in an arbitary order. But up to object entry order, the encoding is unique. When receiving a message over the network, the order of the freely-orderable object entries in the transport encoding is the order that must be used for verifying the signature. Thus the network encoding induces a unique signing encoding to use for signature checking.
The signing encoding is defined as follows:
Signing Encoding Null
null
is encoded as the utf-8 string null
([0x6E, 0x75, 0x6C, 0x6C]
).
Signing Encoding Booleans
true
is encoded as the utf-8 string true
([0x74, 0x72, 0x75, 0x65]
).
false
is encoded as the utf-8 string false
([0x66, 0x61, 0x6c, 0x73, 0x65]
).
Signing Encoding Strings
A string containing the unicode code points c_0, ..., c_n
is is encoded as follows:
- begin with the utf-8 string
"
(0x22
) - for each code point
c_i
inc_0, ..., c_n
:- if
c_i
is unicode code point0x000022
(quotation mark"
), append the utf-8 string\"
([0x5C, 0x22]
) - else if
c_i
is unicode code point0x00005C
(reverse solidus\
), append the utf-8 string\\
([0x5C, 0x5C]
) - else if
c_i
is unicode code point0x000008
(backspace), append the utf-8 string\b
([0x5C, 0x62]
) - else if
c_i
is unicode code point0x00000C
(form feed), append the utf-8 string\f
([0x5C, 0x66]
) - else if
c_i
is unicode code point0x00000A
(line feed), append the utf-8 string\n
([0x5C, 0x6E]
) - else if
c_i
is unicode code point0x00000D
(carriage return), append the utf-8 string\r
([0x5C, 0x72]
) - else if
c_i
is unicode code point0x000009
(line tabulation), append the utf-8 string\t
([0x5C, 0x74]
) - else if
c_i
is a unicode code point below0x000020
(space), append the utf-8 string\u<hex>
([0x5C, 0x75, <hex>]
), where<hex>
are the two utf-8 bytes of the hexadecimal encoding of the code point, using lower-case lettersa
-f
(0x61
-0x66
) for alphabetic hex digits - else append the utf-8 representation of
c_i
without any modifications
- if
- append the utf-8 string
"
(0x22
)
Signing Encoding Floats
A float m
is encoded as follows:
- if
m == 0
, the encoding is the utf-8 string0
(0x30
) - else if
m
is negative, the encoding is the utf-8 string-<abs(m)>
([0x2d, <abs(m)>]
), where<abs(m)>
is the encoding of the same float with the sign bit flipped - else (largely quoting from the ECMAScript specification, applying NOTE 2 from here on):
- let
n
,k
ands
be integers such that:k >= 1
10 ^ (k - 1) <= s <= 10 ^ k
s * (10 ^ (n - k))
ism
(or round-to-even-s tom
if it is not precisely representable in a 64 bit float)k
is as small as possible- if there are multiple values for
s
, choose the one for whichs * (10 ^ (n - k))
is closest in value tom
- if there are two such possible values of
s
, choose the one that is even - Intuitively,
s
is the integer you get by removing the point and all trailing zeros from the decimal representation ofm
,k
is the number of digits in the decimal representation ofs
, andn
specifies how to print the number: Ifn
is greater than0
, there aren
digits left of the point, else there areabs(n)
many zeros right of the point. The choice ofs
uniquely determinesk
andn
, the tricky part is finding the uniques
that rounds correctly and for whichk
is minimal.
- if
k <= n <= 21
, the encoding is the utf-8 string<k_decimals><trailing_zeros>
, where<k_decimals>
is the utf-8 encoding of the digits of the decimal representation ofs
, and<trailing_zeros
aren - k
zero digits (0x30
) - else if
0 <= n <= 21
, the encoding is the utf-8 string<pre_point>.<post_point>
([<pre_point>, 0x2E, <post_point>]
), where<pre_point>
is the utf-8 encoding of the most significantn
digits of the decimal representation ofs
, and<post_point>
is the utf-8 encoding of the remainingk - n
digits of the decimal representation ofs
- else if
-6 < n <= 0
, the encoding is the utf-8 string0.<zeros><k_decimals>
([0x30, 0x2E, <zeros>, <k_decimals>]
), where<zeros>
are-n
many zero digits (0x30
), and<k_decimals>
is the utf-8 encoding of the digits of the decimal representation ofs
- else if
k == 1
, the encoding is<base>e<sign><exp>
([<base>, 0x65, <sign>, <exp>]
), where<base>
is the utf-8 encoding of the single digit ofs
,<sign>
is the utf-8 string+
(0x2B
) ifn - 1
is positive or the utf-8 string-
(0x2D
) ifn - 1
is negative, and<exp>
is the utf-8 encoding of the decimal representation of the absolute value ofn - 1
- else, the encoding is the utf-8 string
<pre_point>.<post_point>e<sign><exp>
([<pre_point>, 0x2E, <post_point>, 0x65, <sign>, <exp>]
), where<pre_point>
is the utf-8 encoding of the most significant digit of the decimal representation ofs
,<post_point>
is the utf-8 encoding of the remainingk - 1
digits of the decimal representation ofs
,<sign>
is the utf-8 string+
(0x2B
) ifn - 1
is positive or the utf-8 string-
(0x2D
) ifn - 1
is negative, and<exp>
is the utf-8 encoding of the decimal representation of the absolute value ofn - 1
- let
- good to know: The maximum length for such an encoding is 25 bytes
Signing Encoding Arrays
Let n
be a natural number less than 2^32
, let v_0, ..., v_n
be values, and let e_0, ..., e_n
be functions that take a natural number as an argument and return the encodings of v_0, ..., v_n
respectively using the supplied number as the indentation level.
At indentation level indent
, the array [v_0, ..., v_n]
is encoded as follows:
- if the array is empty (
n == 0
), the encoding is the utf-8 string[]
([0x5B, 0x5D]
) - else, do the following:
- begin with the utf-8 string
[<line feed>
([0x5B, 0x0A]
) - for each
v_i
inv_0, ..., v_(n - 1)
(so skip this ifn == 1
):- append
indent + 2
many space characters (0x20
) - append
e_i(indent + 2)
- append the utf-8 string
,<line feed>
([0x2C, 0x0A]
)
- append
- append
indent + 2
many space characters (0x20
) - append
e_n(indent + 2)
- append the utf-8 string
<line feed>
(0x0A
) - append
indent
many space characters (0x20
) - append the utf-8 string
]
(0x5D
)
- begin with the utf-8 string
Signing Encoding Objects
Let n
be a natural number less than 2^32
, let s_0, ..., s_n
be pairwise distinct strings, let v_0, ..., v_n
be values, and let e_0, ..., e_n
be functions that take a natural number as an argument and return the encodings of v_0, ..., v_n
respectively using the supplied number as the indentation level.
At indentation level indent
, the object { s_0: v_0, ..., s_n: v_n}
is encoded as follows:
- if the object is empty (
n == 0
), the encoding is the utf-8 string{}
([0x7B, 0x7D]
) - else, do the following:
- begin with the utf-8 string
{<line feed>
([0x7B, 0x0A]
) - for each pair
(s_i, v_i)
fori
in0, ..., n - 1
(so skip this ifn == 1
):- append
indent + 2
many space characters (0x20
) - append the encoding of the string
s_i
- append the utf-8 string
:<space>
([0x3A, 0x20]
) - append
e_i(indent + 2)
- append the utf-8 string
,<line feed>
([0x2C, 0x0A]
)
- append
- append
indent + 2
many space characters (0x20
) - append the encoding of the string
s_i
- append the utf-8 string
:<space>
([0x3A, 0x20]
) - append
e_i(indent + 2)
- append the utf-8 string
<line feed>
(0x0A
) - append
indent
many space characters (0x20
) - append the utf-8 string
}
(0x7D
)
- begin with the utf-8 string
- The order in which to serialize the entries
s_i: v_i
is not fully specified, but there are some constraints:- intuitively: Natural numbers less than 2^32 - 1 are sorted ascendingly
- formally:
- a key is called a
more-or-less integer key
if it is either the string"0"
, or a nonzero decimal digit (1 - 9 (0x31
-0x39
)) followed by zero or more arbitrary decimal digits (0 - 9 (0x30
-0x39
)) and consists solely of such digits - a key is called an
int key
if it:- is a
more-or-less integer key
- its numeric value is strictly less than
2^32 - 1
- no, there's no off-by one error here:
0xfffffffe
is anint key
, whereas0xffffffff
is not - Why? We have no idea either
- no, there's no off-by one error here:
- is a
- all entries with
int key
s must be serialized before all other entries - among themselves, the
int key
s are sorted:- by length first (ascending), using
- numeric value as a tie breaker (the key whose raw bytes interpreted as a natural number are larger is serialized later)
- note that this coincides with sorting the decimally encoded numbers by numeric value
- a key is called a
- all other entries may be serialized in an arbitrary order
The string handling is equivalent to ECMAScript 2015 QuoteJSONString, but defined over utf-8 strings instead of utf-16 ones.
The float handling is equivalent to (and quotes from) ECMAScript 2015 ToString Applied to the Number Type, with step 5 replaced as specified in NOTE 2 to result in unique encodings. This spec provides a declarative description of the encoding process, for an algorithmic perspective, there are some papers on the subject such as:
- Steele Jr, Guy L., and Jon L. White. "How to print floating-point numbers accurately." ACM SIGPLAN Notices. Vol. 25. No. 6. ACM, 1990.
- Loitsch, Florian. "Printing floating-point numbers quickly and accurately with integers." ACM Sigplan Notices. Vol. 45. No. 6. ACM, 2010.
- Andrysco, Marc, Ranjit Jhala, and Sorin Lerner. "Printing floating-point numbers: a faster, always correct method." ACM SIGPLAN Notices. Vol. 51. No. 1. ACM, 2016.
- Adams, Ulf. "Ryū: fast float-to-string conversion." Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, 2018.
The array and object handling is equivalent to JSON.stringify(value, null, 2)
, specified in ECMAScript 2015 (except for the object entry ordering, which is not specified in ECMAScript, but implemented this way in v8 and spidermonkey).
Hash Computation
To compute the hash of a value, you can not use the signing encoding directly, but the hash computation is based on it. The signing encoding always results in valid unicode. Represent this unicode in utf-16. This encoding is a sequence of code units, each consisting of two bytes. The data to hash is obtained from these code units by only keeping the less significant byte.
Example: Suppose you want to compute the hash for "ß"
, the corresponding utf8 is [0x22, 0xC3, 0x9F, 0x22]
. In big-endian utf16, this is [(0x22, 0x00), (0xDF, 0x00), (0x22, 0x00)]
, in little-endian utf16, this is [(0x00, 0x22), (0x00, 0xDF), (0x00, 0x22)]
. In both cases, the sequence of less signifiant bytes per code unit is [0x22, 0xDF, 0x22]
. That is the byte array over which to compute the hash.
Note that this means that two strings with different utf-8 encodings can result in the same hash, due to the information in the more significant byte of the utf-16 encoding being dropped.
Length Computation
Ssb places a limit on the size of messages. To compute the length of a value, compute the signing encoding (which is always valid unicode), reencode that unicode as utf16, then count the number of code units.
JSON Transport Encoding
In addition to the signing format, messages can be encoded as ECMA-404 json, with the following differences:
- numbers may not be negative zero
- numbers may not round to positive infinity, negative infinity or negative zero IEEE 754 64 bit floating point numbers
- strings may not be longer than
2^53 - 1
bytes - arrays and object may not contain more than
2^32 - 1
entries - objects may not contain multiple entries with the same key
- in strings, unicode escape sequences of code points greater than
U+FFFF
must be interpreted as a single code point, not as an explicit surrogate pair - escape sequences of surrogate code points must be matched: Each escape sequence for a high surrogate must be followed by an escape sequence for a low surrogate, and any escape sequence for a low surrogate must be preceded by an escape sequence for a high surrogate
The signing format itself is a subset of this, but this format can be more compact (by omitting all whitespace). This compact form is used by ssb server implementations for message exchange with other servers.
Feeds and Messages
Ssb is designed around a data model optimized for simple data replication and strong cryptographic integrity guarantees in a social setting. The central entities in this model are feeds and messages. Messages are the pieces of data inserted into the system, feeds describe who authored a piece of data.
Messages are carry schemaless, free-form pieces of data, together with some metadata necessary for message replication and verification. Each message belongs to exactly one feed, and each message contains a backlink to the previous message posted to that feed. Feeds thus form linked lists. Links are implemented via cryptographically secure hashes, in that sense feeds behave mostly like blockchains. But whereas blockchains are traditionally used to create a global, single source of truth, ssb chooses a different approach.
Instead of one single, global linked list for all data, each ssb user has their own linked list. To ensure that no malicious actor can append data to other user's list, all messages are signed. We call this data structure - a signed hash-based linked list - a sigchain.
The sigchain-per-user design has some very desirable properties:
- data can be moved across untrusted machines, you still know that some piece of data has indeed be posted by a certain identity
- immutability of data, nothing gets lost
- the order between messages can not be confused
- replicating data becomes simple: just exchange the number of messages you know about for a certain feed, and your peer can send you everything that is newer
Of course, this design also has some drawbacks:
- the simple replication scheme does not allow subscription to only parts of the data - it's all or nothing
- immutability and non-repudiation are not appropriate for all use-cases
- a single identity can not append data from multiple machines in parallel, as that would result in a tree rather than a linked list
In some sense, ssb can be seen as an experiment whether the advantages of a sigchain-based distributed database outweigh the disadvantages. So far, it appears to be working sufficiently well.
The remainder of this chapter describes the exact format of the metadata ssb maintains to build the sigchain. Conceptually, to form the sigchain, the metadata of each message must include:
- the feed the message belongs to
- the hash of the previous message from the same feed, or
null
- the free-form content of the message
- a signature to prove that the author knew the private key of the feed
The actual metadata formats also need to include some extra information.
Metadata
The abstract model for metadata is a tuple containing the following entries:
previous
: either a multihash, or the distinct valuenull
author
: a multikeysequence
: an unsigned 53 bit integertimestamp
: an IEEE 754 64 bit float except the infinities, negative zero, andNaN
scontent
: a data value that is either:- an object containing an entry
"type"
, whose value is a string that takes between 3 and 53 (inclusive) code units when encoded as utf16 - a multibox
- an object containing an entry
swapped
: a boolean indicating how to encode the metadatasignature
: A signature of the message, generated by the cryptographic primitive of theauthor
- this can only be computed once all other metadata is known
Json Encoding
Metadata can be encoded as json, just like regular data. The json encoding is a json object containing the entries listed above, with the following additional regulations:
- the
previous
multihash must use the message hash encoding as a string - the
author
multikey must use the multikey encoding as a string - the
sequence
is serialized as a float, since that's the only number type available- it must not be negative
- it must not contain a decimal point
- the
signature
value is a string whose content is the concatenation of:- the canonic base64 encoding of the message's signature itself (see next section)
- ietf rfc 4648, section 4, disallowing superflous
=
characters inside the data or after the necessary padding=
s
- ietf rfc 4648, section 4, disallowing superflous
- the characters
.sig.
([0x2E, 0x73, 0x69, 0x67, 0x2E]
) - a primitive-specific suffix, depending of the primitive of the
author
- for ed25519, this is
ed25519
([0x65, 0x64, 0x32, 0x35, 0x35, 0x31, 0x39]
)
- for ed25519, this is
- the canonic base64 encoding of the message's signature itself (see next section)
- the
swapped
boolean is omitted - an entry
"hash": "sha256"
is added - if
swapped
, the order of entries must beprevious, sequence, author, timestamp, hash, content, signature
, else it must beprevious, author, sequence, timestamp, hash, content, signature
- if
content
is a multibox, it must use the multibox encoding as a string
When creating new messages, the (purely logical) value of swapped
should be false
. Or you might set it to true
, or generate it randomly - nobody can stop you, and everything will still work.
Messages
To this json encoding corresponds exactly one data value with a fixed order of object entries. This value is referred to as a message.
Computing a Message's Signature
To obtain the signature of a message, first compute the json encoding specified above, but without the "signature"
entry. Then compute the signing encoding of the corresponding data value, using the entry order of the json as a tie-breaker for the entry-order of the signing encoding. Finally use the signing primitive of the message's author
on the signing encoding, to obtain the signature.
Computing the Hash of a Message
To compute the hash of a message, use the value hash computation, with the signing encoding where the object entry order that would produce the correct signature.
Computing the Size of a Message
To compute the size of a message, use the value length computation, with the signing encoding where the object entry order that would produce the correct signature.
Message Validation
A message is considered valid if and only if all of the following conditions are met:
- its json encoding is a possible output of the message creation algorithm
- in particular, the claimed signature must match the data and public key
- its length is smaller than
16385
- if
previous
isnull
, the sequence number of the message must be the float1
, else it must be one larger than that of the message whose hash is the value ofprevious