TODO (mw) re-write this.

This document is the specification of the secure-scuttlebutt protocol (ssb). The primary audience are developers who need a thorough understanding of protocol internals, for example because they want to implement it themselves. The specificaiton is not intended for developers who want to build applications on top of ssb. Ssb provides nice abstractions to those developer, hiding the nitty-gritty details. This spec however is all about those details.

It is possible to read through the spec from end to end, but it's not an easy read. It is structured into different sections, each dealing with a different aspect of the protocol. The sections begin with a high-level overview of the problem space, and then delve into the details. In the description of those details, the highest importance is given to unambiguity. If you can not tell exactly how something works, it is a bug in the spec.

A few datatypes appear throughout this spec, it makes sense to introduce them in one single place.

An overarching theme of various parts of the ssb protocol(s) is that of future-proofness. The protocol will need to adapt itself to new circumstances, such as new transport channels or broken cryptographic primitives. A simple way to keep flexibility are self-describing multiformats, where the data is prefixed by an identifier (and in some cases its expected length). New data types can be introduced by simply assigning a new identifier. Older software can detect data types it doesn't understand yet, and react accordingly.

Each format consists of some logical data type, and then one or more encodings. These encodings can serve different purposes, for example they might be optimized for machine-readability, human-readability, uniqueness, backwards-compatibility, etc.

A multikey is the public key of some digital signature scheme, annotated with an identifier for the scheme itself. The only currently supported cryptographic primitive is ed25519 (which has a public key length of 32 bytes).

The encoding of a multikey is defined as the concatenation of:

the character @ (0x40)
the canonic base64 encoding of the key itself
- ietf rfc 4648, section 4, disallowing superflous = characters inside the data or after the necessary padding =s
- it may not omit = characters either - the amount of encoding bytes must always be a multiple of four
the character . (0x2E)
the primitive-specific suffix
- for ed25519, this is ed25519 ([0x65, 0x64, 0x32, 0x35, 0x35, 0x31, 0x39])

Typically, this encoding is stored in a json string.

A multifeed represents a scuttlebutt feed. It consists of a feed kind indicator and some additional data depending on the feed kind. The only currently supported feed kind is multikey, whose additional data surprisingly enough is a multikey.

The encoding of a multikey multifeed is the same as the encoding of the multikey. The encoding of possible future multifeed kinds will use a different first char than @.

A multihash is a pair of:

the hash target
the hash digest of some cryptographically secure hash function, annotated with an identifier for the hash function itself.

The only currently supported hash targets are Message and Blob

The only currently supported cryptographic primitive is sha256 (which has a digest length of 32 bytes).

The encoding of a multihash is defined as the concatenation of:

depending on the hash target:
- Message: the character % (0x25)
- Blob: the character & (0x26)
the canonic base64 encoding of the digest itself
- ietf rfc 4648, section 4, disallowing superflous = characters inside the data or after the necessary padding =s
- it may not omit = characters either - the amount of encoding bytes must always be a multiple of four
the character . (0x2E)
the primitive-specific suffix
- for sha256, this is sha256 ([0x73, 0x68, 0x61, 0x32, 0x35, 0x36])

Typically, this encoding is stored in a json string.

A multibox is a cyphertext, annotated with an identifier for the algorithm that produced it. The algorithm identifiers are natural numbers between 0 and 2^64 - 1 (inclusive). Even identifiers are reserved for assignment by the ssb protocol devs, odd identifiers are open for experimentation and/or custom usage.

The only currently specified algorithm is private-box, using the identifier 0.

The encoding of a multibox is defined as the concatenation of:

the canonic base64 encoding of the cyphertext
- ietf rfc 4648, section 4, disallowing superflous =
- it may not omit = characters either - the amount of encoding bytes must always be a multiple of four
the characters .box ([0x2E, 0x62, 0x6F, 0x78])
the uppercase base32 encoding without padding of the identifier, without leading zeros, using the following table (the canonic subset of Crockford's base32):

Value	Symbol
0	0
1	1
2	2
3	3
4	4
5	5
6	6
7	7
8	8
9	9
10	A
11	B
12	C
13	D
14	E
15	F
16	G
17	H
18	J
19	K
20	M
21	N
22	P
23	Q
24	R
25	S
26	T
27	V
28	W
29	X
30	Y
31	Z

Due to omitting leading zeros, the suffix for private box (identifier 0) is simply "box".

For large identifiers (between 2^60 and 2^64 - 1 inclusive), 13 characters are needed, and the most-significant bit of the leftmost character does not contribute to the decoded value. Identifiers must be encoded such that this ignored bit is set to zero. Put another way: If the identifier encoding takes up 13 characters, it must begin with one of (1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F).

Ssb accepts schemaless data, adhering to a certain format specified in this section. This is the data that application developers care about. Internally, ssb then adds some more metadata to maintain the append-only logs, discussed in the next chapter. This chapter only describes the data model of the schemaless content data.

It is important to clearly distinguish between the abstract data model of a format, and various encodings. For example, the strings 1, 1.0 and 0.1e1 are all different json encodings of the same abstract value. And there can even be different encoding formats for the same abstract value, e.g. the bytes 0xf93c00 also encode the floating point number 1.0, but in cbor rather than json.

Ssb messages all use the same abstract data model, but there are different situations where they use different encodings. An encoding used for signing the data has different requirements than an encoding used for transmitting the data, which again has different requirements than an encoding used for persisten storage.

Encodings for persistent storage are not specified in this document, but is crucial for all ssb implementations to compute the exact same signatures, and to send data in a format that can be understood by other implementations. We call these encodings signing encoding and transport encoding respectively.

The ssb protocol was initially implemented in javascript, and it relied heavily on implicit behavior of the node js runtime. People (including the authors of this specification) have called the message format "bizarre" and worse, and that is entirely justified. But when during reading you inevitably thing "How could anybody end up with this, I could have done a much better job.", then remember: Maybe you could have, but you didn't.

Abstract Data Model

The data model describes all of the free-form data that can be carried in a message. It is close to the json data model, but with a few differences. The definition came about as the set of javascript values created by JSON.parse(json) for which json === JSON.stringify(JSON.parse(json)) (javascript code).

Defined in a language-agnostic way:

null is a value that carries no information.

true and false are values, called booleans.

An ordered sequence of bytes which form valid utf8 of length between 0 and 2^53 - 1 (inclusive) is a value, called a string. Such a string may include null bytes.

An IEEE 754 double precision (64 bit) floating point number that is none of Infinity, -Infinity, -0 or NaN is a value, called a float.

Let n be a natural number less than 2^32, and let v_0, ..., v_n be values.

The ordered sequence [v_0, ..., v_n] is a value, called an array.

Let n be a natural number less than 2^32, let s_0, ..., s_n be pairwise distinct strings, and let v_0, ..., v_n be values.

The (unordered) set of pairs (s_i, v_i) for all 0 <= i <= n is a value, called a object. The pairs are called entries, the strings are called keys, and the values are called values.

The encoding to turn values into a signeable array of bytes is based on json (the set of valid encodings is a subset of ECMA-404 json). There are multiple valid encodings of a single value, because some of the entries of an objects can be encoded in an arbitary order. But up to object entry order, the encoding is unique. When receiving a message over the network, the order of the freely-orderable object entries in the transport encoding is the order that must be used for verifying the signature. Thus the network encoding induces a unique signing encoding to use for signature checking.

The signing encoding is defined as follows:

null is encoded as the utf-8 string null ([0x6E, 0x75, 0x6C, 0x6C]).

true is encoded as the utf-8 string true ([0x74, 0x72, 0x75, 0x65]).
false is encoded as the utf-8 string false ([0x66, 0x61, 0x6c, 0x73, 0x65]).

A string containing the unicode code points c_0, ..., c_n is is encoded as follows:

begin with the utf-8 string " (0x22)
for each code point c_i in c_0, ..., c_n:
- if c_i is unicode code point 0x000022 (quotation mark "), append the utf-8 string \" ([0x5C, 0x22])
- else if c_i is unicode code point 0x00005C (reverse solidus \), append the utf-8 string \\ ([0x5C, 0x5C])
- else if c_i is unicode code point 0x000008 (backspace), append the utf-8 string \b ([0x5C, 0x62])
- else if c_i is unicode code point 0x00000C (form feed), append the utf-8 string \f ([0x5C, 0x66])
- else if c_i is unicode code point 0x00000A (line feed), append the utf-8 string \n ([0x5C, 0x6E])
- else if c_i is unicode code point 0x00000D (carriage return), append the utf-8 string \r ([0x5C, 0x72])
- else if c_i is unicode code point 0x000009 (line tabulation), append the utf-8 string \t ([0x5C, 0x74])
- else if c_i is a unicode code point below 0x000020 (space), append the utf-8 string \u<hex> ([0x5C, 0x75, <hex>]), where <hex> are the two utf-8 bytes of the hexadecimal encoding of the code point, using lower-case letters a - f (0x61 - 0x66) for alphabetic hex digits
- else append the utf-8 representation of c_i without any modifications
append the utf-8 string " (0x22)

A float m is encoded as follows:

if m == 0, the encoding is the utf-8 string 0 (0x30)
else if m is negative, the encoding is the utf-8 string -<abs(m)>([0x2d, <abs(m)>]), where <abs(m)> is the encoding of the same float with the sign bit flipped
else (largely quoting from the ECMAScript specification, applying NOTE 2 from here on):
- let n, k and s be integers such that:
  - k >= 1
  - 10 ^ (k - 1) <= s <= 10 ^ k
  - s * (10 ^ (n - k)) is m (or round-to-even-s to m if it is not precisely representable in a 64 bit float)
  - k is as small as possible
  - if there are multiple values for s, choose the one for which s * (10 ^ (n - k)) is closest in value to m
  - if there are two such possible values of s, choose the one that is even
  - Intuitively, s is the integer you get by removing the point and all trailing zeros from the decimal representation of m, k is the number of digits in the decimal representation of s, and n specifies how to print the number: If n is greater than 0, there are n digits left of the point, else there are abs(n) many zeros right of the point. The choice of s uniquely determines k and n, the tricky part is finding the unique s that rounds correctly and for which k is minimal.
- if k <= n <= 21, the encoding is the utf-8 string <k_decimals><trailing_zeros>, where <k_decimals> is the utf-8 encoding of the digits of the decimal representation of s, and <trailing_zeros are n - k zero digits (0x30)
- else if 0 <= n <= 21, the encoding is the utf-8 string <pre_point>.<post_point> ([<pre_point>, 0x2E, <post_point>]), where <pre_point> is the utf-8 encoding of the most significant n digits of the decimal representation of s, and <post_point> is the utf-8 encoding of the remaining k - n digits of the decimal representation of s
- else if -6 < n <= 0, the encoding is the utf-8 string 0.<zeros><k_decimals> ([0x30, 0x2E, <zeros>, <k_decimals>]), where <zeros> are -n many zero digits (0x30), and <k_decimals> is the utf-8 encoding of the digits of the decimal representation of s
- else if k == 1, the encoding is <base>e<sign><exp> ([<base>, 0x65, <sign>, <exp>]), where <base> is the utf-8 encoding of the single digit of s, <sign> is the utf-8 string + (0x2B) if n - 1 is positive or the utf-8 string - (0x2D) if n - 1 is negative, and <exp> is the utf-8 encoding of the decimal representation of the absolute value of n - 1
- else, the encoding is the utf-8 string <pre_point>.<post_point>e<sign><exp> ([<pre_point>, 0x2E, <post_point>, 0x65, <sign>, <exp>]), where <pre_point> is the utf-8 encoding of the most significant digit of the decimal representation of s, <post_point> is the utf-8 encoding of the remaining k - 1 digits of the decimal representation of s, <sign> is the utf-8 string + (0x2B) if n - 1 is positive or the utf-8 string - (0x2D) if n - 1 is negative, and <exp> is the utf-8 encoding of the decimal representation of the absolute value of n - 1
good to know: The maximum length for such an encoding is 25 bytes

Let n be a natural number less than 2^32, let v_0, ..., v_n be values, and let e_0, ..., e_n be functions that take a natural number as an argument and return the encodings of v_0, ..., v_n respectively using the supplied number as the indentation level.

At indentation level indent, the array [v_0, ..., v_n] is encoded as follows:

if the array is empty (n == 0), the encoding is the utf-8 string [] ([0x5B, 0x5D])
else, do the following:
- begin with the utf-8 string [<line feed> ([0x5B, 0x0A])
- for each v_i in v_0, ..., v_(n - 1) (so skip this if n == 1):
  - append indent + 2 many space characters (0x20)
  - append e_i(indent + 2)
  - append the utf-8 string ,<line feed> ([0x2C, 0x0A])
- append indent + 2 many space characters (0x20)
- append e_n(indent + 2)
- append the utf-8 string <line feed> (0x0A)
- append indent many space characters (0x20)
- append the utf-8 string ] (0x5D)

Let n be a natural number less than 2^32, let s_0, ..., s_n be pairwise distinct strings, let v_0, ..., v_n be values, and let e_0, ..., e_n be functions that take a natural number as an argument and return the encodings of v_0, ..., v_n respectively using the supplied number as the indentation level.

At indentation level indent, the object { s_0: v_0, ..., s_n: v_n} is encoded as follows:

if the object is empty (n == 0), the encoding is the utf-8 string {} ([0x7B, 0x7D])
else, do the following:
- begin with the utf-8 string {<line feed> ([0x7B, 0x0A])
- for each pair (s_i, v_i) for i in 0, ..., n - 1 (so skip this if n == 1):
  - append indent + 2 many space characters (0x20)
  - append the encoding of the string s_i
  - append the utf-8 string :<space> ([0x3A, 0x20])
  - append e_i(indent + 2)
  - append the utf-8 string ,<line feed> ([0x2C, 0x0A])
- append indent + 2 many space characters (0x20)
- append the encoding of the string s_i
- append the utf-8 string :<space> ([0x3A, 0x20])
- append e_i(indent + 2)
- append the utf-8 string <line feed> (0x0A)
- append indent many space characters (0x20)
- append the utf-8 string } (0x7D)
The order in which to serialize the entries s_i: v_i is not fully specified, but there are some constraints:
- intuitively: Natural numbers less than 2^32 - 1 are sorted ascendingly
- formally:
  - a key is called a more-or-less integer key if it is either the string "0", or a nonzero decimal digit (1 - 9 (0x31 - 0x39)) followed by zero or more arbitrary decimal digits (0 - 9 (0x30 - 0x39)) and consists solely of such digits
  - a key is called an int key if it:
    - is a more-or-less integer key
    - its numeric value is strictly less than 2^32 - 1
      - no, there's no off-by one error here: 0xfffffffe is an int key, whereas 0xffffffff is not
      - Why? We have no idea either
  - all entries with int keys must be serialized before all other entries
  - among themselves, the int keys are sorted:
    - by length first (ascending), using
    - numeric value as a tie breaker (the key whose raw bytes interpreted as a natural number are larger is serialized later)
      - note that this coincides with sorting the decimally encoded numbers by numeric value
- all other entries may be serialized in an arbitrary order

The string handling is equivalent to ECMAScript 2015 QuoteJSONString, but defined over utf-8 strings instead of utf-16 ones.

The float handling is equivalent to (and quotes from) ECMAScript 2015 ToString Applied to the Number Type, with step 5 replaced as specified in NOTE 2 to result in unique encodings. This spec provides a declarative description of the encoding process, for an algorithmic perspective, there are some papers on the subject such as:

The array and object handling is equivalent to JSON.stringify(value, null, 2), specified in ECMAScript 2015 (except for the object entry ordering, which is not specified in ECMAScript, but implemented this way in v8 and spidermonkey).

To compute the hash of a value, you can not use the signing encoding directly, but the hash computation is based on it. The signing encoding always results in valid unicode. Represent this unicode in utf-16. This encoding is a sequence of code units, each consisting of two bytes. The data to hash is obtained from these code units by only keeping the less significant byte.

Example: Suppose you want to compute the hash for "ß", the corresponding utf8 is [0x22, 0xC3, 0x9F, 0x22]. In big-endian utf16, this is [(0x22, 0x00), (0xDF, 0x00), (0x22, 0x00)], in little-endian utf16, this is [(0x00, 0x22), (0x00, 0xDF), (0x00, 0x22)]. In both cases, the sequence of less signifiant bytes per code unit is [0x22, 0xDF, 0x22]. That is the byte array over which to compute the hash.

Note that this means that two strings with different utf-8 encodings can result in the same hash, due to the information in the more significant byte of the utf-16 encoding being dropped.

Ssb places a limit on the size of messages. To compute the length of a value, compute the signing encoding (which is always valid unicode), reencode that unicode as utf16, then count the number of code units.

In addition to the signing format, messages can be encoded as ECMA-404 json, with the following differences:

numbers may not be negative zero
numbers may not round to positive infinity, negative infinity or negative zero IEEE 754 64 bit floating point numbers
strings may not be longer than 2^53 - 1 bytes
arrays and object may not contain more than 2^32 - 1 entries
objects may not contain multiple entries with the same key
in strings, unicode escape sequences of code points greater than U+FFFF must be interpreted as a single code point, not as an explicit surrogate pair
escape sequences of surrogate code points must be matched: Each escape sequence for a high surrogate must be followed by an escape sequence for a low surrogate, and any escape sequence for a low surrogate must be preceded by an escape sequence for a high surrogate

The signing format itself is a subset of this, but this format can be more compact (by omitting all whitespace). This compact form is used by ssb server implementations for message exchange with other servers.

Feeds and Messages

Ssb is designed around a data model optimized for simple data replication and strong cryptographic integrity guarantees in a social setting. The central entities in this model are feeds and messages. Messages are the pieces of data inserted into the system, feeds describe who authored a piece of data.

Messages are carry schemaless, free-form pieces of data, together with some metadata necessary for message replication and verification. Each message belongs to exactly one feed, and each message contains a backlink to the previous message posted to that feed. Feeds thus form linked lists. Links are implemented via cryptographically secure hashes, in that sense feeds behave mostly like blockchains. But whereas blockchains are traditionally used to create a global, single source of truth, ssb chooses a different approach.

Instead of one single, global linked list for all data, each ssb user has their own linked list. To ensure that no malicious actor can append data to other user's list, all messages are signed. We call this data structure - a signed hash-based linked list - a sigchain.

The sigchain-per-user design has some very desirable properties:

data can be moved across untrusted machines, you still know that some piece of data has indeed be posted by a certain identity
immutability of data, nothing gets lost
the order between messages can not be confused
replicating data becomes simple: just exchange the number of messages you know about for a certain feed, and your peer can send you everything that is newer

Of course, this design also has some drawbacks:

the simple replication scheme does not allow subscription to only parts of the data - it's all or nothing
immutability and non-repudiation are not appropriate for all use-cases
a single identity can not append data from multiple machines in parallel, as that would result in a tree rather than a linked list

In some sense, ssb can be seen as an experiment whether the advantages of a sigchain-based distributed database outweigh the disadvantages. So far, it appears to be working sufficiently well.

The remainder of this chapter describes the exact format of the metadata ssb maintains to build the sigchain. Conceptually, to form the sigchain, the metadata of each message must include:

the feed the message belongs to
the hash of the previous message from the same feed, or null
the free-form content of the message
a signature to prove that the author knew the private key of the feed

The actual metadata formats also need to include some extra information.

The abstract model for metadata is a tuple containing the following entries:

previous: either a multihash, or the distinct value null
author: a multikey
sequence: an unsigned 53 bit integer
timestamp: an IEEE 754 64 bit float except the infinities, negative zero, and NaNs
content: a data value that is either:
- an object containing an entry "type", whose value is a string that takes between 3 and 53 (inclusive) code units when encoded as utf16
- a multibox
swapped: a boolean indicating how to encode the metadata
signature: A signature of the message, generated by the cryptographic primitive of the author
- this can only be computed once all other metadata is known

Metadata can be encoded as json, just like regular data. The json encoding is a json object containing the entries listed above, with the following additional regulations:

the previous multihash must use the message hash encoding as a string
the author multikey must use the multikey encoding as a string
the sequence is serialized as a float, since that's the only number type available
- it must not be negative
- it must not contain a decimal point
the signature value is a string whose content is the concatenation of:
- the canonic base64 encoding of the message's signature itself (see next section)
  - ietf rfc 4648, section 4, disallowing superflous = characters inside the data or after the necessary padding =s
- the characters .sig. ([0x2E, 0x73, 0x69, 0x67, 0x2E])
- a primitive-specific suffix, depending of the primitive of the author
  - for ed25519, this is ed25519 ([0x65, 0x64, 0x32, 0x35, 0x35, 0x31, 0x39])
the swapped boolean is omitted
an entry "hash": "sha256" is added
if swapped, the order of entries must be previous, sequence, author, timestamp, hash, content, signature, else it must be previous, author, sequence, timestamp, hash, content, signature
if content is a multibox, it must use the multibox encoding as a string

When creating new messages, the (purely logical) value of swapped should be false. Or you might set it to true, or generate it randomly - nobody can stop you, and everything will still work.

Messages

To this json encoding corresponds exactly one data value with a fixed order of object entries. This value is referred to as a message.

Computing a Message's Signature

To obtain the signature of a message, first compute the json encoding specified above, but without the "signature" entry. Then compute the signing encoding of the corresponding data value, using the entry order of the json as a tie-breaker for the entry-order of the signing encoding. Finally use the signing primitive of the message's author on the signing encoding, to obtain the signature.

its json encoding is a possible output of the message creation algorithm
- in particular, the claimed signature must match the data and public key
its length is smaller than 16385
if previous is null, the sequence number of the message must be the float 1, else it must be one larger than that of the message whose hash is the value of previous

SSB Specification

Introduction

Feed spec

Common Datatypes

Multikey

Multikey Encoding

Multifeed

Multifeed Encoding

Multihash

Multihash Encoding

Multibox

Multibox Encoding

Datamodel

Abstract Data Model

Null

Booleans

Strings

Floats

Arrays

Objects

Signing Encoding

Signing Encoding Null

Signing Encoding Booleans

Signing Encoding Strings

Signing Encoding Floats

Signing Encoding Arrays

Signing Encoding Objects

Hash Computation

Length Computation

JSON Transport Encoding

Feeds and Messages

Metadata

Json Encoding

Messages

Computing a Message's Signature

Computing the Hash of a Message

Computing the Size of a Message

Message Validation