TCOBSv2 Specification

March 2, 2026 ยท View on GitHub

Table of Contents

1. Preface

  • To understand the encoding principle, the number systems used are explained first.

(back to top)

2. Ternary and Quaternary Numbers

  • Common numbers:
    • decimal numbers with 10 digits, 0123456789, like 0d109 = 1 * $10^{2}$ + 0 * $10^{1}$ + 9 * $10^{0}$ = 109
    • hexadecimal numbers with 16 ciphers 0123456789abcdef, like 0xc0de = 12 * $16^{3}$ + 0 * $16^{2}$ + 13 * $16^{1}$ + 14 * $16^{0}$ = 49374
    • binary numbers with 2 ciphers 01, like 0b101 = 1 * $2^{2}$ + 0 * $2^{1}$ + 1 * $2^{0}$ = 5
    • octal numbers with 8 ciphers 01234567, like 0o77 = 7 * $8^{1}$ + 7 * $8^{0}$ = 63
  • Not so common numbers:
    • Ternary numbers with 3 digits 012, like 0t201 = 2 * $3^{2}$ + 0 * $3^{1}$ + 1 * $3^{0}$ = 19
    • Quaternary numbers with 4 digits 0123, like 0q123 = 1 * $4^{2}$ + 2 * $4^{1}$ + 1 * $4^{0}$ = 25

(back to top)

3. Cipher Counted Notation

For TCOBS encoding, ternary and quaternary numbers are used in a way that also counts the number of ciphers. For example, the cipher sequence 022 is not equal to 22.

3.1. Cipher Counted Ternary Notation (CCTN)

  • Ternary notation uses prefix 0t.
  • CCTN notation uses prefix 0T.
  • Because values 0 and 1 are never needed in TCOBS here, CCTN numbers start at 2.

3.1.1. One CCTN Cipher

2 = 1 + 3^0

indexdecimalCCTNremark
0impossible
1impossible
020T0exactly 1 cipher allowed
130T1exactly 1 cipher allowed
240T2exactly 1 cipher allowed

3.1.2. Two CCTN Ciphers

5 = 1 + $3^{0}$ + 3^1

indexdecimalCCTNremark
050T00exactly 2 ciphers allowed
............
8130T22exactly 2 ciphers allowed

3.1.3. Three CCTN Ciphers

14 = 1 + $3^{0}$ + $3^{1}$ + 3^2

indexdecimalCCTNremark
0140T000exactly 3 ciphers allowed
1150T001exactly 3 ciphers allowed
2160T002exactly 3 ciphers allowed
3170T010exactly 3 ciphers allowed
4180T011exactly 3 ciphers allowed
5190T012exactly 3 ciphers allowed
6200T020exactly 3 ciphers allowed
7210T021exactly 3 ciphers allowed
8220T022exactly 3 ciphers allowed
............
26400T222exactly 3 ciphers allowed

3.1.4. Four CCTN Ciphers

41 = 1 + $3^{0}$ + $3^{1}$ + $3^{2}$ + 3^3

indexdecimalCCTNremark
0410T0000exactly 4 ciphers allowed
............
801210T2222exactly 4 ciphers allowed

3.1.5. Many CCTN Ciphers

Cipher Countgeneric startstartindex rangevalue range
11 + 303^{0}20-22-4
21 + 303^{0} + 313^{1}50-85-13
31 + 303^{0} + 313^{1} + 323^{2}140-2614-40
41 + 303^{0} + 313^{1} + 323^{2} + 333^{3}410-8041-121
51 + 303^{0} + 313^{1} + 323^{2} + 333^{3} + 343^{4}1220-242122-364
...............

3.2. Cipher Counted Quaternary Notation (CCQN)

  • Quaternary notation uses prefix 0q.
  • CCQN notation uses prefix 0Q.
  • Because value 0 is never needed here, CCQN numbers start at 1.

3.2.1. One CCQN Cipher

1 = 4^0

indexdecimalCCQNremark
0impossible
010Q0exactly 1 cipher allowed
............
340Q3exactly 1 cipher allowed

3.2.2. Two CCQN Ciphers

5 = $4^{0}$ + 4^1

indexdecimalCCQNremark
050Q00exactly 2 ciphers allowed
160Q01...
270Q02...
380Q03...
490Q10...
............
15200Q33exactly 2 ciphers allowed

3.2.3. Three CCQN Ciphers

21 = $4^{0}$ + $4^{1}$ + 4^2

indexdecimalCCQNremark
0210Q000exactly 3 ciphers allowed
............
63840Q333exactly 3 ciphers allowed

3.2.4. Four CCQN Ciphers

85 = $4^{0}$ + $4^{1}$ + $4^{2}$ + 4^3

indexdecimalCCQNremark
0850Q0000exactly 4 ciphers allowed
............
2553400Q3333exactly 4 ciphers allowed

3.2.5. Many CCQN Ciphers

Cipher Countgeneric startstartindex rangevalue range
1404^{0}10-31-4
2404^{0} + 414^{1}50-155-20
3404^{0} + 414^{1} + 424^{2}210-6321-84
4404^{0} + 414^{1} + 424^{2} + 434^{3}850-25585-340
5404^{0} + 414^{1} + 424^{2} + 434^{3} + 444^{4}3410-1023341-1364
...............

(back to top)

4. Encoding principle

  • Legend:

    • xx stands for any byte different from its neighbor.
    • AA represents any byte that is neither FF nor 00, with AA == AA.
  • For count encoding, different sigil byte types are used:

    • Z0, Z1, Z2, Z3 for 1 to n 00-bytes in a row
    • F0, F1, F2, F3 for 1 to n FF-bytes in a row
    • R0, R1, R2 for 2 to n equal other bytes in a row
  • Z and F sigils are CCQN ciphers 0..3, and R sigils are CCTN ciphers 0..2.

  • Examples:

decodedencodednumber notation / remark
xx 00 xxxx Z0 xx0Q0 = 1 zero
xx 00 00 00 00 xxxx Z3 xx0Q3 = 4 zeros
xx 17 times FF xxxx F3 F0 xx0Q30 = 17.
xx AA AA xxxx AA AA xx2 times AA stays the same.
xx AA AA AA xxxx AA R0 xx3 times AA gets AA followed by 2 AA coded as R0
xx 13 times AA xxxx AA R2 R1 xxAA stands for itself and indicates what following R-sigils mean - 0T21 (= 12) stands for 12 following AA bytes
  • Integer examples:
decodedencodednumber notation / remark
11 0011 Z017 as 16-bit little-endian integer
FF FFF1-1 as 16-bit little-endian integer
11 00 00 0011 Z217 as 32-bit little-endian integer
FF FF FF FFF3-1 as 32-bit little-endian integer
11 00 00 00 00 00 00 0011 Z0 Z217 as 64-bit little-endian integer
FF FF FF FF FF FF FF FFF0 F3-1 as 64-bit little-endian integer

This lets developers choose integer transfer widths in advance when bandwidth or storage is limited.

(back to top)

5. Sigil Bytes

  • Which patterns are used as sigil bytes is an optimization question. The table below follows the assumption that FF and 00 occur more often in the data stream, especially in short runs. Runs of any equal byte are covered as well.
Value 7-5Bits 7-0Hex RangeByte NameSingle Cipher ValueSignOffset BitsOffset ValueUsageRemark
00000000000forbiddenused later as delimiter byte
0000ooooo01...1FNOP sigilNooooo = 1-311-31moreno meaning, used for keeping the sigil chain linked, offset 0 not needed
1001ooooo20...3FZero 0 sigil1Z0ooooo = 0-310-31morequaternary cipher 0 for a 0x00 count
20100oooo40...4FRepeat 1 sigil3R1oooo = 0-150-15lessternary cipher 1 for an any count
20101oooo50...5FZero 2 sigil3Z2oooo = 0-150-15lessquaternary cipher 2 for a 0x00 count
3011ooooo60...7FZero 1 sigil2Z1ooooo = 0-310-31morequaternary cipher 1 for a 0x00 count
4100ooooo80...9FRepeat 0 sigil2R0ooooo = 0-310-31moreternary cipher 0 for an any count
51010ooooA0...AFRepeat 2 sigil4R2oooo = 0-150-15lessternary cipher 2 for an any count
51011ooooB0...BFZero 3 sigil4Z3oooo = 0-150-15lessquaternary cipher 3 for a 0x00 count
6110oooooC0...DFFull 1 sigil2F1ooooo = 0-310-31morequaternary cipher 1 for a 0xFF count
71110ooooE0...EFFull 2 sigil3F2oooo = 0-150-15lessquaternary cipher 2 for a 0xFF count
71111ooooF0...FEFull 3 sigil4F3oooo = 0-140-14lessquaternary cipher 3 for a 0xFF count, offset 15 forbidden to distinguish from F0=FF sigil byte
11111111FFFull 4 sigil1F00CCQN cipher 0; it does not need to be inside the sigil chain, but it can be

5.1. Symbols assumptions

  • N gets code 0 because its offset is never 0, and 00 is forbidden inside TCOBS-encoded data.

  • N, Z0, Z1, F1, and R0 are used more often, so they carry link offsets 0..31 in 5 offset bits.

  • F0 is also used often but cannot carry offset bits. Therefore its implicit offset value is 0 when used inside the sigil chain.

  • A single F0 can be treated as an ordinary byte because it has code FF and translates to FF.

  • When F0 is followed by an F sigil, a N sigil is needed in front to carry offset bits if offset > 0 at that point.

    • A possible improvement is delegating offset carrying to the next neighboring sigil that can hold it, but that makes the code more complex with very small benefit.
    • Concatenating offset bits across neighboring sigil bytes is not used because it makes code more complex with little benefit.
  • Z2, Z3, F2, F3, R1, and R2 are used less often, so they carry link offsets 0..15 in 4 offset bits.

  • Even though FF is de facto a sigil byte in the encoded data stream, it does not need to be part of the sigil chain when it is not in a run with other F sigils. Examples:

decodedencodednumber notation / remark
xx FF xxxx F0 xxF0 == FF = 1 time FF. F0 does not need to be part of the sigil chain when neighboring bytes are not F sigils.
xx FF FF xxxx F1 xx0Q1 = 2 times FF. F1 is part of sigil chain.
xx 3 times FF xxxx F2 xx0Q2 = 3 times FF. F2 is part of sigil chain.
xx 4 times FF xxxx F3 xx0Q3 = 4 times FF. F3 is part of sigil chain.
xx 5 times FF xxxx F0 F0 xx0Q00 = 5 times FF. Both F0 bytes need to be part of the sigil chain.
xx 6 times FF xxxx F0 F1 xx0Q01 = 6 times FF. F0 F1 both part of sigil chain.
xx 9 times FF xxxx F1 F0 xx0Q00 = 9 times FF. F1 F0 both part of sigil chain.

(back to top)

6. Algorithm

  • Count equal bytes in a run.
  • Convert the count into a ternary or quaternary cipher-counted number.
  • Convert the cipher sequence into a same-sigil-type sequence.
  • Handle offsets to build the sigil chain (encoded buffer ends with a sigil byte).
  • Mathematical proof?

(back to top)

7. Change Log

DateVersionComment
2026-MAR-020.2.3Wording and typo cleanup, notation clarifications, and example corrections for consistency with implemented symbols.
2022-AUG-030.2.2Ternary tables corrected and extended.
2022-JUL-270.2.1Integer number examples added.
2022-JUL-270.2.0Sigil code exchange R2 <-> N. Symbol assumptions reworked.
2022-JUL-070.1.1Explanation and samples added.
2022-JUL-070.1.0CCTN start now with 2.
2022-JUN-000.0.0initial

(back to top)