TCOBSv1 Specification

March 2, 2026 ยท View on GitHub

Table of Contents

1. TCOBS Encoding Principle

1.1. Symbols

  • o = offset bits to the next sigil byte

  • 101ooooo NOP sigil byte N: ooooo = 0-31

  • 001ooooo Zero sigil byte Z1: ooooo = 0-31

  • 010ooooo Zero sigil byte Z2: ooooo = 0-31

  • 011ooooo Zero sigil byte Z3: ooooo = 0-31

  • 110ooooo Full sigil byte F2: ooooo = 0-31

  • 111ooooo Full sigil byte F3: ooooo = 0-31

  • 100ooooo Full sigil byte F4: ooooo = 0-31

  • 00001ooo Repeat sigil byte R2: ooo = 0-7

  • 00010ooo Repeat sigil byte R3: ooo = 0-7

  • 00011ooo Repeat sigil byte R4: ooo = 0-7

  • 00000ooo reserved bytes: ooo = 1-7

  • 00000000 forbidden byte

1.1.1. NOP Sigil Byte N

This sigil does not represent data in the stream. It only keeps the sigil chain linked. The remaining 5 bits encode the distance to the next sigil (0 <= n <= 31).

  • N_0 = 10100000
  • ...
  • N_31 = 10111111

1.1.2. Zero Sigil Byte Z1, Z2, Z3

  • This sigil represents 1 to 3 zero bytes in the data stream and replaces 00 to 00 00 00 to eliminate zeros, reduce data size, and keep the chain linked.
  • The remaining 5 bits encode the distance to the next sigil (0 <= n <= 31).
  • Z1 = 001ooooo
    • Z1_0 = 00100000
    • ...
    • Z1_31 = 00111111
  • ...
  • Z3 = 011ooooo
    • Z3_0 = 01100000
    • ...
    • Z3_31 = 01111111

1.1.3. Full Sigil Byte F2, F3, F4

  • This sigil represents 2 to 4 0xFF bytes in the data stream and replaces FF FF to FF FF FF FF to reduce data size and keep the chain linked.
  • The remaining 5 bits encode the distance to the next sigil (0 <= n <= 31).
  • F2 = 110ooooo
    • F2_0 = 11000000
    • ...
    • F2_31 = 11011111
  • ...
  • F4 = 100ooooo
    • F4_0 = 10000000
    • ...
    • F4_31 = 10011111

1.1.4. Repeat Sigil Byte R2, R3, R4

  • This sigil represents 2 to 4 repetitions of the previous data byte and is used to reduce data size while keeping the chain linked.
  • The remaining 3 bits encode the distance to the next sigil (0 <= n <= 7).
  • R2 = 00001ooo
    • R2_0 = 00001000
    • ...
    • R2_7 = 00001111
  • ...
  • R4 = 00011ooo
    • R4_0 = 00011000
    • ...
    • R4_7 = 00011111

1.2. TCOBS Encoding

Encoding can be implemented in a straightforward way on the sender side, touching each byte only once.

1.2.1. Simple Encoding Algorithm

  • aa represents any non-zero and non-FF byte

  • aa aa ... represents any non-zero and non-FF equal bytes

  • Easy to implement (fast).

  • Longer sequences are possible by repetition.

EXAMPLES:

Unencoded DataEncoded DataComment
00Z1
00 00Z2
00 00 00Z3
00 00 00 00Z3 Z1repetition
00 00 00 00 00Z3 Z2repetition
00 00 00 00 00 00Z3 Z3repetition
00 00 00 00 00 00 00Z3 Z3 Z1repetition
......repetition
aaaa
aa aaaa aa
aa aaaa N31 aaoffset reached 31, so a NOP sigil byte is added
aa aa aaaa R2
aa aa aa aaaa R3
aa aa aa aaaa Nn R3n=8...31, offset exceeds 7, so a NOP sigil byte is inserted
aa aa aa aa aaaa R4
aa aa aa aa aa aaaa R4 aarepetition
aa aa aa aa aa aa aaaa R4 aa aarepetition
......repetition
FFFF
FFFF N31offset reached 31, so a NOP sigil byte is added
FF FFF2
FF FF FFF3
FF FF FF FFF4
FF FF FF FF FFF4 FFrepetition
FF FF FF FF FF FFF4 F2repetition
FF FF FF FF FF FF FFF4 F3repetition
FF FF FF FF FF FF FF FFF4 F4repetition
......repetition
  • If several encodings are possible, the encoder can choose either one.
    • Example: 00 00 00 00 can be encoded as A0 20 (Z3 Z1) or 40 40 (Z2 Z2).
  • NOP sigil bytes are logically ignored. They simply serve as link chain elements.

1.2.2. Sigil Bytes Chaining

  • Encoding starts at the first buffer address.

  • The encoded buffer ends with a sigil byte.

  • Decoding starts with that final sigil byte.

  • The first decoded sigil byte (at the end of the encoded buffer) carries, as offset, the byte distance to the previous sigil byte or to the buffer start.

  • If two sigil bytes are neighbors, the offset is 0.

  • Encoded examples (Sn = sigil byte with offset n, by = data byte):

    S0 // only one sigil byte like F4 (representing FF FF FF FF)
    by S1 // one byte and a sigil byte like AA R4 representing AA AA AA AA AA
    by by S2 S0 // like AA BB R2 Z2 representing AA BB BB BB 00 00
    by by by by by by by by by S9 S0 S0 by by by S3 // and so on ...
    
  • Each next sigil byte carries the byte distance to the previous sigil byte when moving toward buffer start.

(back to top)

2. TCOBS Software Interface

2.1. C Interface and Code

2.2. Go Interface and Code

// CEncode encodes i into o and returns the number of encoded bytes.
func CEncode(o, i []byte) int
// Decode decodes in into d and returns n = valid decoded length from the end of d.
func Decode(d, in []byte) (n int, e error)

(back to top)

3. Appendix: Extended Encoding Possibilities

  • The reserved bytes 00000ooo with ooo = 1..7 can be used for future extensions.

3.1. Example: RLE for longer rows of equal bytes (not implemented)

Compression can be improved by the following ideas. This would make the encoder more complex and usually does not make sense for messages produced by Trice. For user data with long runs of equal bytes, it may still be useful when compute power matters and standard compression is too slow.

  • The reserved values 00000ooo with ooo = 001...111 can be used for additional compression extensions.
  • These sigil bytes then implicitly have offset 0. They are only allowed as the right neighbor of another sigil byte.
  • R repetition sigils repeat data bytes according to their count value if no M sigil (see below) is to their right.
  • Multiple repetition sigils can be chained. Example:
    • aa R4 R3 = (1 + 4 + 3) * aa = 8 * aa
  • M multiply sigils multiply their count with the count of the sigil to their left.
  • Multiplication between M sigils is possible an unlimited number of times.
  • If an R, Z, or F sigil is left of an M sigil, it is also multiplied, and the multiplication chain ends there.
  • Examples:
    • Z2 R3 R4 M8 = ( 2 + 3 + (4 * 8)) * 00 = 37 * 00
    • aa R4 R2 M3 M3 = ( 1 + 4 + (2 * 3 *3 ) * aa = 23 * aa
    • F2 M3 R4 M3 M8 R5 = ( (2 *3 ) + (4 * 3 * 8) + 5 ) ) * 00 = 107 * 00
  • The encoder can choose among alternatives. The decoder follows a clear algorithm.
SigilcodeUse count until 21 repetitionsComment
0010reserved
0100reserved
RA011610 data byte repetitions
M31003multiply left count with 3
M410110multiply left count with 4
M51106multiply left count with 5
M811110multiply left count with 8

These 5 sigils can minimize encoded length for runs of more than 20 equal bytes. The table below is exploratory only; tokens like F1, M7, or Z4 are historical placeholders and not part of the implemented v1 symbol set.

DecodedTCOBS encodedDecodedTCOBS encodedDecodedTCOBS encoded
1 * 00Z11 * FFFF1 * aaaa
2 * 00Z22 * FFF22 * aaaa aa
3 * 00Z33 * FFF33 * aaaa R2
4 * 00Z3 Z14 * FFF44 * aaaa R3
5 * 00Z3 Z25 * FFF4 FF5 * aaaa R4
6 * 00Z3 Z36 * FFF4 F26 * aaaa R3 R2
7 * 00Z3 R47 * FFF4 F47 * aaaa R4 R2
8 * 00Z2 M48 * FFF2 M48 * aaaa R4 R3
9 * 00Z3 M39 * FFF3 M39 * aaaa R2 M4
10 * 00Z2 M510 * FFF2 M510 * aaaa R3 M3
11 * 00Z1 RA11 * FFF1 RA11 * aaaa RA
12 * 00Z3 M412 * FFF3 M412 * aaaa RA aa
13 * 00Z3 RA13 * FFF3 RA13 * aaaa R3 M4
14 * 00Z2 M714 * FFF2 M714 * aaaa R3 RA
15 * 00Z3 M515 * FFF3 M515 * aaaa R2 M7
16 * 00Z2 M816 * FFF2 M816 * aaaa R3 M5
17 * 00Z2 M8 Z117 * FFF2 M8 FF17 * aaaa R4 M4
18 * 00Z2 M8 R218 * FFF2 M8 R218 * aaaa R4 M4 aa
19 * 00Z2 M8 R319 * FFF2 M8 R319 * aaaa R4 M4 R2
20 * 00Z2 M8 R420 * FFF2 M8 R420 * aaaa R4 M4 R3
21 * 00Z4 M5 Z121 * FFF3 M5 FF21 * aaaa R4 M5

These extended possibilities are currently not implemented and are shown for discussion only. A decoder that can interpret such extensions will also decode the simple encoding.

3.2. Other Example: Any proposal?

F4 may not be used very often and could be reassigned for a different purpose, but that would define a different method.

(back to top)

4. Changelog

DateVersionComment
2026-MAR-020.9.3Wording and typo cleanup, notation clarifications, Go interface snippet correction, and link fixes.
2022-AUG-080.9.2Link corrected
2022-AUG-060.9.1Assumptions moved to TCOBS ReadMe.md
2022-JUL-300.9.0Common (v1 & v2) parts removed.
2022-JUL-240.8.5Smaller wording improvements.
2022-MAY-220.8.4F4 remark added. Correction: Trice -> message in chapter 2 and 3.
2022-MAY-080.8.3Correction: in the worst case 1 additional byte per 32 31 bytes
2022-APR-020.8.2Preface reworked
2022-APR-010.8.1Document slightly restructured
2022-MAR-290.8.0Document slightly restructured, some comments added
2022-MAR-280.7.1Multiply sigil byte idea more specified
2022-MAR-280.7.0Multiply sigil byte idea added
2022-MAR-240.6.1Smaller corrections
2022-MAR-240.6.0Comment added to preface after talk with Sergii
2022-MAR-230.5.2Sigil bytes offset correction, tcobs.h Link corrected. tcobs.c Link added
2022-MAR-220.5.1Simple encoding example table extended.
2022-MAR-210.5.0R5 removed
2022-MAR-200.4.2Sigil corrected. Now the offset is the byte count between two sigil bytes.
2022-MAR-200.4.1Sigil chaining better explained.
2022-MAR-190.4.0TCOBS Encoding as C-Code in separate file TCOBS.C
2022-MAR-180.3.1wip TCOBS Encoding
2022-MAR-180.3.0Software Interface added
2022-MAR-180.2.0Correction & simplification
2022-MAR-170.1.1Clarification
2022-MAR-170.0.0Moved from Trice1.0Specification and reworked