Waveform data file format

July 3, 2021 ยท View on GitHub

This page describes the binary and JSON data formats produced by the audiowaveform (1) program.

Binary data format (.dat)

audiowaveform expects binary waveform data files to use the .dat file extension. This format consists of a header block followed by the actual waveform data. All values are little-endian.

The header block starts with a version field that identifies how the rest of the data in the file is structured.

Byte offsetTypeField
0-3int32_tVersion

The version 1 header is structured as follows:

Byte offsetTypeField
4-7uint32_tFlags
8-11int32_tSample rate
12-15int32_tSamples per pixel
16-19uint32_tLength

The version 2 header is structured as follows:

Byte offsetTypeField
4-7uint32_tFlags
8-11int32_tSample rate
12-15int32_tSamples per pixel
16-19uint32_tLength
20-23int32_tChannels

Each of these fields is described in detail below.

Version

This field indicates the version number of the waveform data format. The version 1 and 2 data formats are described here. If the format changes in future, the Version field will be incremented.

Flags

The Flags field describes the format of the waveform data that follows the header.

BitDescription
0 (lsb)0: 16-bit resolution, 1: 8-bit resolution
1-31Unused

Sample rate

Sample rate of the original audio file (Hz).

Samples per pixel

Number of audio samples per waveform minimum/maximum pair.

Length

Length of waveform data (number of minimum and maximum value pairs per channel).

Channels

The number of waveform channels present (version 2 only).

Waveform data

Waveform data follows the header block and consists of pairs of minimum and maximum values that each represent a range of samples of the original audio (the "samples per pixel" header field).

The version 1 data format supports only a single audio channel; the audiowaveform program converts stereo audio to mono when generating waveform data. The version 2 data format supports multiple channels, where the data from each channel is interleaved.

For 8-bit data, the waveform data is represented as follows. Each value lies in the range -128 to +127. The example shows a two channel waveform data file.

Byte offsetTypeValue
20int8_tMinimum sample value, index 0, channel 0
21int8_tMaximum sample value, index 0, channel 0
22int8_tMinimum sample value, index 0, channel 1
23int8_tMaximum sample value, index 0, channel 1
24int8_tMinimum sample value, index 1, channel 0
25int8_tMaximum sample value, index 1, channel 0
26int8_tMinimum sample value, index 1, channel 1
27int8_tMaximum sample value, index 1, channel 1
etc......

Pairs of minimum and maximum values repeat to the end of the file.

For 16-bit data, the waveform data is represented as follows. Each value lies in the range -32768 to +32767. The example shows a two channel waveform data file.

Byte offsetTypeValue
20-21int16_tMinimum sample value, index 0, channel 0
22-23int16_tMaximum sample value, index 0, channel 0
24-25int16_tMinimum sample value, index 0, channel 1
25-26int16_tMaximum sample value, index 0, channel 1
27-28int16_tMinimum sample value, index 1, channel 0
29-30int16_tMaximum sample value, index 1, channel 0
31-32int16_tMinimum sample value, index 1, channel 1
33-34int16_tMaximum sample value, index 1, channel 1
etc......

Pairs of minimum and maximum values repeat to the end of the file.

JSON data format (.json)

The JSON data format contains the same information as the binary format. audiowaveform expects JSON data files to use the .json file extension. The format consists of a single JSON object containing fields as described below.

version

The version number of the waveform data format. The version 1 and 2 data formats are described here. If the format changes in future, the version field will be incremented.

channels

The number of waveform channels present (version 2 only).

sample_rate

Sample rate of the original audio file (Hz).

samples_per_pixel

Number of audio samples per waveform minimum/maximum pair.

bits

Resolution of waveform data. May be either 8 or 16.

length

Length of waveform data (number of minimum and maximum value pairs per channel).

data

Array of minimum and maximum waveform data points, interleaved. The example shows a two channel waveform data file. Depending on bits, each value may be in the range -128 to +127 or -32768 to +32767.

Array offsetValue
0Minimum sample value, index 0, channel 0
1Maximum sample value, index 0, channel 0
2Minimum sample value, index 0, channel 1
3Maximum sample value, index 0, channel 1
4Minimum sample value, index 1, channel 0
5Maximum sample value, index 1, channel 0
6Minimum sample value, index 1, channel 1
7Maximum sample value, index 1, channel 1
etc...

Example

The following is an example of a (very short) waveform data file in JSON format.

{
  "version": 2,
  "channels": 2,
  "sample_rate": 48000,
  "samples_per_pixel": 512,
  "bits": 8,
  "length": 3,
  "data": [-65,63,-66,64,-40,41,-39,45,-55,43,-55,44]
}