Datatypes Guide
May 14, 2026 ยท View on GitHub
Complete guide to HDF5 datatype mapping and Go type conversion
๐ Table of Contents
- Overview
- Numeric Types
- String Types
- Compound Types
- Type Conversion Rules
- Unsupported Types
- Best Practices
๐ฏ Overview
HDF5 uses its own type system that maps to native types in different programming languages. This library provides automatic conversion between HDF5 types and Go types.
Type Categories
| Category | HDF5 Class | Go Representation | Read | Write |
|---|---|---|---|---|
| Fixed-point | H5T_INTEGER | int8-64, uint8-64 | โ | โ |
| Floating-point | H5T_FLOAT | float32, float64 | โ | โ |
| String | H5T_STRING | string, []string | โ | โ |
| Compound | H5T_COMPOUND | map[string]interface{} | โ | โ |
| Array | H5T_ARRAY | [N]T (fixed arrays) | โ | โ |
| Enum | H5T_ENUM | Named integer constants | โ | โ |
| Reference | H5T_REFERENCE | uint64, [12]byte | โ | โ |
| Opaque | H5T_OPAQUE | []byte with tag | โ | โ |
| Time | H5T_TIME | - | โ | โ Deprecated |
๐ข Numeric Types
Integer Types
32-bit Signed Integer
HDF5 Types:
H5T_STD_I32LE(little-endian)H5T_STD_I32BE(big-endian)H5T_NATIVE_INT(platform-native, 32-bit)
Go Type: int32
Example:
// HDF5 file contains int32 dataset
data, err := ds.Read() // Returns []float64
// Or preserve original type information
info, _ := ds.Info()
// info shows: "Datatype: int32"
// Value conversion: int32 โ float64
// Example: 42 (int32) becomes 42.0 (float64)
Range: -2,147,483,648 to 2,147,483,647
64-bit Signed Integer
HDF5 Types:
H5T_STD_I64LE(little-endian)H5T_STD_I64BE(big-endian)H5T_NATIVE_LLONG(platform-native, 64-bit)
Go Type: int64
Example:
data, err := ds.Read() // Returns []float64
// Value conversion: int64 โ float64
// Example: 9223372036854775807 (int64) becomes 9.223372036854776e+18 (float64)
Range: -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807
Precision Note: When converting int64 to float64, integers larger than (9,007,199,254,740,992) may lose precision due to float64's mantissa limitations.
8-bit Signed Integer
HDF5 Types:
H5T_STD_I8LE(little-endian)H5T_STD_I8BE(big-endian)H5T_NATIVE_INT8(platform-native)
Go Type: int8
Range: -128 to 127
16-bit Signed Integer
HDF5 Types:
H5T_STD_I16LE(little-endian)H5T_STD_I16BE(big-endian)H5T_NATIVE_INT16(platform-native)
Go Type: int16
Range: -32,768 to 32,767
Unsigned Integers
Status: Fully supported (all widths)
HDF5 Types:
H5T_STD_U8LE/BE(uint8)H5T_STD_U16LE/BE(uint16)H5T_STD_U32LE/BE(uint32)H5T_STD_U64LE/BE(uint64)
Go Conversion: Read as native unsigned integers, converted to float64 by Dataset.Read().
Precision Note: uint64 values above (9,007,199,254,740,992) may lose precision when converted to float64.
Floating-Point Types
32-bit Float (Single Precision)
HDF5 Types:
H5T_IEEE_F32LE(little-endian)H5T_IEEE_F32BE(big-endian)H5T_NATIVE_FLOAT(platform-native)
Go Type: float32
Precision: ~7 decimal digits
Example:
data, err := ds.Read() // Returns []float64
// Value conversion: float32 โ float64
// Example: 3.14159265f (float32) becomes 3.1415927410125732 (float64)
Range: ยฑ1.18e-38 to ยฑ3.40e+38
64-bit Float (Double Precision)
HDF5 Types:
H5T_IEEE_F64LE(little-endian)H5T_IEEE_F64BE(big-endian)H5T_NATIVE_DOUBLE(platform-native)
Go Type: float64
Precision: ~15 decimal digits
Example:
data, err := ds.Read() // Returns []float64 (native)
// No conversion needed
// Example: 3.141592653589793 (float64) stays exact
Range: ยฑ2.23e-308 to ยฑ1.80e+308
Numeric Type Conversion Summary
| HDF5 Type | Size | Go Read Type | Conversion |
|---|---|---|---|
| H5T_STD_I8LE/BE | 1 byte | float64 | int8 โ float64 |
| H5T_STD_I16LE/BE | 2 bytes | float64 | int16 โ float64 |
| H5T_STD_I32LE/BE | 4 bytes | float64 | int32 โ float64 |
| H5T_STD_I64LE/BE | 8 bytes | float64 | int64 โ float64 |
| H5T_STD_U8LE/BE | 1 byte | float64 | uint8 โ float64 |
| H5T_STD_U16LE/BE | 2 bytes | float64 | uint16 โ float64 |
| H5T_STD_U32LE/BE | 4 bytes | float64 | uint32 โ float64 |
| H5T_STD_U64LE/BE | 8 bytes | float64 | uint64 โ float64 * |
| H5T_IEEE_F32LE/BE | 4 bytes | float64 | float32 โ float64 |
| H5T_IEEE_F64LE/BE | 8 bytes | float64 | No conversion |
* uint64 values above may lose precision in float64.
๐ String Types
Fixed-Length Strings
HDF5 Type: H5T_STRING with fixed size
Padding Strategies:
- Null-terminated (C-style):
"hello\0\0\0" - Null-padded:
"hello\0\0\0" - Space-padded:
"hello "
Go Type: string
Automatic Handling: The library automatically strips padding.
Example:
// HDF5 file has fixed-length string dataset
strings, err := ds.ReadStrings() // Returns []string
// Padding is automatically removed:
// HDF5 bytes: "hello\0\0\0" โ Go string: "hello"
// HDF5 bytes: "world " โ Go string: "world"
Python h5py equivalent:
# Creating fixed-length strings in Python
import h5py
import numpy as np
with h5py.File('strings.h5', 'w') as f:
# Null-terminated
dt = h5py.string_dtype(encoding='ascii', length=20)
f.create_dataset('names', data=[b'Alice', b'Bob'], dtype=dt)
Variable-Length Strings
HDF5 Type: H5T_STRING with variable size
Storage: Global Heap (separate area in HDF5 file)
Go Type: string
Example:
// HDF5 file has variable-length string dataset
strings, err := ds.ReadStrings() // Returns []string
// Strings can have different lengths:
// ["short", "a much longer string", "x"]
Python h5py equivalent:
import h5py
with h5py.File('vlen_strings.h5', 'w') as f:
# Variable-length strings
dt = h5py.string_dtype(encoding='utf-8')
f.create_dataset('messages', data=["Hello", "World!"], dtype=dt)
Character Sets
| Encoding | Status | Notes |
|---|---|---|
| ASCII | โ Full | Standard ASCII (0-127) |
| UTF-8 | โ Full | Unicode support |
๐๏ธ Compound Types
Compound types are struct-like data with named fields (similar to C structs or Go structs).
Basic Compound Type
HDF5 Type: H5T_COMPOUND
Go Type: map[string]interface{}
Example HDF5 Structure:
Compound Type:
- "temperature" : float64
- "humidity" : float64
- "location" : string (fixed-length, 20 bytes)
Reading Compound Data:
compounds, err := ds.ReadCompound() // Returns []map[string]interface{}
for i, record := range compounds {
fmt.Printf("Record %d:\n", i)
// Access fields by name
temp := record["temperature"].(float64)
humid := record["humidity"].(float64)
loc := record["location"].(string)
fmt.Printf(" Temperature: %.1fยฐC\n", temp)
fmt.Printf(" Humidity: %.1f%%\n", humid)
fmt.Printf(" Location: %s\n", loc)
}
Output:
Record 0:
Temperature: 25.3ยฐC
Humidity: 65.2%
Location: Lab A
Record 1:
Temperature: 26.1ยฐC
Humidity: 63.8%
Location: Lab B
Nested Compound Types
Compound types can contain other compound types:
HDF5 Structure:
Compound Type "Measurement":
- "timestamp" : int64
- "sensor" : Compound {
- "id" : int32
- "name" : string
}
- "value" : float64
Reading Nested Compounds:
compounds, err := ds.ReadCompound()
for _, record := range compounds {
timestamp := record["timestamp"].(int64)
value := record["value"].(float64)
// Nested compound
sensor := record["sensor"].(map[string]interface{})
sensorID := sensor["id"].(int32)
sensorName := sensor["name"].(string)
fmt.Printf("Sensor %d (%s) at %d: %.2f\n",
sensorID, sensorName, timestamp, value)
}
Compound Type with Arrays
HDF5 Structure: ``$ \text{Compound} \text{Type}:
- "\text{name}" : \text{string}
- "\text{scores}" : \text{array} \text{of} 5 \times \text{float64} $``
Status: โ Fully supported (including array fields within compounds).
Creating Compounds in Python
For testing or reference:
import h5py
import numpy as np
# Define compound datatype
dt = np.dtype([
('temperature', 'f8'), # float64
('humidity', 'f8'), # float64
('location', 'S20') # fixed-length string (20 bytes)
])
# Create data
data = np.array([
(25.3, 65.2, b'Lab A'),
(26.1, 63.8, b'Lab B'),
(24.8, 67.5, b'Lab C')
], dtype=dt)
# Write to HDF5
with h5py.File('compounds.h5', 'w') as f:
f.create_dataset('measurements', data=data)
๐ Type Conversion Rules
Automatic Conversions
The library performs these conversions automatically:
| From (HDF5) | To (Go) | Information Loss? |
|---|---|---|
| int32 | float64 | โ No (exact) |
| int64 | float64 | โ ๏ธ Yes (> ) |
| float32 | float64 | โ No (promoted) |
| float64 | float64 | โ No (exact) |
| fixed string | string | โ No (padding removed) |
| variable string | string | โ No (exact) |
Precision Considerations
Integer to Float Conversion
Safe Range (no precision loss):
- int32: All values (max << )
- int64: - to (ยฑ9,007,199,254,740,992)
Example of Precision Loss:
// int64 value in HDF5: 9223372036854775807 ($2^{63}$ - 1)
// Converted to float64: 9223372036854776000 (rounded)
// Lost precision: ~1000
// For most scientific data, this is acceptable
// If exact large integers needed, wait for v1.0.0 (direct int64 support)
Float32 to Float64 Conversion
Float32 values are promoted to float64 without precision loss (but representation changes):
// float32 in HDF5: 3.14159265f (stored as 0x40490FDB)
// Converted to float64: 3.1415927410125732 (0x400921FB60000000)
// ^^^^^^^^ extra precision is not real data!
// For display, round appropriately:
fmt.Printf("%.6f\n", value) // 3.141593 (shows only 6 digits)
โ Best Practices
1. Check Dataset Type Before Reading
info, err := ds.Info()
if err == nil {
fmt.Println(info) // Shows datatype
// Choose appropriate read method
if strings.Contains(info, "string") {
strings, _ := ds.ReadStrings()
// ...
} else if strings.Contains(info, "compound") {
compounds, _ := ds.ReadCompound()
// ...
} else {
data, _ := ds.Read() // Numeric
// ...
}
}
2. Handle Type Assertions Safely
for _, attr := range attrs {
value, err := attr.ReadValue()
if err != nil {
fmt.Printf("error reading %s: %v\n", attr.Name, err)
continue
}
switch v := value.(type) {
case int32:
fmt.Printf("int32: %d\n", v)
case int64:
fmt.Printf("int64: %d\n", v)
case float64:
fmt.Printf("float64: %.6f\n", v)
case string:
fmt.Printf("string: %q\n", v)
default:
fmt.Printf("unknown type: %T\n", v)
}
}
3. Document Precision Requirements
If your application requires exact integer values > :
// Check if dataset contains large integers
info, _ := ds.Info()
if strings.Contains(info, "int64") {
log.Println("Warning: int64 dataset may lose precision when converted to float64")
log.Println("Safe range: -$2^{53}$ to $2^{53}$ (ยฑ9,007,199,254,740,992)")
log.Println("For exact int64 values, wait for v1.0.0")
}
4. Use Compound Types for Structured Data
Instead of separate datasets:
/measurement_temperature
/measurement_humidity
/measurement_location
Use compound types:
/measurements (compound with temperature, humidity, location fields)
Benefits:
- Keeps related data together
- More efficient storage
- Easier to maintain consistency
5. Create Test Files in Pure Go
Generate test files using this library (no Python needed!):
package main
import (
"log"
"github.com/scigolib/hdf5"
)
func main() {
// Create test file
fw, _ := hdf5.CreateForWrite("test_types.h5", hdf5.CreateTruncate)
defer fw.Close()
// Test all supported types
fw.CreateDataset("/int32", hdf5.Int32, []uint64{3})
fw.CreateDataset("/int64", hdf5.Int64, []uint64{3})
fw.CreateDataset("/float32", hdf5.Float32, []uint64{3})
fw.CreateDataset("/float64", hdf5.Float64, []uint64{3})
// Fixed-length strings
fw.CreateDataset("/strings_fixed", hdf5.StringFixed(10), []uint64{2})
// Arrays
fw.CreateDataset("/arrays", hdf5.ArrayFloat32, []uint64{10},
hdf5.WithArrayDims([]uint64{3, 3}))
// Enums
fw.CreateDataset("/status", hdf5.EnumInt8, []uint64{5},
hdf5.WithEnumValues([]string{"OK", "ERROR"}, []int64{0, 1}))
// Write data, verify with h5dump!
}
Pure Go workflow - Write with this library, validate with h5dump or Python h5py!
๐ Next Steps
- Reading Data Guide - How to use these types in practice
- Troubleshooting - Common type-related issues
- Examples - Code examples with different datatypes
Last Updated: 2025-11-13