Compiler API Reference

June 26, 2026 · View on GitHub

Complete API reference for the SC-NeuroCore compilation pipeline. Covers the ODE-to-Verilog equation compiler, MLIR/CIRCT emitter, weight quantiser, adaptive precision, IR type checker, static analysis, and deployment orchestrator. This is the authoritative reference for all compiler-facing Python functions.

The root package boundary is defined in Compiler Surface Policy. That page states which sc_neurocore.compiler modules are package-facade exports, direct public modules, compatibility facades, or internal build tools.

1. Mathematical Formalism

1.1 ODE Discretisation

The compiler transforms continuous ODEs to discrete fixed-point computations using the forward Euler method:

x[n+1] = x[n] + \Delta t \cdot f(x[n], I[n])

In Q $m$ . $f$ format with shift-based division by $\tau$ :

x_{\text{next}} = x + \frac{I - (x - x_{\text{rest}})}{2^{\lceil\log_2 \tau\rceil}}

1.2 Fixed-Point Encoding

Parameters and states are encoded in Q $m$ . $f$ signed format:

Q(v) = \text{round}(v \cdot 2^f)

The range is $[-2^{m-1}, 2^{m-1} - 2^{-f}]$ with precision $2^{-f}$.

Format	Total Bits	Integer	Fraction	Range	Precision
Q8.8	16	8	8	±127	0.0039
Q16.16	32	16	16	±32767	0.000015
Q12.20	32	12	20	±2047	0.00000095

1.3 Guard Bit Computation

Guard bits prevent intermediate overflow during multiply-accumulate:

G = \lceil \log_2(N_{\text{terms}}) \rceil

where $N_{\text{terms}}$ is the maximum number of additions in the datapath. The data width is extended to $W + G$ bits for intermediates, then saturated back to $W$ bits for the final result.

1.4 Piecewise LUT Approximation

Transcendental functions ( $\exp$ , $\log$ , $\tanh$ , etc.) use 16-entry piecewise-constant lookup tables covering $[-8, +8)$ :

f_{\text{LUT}}(x) = \text{table}\left[\left\lfloor \frac{x + 8}{1} \right\rfloor\right]

Accuracy: ~1–2% over the useful range for neuron dynamics.

2. Architecture

2.1 Compilation Pipeline

flowchart TB
    subgraph Input
        A["ODE string<br/>'dv/dt = -(v-E_L)/tau + I/C'"]
    end
    subgraph Parse
        B["Python AST parser"]
        C["_VerilogExprEmitter"]
    end
    subgraph Emit
        D["Q-format parameters"]
        E["Multiply pipelines"]
        F["LUT for exp/log/tanh"]
        G["Saturating next-state"]
        H["Threshold + reset logic"]
    end
    subgraph Output
        I["Synthesizable Verilog"]
        J["Testbench"]
    end

    A --> B --> C
    C --> D & E & F & G & H
    D & E & F & G & H --> I
    I --> J

    style Input fill:#e1f5fe
    style Output fill:#e8f5e9

2.2 Module Dependency Graph

sc_neurocore.compiler
├── equation_compiler   # ODE → Verilog
├── pipeline            # Yosys → nextpnr → bitstream
├── mlir_emitter        # MLIR/CIRCT backend
├── quantizer           # Float → Q-format
├── adaptive_precision  # Dynamic width switching
├── ir_type_checker     # Stochastic IR validation
├── static_analysis     # Guard bits, SVA, power
└── deployment          # Constraints, drivers, multi-target

3. CLI Interface

3.1 Main Compilation Command

sc-neurocore compile "dv/dt = -(v-E_L)/tau_m + I/C" \
    --threshold "v > -50" --reset "v = -65" \
    --params "E_L=-65,tau_m=10,C=1" --init "v=-65" \
    --target ice40 --testbench --synthesize -o build/

3.2 CLI Flags

Flag	Default	Description
`--threshold`	None	Spike condition (e.g. `"v > -50"`)
`--reset`	None	Reset expression (e.g. `"v = -65; w = 0"`)
`--params`	None	Comma-separated `key=val` pairs
`--init`	None	Initial state `key=val` pairs
`--target`	`ice40`	FPGA target (`ice40`, `ecp5`, `artix7`, `zynq`)
`--module-name`	`sc_equation_neuron`	Generated Verilog module name
`--testbench`	off	Generate simulation testbench
`--synthesize`	off	Run Yosys synthesis (requires Yosys in PATH)
`-o` / `--output`	`build`	Output directory

3.3 NIR Network Compilation

sc-neurocore compile-nir model.nir --target artix7 -o build/

3.4 Multi-Target Compilation

sc-neurocore compile "dv/dt = -(v)/tau + I" \
    --target artix7,ecp5,ice40 --compare -o build/

4. Python API

4.1 Equation Compiler

from sc_neurocore.neurons.equation_builder import from_equations
from sc_neurocore.compiler.equation_compiler import compile_to_verilog

neuron = from_equations(
    "dv/dt = -(v - E_L)/tau_m + I/C",
    threshold="v > -50",
    reset="v = -65",
    params=dict(E_L=-65, tau_m=10, C=1),
    init=dict(v=-65),
)

verilog = compile_to_verilog(
    neuron,
    module_name="sc_lif",
    data_width=16,
    fraction=8,
)

Supported Functions

Category	Functions
Transcendental	`exp`, `log`, `sqrt`, `tanh`, `sigmoid`, `sin`, `cos`
Arithmetic	`abs`, `clip(x, lo, hi)`, `max(a, b)`, `min(a, b)`
Polynomial	`x2` through `x8`
Operators	`+`, `-`, `*`, `/` (by constant), unary `-`
Comparison	`>`, `>=`, `<`, `<=`

4.2 MLIR Emitter

from sc_neurocore.compiler import MLIREmitter, generate_mlir_bundle

emitter = MLIREmitter("sc_native_top")
lhs = emitter.emit_lfsr(8, 0x5A)
rhs = emitter.emit_lfsr(8, 0xC3)
emitter.emit_and(lhs, rhs)

bundle = generate_mlir_bundle(emitter, "build/mlir/sc_native_top")
print(bundle.mlir_path)
print(bundle.manifest_path)

The manifest records operation counts and whether firtool is available.

4.3 Weight Quantizer

import numpy as np

from sc_neurocore.compiler.quantizer import (
    PrecisionEnvelopeReport,
    PrecisionTrapReport,
    QFormatMixed,
    compile_dense_block_floating,
    compile_dense_mixed_precision,
    dequantize_weights,
    quantize_weights,
)

weights = np.array([0.5, -0.3, 1.2, 0.0], dtype=np.float64)

# Canonical fixed-point Q8.8 path: returns the integer tensor only.
q_weights = quantize_weights(weights, fmt="Q8.8", rounding="nearest")
restored = dequantize_weights(q_weights, fmt="Q8.8")

# Mixed hardware path: Q8.8 stored weights with Q16.16 accumulation metadata.
mixed = QFormatMixed()
q_mixed, tensor_scale = quantize_weights(weights, fmt=mixed)
restored_mixed = dequantize_weights(q_mixed, fmt=mixed, scale=tensor_scale)

QFormatMixed defaults to Q8.8 weights, a Q16.16 accumulator, nearest rounding, and per-tensor scale maximisation. Its accumulator format must be at least as wide as the weight format, preserve the weight fractional precision, and cover the full weight dynamic range. The returned tensor_scale is deterministic metadata for reconstructing values and for hardware emitters that need the scale alongside compact stored weights.

Dense deployment can compile a two-dimensional weight matrix into the same bit-true Q8.8-weight/Q16.16-accumulator contract used by the Rust and HDL reference paths:

compiled = compile_dense_mixed_precision(weights, fmt=QFormatMixed())
outputs_q1616, overflow = compiled.forward_with_overflow(inputs)
outputs = compiled.forward_float(inputs)
trap_report: PrecisionTrapReport = compiled.precision_trap_report(inputs)
envelope_report: PrecisionEnvelopeReport = compiled.precision_envelope_report(inputs)
manifest = compiled.manifest()

The mixed-dense HDL reference exposes the same lane-level overflow contract as the Python overflow mask and Rust overflow_count: overflow_vector[i] identifies output channel i, while the aggregate overflow line is asserted when any lane saturates.

The block-floating dense HDL reference uses the same lane convention, with overflow_vector[i] identifying the output channel that saturated after the shared-exponent product shift and Q16.16 accumulation. Both dense HDL references also export abs_bounds_q1616[i], an unsigned 64-bit conservative absolute Q16.16 bound for output channel i. This mirrors Python PrecisionEnvelopeReport.abs_bound_codes and Rust MixedDenseResult.abs_bounds_q1616, including cancellation cases where the realised output is small but the absolute product envelope is large.

PrecisionEnvelopeReport.manifest() also exposes the signed fixed-point width proof used by the Python and Rust deployment surfaces:

Field	Meaning
`proof_kind`	Fixed string `signed_symmetric_fixed_point_width` for this contract.
`required_total_bits`	Sign bit plus the bit length required by the largest conservative absolute Q16.16 bound.
`required_integer_bits`	`required_total_bits - 16`, clamped to at least one signed integer bit for Q16.16 reporting.
`width_headroom_bits`	`32 - required_total_bits`; negative values mean Q16.16 saturation is required.
`saturation_required`	True when the conservative bound cannot fit in signed 32-bit Q16.16.
`static_overflow_proven_safe`	Alias of the conservative overflow proof used by safety-gate callers.

These fields are static envelope claims over absolute product magnitudes. They do not rely on cancellation in the realised dot product, so a small output code does not weaken the predeployment overflow proof. The quantizer delegates these manifest fields to sc_neurocore.compiler.static_analysis.prove_fixed_point_envelope(), so the standalone static-analysis API and dense deployment reports share one Python proof authority.

Live-Control Parameter Banks

The live-control schema decouples long-lived parameters from static logic fabric. ParameterBankSpec describes writable Q-format or block-floating entries in BRAM/distributed RAM, including byte span, entry addresses, and raw encoded-word bounds. MMIOUpdateSpec adds a deterministic AXI4-Lite/PCIe control window with fixed registers for bank select, entry select, write-data low/high words, status, trap status, and trap clear. Host code uses build_update_sequence(...) to stage a bank/index/value update with a deterministic CRC32 checksum, reject mismatches through a sticky checksum_mismatch trap, load it into a shadow bank, and then apply it explicitly, so operators can update weights or Kuramoto phase-coupling parameters without resynthesising the bitstream. Successful shadow loads latch the bank and entry identity at load time. Apply and rollback use that latched identity rather than the mutable selection registers, so a later bank_select or entry_index write cannot retarget an in-flight transaction. The generated bus surface requires full-word writes; a partial write strobe is rejected with a sticky partial_write trap before any control or staged-data register is modified.

The status map exposes ready, busy, update_ack, trap_latched, shadow_loaded, applied, rollback_ack, checksum_valid, and sticky checksum_mismatch/invalid_selection/read_only_bank/partial_write trap bits. Generated parameter-bank RTL reserves deterministic trap lanes for staged overflow, staged underflow, checksum mismatch, invalid bank/entry selection, and read-only bank or partial-write rejection before shadow loading: if a host payload cannot be represented as either a zero-extended raw word or a valid signed extension for the selected bank width, if the CRC32 guard does not match the staged payload, or if the selected bank/index pair is not writable, the trap vector latches and the shadow bank is not modified. Trap clearing is a separate two-write sequence that records the intended flag width before asserting the clear command, preserving deterministic host intervention semantics. sc_neurocore.hdl_gen.bus_interface.generate_live_parameter_bank(...) consumes the same manifest and emits the corresponding AXI4-Lite parameter-bank RTL with active/shadow memories, checksum-gated shadow loading, generated staged-range, CRC32-mismatch, invalid-selection, read-only-bank, and partial-write traps, explicit apply, rollback, and active-only parameter_words, so the Python control schema and hardware register map remain one contract.

forward_with_overflow returns saturated accumulator-format integer codes and per-output overflow flags. In canonical scale_per_tensor=False mode the division from Q8.8×Q16.16 products to Q16.16 outputs uses the same signed arithmetic shift as the hardware reference. With per-tensor scaling enabled, the host path carries tensor_scale in the manifest so deployment code can reconstruct compact stored weights without silently changing the physical output scale.

precision_trap_report packages the same saturated output codes and overflow mask into deterministic telemetry for host validation and HDL trap registers. The report manifest includes output_format, output_count, overflow_count, underflow_count, saturated_min_count, saturated_max_count, has_overflow, and has_underflow. Overflow means the realised output saturated at the configured Q-format bound. Underflow means a nonzero fixed-point product or BFP output collapsed below one output-code LSB and therefore produced a zero code that remains visible to safety review.

precision_envelope_report adds conservative predeployment range evidence. It returns realised output codes, realised overflow and underflow flags, per-output absolute bound codes, and a manifest containing observed_overflow_free, observed_underflow_free, conservative_overflow_free, max_abs_output_code, max_abs_bound_code, and min_headroom_code.

Block-floating dense deployment uses shared-exponent weight blocks with Q16.16 inputs and outputs:

compiled_bfp = compile_dense_block_floating(weights, fmt="BFP16E3X32")
outputs_q1616, overflow = compiled_bfp.forward_with_overflow(inputs)
outputs = compiled_bfp.forward_float(inputs)
trap_report = compiled_bfp.precision_trap_report(inputs)
envelope_report = compiled_bfp.precision_envelope_report(inputs)

BFP16E3X32 stores 16-bit signed mantissas and one 3-bit biased exponent per 32-weight block. The exponent range is the full encoded biased range: for three exponent bits, the unbiased range is [-3, +4]. The Python deployment path preserves the shared exponent metadata, saturates final Q16.16 output codes, and exposes overflow and sub-LSB underflow flags for hardware telemetry parity. Compiler manifests record the exact exponent bias (3 for BFP16E3X32), encoded exponent range [0, 7], maximum signed mantissa magnitude 32767, minimum quantum 0.125, maximum absolute value 524272.0, and the contiguous flattened block-alignment rule required by downstream RTL emitters. When the parameter count is known, manifests also carry an exact block_exponent_layout with parameter_count, block_size, exponent_count, last_block_size, and the exponent-index formula. The Python and Rust BFP surfaces reject mismatched exponent-vector lengths before accumulation, preventing an emitter from silently applying a shared exponent to the wrong parameter block. The maintained comparison benchmark also exercises a seeded BFP16E3X2 edge-sweep contract: exponent codes [0, 7, 0, 7] must produce exact safe Q16.16 codes [1056736, -1069024] with zero overflow/underflow, while a max-exponent saturating payload must raise one overflow trap and clamp to 2147483647 rather than wrapping.

Rounding Modes

Mode	Description	Use Case
`nearest`	Round to nearest representable value	Default
`stochastic`	Probabilistic rounding	Training
`floor`	Round toward zero	Conservative

4.4 Adaptive Precision

from sc_neurocore.compiler.adaptive_precision import AdaptivePrecisionConfig

config = AdaptivePrecisionConfig(
    low_precision=8,        # LP mode (Q4.4)
    high_precision=16,      # HP mode (Q8.8)
    switch_threshold=0.1,   # Switch to HP when gradient > 0.1
    hysteresis=0.05,        # Stay in HP until gradient < 0.05
)

4.5 IR Type Checker

from sc_neurocore.compiler.ir_type_checker import check_ir_types

errors = check_ir_types(graph)
if errors:
    for e in errors:
        print(f"Type error: {e}")

Signal types: BITSTREAM, RATE, SPIKE, FIXED, ANY.

4.6 Static Analysis

from sc_neurocore.compiler.static_analysis import (
    prove_fixed_point_envelope,
    prove_no_overflow,
    generate_sva,
    estimate_power,
)

# Guard bit computation
proof = prove_no_overflow(
    "-(v - E_L)/tau_m + I/C",
    bounds={"v": (-128, 127), "E_L": (-65, -65), "tau_m": (10, 10), "I": (0, 100), "C": (1, 1)},
    data_width=16,
    fraction=8,
)
print(f"Safe: {proof.proven_safe}, output range: {proof.expr_interval}")

# Conservative Q16.16 width proof for dense precision envelopes
envelope = prove_fixed_point_envelope(
    [531_400],
    total_bits=32,
    fractional_bits=16,
)
assert envelope.static_overflow_proven_safe
assert envelope.required_total_bits == 21
assert envelope.width_headroom_bits == 11

# SVA assertion generation
sva = generate_sva(
    state_vars=["v"],
    module_name="sc_lif",
    data_width=16,
    fraction=8,
)

# Power estimation
pe = estimate_power(
    verilog,
    data_width=16,
    freq_mhz=200.0,
    process_nm=28,
)

# Use measured VCD switching activity when available
pe_vcd = estimate_power(
    verilog,
    activity_vcd="build/sc_lif.vcd",
    vcd_time_units_per_cycle=5,
    freq_mhz=200.0,
)

4.7 Deployment

from sc_neurocore.compiler.deployment import (
    generate_constraints,
    generate_cocotb_testbench,
    generate_riscv_driver,
    generate_sby_script,
    compile_multi_target,
    format_comparison_table,
    estimate_resources,
)

# Timing constraints
xdc = generate_constraints("sc_lif", freq_mhz=200)

# Cocotb testbench
tb = generate_cocotb_testbench("sc_lif", data_width=16, fraction=8)

# RISC-V driver
driver = generate_riscv_driver(
    "sc_lif",
    params={"E_L": 16, "tau_m": 16},
    rtos="freertos",
)

# SymbiYosys formal
sby = generate_sby_script("sc_lif", mode="bmc", depth=20)

# Resource estimation
res = estimate_resources("sc_lif", verilog)
print(f"LUTs: {res.estimated_luts}, DSPs: {res.estimated_dsps}")

5. Pipeline Orchestration

5.1 Full Synthesis Flow

from sc_neurocore.compiler.pipeline import run_synthesis_pipeline

result = run_synthesis_pipeline(
    verilog_path="build/sc_lif.v",
    target="ice40",
    freq_mhz=100,
    output_dir="build/",
)
print(f"LUTs: {result.lut_count}")
print(f"FFs: {result.ff_count}")
print(f"Fmax: {result.fmax_mhz:.1f} MHz")

5.2 Pipeline Stages

Stage	Tool	Input	Output
Parse	Python AST	ODE string	IR graph
Emit	`_VerilogExprEmitter`	IR graph	Verilog RTL
Synthesis	Yosys	Verilog	BLIF/JSON
P&R	nextpnr	BLIF	Bitstream

5.3 MLIR/CIRCT Path

from sc_neurocore.compiler import MLIREmitter, generate_mlir_bundle

emitter = MLIREmitter("sc_native_top")
# ... emit operations ...
bundle = generate_mlir_bundle(emitter, "build/mlir/sc_native_top")

The MLIR backend generates .mlir files and mlir_bundle_manifest.json. The manifest records operation counts and does not claim CIRCT lowering unless a downstream tool execution record is attached.

6. Data Types and Structures

6.1 CompilationResult

@dataclass
class CompilationResult:
    target: str
    verilog: str
    verilog_lines: int
    data_width: int
    fraction: int
    overflow: str           # "saturate" or "wrap"
    rounding: str           # "nearest", "stochastic", "floor"
    estimated_luts: int
    estimated_dsps: int
    estimated_ffs: int
    guard_bits: int
    max_freq_mhz: float

6.2 OverflowProofResult

@dataclass
class OverflowProofResult:
    safe: bool
    guard_bits: int
    max_intermediate_bits: int
    overflow_possible_vars: list[str]

6.3 PowerEstimate

@dataclass
class PowerEstimate:
    dynamic_mw: float
    static_mw: float
    total_mw: float
    energy_per_spike_nj: float
    toggle_rate: float

6.4 ResourceEstimate

@dataclass
class ResourceEstimate:
    estimated_luts: int
    estimated_dsps: int
    estimated_ffs: int
    estimated_bram_18k: int
    mul_count: int
    add_count: int
    register_bits: int

7. Performance Characteristics

7.1 Compilation Speed

Neuron Type	State Vars	Compile Time	Lines
LIF	1	~5 ms	~80
Izhikevich	2	~8 ms	~120
AdEx	2	~10 ms	~140
HH	4	~20 ms	~250
Custom (10 vars)	10	~50 ms	~600

7.2 Generated Verilog Quality

Metric	LIF Q8.8	Izh Q16.16	HH Q16.16
Lines	80	120	250
LUTs (Artix-7)	~80	~200	~500
DSPs	1	3	8
Fmax	450 MHz	400 MHz	350 MHz
Power (28nm)	0.003 mW	0.008 mW	0.06 mW

7.3 LUT Accuracy

Function	LUT Entries	Range	Max Error
`exp`	16	[-8, 8)	1.5%
`log`	16	(0, 8)	2.0%
`tanh`	16	[-8, 8)	1.0%
`sigmoid`	16	[-8, 8)	1.2%
`sqrt`	16	[0, 8)	1.8%

8. Test Suite and Verification

8.1 Equation Compiler Test

python -c "
from sc_neurocore.neurons.equation_builder import from_equations
from sc_neurocore.compiler.equation_compiler import compile_to_verilog

n = from_equations('dv/dt = -(v-E_L)/tau_m + I/C',
    threshold='v > -50', reset='v = -65',
    params=dict(E_L=-65, tau_m=10, C=1), init=dict(v=-65))
v = compile_to_verilog(n, module_name='sc_lif')
assert 'module sc_lif' in v
assert 'spike' in v
assert len(v.splitlines()) > 50
print(f'Equation compiler: PASS ({len(v.splitlines())} lines)')
"

8.2 Multi-Target Compilation Test

python -c "
from sc_neurocore.neurons.equation_builder import from_equations
from sc_neurocore.compiler.deployment import compile_multi_target

n = from_equations('dv/dt = -(v)/tau + I',
    threshold='v > 0', reset='v = 0',
    params=dict(tau=10), init=dict(v=0))
results = compile_multi_target(n, ['artix7', 'ice40'], 'test')
assert len(results) == 2
print('Multi-target: PASS')
"

8.3 Quantizer Test

python -c "
from sc_neurocore.compiler.quantizer import quantize_weights
q = quantize_weights([1.0, -0.5, 0.25], data_width=16, fraction=8)
assert q[0] == 256   # 1.0 * 256
assert q[1] == -128   # -0.5 * 256
assert q[2] == 64     # 0.25 * 256
print('Quantizer: PASS')
"

8.4 Static Analysis Test

python -c "
from sc_neurocore.compiler.static_analysis import prove_no_overflow
r = prove_no_overflow(
    '-(v - E_L)/tau_m + I/C',
    bounds={'v': (-128, 127), 'E_L': (-65, -65), 'tau_m': (10, 10), 'I': (0, 100), 'C': (1, 1)},
    data_width=16, fraction=8,
)
assert r.proven_safe
print('Overflow proof: PASS')
"

8.5 SVA Generation Test

python -c "
from sc_neurocore.compiler.static_analysis import generate_sva
sva = generate_sva(['v'], module_name='sc_lif')
assert 'a_no_overflow_v' in sva
assert 'c_spike_reachable' in sva
print('SVA generation: PASS')
"

8.6 Power Estimation Test

python -c "
from sc_neurocore.compiler.static_analysis import estimate_power
v = 'wire signed [15:0] _mul0 = a * b; reg signed [15:0] v_reg;'
pe = estimate_power(v, data_width=16, freq_mhz=200, process_nm=28)
assert pe.total_mw > 0
print(f'Power: {pe.total_mw:.6f} mW — PASS')
"

8.7 Deployment Functions Test

python -c "
from sc_neurocore.compiler.deployment import (
    generate_constraints,
    generate_cocotb_testbench,
    generate_riscv_driver,
    generate_sby_script,
)

xdc = generate_constraints('test', freq_mhz=200)
assert 'create_clock' in xdc

tb = generate_cocotb_testbench('test', data_width=16, fraction=8)
assert 'cocotb' in tb

d = generate_riscv_driver('test', {'v': 16}, rtos='freertos')
assert 'xTaskCreate' in d

sby = generate_sby_script('test', mode='bmc', depth=10)
assert 'mode bmc' in sby

print('All deployment: PASS')
"

8.8 E2E Pipeline Test

python -m pytest tests/e2e/test_e2e_pipeline.py -v

8.9 Troubleshooting

Symptom	Cause	Fix
`compile_to_verilog` fails	Invalid ODE syntax	Check equation string format
Overflow in simulation	Guard bits insufficient	Increase data width
Yosys synthesis fails	Unsupported Verilog construct	Check target compatibility
Power estimate zero	Empty Verilog source	Verify compilation output
MLIR bundle missing firtool	firtool not installed	Install CIRCT toolchain

References

Fixed-point arithmetic: Yates, R.B. "Fixed-Point Arithmetic: An Introduction." Digital Signal Labs, Technical Report, 2013.
Yosys synthesis framework: Wolf, C. "Yosys Open SYnthesis Suite." https://yosyshq.net/yosys/, 2024.
CIRCT project: LLVM Foundation. "Circuit IR Compilers and Tools." https://circt.llvm.org/, 2024.

Live-control MMIO Parameter Banks

MMIOUpdateSpec supports axi4_lite and pcie bus contracts for live parameter updates. Both protocols use the same deterministic register map:

Register	Offset	Purpose
`control`	`0x00`	update, apply, rollback, and trap-clear control bits
`status`	`0x04`	ready, update acknowledgement, checksum, shadow, and trap status
`bank_select`	`0x08`	selected live parameter bank
`entry_index`	`0x0C`	selected entry inside the bank
`write_data_lo`	`0x10`	low 32 bits of the staged encoded parameter word
`write_data_hi`	`0x14`	high 32 bits for 64-bit staged words
`trap_status`	`0x18`	sticky generated and external trap bits
`trap_clear`	`0x1C`	sticky trap clear register
`write_checksum`	`0x20`	IEEE CRC32 guard over bank, entry, and staged value

generate_live_parameter_bank() emits the AXI4-Lite core directly for bus_protocol="axi4_lite". For bus_protocol="pcie" it emits a PCIe-MMIO register-window adapter over that same core. The PCIe wrapper is deliberately a register-window contract: upstream PCIe hard IP or a board integration wrapper must decode posted writes and reads into the generated single-clock MMIO strobes. It is not a generated PCIe endpoint PHY.

Valid updates are fail-closed. The host must write bank select, entry index, low/high staged data, and the crc32-ieee-le-4x32 guard before asserting CONTROL_UPDATE_VALID; the active parameter output changes only after a separate CONTROL_COMMIT. The CRC32 payload is four little-endian 32-bit words: bank select, entry index, low data word, and high data word. Range traps latch staged overflow or underflow attempts and prevent shadow mutation. Active readback is fail-closed as well: invalid bank or entry selections on read_data_lo or read_data_hi return a bus error and latch invalid_selection rather than returning an ambiguous zero coefficient.