Compiler API Reference

June 26, 2026 · View on GitHub

Complete API reference for the SC-NeuroCore compilation pipeline. Covers the ODE-to-Verilog equation compiler, MLIR/CIRCT emitter, weight quantiser, adaptive precision, IR type checker, static analysis, and deployment orchestrator. This is the authoritative reference for all compiler-facing Python functions.

The root package boundary is defined in Compiler Surface Policy. That page states which sc_neurocore.compiler modules are package-facade exports, direct public modules, compatibility facades, or internal build tools.


1. Mathematical Formalism

1.1 ODE Discretisation

The compiler transforms continuous ODEs to discrete fixed-point computations using the forward Euler method:

x[n+1]=x[n]+Δtf(x[n],I[n])x[n+1] = x[n] + \Delta t \cdot f(x[n], I[n])

In Qmm.ff format with shift-based division by τ\tau:

xnext=x+I(xxrest)2log2τx_{\text{next}} = x + \frac{I - (x - x_{\text{rest}})}{2^{\lceil\log_2 \tau\rceil}}

1.2 Fixed-Point Encoding

Parameters and states are encoded in Qmm.ff signed format:

Q(v)=round(v2f)Q(v) = \text{round}(v \cdot 2^f)

The range is [2m1,2m12f][-2^{m-1}, 2^{m-1} - 2^{-f}] with precision $2^{-f}$.

FormatTotal BitsIntegerFractionRangePrecision
Q8.81688±1270.0039
Q16.16321616±327670.000015
Q12.20321220±20470.00000095

1.3 Guard Bit Computation

Guard bits prevent intermediate overflow during multiply-accumulate:

G=log2(Nterms)G = \lceil \log_2(N_{\text{terms}}) \rceil

where NtermsN_{\text{terms}} is the maximum number of additions in the datapath. The data width is extended to W+GW + G bits for intermediates, then saturated back to WW bits for the final result.

1.4 Piecewise LUT Approximation

Transcendental functions (exp\exp, log\log, tanh\tanh, etc.) use 16-entry piecewise-constant lookup tables covering [8,+8)[-8, +8):

fLUT(x)=table[x+81]f_{\text{LUT}}(x) = \text{table}\left[\left\lfloor \frac{x + 8}{1} \right\rfloor\right]

Accuracy: ~1–2% over the useful range for neuron dynamics.


2. Architecture

2.1 Compilation Pipeline

flowchart TB
    subgraph Input
        A["ODE string<br/>'dv/dt = -(v-E_L)/tau + I/C'"]
    end
    subgraph Parse
        B["Python AST parser"]
        C["_VerilogExprEmitter"]
    end
    subgraph Emit
        D["Q-format parameters"]
        E["Multiply pipelines"]
        F["LUT for exp/log/tanh"]
        G["Saturating next-state"]
        H["Threshold + reset logic"]
    end
    subgraph Output
        I["Synthesizable Verilog"]
        J["Testbench"]
    end

    A --> B --> C
    C --> D & E & F & G & H
    D & E & F & G & H --> I
    I --> J

    style Input fill:#e1f5fe
    style Output fill:#e8f5e9

2.2 Module Dependency Graph

sc_neurocore.compiler
├── equation_compiler   # ODE → Verilog
├── pipeline            # Yosys → nextpnr → bitstream
├── mlir_emitter        # MLIR/CIRCT backend
├── quantizer           # Float → Q-format
├── adaptive_precision  # Dynamic width switching
├── ir_type_checker     # Stochastic IR validation
├── static_analysis     # Guard bits, SVA, power
└── deployment          # Constraints, drivers, multi-target

3. CLI Interface

3.1 Main Compilation Command

sc-neurocore compile "dv/dt = -(v-E_L)/tau_m + I/C" \
    --threshold "v > -50" --reset "v = -65" \
    --params "E_L=-65,tau_m=10,C=1" --init "v=-65" \
    --target ice40 --testbench --synthesize -o build/

3.2 CLI Flags

FlagDefaultDescription
--thresholdNoneSpike condition (e.g. "v > -50")
--resetNoneReset expression (e.g. "v = -65; w = 0")
--paramsNoneComma-separated key=val pairs
--initNoneInitial state key=val pairs
--targetice40FPGA target (ice40, ecp5, artix7, zynq)
--module-namesc_equation_neuronGenerated Verilog module name
--testbenchoffGenerate simulation testbench
--synthesizeoffRun Yosys synthesis (requires Yosys in PATH)
-o / --outputbuildOutput directory

3.3 NIR Network Compilation

sc-neurocore compile-nir model.nir --target artix7 -o build/

3.4 Multi-Target Compilation

sc-neurocore compile "dv/dt = -(v)/tau + I" \
    --target artix7,ecp5,ice40 --compare -o build/

4. Python API

4.1 Equation Compiler

from sc_neurocore.neurons.equation_builder import from_equations
from sc_neurocore.compiler.equation_compiler import compile_to_verilog

neuron = from_equations(
    "dv/dt = -(v - E_L)/tau_m + I/C",
    threshold="v > -50",
    reset="v = -65",
    params=dict(E_L=-65, tau_m=10, C=1),
    init=dict(v=-65),
)

verilog = compile_to_verilog(
    neuron,
    module_name="sc_lif",
    data_width=16,
    fraction=8,
)

Supported Functions

CategoryFunctions
Transcendentalexp, log, sqrt, tanh, sigmoid, sin, cos
Arithmeticabs, clip(x, lo, hi), max(a, b), min(a, b)
Polynomialx**2 through x**8
Operators+, -, *, / (by constant), unary -
Comparison>, >=, <, <=

4.2 MLIR Emitter

from sc_neurocore.compiler import MLIREmitter, generate_mlir_bundle

emitter = MLIREmitter("sc_native_top")
lhs = emitter.emit_lfsr(8, 0x5A)
rhs = emitter.emit_lfsr(8, 0xC3)
emitter.emit_and(lhs, rhs)

bundle = generate_mlir_bundle(emitter, "build/mlir/sc_native_top")
print(bundle.mlir_path)
print(bundle.manifest_path)

The manifest records operation counts and whether firtool is available.

4.3 Weight Quantizer

import numpy as np

from sc_neurocore.compiler.quantizer import (
    PrecisionEnvelopeReport,
    PrecisionTrapReport,
    QFormatMixed,
    compile_dense_block_floating,
    compile_dense_mixed_precision,
    dequantize_weights,
    quantize_weights,
)

weights = np.array([0.5, -0.3, 1.2, 0.0], dtype=np.float64)

# Canonical fixed-point Q8.8 path: returns the integer tensor only.
q_weights = quantize_weights(weights, fmt="Q8.8", rounding="nearest")
restored = dequantize_weights(q_weights, fmt="Q8.8")

# Mixed hardware path: Q8.8 stored weights with Q16.16 accumulation metadata.
mixed = QFormatMixed()
q_mixed, tensor_scale = quantize_weights(weights, fmt=mixed)
restored_mixed = dequantize_weights(q_mixed, fmt=mixed, scale=tensor_scale)

QFormatMixed defaults to Q8.8 weights, a Q16.16 accumulator, nearest rounding, and per-tensor scale maximisation. Its accumulator format must be at least as wide as the weight format, preserve the weight fractional precision, and cover the full weight dynamic range. The returned tensor_scale is deterministic metadata for reconstructing values and for hardware emitters that need the scale alongside compact stored weights.

Dense deployment can compile a two-dimensional weight matrix into the same bit-true Q8.8-weight/Q16.16-accumulator contract used by the Rust and HDL reference paths:

compiled = compile_dense_mixed_precision(weights, fmt=QFormatMixed())
outputs_q1616, overflow = compiled.forward_with_overflow(inputs)
outputs = compiled.forward_float(inputs)
trap_report: PrecisionTrapReport = compiled.precision_trap_report(inputs)
envelope_report: PrecisionEnvelopeReport = compiled.precision_envelope_report(inputs)
manifest = compiled.manifest()

The mixed-dense HDL reference exposes the same lane-level overflow contract as the Python overflow mask and Rust overflow_count: overflow_vector[i] identifies output channel i, while the aggregate overflow line is asserted when any lane saturates.

The block-floating dense HDL reference uses the same lane convention, with overflow_vector[i] identifying the output channel that saturated after the shared-exponent product shift and Q16.16 accumulation. Both dense HDL references also export abs_bounds_q1616[i], an unsigned 64-bit conservative absolute Q16.16 bound for output channel i. This mirrors Python PrecisionEnvelopeReport.abs_bound_codes and Rust MixedDenseResult.abs_bounds_q1616, including cancellation cases where the realised output is small but the absolute product envelope is large.

PrecisionEnvelopeReport.manifest() also exposes the signed fixed-point width proof used by the Python and Rust deployment surfaces:

FieldMeaning
proof_kindFixed string signed_symmetric_fixed_point_width for this contract.
required_total_bitsSign bit plus the bit length required by the largest conservative absolute Q16.16 bound.
required_integer_bitsrequired_total_bits - 16, clamped to at least one signed integer bit for Q16.16 reporting.
width_headroom_bits32 - required_total_bits; negative values mean Q16.16 saturation is required.
saturation_requiredTrue when the conservative bound cannot fit in signed 32-bit Q16.16.
static_overflow_proven_safeAlias of the conservative overflow proof used by safety-gate callers.

These fields are static envelope claims over absolute product magnitudes. They do not rely on cancellation in the realised dot product, so a small output code does not weaken the predeployment overflow proof. The quantizer delegates these manifest fields to sc_neurocore.compiler.static_analysis.prove_fixed_point_envelope(), so the standalone static-analysis API and dense deployment reports share one Python proof authority.

Live-Control Parameter Banks

The live-control schema decouples long-lived parameters from static logic fabric. ParameterBankSpec describes writable Q-format or block-floating entries in BRAM/distributed RAM, including byte span, entry addresses, and raw encoded-word bounds. MMIOUpdateSpec adds a deterministic AXI4-Lite/PCIe control window with fixed registers for bank select, entry select, write-data low/high words, status, trap status, and trap clear. Host code uses build_update_sequence(...) to stage a bank/index/value update with a deterministic CRC32 checksum, reject mismatches through a sticky checksum_mismatch trap, load it into a shadow bank, and then apply it explicitly, so operators can update weights or Kuramoto phase-coupling parameters without resynthesising the bitstream. Successful shadow loads latch the bank and entry identity at load time. Apply and rollback use that latched identity rather than the mutable selection registers, so a later bank_select or entry_index write cannot retarget an in-flight transaction. The generated bus surface requires full-word writes; a partial write strobe is rejected with a sticky partial_write trap before any control or staged-data register is modified.

The status map exposes ready, busy, update_ack, trap_latched, shadow_loaded, applied, rollback_ack, checksum_valid, and sticky checksum_mismatch/invalid_selection/read_only_bank/partial_write trap bits. Generated parameter-bank RTL reserves deterministic trap lanes for staged overflow, staged underflow, checksum mismatch, invalid bank/entry selection, and read-only bank or partial-write rejection before shadow loading: if a host payload cannot be represented as either a zero-extended raw word or a valid signed extension for the selected bank width, if the CRC32 guard does not match the staged payload, or if the selected bank/index pair is not writable, the trap vector latches and the shadow bank is not modified. Trap clearing is a separate two-write sequence that records the intended flag width before asserting the clear command, preserving deterministic host intervention semantics. sc_neurocore.hdl_gen.bus_interface.generate_live_parameter_bank(...) consumes the same manifest and emits the corresponding AXI4-Lite parameter-bank RTL with active/shadow memories, checksum-gated shadow loading, generated staged-range, CRC32-mismatch, invalid-selection, read-only-bank, and partial-write traps, explicit apply, rollback, and active-only parameter_words, so the Python control schema and hardware register map remain one contract.

forward_with_overflow returns saturated accumulator-format integer codes and per-output overflow flags. In canonical scale_per_tensor=False mode the division from Q8.8×Q16.16 products to Q16.16 outputs uses the same signed arithmetic shift as the hardware reference. With per-tensor scaling enabled, the host path carries tensor_scale in the manifest so deployment code can reconstruct compact stored weights without silently changing the physical output scale.

precision_trap_report packages the same saturated output codes and overflow mask into deterministic telemetry for host validation and HDL trap registers. The report manifest includes output_format, output_count, overflow_count, underflow_count, saturated_min_count, saturated_max_count, has_overflow, and has_underflow. Overflow means the realised output saturated at the configured Q-format bound. Underflow means a nonzero fixed-point product or BFP output collapsed below one output-code LSB and therefore produced a zero code that remains visible to safety review.

precision_envelope_report adds conservative predeployment range evidence. It returns realised output codes, realised overflow and underflow flags, per-output absolute bound codes, and a manifest containing observed_overflow_free, observed_underflow_free, conservative_overflow_free, max_abs_output_code, max_abs_bound_code, and min_headroom_code.

Block-floating dense deployment uses shared-exponent weight blocks with Q16.16 inputs and outputs:

compiled_bfp = compile_dense_block_floating(weights, fmt="BFP16E3X32")
outputs_q1616, overflow = compiled_bfp.forward_with_overflow(inputs)
outputs = compiled_bfp.forward_float(inputs)
trap_report = compiled_bfp.precision_trap_report(inputs)
envelope_report = compiled_bfp.precision_envelope_report(inputs)

BFP16E3X32 stores 16-bit signed mantissas and one 3-bit biased exponent per 32-weight block. The exponent range is the full encoded biased range: for three exponent bits, the unbiased range is [-3, +4]. The Python deployment path preserves the shared exponent metadata, saturates final Q16.16 output codes, and exposes overflow and sub-LSB underflow flags for hardware telemetry parity. Compiler manifests record the exact exponent bias (3 for BFP16E3X32), encoded exponent range [0, 7], maximum signed mantissa magnitude 32767, minimum quantum 0.125, maximum absolute value 524272.0, and the contiguous flattened block-alignment rule required by downstream RTL emitters. When the parameter count is known, manifests also carry an exact block_exponent_layout with parameter_count, block_size, exponent_count, last_block_size, and the exponent-index formula. The Python and Rust BFP surfaces reject mismatched exponent-vector lengths before accumulation, preventing an emitter from silently applying a shared exponent to the wrong parameter block. The maintained comparison benchmark also exercises a seeded BFP16E3X2 edge-sweep contract: exponent codes [0, 7, 0, 7] must produce exact safe Q16.16 codes [1056736, -1069024] with zero overflow/underflow, while a max-exponent saturating payload must raise one overflow trap and clamp to 2147483647 rather than wrapping.

Rounding Modes

ModeDescriptionUse Case
nearestRound to nearest representable valueDefault
stochasticProbabilistic roundingTraining
floorRound toward zeroConservative

4.4 Adaptive Precision

from sc_neurocore.compiler.adaptive_precision import AdaptivePrecisionConfig

config = AdaptivePrecisionConfig(
    low_precision=8,        # LP mode (Q4.4)
    high_precision=16,      # HP mode (Q8.8)
    switch_threshold=0.1,   # Switch to HP when gradient > 0.1
    hysteresis=0.05,        # Stay in HP until gradient < 0.05
)

4.5 IR Type Checker

from sc_neurocore.compiler.ir_type_checker import check_ir_types

errors = check_ir_types(graph)
if errors:
    for e in errors:
        print(f"Type error: {e}")

Signal types: BITSTREAM, RATE, SPIKE, FIXED, ANY.

4.6 Static Analysis

from sc_neurocore.compiler.static_analysis import (
    prove_fixed_point_envelope,
    prove_no_overflow,
    generate_sva,
    estimate_power,
)

# Guard bit computation
proof = prove_no_overflow(
    "-(v - E_L)/tau_m + I/C",
    bounds={"v": (-128, 127), "E_L": (-65, -65), "tau_m": (10, 10), "I": (0, 100), "C": (1, 1)},
    data_width=16,
    fraction=8,
)
print(f"Safe: {proof.proven_safe}, output range: {proof.expr_interval}")

# Conservative Q16.16 width proof for dense precision envelopes
envelope = prove_fixed_point_envelope(
    [531_400],
    total_bits=32,
    fractional_bits=16,
)
assert envelope.static_overflow_proven_safe
assert envelope.required_total_bits == 21
assert envelope.width_headroom_bits == 11

# SVA assertion generation
sva = generate_sva(
    state_vars=["v"],
    module_name="sc_lif",
    data_width=16,
    fraction=8,
)

# Power estimation
pe = estimate_power(
    verilog,
    data_width=16,
    freq_mhz=200.0,
    process_nm=28,
)

# Use measured VCD switching activity when available
pe_vcd = estimate_power(
    verilog,
    activity_vcd="build/sc_lif.vcd",
    vcd_time_units_per_cycle=5,
    freq_mhz=200.0,
)

4.7 Deployment

from sc_neurocore.compiler.deployment import (
    generate_constraints,
    generate_cocotb_testbench,
    generate_riscv_driver,
    generate_sby_script,
    compile_multi_target,
    format_comparison_table,
    estimate_resources,
)

# Timing constraints
xdc = generate_constraints("sc_lif", freq_mhz=200)

# Cocotb testbench
tb = generate_cocotb_testbench("sc_lif", data_width=16, fraction=8)

# RISC-V driver
driver = generate_riscv_driver(
    "sc_lif",
    params={"E_L": 16, "tau_m": 16},
    rtos="freertos",
)

# SymbiYosys formal
sby = generate_sby_script("sc_lif", mode="bmc", depth=20)

# Resource estimation
res = estimate_resources("sc_lif", verilog)
print(f"LUTs: {res.estimated_luts}, DSPs: {res.estimated_dsps}")

5. Pipeline Orchestration

5.1 Full Synthesis Flow

from sc_neurocore.compiler.pipeline import run_synthesis_pipeline

result = run_synthesis_pipeline(
    verilog_path="build/sc_lif.v",
    target="ice40",
    freq_mhz=100,
    output_dir="build/",
)
print(f"LUTs: {result.lut_count}")
print(f"FFs: {result.ff_count}")
print(f"Fmax: {result.fmax_mhz:.1f} MHz")

5.2 Pipeline Stages

StageToolInputOutput
ParsePython ASTODE stringIR graph
Emit_VerilogExprEmitterIR graphVerilog RTL
SynthesisYosysVerilogBLIF/JSON
P&RnextpnrBLIFBitstream

5.3 MLIR/CIRCT Path

from sc_neurocore.compiler import MLIREmitter, generate_mlir_bundle

emitter = MLIREmitter("sc_native_top")
# ... emit operations ...
bundle = generate_mlir_bundle(emitter, "build/mlir/sc_native_top")

The MLIR backend generates .mlir files and mlir_bundle_manifest.json. The manifest records operation counts and does not claim CIRCT lowering unless a downstream tool execution record is attached.


6. Data Types and Structures

6.1 CompilationResult

@dataclass
class CompilationResult:
    target: str
    verilog: str
    verilog_lines: int
    data_width: int
    fraction: int
    overflow: str           # "saturate" or "wrap"
    rounding: str           # "nearest", "stochastic", "floor"
    estimated_luts: int
    estimated_dsps: int
    estimated_ffs: int
    guard_bits: int
    max_freq_mhz: float

6.2 OverflowProofResult

@dataclass
class OverflowProofResult:
    safe: bool
    guard_bits: int
    max_intermediate_bits: int
    overflow_possible_vars: list[str]

6.3 PowerEstimate

@dataclass
class PowerEstimate:
    dynamic_mw: float
    static_mw: float
    total_mw: float
    energy_per_spike_nj: float
    toggle_rate: float

6.4 ResourceEstimate

@dataclass
class ResourceEstimate:
    estimated_luts: int
    estimated_dsps: int
    estimated_ffs: int
    estimated_bram_18k: int
    mul_count: int
    add_count: int
    register_bits: int

7. Performance Characteristics

7.1 Compilation Speed

Neuron TypeState VarsCompile TimeLines
LIF1~5 ms~80
Izhikevich2~8 ms~120
AdEx2~10 ms~140
HH4~20 ms~250
Custom (10 vars)10~50 ms~600

7.2 Generated Verilog Quality

MetricLIF Q8.8Izh Q16.16HH Q16.16
Lines80120250
LUTs (Artix-7)~80~200~500
DSPs138
Fmax450 MHz400 MHz350 MHz
Power (28nm)0.003 mW0.008 mW0.06 mW

7.3 LUT Accuracy

FunctionLUT EntriesRangeMax Error
exp16[-8, 8)1.5%
log16(0, 8)2.0%
tanh16[-8, 8)1.0%
sigmoid16[-8, 8)1.2%
sqrt16[0, 8)1.8%

8. Test Suite and Verification

8.1 Equation Compiler Test

python -c "
from sc_neurocore.neurons.equation_builder import from_equations
from sc_neurocore.compiler.equation_compiler import compile_to_verilog

n = from_equations('dv/dt = -(v-E_L)/tau_m + I/C',
    threshold='v > -50', reset='v = -65',
    params=dict(E_L=-65, tau_m=10, C=1), init=dict(v=-65))
v = compile_to_verilog(n, module_name='sc_lif')
assert 'module sc_lif' in v
assert 'spike' in v
assert len(v.splitlines()) > 50
print(f'Equation compiler: PASS ({len(v.splitlines())} lines)')
"

8.2 Multi-Target Compilation Test

python -c "
from sc_neurocore.neurons.equation_builder import from_equations
from sc_neurocore.compiler.deployment import compile_multi_target

n = from_equations('dv/dt = -(v)/tau + I',
    threshold='v > 0', reset='v = 0',
    params=dict(tau=10), init=dict(v=0))
results = compile_multi_target(n, ['artix7', 'ice40'], 'test')
assert len(results) == 2
print('Multi-target: PASS')
"

8.3 Quantizer Test

python -c "
from sc_neurocore.compiler.quantizer import quantize_weights
q = quantize_weights([1.0, -0.5, 0.25], data_width=16, fraction=8)
assert q[0] == 256   # 1.0 * 256
assert q[1] == -128   # -0.5 * 256
assert q[2] == 64     # 0.25 * 256
print('Quantizer: PASS')
"

8.4 Static Analysis Test

python -c "
from sc_neurocore.compiler.static_analysis import prove_no_overflow
r = prove_no_overflow(
    '-(v - E_L)/tau_m + I/C',
    bounds={'v': (-128, 127), 'E_L': (-65, -65), 'tau_m': (10, 10), 'I': (0, 100), 'C': (1, 1)},
    data_width=16, fraction=8,
)
assert r.proven_safe
print('Overflow proof: PASS')
"

8.5 SVA Generation Test

python -c "
from sc_neurocore.compiler.static_analysis import generate_sva
sva = generate_sva(['v'], module_name='sc_lif')
assert 'a_no_overflow_v' in sva
assert 'c_spike_reachable' in sva
print('SVA generation: PASS')
"

8.6 Power Estimation Test

python -c "
from sc_neurocore.compiler.static_analysis import estimate_power
v = 'wire signed [15:0] _mul0 = a * b; reg signed [15:0] v_reg;'
pe = estimate_power(v, data_width=16, freq_mhz=200, process_nm=28)
assert pe.total_mw > 0
print(f'Power: {pe.total_mw:.6f} mW — PASS')
"

8.7 Deployment Functions Test

python -c "
from sc_neurocore.compiler.deployment import (
    generate_constraints,
    generate_cocotb_testbench,
    generate_riscv_driver,
    generate_sby_script,
)

xdc = generate_constraints('test', freq_mhz=200)
assert 'create_clock' in xdc

tb = generate_cocotb_testbench('test', data_width=16, fraction=8)
assert 'cocotb' in tb

d = generate_riscv_driver('test', {'v': 16}, rtos='freertos')
assert 'xTaskCreate' in d

sby = generate_sby_script('test', mode='bmc', depth=10)
assert 'mode bmc' in sby

print('All deployment: PASS')
"

8.8 E2E Pipeline Test

python -m pytest tests/e2e/test_e2e_pipeline.py -v

8.9 Troubleshooting

SymptomCauseFix
compile_to_verilog failsInvalid ODE syntaxCheck equation string format
Overflow in simulationGuard bits insufficientIncrease data width
Yosys synthesis failsUnsupported Verilog constructCheck target compatibility
Power estimate zeroEmpty Verilog sourceVerify compilation output
MLIR bundle missing firtoolfirtool not installedInstall CIRCT toolchain

References

  1. Fixed-point arithmetic: Yates, R.B. "Fixed-Point Arithmetic: An Introduction." Digital Signal Labs, Technical Report, 2013.

  2. Yosys synthesis framework: Wolf, C. "Yosys Open SYnthesis Suite." https://yosyshq.net/yosys/, 2024.

  3. CIRCT project: LLVM Foundation. "Circuit IR Compilers and Tools." https://circt.llvm.org/, 2024.


Further Reading

Live-control MMIO Parameter Banks

MMIOUpdateSpec supports axi4_lite and pcie bus contracts for live parameter updates. Both protocols use the same deterministic register map:

RegisterOffsetPurpose
control0x00update, apply, rollback, and trap-clear control bits
status0x04ready, update acknowledgement, checksum, shadow, and trap status
bank_select0x08selected live parameter bank
entry_index0x0Cselected entry inside the bank
write_data_lo0x10low 32 bits of the staged encoded parameter word
write_data_hi0x14high 32 bits for 64-bit staged words
trap_status0x18sticky generated and external trap bits
trap_clear0x1Csticky trap clear register
write_checksum0x20IEEE CRC32 guard over bank, entry, and staged value

generate_live_parameter_bank() emits the AXI4-Lite core directly for bus_protocol="axi4_lite". For bus_protocol="pcie" it emits a PCIe-MMIO register-window adapter over that same core. The PCIe wrapper is deliberately a register-window contract: upstream PCIe hard IP or a board integration wrapper must decode posted writes and reads into the generated single-clock MMIO strobes. It is not a generated PCIe endpoint PHY.

Valid updates are fail-closed. The host must write bank select, entry index, low/high staged data, and the crc32-ieee-le-4x32 guard before asserting CONTROL_UPDATE_VALID; the active parameter output changes only after a separate CONTROL_COMMIT. The CRC32 payload is four little-endian 32-bit words: bank select, entry index, low data word, and high data word. Range traps latch staged overflow or underflow attempts and prevent shadow mutation. Active readback is fail-closed as well: invalid bank or entry selections on read_data_lo or read_data_hi return a bus error and latch invalid_selection rather than returning an ambiguous zero coefficient.