Gene Expression Programming (GEP) in Go
May 11, 2026 · View on GitHub
github.com/gmlewis/gep/v2 is a typed Gene Expression Programming engine for
scientific and engineering search in Go.
The repository now has a clear default architecture:
coredefines typed genes, genomes, symbols, catalogs, and link operatorsevolutionruns typed population search with configurable mutation, recombination, transposition, selection, statistics, and terminationproblemsprovides reusable typed scoring helpers for common boolean and floating-point taskscodegenrenders evolved Karva programs through optional grammar backendsenvandgymnasiumprovide an exploratory environment/agent layer for discrete and tuple-space experimentation
Status
The primary workflow is the typed stack:
coreevolutionproblemscodegen
The env subsystem is usable for discrete and tuple-space agent experiments,
but it is still an exploratory RL adapter rather than a complete modern RL
framework.
Legacy gene and genome packages remain only as compatibility/reference
layers. New workflow code should not build on them.
Package map
Core engine
| Package | Role | Use it when |
|---|---|---|
core | Typed GEP representation and random genome construction | You need Node[T], Genome[T], Catalog[T], or direct genome evaluation |
evolution | Typed population search engine | You need seeded experiments, operators, stopping criteria, or per-generation statistics |
evolution/* | Operator and evaluation subsystems | You are tuning mutation, recombination, selection, transposition, termination, or statistics behavior |
problems | Reusable typed scoring seams | Your problem is a reusable boolean or regression task instead of a one-off experiment |
codegen | Grammar-backed code generation | You want Go (or other grammar-backed) source emitted from evolved Karva expressions |
functions/*_nodes | Ready-made node catalogs | You want to start from the built-in boolean, integer, float, or vector-int operators |
grammars | Code-generation grammars | You want to render evolved programs into source code |
env / gymnasium | Exploratory environment integration | You are experimenting with Gymnasium-style environments and discrete action/observation spaces |
experiments/* | End-to-end examples | You want concrete entrypoints that exercise the typed stack |
gene, genome | Legacy compatibility layers | You are maintaining compatibility code, not building new features |
Applied-design substrate
The applied-design packages provide a shared pipeline contract for
multi-domain discovery experiments:
evolve → decode → constrain → validate → promote → export → checkpoint
| Package | Role |
|---|---|
design | RunManifest schema, ArtifactRef, JSON helpers |
design/scenarios | ScenarioSet, ScenarioRegistry, train/validation/test splits |
design/promotion | PromotionReport, AcceptanceCriterion, threshold-driven promotion |
design/checkpoint | Snapshot save/load, manifest replay |
design/objectives | ObjectiveDef, AggregateResult, multi-objective scoring |
domains/circuit | Serializable circuit model, structural validation |
domains/circuit/artifacts | JSON, SPICE-netlist, and structural-Verilog emitters |
domains/circuit/scenarios | Embedded half-adder circuit scenario fixtures |
domains/voxel | Serializable voxel design types, occupancy validation |
domains/voxel/artifacts | JSON, OBJ (Wavefront mesh), and summary emitters |
domains/voxel/scenarios | Embedded bracket voxel scenario fixtures |
Quick start
The fastest path is:
- build or reuse a typed catalog
- define a typed scoring function over
core.Genome[T] - create a seeded
evolution.Generation[T] - evolve until the stop condition is met
- optionally render the result with
codegen
package main
import (
"fmt"
"log"
"github.com/gmlewis/gep/v2/core"
"github.com/gmlewis/gep/v2/evolution"
boolNodes "github.com/gmlewis/gep/v2/functions/bool_nodes"
)
var nandCases = []struct {
in []bool
out bool
}{
{[]bool{false, false}, true},
{[]bool{false, true}, true},
{[]bool{true, false}, true},
{[]bool{true, true}, false},
}
func scoreNAND(g core.Genome[bool]) float64 {
hits := 0
for _, tc := range nandCases {
got, err := g.Eval(tc.in)
if err != nil {
return 0
}
if got == tc.out {
hits++
}
}
return 1000.0 * float64(hits) / float64(len(nandCases))
}
func main() {
cat, err := boolNodes.CatalogFromNames([]string{"Not", "And", "Or"})
if err != nil {
log.Fatal(err)
}
link, err := boolNodes.LinkFuncFrom("Or")
if err != nil {
log.Fatal(err)
}
gen, err := evolution.NewWithSeed(42, cat, 30, 7, 1, 2, 0, link, scoreNAND)
if err != nil {
log.Fatal(err)
}
best := gen.Evolve(250)
fmt.Println(best.Score)
fmt.Println(best.Genome.KarvaString())
}
Optional code generation
If you want source code from an evolved genome, convert it to a codegen.Program
and render it with a grammar:
prog := codegen.ProgramFromSymbols(
best.Genome.SymbolNamesPerGene(),
nil,
best.Genome.Link.Symbol(),
)
grammar, err := grammars.LoadGoBooleanAllGatesGrammar()
if err != nil {
return err
}
return codegen.Write(os.Stdout, prog, grammar)
See:
experiments/nandexperiments/symbolic_regression
Reproducible experiments
Use evolution.NewWithSeed whenever you care about deterministic replay.
For each run, record at least:
- seed
- package version / commit SHA
- catalog contents
- link operator
- population size and gene geometry
- mutation, recombination, and transposition configs
- stopping criteria
- scoring function definition and dataset/problem snapshot
If you emit code or downstream artifacts, store the final KarvaString,
SymbolNamesPerGene, constants, and rendered output together.
Extending the engine
Add a new typed domain
- Define a Go type for your terminals and gene outputs.
- Implement
core.Node[T]for each function/operator in the domain. - Register those nodes in a
core.Catalog[T]. - Define a typed link operator with
core.NewLinkFunc. - Write a scoring function over
core.Genome[T]. - Evolve with
evolution.Neworevolution.NewWithSeed.
Add reusable problems
Put reusable scoring/problem definitions into problems or a sibling package
with typed seams. Keep one-off experiment scoring logic close to the experiment
entrypoint.
Add code generation
If the output can be expressed through the grammar system, use codegen and
grammars. If not, treat the evolved genome as an intermediate representation
and write a domain-specific emitter.
Add RL or simulator-backed workflows
Use core and evolution as the search engine, then place simulator calls,
reward aggregation, train/validation splits, and artifact generation in a
domain-specific package. The current env package is a useful reference for
agent orchestration, but advanced RL work will often want a richer typed layer.
Included entrypoints
Classic GEP experiments
go run ./experiments/nandgo run ./experiments/odd-3-paritygo run ./experiments/odd-7-paritygo run ./experiments/6-multiplexergo run ./experiments/symbolic_regressiongo run ./examples/gymnasium/toy_text/blackjack-go
Applied-design pilots
These pilots demonstrate the full applied-design pipeline across three domains:
go run ./experiments/circuit/half_adder— evolves a boolean half-adder and exports SPICE/Verilog artifactsgo run ./experiments/voxel/bracket— evolves a voxel bracket geometry and exports JSON/OBJ artifactsgo run ./experiments/control/mass_spring_damper— evolves a controller policy and exports a controller JSON artifact
Cross-domain regression suite (runs all three full pipelines as a single gate):
go test ./experiments/regression/...
Quality gates
Repo-level verification:
./scripts/test-all.sh./scripts/bench-all.sh
GitHub Actions runs CI and benchmark workflows from .github/workflows/.
License
Copyright 2014-2026 Google Inc. All Rights Reserved.
Licensed under the Apache License, Version 2.0. See LICENSE.