35 behavioral tests across 8 categories verify identical output between
WAFER and gforth. Performance benchmarks compare execution speed for
Fibonacci, Factorial, GCD, NestedLoops, and Collatz workloads.
WAFER-only correctness tests run in CI without gforth; cross-engine
comparison and performance report are opt-in via --ignored.