Files
WAFER/crates
ok2 4feeaeb0ba Optimize DO/LOOP: index/limit in WASM locals, J as IR primitive
Two-path DO/LOOP codegen based on static analysis of the loop body:

- Fast path (no calls, no >R/R> in body): index and limit live purely
  in WASM locals with zero return stack traffic per iteration. RFetch (I)
  and LoopJ (J) resolve to local.get instead of memory access.

- Slow path (body has calls or explicit RS ops): locals still used for
  loop control, but synced to return stack for LEAVE/UNLOOP compatibility.

Also converts J from a host function (WASM→Rust roundtrip per call) to
an IR primitive (IrOp::LoopJ) that compiles to local.get of the outer
loop's index local.

Performance impact (vs gforth, all opts enabled):
- Factorial: 1.02x → 0.94x (now faster than gforth)
- NestedLoops: 717x → 543x (24% faster, still bottlenecked by data stack)
- Fibonacci, GCD, Collatz: unchanged (don't use DO/LOOP)
2026-04-09 17:13:31 +02:00
..