Optimize DO/LOOP: index/limit in WASM locals, J as IR primitive
Two-path DO/LOOP codegen based on static analysis of the loop body: - Fast path (no calls, no >R/R> in body): index and limit live purely in WASM locals with zero return stack traffic per iteration. RFetch (I) and LoopJ (J) resolve to local.get instead of memory access. - Slow path (body has calls or explicit RS ops): locals still used for loop control, but synced to return stack for LEAVE/UNLOOP compatibility. Also converts J from a host function (WASM→Rust roundtrip per call) to an IR primitive (IrOp::LoopJ) that compiles to local.get of the outer loop's index local. Performance impact (vs gforth, all opts enabled): - Factorial: 1.02x → 0.94x (now faster than gforth) - NestedLoops: 717x → 543x (24% faster, still bottlenecked by data stack) - Fibonacci, GCD, Collatz: unchanged (don't use DO/LOOP)
This commit is contained in:
@@ -119,6 +119,9 @@ pub enum IrOp {
|
||||
FromR,
|
||||
/// Copy from return stack: ( -- x ) ( R: x -- x )
|
||||
RFetch,
|
||||
/// Read outer DO/LOOP index (J): ( -- n )
|
||||
/// Compiled to local.get when loop locals are available.
|
||||
LoopJ,
|
||||
|
||||
// -- Forth locals (from {: ... :} syntax) --
|
||||
/// Get Forth local variable N: ( -- x )
|
||||
|
||||
Reference in New Issue
Block a user