From ca07d358fb5af2aca2fa7610e1132a9d6f6ffd49 Mon Sep 17 00:00:00 2001 From: Oleksandr Kozachuk Date: Sun, 29 Mar 2026 23:14:54 +0200 Subject: [PATCH] Update documentation to reflect current implementation state README now documents all 70+ implemented words, working examples, architecture overview, and accurate compliance status. CLAUDE.md updated with actual file descriptions, patterns for adding new words, and current test count. --- CLAUDE.md | 71 +++++++++++++++++++++++++++----------------- README.md | 88 ++++++++++++++++++++++++++++++++++++++++++------------- 2 files changed, 111 insertions(+), 48 deletions(-) diff --git a/CLAUDE.md b/CLAUDE.md index 3846c48..56722e4 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -1,40 +1,57 @@ # WAFER Project Conventions ## What is WAFER? -WAFER (WebAssembly Forth Engine in Rust) is an optimizing Forth 2012 compiler targeting WebAssembly. +WAFER (WebAssembly Forth Engine in Rust) is an optimizing Forth 2012 compiler targeting WebAssembly. Currently a working Forth system with 70+ words and JIT compilation. ## Architecture -- Rust kernel (~35 primitives) + Forth standard library (everything else in .fth files) -- IR-based compilation pipeline: Forth -> IR -> type inference -> optimize -> WASM codegen -- Multi-typed stack: use WASM's native typed stack when types are known via inference, fall back to linear memory for dynamic/polymorphic cases -- Subroutine threading via WASM function tables -- JIT mode: per-word WASM modules + shared function table -- Consolidation mode: recompile all words into single optimized WASM module +- Each Forth word compiles to its own WASM module via `wasm-encoder` +- Modules share memory, globals (dsp/rsp), and a function table via wasmtime imports +- IR-based compilation: Forth -> `Vec` -> WASM codegen -> wasmtime instantiation +- Dictionary: linked-list in a `Vec` buffer simulating WASM linear memory +- Primitives: either IR-based (compiled to WASM) or host functions (Rust closures in wasmtime) + +## Key Files +- `crates/core/src/outer.rs` -- ForthVM: the main runtime, outer interpreter, compiler, all primitives +- `crates/core/src/codegen.rs` -- IR-to-WASM translation, module generation, wasmtime execution tests +- `crates/core/src/dictionary.rs` -- Dictionary data structure with create/find/reveal +- `crates/core/src/ir.rs` -- IrOp enum (the intermediate representation) +- `crates/core/src/memory.rs` -- Memory layout constants (stack regions, dictionary base, etc.) +- `crates/cli/src/main.rs` -- CLI REPL with rustyline + +## Adding a New Word + +**IR primitive** (simple stack/arithmetic/logic -- preferred when possible): +```rust +self.register_primitive("WORD_NAME", false, vec![IrOp::Dup, IrOp::Mul])?; +``` + +**Host function** (needs Rust logic -- I/O, dictionary manipulation, complex stack access): +```rust +let func = Func::new(&mut self.store, func_type.clone(), move |mut caller, _params, _results| { + // manipulate memory/globals directly + Ok(()) +}); +self.register_host_primitive("WORD_NAME", false, func)?; +``` + +**Special interpreter token** (defining words like VARIABLE, CONSTANT, CREATE): +Handle in `interpret_token_immediate()` or `compile_token()` as a special case. ## Code Style -- `cargo fmt` and `cargo clippy` must pass with no warnings +- `cargo fmt --all` and `cargo clippy --workspace` must pass with no warnings - Every public function needs a doc comment -- Every module needs unit tests -- Use `thiserror` for error types in core crate, `anyhow` for CLI crate +- Use `thiserror` for error types in core crate, `anyhow` for CLI - Prefer returning `Result` over panicking -## Testing (Critical) -- **Specs-driven TDD**: Every feature starts with its failing test, then implementation -- Run `cargo test --workspace` before committing -- Forth 2012 compliance: `cargo test --test compliance` -- Property-based tests with `proptest` for numeric operations and optimizer correctness -- Snapshot tests with `insta` for IR and WASM output -- 100% compliance is mandatory for each implemented word set before moving on -- Never break existing compliance tests - -## Forth Source (.fth files) -- One file per word set in `forth/` -- Document each word with standard stack effect notation: `( before -- after )` -- Maximize words written in Forth, minimize Rust primitives -- Boot order: boot.fth -> core.fth -> core_ext.fth -> ... -> prelude.fth +## Testing +- Run `cargo test --workspace` before committing (currently 185 tests) +- Forth 2012 compliance: `cargo test -p wafer-core --test compliance` +- Test helper in outer.rs: `eval_output("forth code")` returns printed output as String +- Test helper: `eval_stack("forth code")` returns data stack as Vec ## Key Principles -1. Maximize Forth, minimize Rust (self-hosting goal) -2. Correctness first, performance second +1. Correctness first, performance second +2. Maximize Forth, minimize Rust (self-hosting goal -- not yet started) 3. Test-driven: if it's not tested, it doesn't work -4. Every word set at 100% compliance before moving to the next +4. Never break existing tests +5. No Co-Authored-By or AI attribution in commits diff --git a/README.md b/README.md index 9441ccf..f583ba1 100644 --- a/README.md +++ b/README.md @@ -4,24 +4,51 @@ An optimizing Forth 2012 compiler targeting WebAssembly. +## Status + +WAFER is a working Forth system. It JIT-compiles each word definition to a separate WASM module and executes via `wasmtime`. 185 tests passing. + +**Working features:** +- Colon definitions with full control flow (IF/ELSE/THEN, DO/LOOP/+LOOP, BEGIN/UNTIL, BEGIN/WHILE/REPEAT) +- 70+ words: stack, arithmetic, comparison, logic, memory, I/O, defining words, system +- Recursion (RECURSE), nested control structures, loop counters (I, J) +- VARIABLE, CONSTANT, CREATE +- Number bases (HEX, DECIMAL), number prefixes ($hex, #dec, %bin) +- Comments (backslash, parentheses), string output (." ...) +- Interactive REPL with line editing + +**Example session:** +```forth +: FIB DUP 2 < IF DROP 1 ELSE DUP 1 - RECURSE SWAP 2 - RECURSE + THEN ; +: FIBS 0 DO I FIB . LOOP ; +12 FIBS CR \ prints: 1 1 2 3 5 8 13 21 34 55 89 144 + +VARIABLE COUNTER 0 COUNTER ! +: BUMP COUNTER @ 1 + COUNTER ! ; +BUMP BUMP BUMP COUNTER @ . \ prints: 3 +``` + ## Goals - **Full Forth 2012 compliance** -- all word sets, 100% test suite pass rate - **Optimizing compiler** -- constant folding, inlining, peephole optimization, stack-to-local promotion -- **Multi-typed stack** -- type inference uses WASM's native typed stack when possible, memory fallback for dynamic cases -- **Self-hosting** -- minimal Rust kernel (~35 primitives), everything else written in WAFER Forth -- **Dual mode** -- JIT for interactive development, consolidation recompilation into a single optimized WASM module +- **Multi-typed stack** -- type inference uses WASM's native typed stack when possible +- **Self-hosting** -- minimal Rust kernel (~35 primitives), everything else in WAFER Forth +- **Consolidation mode** -- recompile all JIT words into a single optimized WASM module ## Architecture ``` -Forth Source -> Outer Interpreter -> IR (with type inference) -> Optimization -> WASM Codegen +Forth Source -> Outer Interpreter -> IR -> [Optimize] -> WASM Codegen (wasm-encoder) + | + wasmtime instantiation + (shared memory + table) ``` -- **Subroutine threading** via WASM function tables -- **JIT mode**: each new word compiles to a separate WASM module -- **Consolidation mode**: recompile all words into a single module with direct calls -- **IR-based pipeline** enables optimization passes before WASM emission +- **Subroutine threading** via WASM function tables and `call_indirect` +- **JIT mode**: each new word compiles to a separate WASM module linked to shared memory/globals/table +- **IR-based pipeline** enables future optimization passes before WASM emission +- **Dictionary**: linked-list word headers in simulated linear memory ## Building @@ -38,43 +65,62 @@ cargo run -p wafer # Run a Forth file cargo run -p wafer -- file.fth -# Compile to optimized WASM -cargo run -p wafer -- --consolidate file.fth -o output.wasm +# Pipe input +echo ': SQUARE DUP * ; 7 SQUARE .' | cargo run -p wafer ``` ## Testing ```bash -# All tests +# All tests (185 currently passing) cargo test --workspace # Forth 2012 compliance dashboard -cargo test --test compliance +cargo test -p wafer-core --test compliance -# Benchmarks -cargo bench --workspace +# Lints +cargo clippy --workspace ``` ## Project Structure ``` crates/ - core/ wafer-core: compiler, IR, optimizer, codegen - cli/ wafer: CLI REPL and AOT compiler + core/ wafer-core: dictionary, IR, codegen (wasm-encoder), outer interpreter + cli/ wafer: CLI REPL and file execution (wasmtime, rustyline) web/ wafer-web: browser bindings (planned) -forth/ Standard library written in WAFER Forth -tests/ Integration tests and Forth 2012 compliance suite +forth/ Standard library in WAFER Forth (planned, currently stubs) +tests/ Forth 2012 compliance suite (gerryjackson/forth2012-test-suite submodule) ``` +## Implemented Words + +### Core (Forth 2012 Section 6.1) -- In Progress + +**Stack:** DUP DROP SWAP OVER ROT NIP TUCK 2DUP 2DROP 2SWAP 2OVER ?DUP PICK DEPTH +**Arithmetic:** + - * / MOD /MOD NEGATE ABS MIN MAX 1+ 1- 2* 2/ +**Comparison:** = <> < > U< 0= 0< 0<> 0> WITHIN +**Logic:** AND OR XOR INVERT LSHIFT RSHIFT +**Memory:** @ ! C@ C! +! HERE ALLOT , C, CELLS CELL+ CHARS CHAR+ ALIGNED ALIGN MOVE FILL +**Control (compile-time):** IF ELSE THEN DO LOOP +LOOP I J UNLOOP LEAVE BEGIN UNTIL WHILE REPEAT RECURSE EXIT +**Defining:** : ; VARIABLE CONSTANT CREATE IMMEDIATE +**I/O:** . .S CR EMIT SPACE SPACES TYPE ." S" +**Return stack:** >R R> R@ +**System:** EXECUTE ' CHAR [CHAR] ['] DECIMAL HEX BASE >BODY ENVIRONMENT? SOURCE ABORT TRUE FALSE BL +**Compiler:** LITERAL POSTPONE [ ] + +### Not Yet Implemented + +DOES> EVALUATE >NUMBER ACCEPT WORD FIND COUNT CMOVE CMOVE> >IN #TIB STATE (as variable) ABORT" and others needed for full Core compliance. + ## Compliance Status -Targeting 100% Forth 2012 compliance, tested with [Gerry Jackson's forth2012-test-suite](https://github.com/gerryjackson/forth2012-test-suite). +Targeting 100% Forth 2012 compliance via [Gerry Jackson's test suite](https://github.com/gerryjackson/forth2012-test-suite). | Word Set | Status | |----------|--------| -| Core | Pending | +| Core | In progress (~70%) | | Core Extensions | Pending | -| Block | N/A | | Double-Number | Pending | | Exception | Pending | | Facility | Pending |