Update documentation to reflect current implementation state

README now documents all 70+ implemented words, working examples,
architecture overview, and accurate compliance status.
CLAUDE.md updated with actual file descriptions, patterns for adding
new words, and current test count.
This commit is contained in:
2026-03-29 23:14:54 +02:00
parent 5e08f7a11f
commit ca07d358fb
2 changed files with 111 additions and 48 deletions
+44 -27
View File
@@ -1,40 +1,57 @@
# WAFER Project Conventions
## What is WAFER?
WAFER (WebAssembly Forth Engine in Rust) is an optimizing Forth 2012 compiler targeting WebAssembly.
WAFER (WebAssembly Forth Engine in Rust) is an optimizing Forth 2012 compiler targeting WebAssembly. Currently a working Forth system with 70+ words and JIT compilation.
## Architecture
- Rust kernel (~35 primitives) + Forth standard library (everything else in .fth files)
- IR-based compilation pipeline: Forth -> IR -> type inference -> optimize -> WASM codegen
- Multi-typed stack: use WASM's native typed stack when types are known via inference, fall back to linear memory for dynamic/polymorphic cases
- Subroutine threading via WASM function tables
- JIT mode: per-word WASM modules + shared function table
- Consolidation mode: recompile all words into single optimized WASM module
- Each Forth word compiles to its own WASM module via `wasm-encoder`
- Modules share memory, globals (dsp/rsp), and a function table via wasmtime imports
- IR-based compilation: Forth -> `Vec<IrOp>` -> WASM codegen -> wasmtime instantiation
- Dictionary: linked-list in a `Vec<u8>` buffer simulating WASM linear memory
- Primitives: either IR-based (compiled to WASM) or host functions (Rust closures in wasmtime)
## Key Files
- `crates/core/src/outer.rs` -- ForthVM: the main runtime, outer interpreter, compiler, all primitives
- `crates/core/src/codegen.rs` -- IR-to-WASM translation, module generation, wasmtime execution tests
- `crates/core/src/dictionary.rs` -- Dictionary data structure with create/find/reveal
- `crates/core/src/ir.rs` -- IrOp enum (the intermediate representation)
- `crates/core/src/memory.rs` -- Memory layout constants (stack regions, dictionary base, etc.)
- `crates/cli/src/main.rs` -- CLI REPL with rustyline
## Adding a New Word
**IR primitive** (simple stack/arithmetic/logic -- preferred when possible):
```rust
self.register_primitive("WORD_NAME", false, vec![IrOp::Dup, IrOp::Mul])?;
```
**Host function** (needs Rust logic -- I/O, dictionary manipulation, complex stack access):
```rust
let func = Func::new(&mut self.store, func_type.clone(), move |mut caller, _params, _results| {
// manipulate memory/globals directly
Ok(())
});
self.register_host_primitive("WORD_NAME", false, func)?;
```
**Special interpreter token** (defining words like VARIABLE, CONSTANT, CREATE):
Handle in `interpret_token_immediate()` or `compile_token()` as a special case.
## Code Style
- `cargo fmt` and `cargo clippy` must pass with no warnings
- `cargo fmt --all` and `cargo clippy --workspace` must pass with no warnings
- Every public function needs a doc comment
- Every module needs unit tests
- Use `thiserror` for error types in core crate, `anyhow` for CLI crate
- Use `thiserror` for error types in core crate, `anyhow` for CLI
- Prefer returning `Result` over panicking
## Testing (Critical)
- **Specs-driven TDD**: Every feature starts with its failing test, then implementation
- Run `cargo test --workspace` before committing
- Forth 2012 compliance: `cargo test --test compliance`
- Property-based tests with `proptest` for numeric operations and optimizer correctness
- Snapshot tests with `insta` for IR and WASM output
- 100% compliance is mandatory for each implemented word set before moving on
- Never break existing compliance tests
## Forth Source (.fth files)
- One file per word set in `forth/`
- Document each word with standard stack effect notation: `( before -- after )`
- Maximize words written in Forth, minimize Rust primitives
- Boot order: boot.fth -> core.fth -> core_ext.fth -> ... -> prelude.fth
## Testing
- Run `cargo test --workspace` before committing (currently 185 tests)
- Forth 2012 compliance: `cargo test -p wafer-core --test compliance`
- Test helper in outer.rs: `eval_output("forth code")` returns printed output as String
- Test helper: `eval_stack("forth code")` returns data stack as Vec<i32>
## Key Principles
1. Maximize Forth, minimize Rust (self-hosting goal)
2. Correctness first, performance second
1. Correctness first, performance second
2. Maximize Forth, minimize Rust (self-hosting goal -- not yet started)
3. Test-driven: if it's not tested, it doesn't work
4. Every word set at 100% compliance before moving to the next
4. Never break existing tests
5. No Co-Authored-By or AI attribution in commits
+67 -21
View File
@@ -4,24 +4,51 @@
An optimizing Forth 2012 compiler targeting WebAssembly.
## Status
WAFER is a working Forth system. It JIT-compiles each word definition to a separate WASM module and executes via `wasmtime`. 185 tests passing.
**Working features:**
- Colon definitions with full control flow (IF/ELSE/THEN, DO/LOOP/+LOOP, BEGIN/UNTIL, BEGIN/WHILE/REPEAT)
- 70+ words: stack, arithmetic, comparison, logic, memory, I/O, defining words, system
- Recursion (RECURSE), nested control structures, loop counters (I, J)
- VARIABLE, CONSTANT, CREATE
- Number bases (HEX, DECIMAL), number prefixes ($hex, #dec, %bin)
- Comments (backslash, parentheses), string output (." ...)
- Interactive REPL with line editing
**Example session:**
```forth
: FIB DUP 2 < IF DROP 1 ELSE DUP 1 - RECURSE SWAP 2 - RECURSE + THEN ;
: FIBS 0 DO I FIB . LOOP ;
12 FIBS CR \ prints: 1 1 2 3 5 8 13 21 34 55 89 144
VARIABLE COUNTER 0 COUNTER !
: BUMP COUNTER @ 1 + COUNTER ! ;
BUMP BUMP BUMP COUNTER @ . \ prints: 3
```
## Goals
- **Full Forth 2012 compliance** -- all word sets, 100% test suite pass rate
- **Optimizing compiler** -- constant folding, inlining, peephole optimization, stack-to-local promotion
- **Multi-typed stack** -- type inference uses WASM's native typed stack when possible, memory fallback for dynamic cases
- **Self-hosting** -- minimal Rust kernel (~35 primitives), everything else written in WAFER Forth
- **Dual mode** -- JIT for interactive development, consolidation recompilation into a single optimized WASM module
- **Multi-typed stack** -- type inference uses WASM's native typed stack when possible
- **Self-hosting** -- minimal Rust kernel (~35 primitives), everything else in WAFER Forth
- **Consolidation mode** -- recompile all JIT words into a single optimized WASM module
## Architecture
```
Forth Source -> Outer Interpreter -> IR (with type inference) -> Optimization -> WASM Codegen
Forth Source -> Outer Interpreter -> IR -> [Optimize] -> WASM Codegen (wasm-encoder)
|
wasmtime instantiation
(shared memory + table)
```
- **Subroutine threading** via WASM function tables
- **JIT mode**: each new word compiles to a separate WASM module
- **Consolidation mode**: recompile all words into a single module with direct calls
- **IR-based pipeline** enables optimization passes before WASM emission
- **Subroutine threading** via WASM function tables and `call_indirect`
- **JIT mode**: each new word compiles to a separate WASM module linked to shared memory/globals/table
- **IR-based pipeline** enables future optimization passes before WASM emission
- **Dictionary**: linked-list word headers in simulated linear memory
## Building
@@ -38,43 +65,62 @@ cargo run -p wafer
# Run a Forth file
cargo run -p wafer -- file.fth
# Compile to optimized WASM
cargo run -p wafer -- --consolidate file.fth -o output.wasm
# Pipe input
echo ': SQUARE DUP * ; 7 SQUARE .' | cargo run -p wafer
```
## Testing
```bash
# All tests
# All tests (185 currently passing)
cargo test --workspace
# Forth 2012 compliance dashboard
cargo test --test compliance
cargo test -p wafer-core --test compliance
# Benchmarks
cargo bench --workspace
# Lints
cargo clippy --workspace
```
## Project Structure
```
crates/
core/ wafer-core: compiler, IR, optimizer, codegen
cli/ wafer: CLI REPL and AOT compiler
core/ wafer-core: dictionary, IR, codegen (wasm-encoder), outer interpreter
cli/ wafer: CLI REPL and file execution (wasmtime, rustyline)
web/ wafer-web: browser bindings (planned)
forth/ Standard library written in WAFER Forth
tests/ Integration tests and Forth 2012 compliance suite
forth/ Standard library in WAFER Forth (planned, currently stubs)
tests/ Forth 2012 compliance suite (gerryjackson/forth2012-test-suite submodule)
```
## Implemented Words
### Core (Forth 2012 Section 6.1) -- In Progress
**Stack:** DUP DROP SWAP OVER ROT NIP TUCK 2DUP 2DROP 2SWAP 2OVER ?DUP PICK DEPTH
**Arithmetic:** + - * / MOD /MOD NEGATE ABS MIN MAX 1+ 1- 2* 2/
**Comparison:** = <> < > U< 0= 0< 0<> 0> WITHIN
**Logic:** AND OR XOR INVERT LSHIFT RSHIFT
**Memory:** @ ! C@ C! +! HERE ALLOT , C, CELLS CELL+ CHARS CHAR+ ALIGNED ALIGN MOVE FILL
**Control (compile-time):** IF ELSE THEN DO LOOP +LOOP I J UNLOOP LEAVE BEGIN UNTIL WHILE REPEAT RECURSE EXIT
**Defining:** : ; VARIABLE CONSTANT CREATE IMMEDIATE
**I/O:** . .S CR EMIT SPACE SPACES TYPE ." S"
**Return stack:** >R R> R@
**System:** EXECUTE ' CHAR [CHAR] ['] DECIMAL HEX BASE >BODY ENVIRONMENT? SOURCE ABORT TRUE FALSE BL
**Compiler:** LITERAL POSTPONE [ ]
### Not Yet Implemented
DOES> EVALUATE >NUMBER ACCEPT WORD FIND COUNT CMOVE CMOVE> >IN #TIB STATE (as variable) ABORT" and others needed for full Core compliance.
## Compliance Status
Targeting 100% Forth 2012 compliance, tested with [Gerry Jackson's forth2012-test-suite](https://github.com/gerryjackson/forth2012-test-suite).
Targeting 100% Forth 2012 compliance via [Gerry Jackson's test suite](https://github.com/gerryjackson/forth2012-test-suite).
| Word Set | Status |
|----------|--------|
| Core | Pending |
| Core | In progress (~70%) |
| Core Extensions | Pending |
| Block | N/A |
| Double-Number | Pending |
| Exception | Pending |
| Facility | Pending |