Update documentation to reflect current implementation state
README now documents all 70+ implemented words, working examples, architecture overview, and accurate compliance status. CLAUDE.md updated with actual file descriptions, patterns for adding new words, and current test count.
This commit is contained in:
@@ -1,40 +1,57 @@
|
||||
# WAFER Project Conventions
|
||||
|
||||
## What is WAFER?
|
||||
WAFER (WebAssembly Forth Engine in Rust) is an optimizing Forth 2012 compiler targeting WebAssembly.
|
||||
WAFER (WebAssembly Forth Engine in Rust) is an optimizing Forth 2012 compiler targeting WebAssembly. Currently a working Forth system with 70+ words and JIT compilation.
|
||||
|
||||
## Architecture
|
||||
- Rust kernel (~35 primitives) + Forth standard library (everything else in .fth files)
|
||||
- IR-based compilation pipeline: Forth -> IR -> type inference -> optimize -> WASM codegen
|
||||
- Multi-typed stack: use WASM's native typed stack when types are known via inference, fall back to linear memory for dynamic/polymorphic cases
|
||||
- Subroutine threading via WASM function tables
|
||||
- JIT mode: per-word WASM modules + shared function table
|
||||
- Consolidation mode: recompile all words into single optimized WASM module
|
||||
- Each Forth word compiles to its own WASM module via `wasm-encoder`
|
||||
- Modules share memory, globals (dsp/rsp), and a function table via wasmtime imports
|
||||
- IR-based compilation: Forth -> `Vec<IrOp>` -> WASM codegen -> wasmtime instantiation
|
||||
- Dictionary: linked-list in a `Vec<u8>` buffer simulating WASM linear memory
|
||||
- Primitives: either IR-based (compiled to WASM) or host functions (Rust closures in wasmtime)
|
||||
|
||||
## Key Files
|
||||
- `crates/core/src/outer.rs` -- ForthVM: the main runtime, outer interpreter, compiler, all primitives
|
||||
- `crates/core/src/codegen.rs` -- IR-to-WASM translation, module generation, wasmtime execution tests
|
||||
- `crates/core/src/dictionary.rs` -- Dictionary data structure with create/find/reveal
|
||||
- `crates/core/src/ir.rs` -- IrOp enum (the intermediate representation)
|
||||
- `crates/core/src/memory.rs` -- Memory layout constants (stack regions, dictionary base, etc.)
|
||||
- `crates/cli/src/main.rs` -- CLI REPL with rustyline
|
||||
|
||||
## Adding a New Word
|
||||
|
||||
**IR primitive** (simple stack/arithmetic/logic -- preferred when possible):
|
||||
```rust
|
||||
self.register_primitive("WORD_NAME", false, vec![IrOp::Dup, IrOp::Mul])?;
|
||||
```
|
||||
|
||||
**Host function** (needs Rust logic -- I/O, dictionary manipulation, complex stack access):
|
||||
```rust
|
||||
let func = Func::new(&mut self.store, func_type.clone(), move |mut caller, _params, _results| {
|
||||
// manipulate memory/globals directly
|
||||
Ok(())
|
||||
});
|
||||
self.register_host_primitive("WORD_NAME", false, func)?;
|
||||
```
|
||||
|
||||
**Special interpreter token** (defining words like VARIABLE, CONSTANT, CREATE):
|
||||
Handle in `interpret_token_immediate()` or `compile_token()` as a special case.
|
||||
|
||||
## Code Style
|
||||
- `cargo fmt` and `cargo clippy` must pass with no warnings
|
||||
- `cargo fmt --all` and `cargo clippy --workspace` must pass with no warnings
|
||||
- Every public function needs a doc comment
|
||||
- Every module needs unit tests
|
||||
- Use `thiserror` for error types in core crate, `anyhow` for CLI crate
|
||||
- Use `thiserror` for error types in core crate, `anyhow` for CLI
|
||||
- Prefer returning `Result` over panicking
|
||||
|
||||
## Testing (Critical)
|
||||
- **Specs-driven TDD**: Every feature starts with its failing test, then implementation
|
||||
- Run `cargo test --workspace` before committing
|
||||
- Forth 2012 compliance: `cargo test --test compliance`
|
||||
- Property-based tests with `proptest` for numeric operations and optimizer correctness
|
||||
- Snapshot tests with `insta` for IR and WASM output
|
||||
- 100% compliance is mandatory for each implemented word set before moving on
|
||||
- Never break existing compliance tests
|
||||
|
||||
## Forth Source (.fth files)
|
||||
- One file per word set in `forth/`
|
||||
- Document each word with standard stack effect notation: `( before -- after )`
|
||||
- Maximize words written in Forth, minimize Rust primitives
|
||||
- Boot order: boot.fth -> core.fth -> core_ext.fth -> ... -> prelude.fth
|
||||
## Testing
|
||||
- Run `cargo test --workspace` before committing (currently 185 tests)
|
||||
- Forth 2012 compliance: `cargo test -p wafer-core --test compliance`
|
||||
- Test helper in outer.rs: `eval_output("forth code")` returns printed output as String
|
||||
- Test helper: `eval_stack("forth code")` returns data stack as Vec<i32>
|
||||
|
||||
## Key Principles
|
||||
1. Maximize Forth, minimize Rust (self-hosting goal)
|
||||
2. Correctness first, performance second
|
||||
1. Correctness first, performance second
|
||||
2. Maximize Forth, minimize Rust (self-hosting goal -- not yet started)
|
||||
3. Test-driven: if it's not tested, it doesn't work
|
||||
4. Every word set at 100% compliance before moving to the next
|
||||
4. Never break existing tests
|
||||
5. No Co-Authored-By or AI attribution in commits
|
||||
|
||||
@@ -4,24 +4,51 @@
|
||||
|
||||
An optimizing Forth 2012 compiler targeting WebAssembly.
|
||||
|
||||
## Status
|
||||
|
||||
WAFER is a working Forth system. It JIT-compiles each word definition to a separate WASM module and executes via `wasmtime`. 185 tests passing.
|
||||
|
||||
**Working features:**
|
||||
- Colon definitions with full control flow (IF/ELSE/THEN, DO/LOOP/+LOOP, BEGIN/UNTIL, BEGIN/WHILE/REPEAT)
|
||||
- 70+ words: stack, arithmetic, comparison, logic, memory, I/O, defining words, system
|
||||
- Recursion (RECURSE), nested control structures, loop counters (I, J)
|
||||
- VARIABLE, CONSTANT, CREATE
|
||||
- Number bases (HEX, DECIMAL), number prefixes ($hex, #dec, %bin)
|
||||
- Comments (backslash, parentheses), string output (." ...)
|
||||
- Interactive REPL with line editing
|
||||
|
||||
**Example session:**
|
||||
```forth
|
||||
: FIB DUP 2 < IF DROP 1 ELSE DUP 1 - RECURSE SWAP 2 - RECURSE + THEN ;
|
||||
: FIBS 0 DO I FIB . LOOP ;
|
||||
12 FIBS CR \ prints: 1 1 2 3 5 8 13 21 34 55 89 144
|
||||
|
||||
VARIABLE COUNTER 0 COUNTER !
|
||||
: BUMP COUNTER @ 1 + COUNTER ! ;
|
||||
BUMP BUMP BUMP COUNTER @ . \ prints: 3
|
||||
```
|
||||
|
||||
## Goals
|
||||
|
||||
- **Full Forth 2012 compliance** -- all word sets, 100% test suite pass rate
|
||||
- **Optimizing compiler** -- constant folding, inlining, peephole optimization, stack-to-local promotion
|
||||
- **Multi-typed stack** -- type inference uses WASM's native typed stack when possible, memory fallback for dynamic cases
|
||||
- **Self-hosting** -- minimal Rust kernel (~35 primitives), everything else written in WAFER Forth
|
||||
- **Dual mode** -- JIT for interactive development, consolidation recompilation into a single optimized WASM module
|
||||
- **Multi-typed stack** -- type inference uses WASM's native typed stack when possible
|
||||
- **Self-hosting** -- minimal Rust kernel (~35 primitives), everything else in WAFER Forth
|
||||
- **Consolidation mode** -- recompile all JIT words into a single optimized WASM module
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
Forth Source -> Outer Interpreter -> IR (with type inference) -> Optimization -> WASM Codegen
|
||||
Forth Source -> Outer Interpreter -> IR -> [Optimize] -> WASM Codegen (wasm-encoder)
|
||||
|
|
||||
wasmtime instantiation
|
||||
(shared memory + table)
|
||||
```
|
||||
|
||||
- **Subroutine threading** via WASM function tables
|
||||
- **JIT mode**: each new word compiles to a separate WASM module
|
||||
- **Consolidation mode**: recompile all words into a single module with direct calls
|
||||
- **IR-based pipeline** enables optimization passes before WASM emission
|
||||
- **Subroutine threading** via WASM function tables and `call_indirect`
|
||||
- **JIT mode**: each new word compiles to a separate WASM module linked to shared memory/globals/table
|
||||
- **IR-based pipeline** enables future optimization passes before WASM emission
|
||||
- **Dictionary**: linked-list word headers in simulated linear memory
|
||||
|
||||
## Building
|
||||
|
||||
@@ -38,43 +65,62 @@ cargo run -p wafer
|
||||
# Run a Forth file
|
||||
cargo run -p wafer -- file.fth
|
||||
|
||||
# Compile to optimized WASM
|
||||
cargo run -p wafer -- --consolidate file.fth -o output.wasm
|
||||
# Pipe input
|
||||
echo ': SQUARE DUP * ; 7 SQUARE .' | cargo run -p wafer
|
||||
```
|
||||
|
||||
## Testing
|
||||
|
||||
```bash
|
||||
# All tests
|
||||
# All tests (185 currently passing)
|
||||
cargo test --workspace
|
||||
|
||||
# Forth 2012 compliance dashboard
|
||||
cargo test --test compliance
|
||||
cargo test -p wafer-core --test compliance
|
||||
|
||||
# Benchmarks
|
||||
cargo bench --workspace
|
||||
# Lints
|
||||
cargo clippy --workspace
|
||||
```
|
||||
|
||||
## Project Structure
|
||||
|
||||
```
|
||||
crates/
|
||||
core/ wafer-core: compiler, IR, optimizer, codegen
|
||||
cli/ wafer: CLI REPL and AOT compiler
|
||||
core/ wafer-core: dictionary, IR, codegen (wasm-encoder), outer interpreter
|
||||
cli/ wafer: CLI REPL and file execution (wasmtime, rustyline)
|
||||
web/ wafer-web: browser bindings (planned)
|
||||
forth/ Standard library written in WAFER Forth
|
||||
tests/ Integration tests and Forth 2012 compliance suite
|
||||
forth/ Standard library in WAFER Forth (planned, currently stubs)
|
||||
tests/ Forth 2012 compliance suite (gerryjackson/forth2012-test-suite submodule)
|
||||
```
|
||||
|
||||
## Implemented Words
|
||||
|
||||
### Core (Forth 2012 Section 6.1) -- In Progress
|
||||
|
||||
**Stack:** DUP DROP SWAP OVER ROT NIP TUCK 2DUP 2DROP 2SWAP 2OVER ?DUP PICK DEPTH
|
||||
**Arithmetic:** + - * / MOD /MOD NEGATE ABS MIN MAX 1+ 1- 2* 2/
|
||||
**Comparison:** = <> < > U< 0= 0< 0<> 0> WITHIN
|
||||
**Logic:** AND OR XOR INVERT LSHIFT RSHIFT
|
||||
**Memory:** @ ! C@ C! +! HERE ALLOT , C, CELLS CELL+ CHARS CHAR+ ALIGNED ALIGN MOVE FILL
|
||||
**Control (compile-time):** IF ELSE THEN DO LOOP +LOOP I J UNLOOP LEAVE BEGIN UNTIL WHILE REPEAT RECURSE EXIT
|
||||
**Defining:** : ; VARIABLE CONSTANT CREATE IMMEDIATE
|
||||
**I/O:** . .S CR EMIT SPACE SPACES TYPE ." S"
|
||||
**Return stack:** >R R> R@
|
||||
**System:** EXECUTE ' CHAR [CHAR] ['] DECIMAL HEX BASE >BODY ENVIRONMENT? SOURCE ABORT TRUE FALSE BL
|
||||
**Compiler:** LITERAL POSTPONE [ ]
|
||||
|
||||
### Not Yet Implemented
|
||||
|
||||
DOES> EVALUATE >NUMBER ACCEPT WORD FIND COUNT CMOVE CMOVE> >IN #TIB STATE (as variable) ABORT" and others needed for full Core compliance.
|
||||
|
||||
## Compliance Status
|
||||
|
||||
Targeting 100% Forth 2012 compliance, tested with [Gerry Jackson's forth2012-test-suite](https://github.com/gerryjackson/forth2012-test-suite).
|
||||
Targeting 100% Forth 2012 compliance via [Gerry Jackson's test suite](https://github.com/gerryjackson/forth2012-test-suite).
|
||||
|
||||
| Word Set | Status |
|
||||
|----------|--------|
|
||||
| Core | Pending |
|
||||
| Core | In progress (~70%) |
|
||||
| Core Extensions | Pending |
|
||||
| Block | N/A |
|
||||
| Double-Number | Pending |
|
||||
| Exception | Pending |
|
||||
| Facility | Pending |
|
||||
|
||||
Reference in New Issue
Block a user