Update documentation to reflect current implementation state
README now documents all 70+ implemented words, working examples, architecture overview, and accurate compliance status. CLAUDE.md updated with actual file descriptions, patterns for adding new words, and current test count.
This commit is contained in:
@@ -1,40 +1,57 @@
|
|||||||
# WAFER Project Conventions
|
# WAFER Project Conventions
|
||||||
|
|
||||||
## What is WAFER?
|
## What is WAFER?
|
||||||
WAFER (WebAssembly Forth Engine in Rust) is an optimizing Forth 2012 compiler targeting WebAssembly.
|
WAFER (WebAssembly Forth Engine in Rust) is an optimizing Forth 2012 compiler targeting WebAssembly. Currently a working Forth system with 70+ words and JIT compilation.
|
||||||
|
|
||||||
## Architecture
|
## Architecture
|
||||||
- Rust kernel (~35 primitives) + Forth standard library (everything else in .fth files)
|
- Each Forth word compiles to its own WASM module via `wasm-encoder`
|
||||||
- IR-based compilation pipeline: Forth -> IR -> type inference -> optimize -> WASM codegen
|
- Modules share memory, globals (dsp/rsp), and a function table via wasmtime imports
|
||||||
- Multi-typed stack: use WASM's native typed stack when types are known via inference, fall back to linear memory for dynamic/polymorphic cases
|
- IR-based compilation: Forth -> `Vec<IrOp>` -> WASM codegen -> wasmtime instantiation
|
||||||
- Subroutine threading via WASM function tables
|
- Dictionary: linked-list in a `Vec<u8>` buffer simulating WASM linear memory
|
||||||
- JIT mode: per-word WASM modules + shared function table
|
- Primitives: either IR-based (compiled to WASM) or host functions (Rust closures in wasmtime)
|
||||||
- Consolidation mode: recompile all words into single optimized WASM module
|
|
||||||
|
## Key Files
|
||||||
|
- `crates/core/src/outer.rs` -- ForthVM: the main runtime, outer interpreter, compiler, all primitives
|
||||||
|
- `crates/core/src/codegen.rs` -- IR-to-WASM translation, module generation, wasmtime execution tests
|
||||||
|
- `crates/core/src/dictionary.rs` -- Dictionary data structure with create/find/reveal
|
||||||
|
- `crates/core/src/ir.rs` -- IrOp enum (the intermediate representation)
|
||||||
|
- `crates/core/src/memory.rs` -- Memory layout constants (stack regions, dictionary base, etc.)
|
||||||
|
- `crates/cli/src/main.rs` -- CLI REPL with rustyline
|
||||||
|
|
||||||
|
## Adding a New Word
|
||||||
|
|
||||||
|
**IR primitive** (simple stack/arithmetic/logic -- preferred when possible):
|
||||||
|
```rust
|
||||||
|
self.register_primitive("WORD_NAME", false, vec![IrOp::Dup, IrOp::Mul])?;
|
||||||
|
```
|
||||||
|
|
||||||
|
**Host function** (needs Rust logic -- I/O, dictionary manipulation, complex stack access):
|
||||||
|
```rust
|
||||||
|
let func = Func::new(&mut self.store, func_type.clone(), move |mut caller, _params, _results| {
|
||||||
|
// manipulate memory/globals directly
|
||||||
|
Ok(())
|
||||||
|
});
|
||||||
|
self.register_host_primitive("WORD_NAME", false, func)?;
|
||||||
|
```
|
||||||
|
|
||||||
|
**Special interpreter token** (defining words like VARIABLE, CONSTANT, CREATE):
|
||||||
|
Handle in `interpret_token_immediate()` or `compile_token()` as a special case.
|
||||||
|
|
||||||
## Code Style
|
## Code Style
|
||||||
- `cargo fmt` and `cargo clippy` must pass with no warnings
|
- `cargo fmt --all` and `cargo clippy --workspace` must pass with no warnings
|
||||||
- Every public function needs a doc comment
|
- Every public function needs a doc comment
|
||||||
- Every module needs unit tests
|
- Use `thiserror` for error types in core crate, `anyhow` for CLI
|
||||||
- Use `thiserror` for error types in core crate, `anyhow` for CLI crate
|
|
||||||
- Prefer returning `Result` over panicking
|
- Prefer returning `Result` over panicking
|
||||||
|
|
||||||
## Testing (Critical)
|
## Testing
|
||||||
- **Specs-driven TDD**: Every feature starts with its failing test, then implementation
|
- Run `cargo test --workspace` before committing (currently 185 tests)
|
||||||
- Run `cargo test --workspace` before committing
|
- Forth 2012 compliance: `cargo test -p wafer-core --test compliance`
|
||||||
- Forth 2012 compliance: `cargo test --test compliance`
|
- Test helper in outer.rs: `eval_output("forth code")` returns printed output as String
|
||||||
- Property-based tests with `proptest` for numeric operations and optimizer correctness
|
- Test helper: `eval_stack("forth code")` returns data stack as Vec<i32>
|
||||||
- Snapshot tests with `insta` for IR and WASM output
|
|
||||||
- 100% compliance is mandatory for each implemented word set before moving on
|
|
||||||
- Never break existing compliance tests
|
|
||||||
|
|
||||||
## Forth Source (.fth files)
|
|
||||||
- One file per word set in `forth/`
|
|
||||||
- Document each word with standard stack effect notation: `( before -- after )`
|
|
||||||
- Maximize words written in Forth, minimize Rust primitives
|
|
||||||
- Boot order: boot.fth -> core.fth -> core_ext.fth -> ... -> prelude.fth
|
|
||||||
|
|
||||||
## Key Principles
|
## Key Principles
|
||||||
1. Maximize Forth, minimize Rust (self-hosting goal)
|
1. Correctness first, performance second
|
||||||
2. Correctness first, performance second
|
2. Maximize Forth, minimize Rust (self-hosting goal -- not yet started)
|
||||||
3. Test-driven: if it's not tested, it doesn't work
|
3. Test-driven: if it's not tested, it doesn't work
|
||||||
4. Every word set at 100% compliance before moving to the next
|
4. Never break existing tests
|
||||||
|
5. No Co-Authored-By or AI attribution in commits
|
||||||
|
|||||||
@@ -4,24 +4,51 @@
|
|||||||
|
|
||||||
An optimizing Forth 2012 compiler targeting WebAssembly.
|
An optimizing Forth 2012 compiler targeting WebAssembly.
|
||||||
|
|
||||||
|
## Status
|
||||||
|
|
||||||
|
WAFER is a working Forth system. It JIT-compiles each word definition to a separate WASM module and executes via `wasmtime`. 185 tests passing.
|
||||||
|
|
||||||
|
**Working features:**
|
||||||
|
- Colon definitions with full control flow (IF/ELSE/THEN, DO/LOOP/+LOOP, BEGIN/UNTIL, BEGIN/WHILE/REPEAT)
|
||||||
|
- 70+ words: stack, arithmetic, comparison, logic, memory, I/O, defining words, system
|
||||||
|
- Recursion (RECURSE), nested control structures, loop counters (I, J)
|
||||||
|
- VARIABLE, CONSTANT, CREATE
|
||||||
|
- Number bases (HEX, DECIMAL), number prefixes ($hex, #dec, %bin)
|
||||||
|
- Comments (backslash, parentheses), string output (." ...)
|
||||||
|
- Interactive REPL with line editing
|
||||||
|
|
||||||
|
**Example session:**
|
||||||
|
```forth
|
||||||
|
: FIB DUP 2 < IF DROP 1 ELSE DUP 1 - RECURSE SWAP 2 - RECURSE + THEN ;
|
||||||
|
: FIBS 0 DO I FIB . LOOP ;
|
||||||
|
12 FIBS CR \ prints: 1 1 2 3 5 8 13 21 34 55 89 144
|
||||||
|
|
||||||
|
VARIABLE COUNTER 0 COUNTER !
|
||||||
|
: BUMP COUNTER @ 1 + COUNTER ! ;
|
||||||
|
BUMP BUMP BUMP COUNTER @ . \ prints: 3
|
||||||
|
```
|
||||||
|
|
||||||
## Goals
|
## Goals
|
||||||
|
|
||||||
- **Full Forth 2012 compliance** -- all word sets, 100% test suite pass rate
|
- **Full Forth 2012 compliance** -- all word sets, 100% test suite pass rate
|
||||||
- **Optimizing compiler** -- constant folding, inlining, peephole optimization, stack-to-local promotion
|
- **Optimizing compiler** -- constant folding, inlining, peephole optimization, stack-to-local promotion
|
||||||
- **Multi-typed stack** -- type inference uses WASM's native typed stack when possible, memory fallback for dynamic cases
|
- **Multi-typed stack** -- type inference uses WASM's native typed stack when possible
|
||||||
- **Self-hosting** -- minimal Rust kernel (~35 primitives), everything else written in WAFER Forth
|
- **Self-hosting** -- minimal Rust kernel (~35 primitives), everything else in WAFER Forth
|
||||||
- **Dual mode** -- JIT for interactive development, consolidation recompilation into a single optimized WASM module
|
- **Consolidation mode** -- recompile all JIT words into a single optimized WASM module
|
||||||
|
|
||||||
## Architecture
|
## Architecture
|
||||||
|
|
||||||
```
|
```
|
||||||
Forth Source -> Outer Interpreter -> IR (with type inference) -> Optimization -> WASM Codegen
|
Forth Source -> Outer Interpreter -> IR -> [Optimize] -> WASM Codegen (wasm-encoder)
|
||||||
|
|
|
||||||
|
wasmtime instantiation
|
||||||
|
(shared memory + table)
|
||||||
```
|
```
|
||||||
|
|
||||||
- **Subroutine threading** via WASM function tables
|
- **Subroutine threading** via WASM function tables and `call_indirect`
|
||||||
- **JIT mode**: each new word compiles to a separate WASM module
|
- **JIT mode**: each new word compiles to a separate WASM module linked to shared memory/globals/table
|
||||||
- **Consolidation mode**: recompile all words into a single module with direct calls
|
- **IR-based pipeline** enables future optimization passes before WASM emission
|
||||||
- **IR-based pipeline** enables optimization passes before WASM emission
|
- **Dictionary**: linked-list word headers in simulated linear memory
|
||||||
|
|
||||||
## Building
|
## Building
|
||||||
|
|
||||||
@@ -38,43 +65,62 @@ cargo run -p wafer
|
|||||||
# Run a Forth file
|
# Run a Forth file
|
||||||
cargo run -p wafer -- file.fth
|
cargo run -p wafer -- file.fth
|
||||||
|
|
||||||
# Compile to optimized WASM
|
# Pipe input
|
||||||
cargo run -p wafer -- --consolidate file.fth -o output.wasm
|
echo ': SQUARE DUP * ; 7 SQUARE .' | cargo run -p wafer
|
||||||
```
|
```
|
||||||
|
|
||||||
## Testing
|
## Testing
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# All tests
|
# All tests (185 currently passing)
|
||||||
cargo test --workspace
|
cargo test --workspace
|
||||||
|
|
||||||
# Forth 2012 compliance dashboard
|
# Forth 2012 compliance dashboard
|
||||||
cargo test --test compliance
|
cargo test -p wafer-core --test compliance
|
||||||
|
|
||||||
# Benchmarks
|
# Lints
|
||||||
cargo bench --workspace
|
cargo clippy --workspace
|
||||||
```
|
```
|
||||||
|
|
||||||
## Project Structure
|
## Project Structure
|
||||||
|
|
||||||
```
|
```
|
||||||
crates/
|
crates/
|
||||||
core/ wafer-core: compiler, IR, optimizer, codegen
|
core/ wafer-core: dictionary, IR, codegen (wasm-encoder), outer interpreter
|
||||||
cli/ wafer: CLI REPL and AOT compiler
|
cli/ wafer: CLI REPL and file execution (wasmtime, rustyline)
|
||||||
web/ wafer-web: browser bindings (planned)
|
web/ wafer-web: browser bindings (planned)
|
||||||
forth/ Standard library written in WAFER Forth
|
forth/ Standard library in WAFER Forth (planned, currently stubs)
|
||||||
tests/ Integration tests and Forth 2012 compliance suite
|
tests/ Forth 2012 compliance suite (gerryjackson/forth2012-test-suite submodule)
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## Implemented Words
|
||||||
|
|
||||||
|
### Core (Forth 2012 Section 6.1) -- In Progress
|
||||||
|
|
||||||
|
**Stack:** DUP DROP SWAP OVER ROT NIP TUCK 2DUP 2DROP 2SWAP 2OVER ?DUP PICK DEPTH
|
||||||
|
**Arithmetic:** + - * / MOD /MOD NEGATE ABS MIN MAX 1+ 1- 2* 2/
|
||||||
|
**Comparison:** = <> < > U< 0= 0< 0<> 0> WITHIN
|
||||||
|
**Logic:** AND OR XOR INVERT LSHIFT RSHIFT
|
||||||
|
**Memory:** @ ! C@ C! +! HERE ALLOT , C, CELLS CELL+ CHARS CHAR+ ALIGNED ALIGN MOVE FILL
|
||||||
|
**Control (compile-time):** IF ELSE THEN DO LOOP +LOOP I J UNLOOP LEAVE BEGIN UNTIL WHILE REPEAT RECURSE EXIT
|
||||||
|
**Defining:** : ; VARIABLE CONSTANT CREATE IMMEDIATE
|
||||||
|
**I/O:** . .S CR EMIT SPACE SPACES TYPE ." S"
|
||||||
|
**Return stack:** >R R> R@
|
||||||
|
**System:** EXECUTE ' CHAR [CHAR] ['] DECIMAL HEX BASE >BODY ENVIRONMENT? SOURCE ABORT TRUE FALSE BL
|
||||||
|
**Compiler:** LITERAL POSTPONE [ ]
|
||||||
|
|
||||||
|
### Not Yet Implemented
|
||||||
|
|
||||||
|
DOES> EVALUATE >NUMBER ACCEPT WORD FIND COUNT CMOVE CMOVE> >IN #TIB STATE (as variable) ABORT" and others needed for full Core compliance.
|
||||||
|
|
||||||
## Compliance Status
|
## Compliance Status
|
||||||
|
|
||||||
Targeting 100% Forth 2012 compliance, tested with [Gerry Jackson's forth2012-test-suite](https://github.com/gerryjackson/forth2012-test-suite).
|
Targeting 100% Forth 2012 compliance via [Gerry Jackson's test suite](https://github.com/gerryjackson/forth2012-test-suite).
|
||||||
|
|
||||||
| Word Set | Status |
|
| Word Set | Status |
|
||||||
|----------|--------|
|
|----------|--------|
|
||||||
| Core | Pending |
|
| Core | In progress (~70%) |
|
||||||
| Core Extensions | Pending |
|
| Core Extensions | Pending |
|
||||||
| Block | N/A |
|
|
||||||
| Double-Number | Pending |
|
| Double-Number | Pending |
|
||||||
| Exception | Pending |
|
| Exception | Pending |
|
||||||
| Facility | Pending |
|
| Facility | Pending |
|
||||||
|
|||||||
Reference in New Issue
Block a user