The syntax file landed in bcccdfb, one commit before `(LOCAL)` and
while several other recently-added words were already in the tree but
unhighlighted. Extend it to cover everything currently registered.
Added contexts:
- `locals` — `{:` `:}` `{F:` `TO` `LOCALS|` `END-LOCALS` `(LOCAL)`.
- `structures` — `BEGIN-STRUCTURE` (captures the following name),
`END-STRUCTURE`, `+FIELD`, `FIELD:`, `CFIELD:`, `FFIELD:`,
`SFFIELD:`, `DFFIELD:`.
- `hashing` — `SHA1`, `SHA256`, `SHA512`. Comment notes the list
mirrors `crypto::ALGOS`.
Extended:
- `definitions` — quotations `[:` / `;]` (Core-ext 6.2.0455).
- `parsing` — state-smart `S` (the string parser from d1a7d55).
- `wafer_extras` — `READ-PASSWORD` (web-side prompter from 9150696).
Context order in `main:` keeps `definitions` ahead of `locals`, so
`: foo` still wins over `{:` / `:}`, and `strings` / `arithmetic`
stay ahead of `parsing` so `S"` and `S>D` keep their existing
highlighting despite the new bare-`S` rule.
Implement `(LOCAL)` as a host primitive that defers its effect to the
outer-interpreter compile state via two new `PendingAction` variants:
- `DeclareLocal(name)` — a non-sentinel `(LOCAL)` call with `u > 0`
appends the name to `compiling_locals` as an int local.
- `DeclareLocalEnd` — the `0 0 (LOCAL)` sentinel emits reverse-order
`ForthLocalSet` IR for the batch declared since the last sentinel,
reusing the same IR shape as the `{: ... :}` locals flow.
`local_batch_base` tracks where the current batch started; it is
saved/restored across nested compile frames and cleared on
`finish_colon_def`. Int-only, per spec — float locals remain `{F: :}`.
Also fix `\` per §6.2.2535: parse-and-discard must stop at the next
`\n`, not at `#TIB`. Under line-wrapped `evaluate` calls (common in
test files) the old behaviour consumed the trailing `;` of a multi-line
`:` definition, silently leaving state in compile mode.
Tighten `compliance.rs`: `load_file` now returns a line-failure count,
every prerequisite is asserted against `expected_load_failures(path)`,
and a new `load_file_whole` handles multi-line definitions (`DOES>`
split across lines in `errorreport.fth`) that the per-line loader
cannot stitch. Baselines document known gaps for `core.fr` (nested
`:`, SOURCE/>IN via EVALUATE), `coreexttest.fth` (SAVE-INPUT, `.(`
inside `[...]`), `exceptiontest.fth` (one garbled parse after
CATCH/THROW source stacking), and `toolstest.fth` (37 `\?`-guarded
lines where `SOURCE >IN ! DROP` fails to skip under per-line
`evaluate`). Each entry is a tech-debt ledger item, not an allowlist.
Regression tests: LT32 (the localstest case that silently skipped
before `(LOCAL)` existed), the `0 0 (LOCAL)` sentinel-only no-op, a
multi-line `:` followed by `VARIABLE` after a `\` comment, and a
direct `\` stops-at-newline case.
Incidental: clear two `implicit_clone` clippy lints in the RANDOM
determinism test (`.to_vec()` → `.clone()`).
Fix rustfmt drift and two clippy lints (`doc_markdown` missing
backticks around `NativeRuntime`) that surfaced after the Rust 1.94
toolchain update. No functional change.
Ship tools/editor-support/bat/WAFER.sublime-syntax so any bat user
(including oked, which probes bat first) renders .fth files with
proper keyword colouring, including the WAFER extras CONSOLIDATE,
RANDOM, RND-SEED, and UTIME.
Keyword list derives from register_primitive/register_host_primitive
calls in crates/core/src/outer.rs plus the boot.fth definitions.
Internal underscore-prefixed words are deliberately omitted.
Install with `just install-syntax`.
compile_token matched hardcoded tokens (S, ." etc) before
checking compiling_locals. Local named `s` got hijacked by
the `S` string shortcut. Forth 2012 §13.3.3.2 — locals
supersede dict names in scope. Move locals check to top of
compile_token for uniform precedence.
Tests: S-hijack repro, get+set round-trip, int-uninit pipe
syntax coverage (`{: | name :}`).
`{: F: x F: y :}` now declares float-typed locals that live on the float
stack. `x x F* y y F* F+ FSQRT` writes real float code without manual
FSTACK juggling — previously WAFER had a 100%-compliant float wordset
but no way to name intermediate float values.
New IR ops `ForthFLocalGet(n)` / `ForthFLocalSet(n)` alongside the
existing int-local ops. Each kind has its own index namespace so mixed
declarations like `{: n F: f :}` compose cleanly. Codegen allocates f64
WASM locals after the existing f64 scratch pair; the fsp-bridge logic
mirrors the existing FDup/FSwap path.
Outer interpreter tracks a parallel `compiling_local_kinds` alongside
`compiling_locals` (keeps the 18 existing touch-points unchanged) and
extends `{:` to recognize `F:` as a per-next-name type marker. `TO` and
name resolution branch on kind to pick Int vs Float get/set ops.
Four tests: classic hypot, TO round-trip, mixed int/float args, and
uninitialized float via `|`. Inline-inhibit for the new ops added to
optimizer and is_promotable so they don't sneak into contexts that
would collide with the caller's WASM locals.
State-smart anonymous xt builder. Interpret mode leaves the xt on the
data stack; compile mode emits a literal push into the enclosing word,
so `: APPLY EXECUTE ; [: 1 2 + ;] APPLY` prints 3.
Supported nested inside colon definitions via a new compile-frame stack
(`Vec<CompileFrame>`). Each frame snapshots `compiling_name`,
`compiling_word_id`, `compiling_word_addr`, `compiling_ir`,
`control_stack`, `saw_create_in_def`, `compiling_locals`, and `state`.
The inner [: ... ;] compiles its body as an anonymous word; on ;] the
outer frame pops back and the xt is either pushed to the data stack
(interpret mode) or compiled as a literal (compile mode).
Also fixes a latent bug: `finish_colon_def` used to reveal `latest`,
which breaks when intermediate dict entries (now including quotations)
move `latest`. Each definition now tracks its own `compiling_word_addr`
and uses `reveal_at`, matching the existing DOES> pattern.
Five tests cover interpret, compile, inside-a-colon-def, two-level
nesting, and the control-stack-travels-with-frame regression (outer
IF/ELSE/THEN must still match around an inner [: ;]).
BEGIN-STRUCTURE, END-STRUCTURE, +FIELD, FIELD:, CFIELD:, FFIELD:,
SFFIELD:, DFFIELD: — the Forth 2012 structure-definition family plus
the float-typed variants for symmetry with WAFER's float wordset.
Each defining word carries its own inline CREATE .. DOES> — factoring
through a shared +FIELD helper doesn't work in WAFER, because DOES>-
defining words only dispatch at the outer interpreter, not from compiled
IR. So FIELD: can't call +FIELD and have the DOES> action fire; each
FIELD:/CFIELD:/... repeats the pattern directly.
Three tests cover size computation, field offsets, and mixed cell + char
fields with alignment.
Non-standard but ubiquitous in gforth/SwiftForth/VFX. Adds a shared
rng_state on ForthVM, seeded from nanosecond wall-clock at boot.
`RANDOM ( -- u )` returns a 32-bit pseudo-random cell; `RND-SEED ( u -- )`
reseeds, with 0 forced to a nonzero constant to avoid xorshift's fixed
point.
Three tests cover determinism after seeding, distinct-value spread
across 1000 pulls, and the zero-seed safeguard.
`S name` in interpret mode used to leave (c-addr u) pointing into the
input buffer, so the next REFILL clobbered the bytes. Typing `s test`
then `type` on a fresh line printed "pest" because the new input
overwrote the first chars of the old TIB content.
Move `S` from boot.fth to the Rust outer interpreter alongside `S"` /
`C"`: both interpret and compile modes now copy the token to HERE-space
(stable across REFILL). Compile-mode output is still bit-identical to
writing `S" name"` inline.
Adds `test_s_interpret_survives_refill` regression.
`S name` is the string analogue of `[CHAR] x` and `['] name`: parses the
next whitespace-delimited token, state-smart.
Interpret: leaves ( c-addr u ) pointing into the input buffer.
Compile: appends run-time push of the copied bytes (identical code
to writing S" name" inline).
One line in boot.fth, leverages the existing PARSE-NAME + SLITERAL.
Zero runtime overhead inside : definitions.
Browser consumers (kelvar) need a host-provided password prompt so the
master never appears on the command line. Exposes a single method:
WaferRepl::set_prompter(js_sys::Function) -> Result<(), JsError>
Given a JS function `(prompt: string) => string`, registers it as the
Forth word `READ-PASSWORD` with stack effect
( prompt-addr prompt-u -- pw-addr pw-u )
The returned bytes land in WAFER's PAD region. Enforces PAD_SIZE-1 as
a hard upper bound — a silent truncation would cause a derived password
to mismatch the one used during setup, which is exactly the failure
mode we are trying to avoid.
`js_sys::Function` is !Send/!Sync but `HostFn` requires both. In a
browser WASM build there is only ever one thread, so wrap it in
`send_wrapper::SendWrapper`, which panics if accessed off-thread — an
honest guard rather than a lie.
Introduces a `crypto` feature (on by default) that wires the RustCrypto
sha1/sha2 crates into a small `HashAlgo` registry. `register_primitives`
iterates `crypto::ALGOS` and installs one Forth host word per algorithm,
each with the stack effect
( c-addr u -- c-addr2 u2 )
reading `u` bytes from `c-addr` and writing the digest into a shared
`HASH_SCRATCH` region in linear memory (carved out between the float
stack and the dictionary).
Adding a new hash is a one-line entry in `ALGOS`. `register_host_primitive`
is now `pub` so downstream crates can extend the VM with their own I/O
host words without forking WAFER — kelvar (a deterministic password
manager on WAFER) is the first consumer.
- 4 unit tests (lib-level sha1/256/512 + registry sanity)
- 5 integration tests (in-VM `SHA1`/`SHA256`/`SHA512` against RFC-3174,
FIPS-180, and the first-round S/KEY seed used by `hel`)
- All 437 existing lib tests still pass; `wafer-web` still builds for
`wasm32-unknown-unknown` with the feature enabled
Implement PAGE (Facility word set) as IR primitive emitting form feed.
Web REPL clears output div on form feed, CLI REPL sends ANSI clear.
Fix init code panel: use default textarea content instead of placeholder
so init code actually executes on first visit. Update wasm-pack 0.10→0.14
and refresh Cargo.lock to latest compatible versions.
Decouple ForthVM from wasmtime via a Runtime trait so the same outer
interpreter, compiler, and 200+ word definitions work on both native
(wasmtime) and browser (js-sys WebAssembly API) backends.
Runtime trait (runtime.rs):
- HostAccess trait for memory/global ops inside host function closures
- HostFn type: Box<dyn Fn(&mut dyn HostAccess) -> Result<()>>
- Runtime trait: memory, globals, table, instantiate, call, register
NativeRuntime (runtime_native.rs):
- Wraps wasmtime Engine/Store/Memory/Table/Global/Func
- CallerHostAccess bridges HostAccess to wasmtime Caller API
- Feature-gated behind "native" (default)
outer.rs refactor:
- ForthVM<R: Runtime> — generic over execution backend
- All 87 host functions converted from Func::new closures to HostFn
- All memory access via rt.mem_read/write_*, global access via rt.get/set_*
- Zero logic changes — pure API conversion
wafer-core feature gates:
- default = ["native"] includes wasmtime + all native modules
- Without "native": pure Rust only (outer, codegen, optimizer, dictionary)
Browser REPL (crates/web):
- WebRuntime: js-sys WebAssembly.Memory/Table/Global/Module/Instance
- WaferRepl: wasm-bindgen entry point (evaluate, data_stack, reset)
- WebAssembly.Function with Safari fallback (wrapper module)
- Frontend: dark terminal UI, word panel, init code editor, history
- Build: wasm-pack build --target web
All 452 tests pass (431 unit + 1 benchmark + 9 comparison + 11 compliance).
wasmtime 31→43, wasm-encoder/wasmparser 0.228→0.246, rustyline 15→18.
API migrations: F64Const now takes Ieee64 wrapper, wasmtime has own
Error type (wasmtime::bail! in host closures), cache_config_load_default
removed. Add performance regression limits to benchmark tests.
Three compile-time words for unstructured control flow:
- AHEAD: unconditional forward branch (code to THEN skipped)
- CS-PICK: duplicate control-flow stack entries (enables multi-exit loops)
- CS-ROLL: rotate control-flow stack entries (reorder IF/THEN resolution)
Also adds POSTPONE support for compile-time keywords (IF, UNTIL, etc.)
via a __CTRL__ host function and unified pending_actions queue.
Key design:
- LoopRestartIfFalse IR op desugars into nested If nodes for CS-PICK'd
BEGIN+UNTIL patterns (multiple backward branches in one loop)
- Flat Block/BranchIfFalse/EndBlock IR ops for CS-ROLL'd IF/THEN
patterns where structured If nesting would consume wrong flags
- First-iteration flag local for AHEAD-into-BEGIN patterns (PT8)
Enables 12th compliance test (compliance_tools): all 11+1 now pass.
- README: add performance section (beats gforth 2-10x), update test
commands, note self-recursive direct calls and loop promotion
- CLAUDE.md: update test counts (427 unit + comparison tests)
- OPTIMIZATIONS.md: stack-to-local Phase 1→Phase 2 (loops + IF),
DO/LOOP locals done, J as IR done, add section 14 (self-recursive
direct call), add current performance table vs gforth
- WAFER.md: document self-recursive call optimization, CONSOLIDATE,
update test commands and line counts
- FORTH.md: expanded space history, add FORTH-IN-SPACE.md reference
- FORTH-IN-SPACE.md: new document with verified spacecraft history
Three bugs fixed to safely enable promotion for control flow:
1. compute_stack_needs now recurses into IF/DoLoop/Begin bodies,
correctly calculating preload counts for promoted words with
nested control flow (was flat, causing stack underflow).
2. BeginDoubleWhileRepeat rejected from promotion (boot.fth's
-TRAILING uses this pattern, handler had structural bugs).
3. IF/ELSE branches must have same net stack effect for promotion
(BITSSET? has asymmetric branches: 2 items vs 1).
Performance with promotion enabled:
- Factorial: 0.50x (2x faster than gforth)
- Collatz: 0.38x (2.6x faster than gforth)
- All 427 unit tests, 10/11 compliance, 35/35 behavioral pass
Extends the promoted codegen path (StackSim) with handlers for DoLoop,
BeginWhileRepeat, BeginUntil, BeginAgain, If/Else/Then, RFetch, LoopJ,
and Exit. Includes loop-iteration fixup to copy modified locals back to
loop-top positions, and IF branch state merging.
The promotion is currently gated off for control flow (is_promotable
rejects all loops/IF) pending fix for edge cases in the Forth 2012 test
suite. The infrastructure is ready to enable incrementally.
When briefly enabled for testing, showed dramatic results:
- Factorial: 0.49x (2x faster than gforth)
- Collatz: 0.17x (6x faster than gforth)
Two-path DO/LOOP codegen based on static analysis of the loop body:
- Fast path (no calls, no >R/R> in body): index and limit live purely
in WASM locals with zero return stack traffic per iteration. RFetch (I)
and LoopJ (J) resolve to local.get instead of memory access.
- Slow path (body has calls or explicit RS ops): locals still used for
loop control, but synced to return stack for LEAVE/UNLOOP compatibility.
Also converts J from a host function (WASM→Rust roundtrip per call) to
an IR primitive (IrOp::LoopJ) that compiles to local.get of the outer
loop's index local.
Performance impact (vs gforth, all opts enabled):
- Factorial: 1.02x → 0.94x (now faster than gforth)
- NestedLoops: 717x → 543x (24% faster, still bottlenecked by data stack)
- Fibonacci, GCD, Collatz: unchanged (don't use DO/LOOP)
35 behavioral tests across 8 categories verify identical output between
WAFER and gforth. Performance benchmarks compare execution speed for
Fibonacci, Factorial, GCD, NestedLoops, and Collatz workloads.
WAFER-only correctness tests run in CI without gforth; cross-engine
comparison and performance report are opt-in via --ignored.
Major compliance push bringing WAFER from 3 to 10 passing Forth 2012
compliance test suites (Core, Core Extensions, Core Plus, Double,
Exception, Facility, Locals, Memory, Search Order, String).
Compiler/runtime fixes:
- DEFER: host function via pending_define, works inside colon defs
- COMPILE,: handle_pending_compile in execute_word for [...] sequences
- MARKER: full save/restore with pending_marker_restore mechanism
- IMMEDIATE: changed from XOR toggle to OR set per Forth 2012 spec
- ABORT": throw -2 via THROW, no message display when caught
- M*/: symmetric division to match WAFER's / behavior
- pending_define: single i32 flag → Vec<i32> queue for multi-action words
- Optimizer: prevent inlining words containing EXIT or ForthLocal ops
- +LOOP: corrected boundary check formula with AND step comparison
- REPEAT: accept bare BEGIN (unstructured IF...BEGIN...REPEAT)
- Auto-close unclosed IFs at ; for unstructured control flow
- _create_part_: use reserve_fn_index to preserve dictionary.latest()
Memory layout:
- Separate PICT_BUF and WORD_BUF regions to prevent PAD overlap
- Updated DEPTH hardcoded DATA_STACK_TOP in boot.fth
New word sets:
- [IF]/[ELSE]/[THEN]/[DEFINED]/[UNDEFINED]: conditional compilation
- UNESCAPE/SUBSTITUTE/REPLACES: string substitution (host functions)
- Locals {: syntax: parser, ForthLocalGet/Set IR ops, WASM local codegen
- ENVIRONMENT? support for #LOCALS (returns 16)
- N>R/NR>/SYNONYM: programming-tools extensions
- Search Order: ONLY, ALSO, PREVIOUS, DEFINITIONS, FORTH,
FORTH-WORDLIST, GET-ORDER, SET-ORDER, GET-CURRENT, SET-CURRENT,
WORDLIST, SEARCH-WORDLIST with full multi-wordlist dictionary support
via Arc<Mutex> shared state for immediate effect from compiled code
Remaining: 1 cascade error in Programming-Tools from CS-PICK/CS-ROLL
(unstructured control-flow stack manipulation, requires flat IR).
- SOURCE-ID now returns -1 during EVALUATE (saves/restores SYSVAR_SOURCE_ID)
- BUFFER: aligns HERE to cell boundary before allocating
- S\" returns Vec<u8> instead of String to preserve raw escape bytes
Core_ext: 14→6 errors. Total: 46→44.
parse_s_escape returned String via from_utf8_lossy which replaces
non-UTF-8 bytes (like \xAB = 171) with the 3-byte U+FFFD replacement
character, corrupting both string length and content.
Changed to return Vec<u8> and write raw bytes directly to WASM memory.
Also registered ( as immediate word for FIND, added 'x' char literals.
Core_ext: 14→8 errors.
- Register ( in dictionary as immediate so FIND can discover it
(fixes search-order FIND test: 4→3 errors)
- Add character literal parsing: 'z' → 122 (Forth 2012 number prefix)
- Fix ALLOCATE/RESIZE -1 size validation (memory suite now passes)
ALLOCATE and RESIZE with size -1 (0xFFFFFFFF) were "succeeding" because
wrapping arithmetic made the block size tiny. Added early rejection for
sizes exceeding half the available memory.
Memory suite: 2→0 errors. Now 4 suites pass (Core, Facility, Memory).
Forth 2012 allows multiple ELSEs: IF 1 ELSE 2 ELSE 3 ELSE 4 ELSE 5 THEN
produces (1 3 5) for true and (2 4) for false. Desugars by saving the
condition flag on the return stack with >R/R@ and building nested
If/Else pairs. The final THEN cleans up with R> DROP.
- Implement Memory-Allocation word set (ALLOCATE/FREE/RESIZE) as
host functions using a top-down arena allocator in WASM linear memory.
Uses wrapping arithmetic for -1 size error cases.
- Fix DU< comparison order (same bug as D<: comparing d2-hi vs d1-hi).
- Register 2VARIABLE/2CONSTANT as callable host functions (pending
codes 9/10) so they work from compiled code like `: CD4 2VARIABLE ;`.
Memory suite: 62→2 errors. Double suite: 27→3 errors.
Total remaining: 56 failures across 9 suites.
- DU< had same comparison order bug as D< (comparing d2-hi < d1-hi
instead of d1-hi < d2-hi). Fixed with SWAP U<.
- 2VARIABLE and 2CONSTANT were handled as special tokens but not
registered in the dictionary, so they couldn't be called from
compiled code (e.g., : CD4 2VARIABLE ;). Added pending codes 9/10.
- D< used D- D0< which overflows for extreme signed doubles.
Replaced with high-cell comparison + unsigned low-cell comparison.
- COMPARE had inverted sign for length difference (u2-u1 vs u1-u2).
- Added -TRAILING (removed during Phase 6 refactoring, never re-added).
PARSE and PARSE-NAME were using the deferred pending mechanism which
broke when called from compiled code (the calling word continued
executing before PARSE ran). Replaced with inline host functions that
read >IN/#TIB directly from WASM memory and parse immediately.
This fixes utilities.fth $"/$2" failures that left stack residue
cascading into all subsequent compliance test suites.
Also: core_ext 17→14, string 27→17.
- Implement ROLL as host function (stack rotation by u positions)
- Fix CASE/ENDCASE: ENDCASE DROP was emitted before default code instead
of after, causing stack underflow in default branches
- Fix PARSE: skip one leading space (outer interpreter's trailing
delimiter) so parsed content starts at the argument, not the space
- Fix UNUSED: read SYSVAR_HERE from WASM memory (not just here_cell)
since Forth ALLOT/,/C, update WASM memory directly
- Register .( as immediate word in dictionary so FIND can discover it
Core and Facility compliance suites pass. Core Extensions down from
34 to 17 errors.
Six fixes for compliance test regressions introduced in Phases 7-8:
- LEAVE + +LOOP with step=0 caused infinite loop: the XOR termination
check yields 0 when index=limit and step=0. Added SYSVAR_LEAVE_FLAG
mechanism — LEAVE sets flag, +LOOP checks it, all loops clear on exit.
- DEPTH was off-by-one: `5440 SP@ -` pushed the literal before SP@
read the stack pointer, making SP@ see one extra cell. Reordered to
`SP@ 5440 SWAP -` so SP@ reads dsp before any literal push.
- */ and */MOD used FM/MOD (floored) but WAFER's / uses WASM i32.div_s
(symmetric). Changed to SM/REM for consistency.
- EVALUATE didn't sync input buffer to WASM memory, breaking SOURCE
and >IN manipulation inside evaluated strings. Added input-only sync
(without touching STATE/BASE) and >IN readback after each token.
- WORD didn't skip leading spaces when delimiter != space, causing
GN' and GS3 tests to read whitespace instead of content.
- Added ACCEPT stub returning 0 for non-interactive mode.
- Added bounds check in refresh_user_here to reject corrupted
SYSVAR_HERE values beyond WASM memory size.
Core and Facility compliance suites now pass. Other suites have
pre-existing regressions from Phases 1-8 still under investigation.
Move memory allocation words to boot.fth:
- ALLOT: `: ALLOT HERE + 12 ! ;`
- , (comma): `: , HERE ! 1 CELLS ALLOT ;`
- C, : `: C, HERE C! 1 ALLOT ;`
- ALIGN: `: ALIGN HERE ALIGNED 12 ! ;`
- FALIGN, SFALIGN, DFALIGN: float-aligned variants
These write directly to WASM memory[SYSVAR_HERE]. The Rust side picks up
Forth-side HERE changes via refresh_user_here() which now reads both
here_cell (for Rust host functions) and memory[12] (for Forth words),
taking the maximum to ensure no allocation is lost.
Removed 222 lines of Rust. All 426 tests pass.
DEFER! and DEFER@ are trivially `: DEFER! >BODY ! ;` and `: DEFER@ >BODY @ ;`.
COMPARE uses a byte-by-byte loop with early exit.
Removed 148 lines of Rust. All 426 tests pass.