<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://bur.gy/feed.xml" rel="self" type="application/atom+xml" /><link href="https://bur.gy/" rel="alternate" type="text/html" /><updated>2026-03-24T11:07:19+00:00</updated><id>https://bur.gy/feed.xml</id><title type="html">Elegant Programming</title><subtitle>Just a place to gather my thoughts and opinions about programming, science, and life.  I can&apos;t promise all of it is interesting, but I can definitely promise all of it is nerdy.</subtitle><entry><title type="html">How Many Roads Must a Man Walk Down?</title><link href="https://bur.gy/2025/11/29/how-many-roads.html" rel="alternate" type="text/html" title="How Many Roads Must a Man Walk Down?" /><published>2025-11-29T12:04:54+00:00</published><updated>2025-11-29T12:04:54+00:00</updated><id>https://bur.gy/2025/11/29/how-many-roads</id><content type="html" xml:base="https://bur.gy/2025/11/29/how-many-roads.html"><![CDATA[<p>Look at me being all highfalutin and referencing “The Bard” when I ought to
admit this is more an “Oops!… I Did It Again” situation! I promised myself
I wouldn’t and yet, here it is, I wrote yet another
<a href="https://rwmj.wordpress.com/2010/08/07/jonesforth-git-repository/">JONESFORTH</a>
clone. I have no excuse. Well, I kinda do. I was
<a href="https://xkcd.com/356/">nerd sniped</a> on discord. See, what happened was this:
I built all previous clones to <a href="https://webassembly.org/">WebAssembly</a> so
<a href="https://discord.com/users/762717526873341962">Paul Tarvydas</a> whether I
considered “writing Forth-ish directly in WAT (WASM)”. I said no because I
was already aware of <a href="https://mko.re/blog/waforth/">WAForth</a> which took
the conversation into an exploration of WAForth with its author, Remko. WAForth
inlines much more aggressively than JONESFORTH. It uses
<a href="https://en.wikipedia.org/wiki/Threaded_code#Subroutine_threading">subroutine threading</a>
and “generates WebAssembly instructions in binary format”. Quite the feat
if I’m honest.</p>

<p>But that strays rather far from JONESFORTH “simple”
<a href="https://en.wikipedia.org/wiki/Threaded_code#Indirect_threading">indirect threading</a>.
So I couldn’t just let it go now, could I. And luckily, WebAssembly introduced
<code class="language-plaintext highlighter-rouge">return_call</code> and <code class="language-plaintext highlighter-rouge">return_call_indirect</code> around 2023.  So off I went.</p>

<p>I initially tried to have Claude do all the heavy lifting so I issued the
following prompt (with <code class="language-plaintext highlighter-rouge">jonesforth.S</code> in the context window):</p>

<blockquote>
  <p>Please translate <code class="language-plaintext highlighter-rouge">jonesforth.S</code> from x86 assembly to WAT the textual Webassembly representation</p>
</blockquote>

<p>And off it went, quickly producing reams of code.  I saved everything it produced
as a <a href="https://gist.github.com/jburgy/56c16cb3d5e366a1217949ac9d05ba7a">GitHub gist</a>.
It looked great at first but soon hit a bunch of “pointer out of bounds” errors
and Claude is not great at debugging, at least not yet.  I took a crack at it myself
but the horse was already out of the barn, Claude had produced too much code and
some of it was too different from the original source to be worth fixing.  Here is
an interesting data point though: Claude ported <code class="language-plaintext highlighter-rouge">WORD</code> and <code class="language-plaintext highlighter-rouge">NUMBER</code> from JONESFORTH 
but missed the, admittedly subtle, distinction between <code class="language-plaintext highlighter-rouge">WORD</code> (the primitive) and
<code class="language-plaintext highlighter-rouge">_WORD</code> (the actual implementation).  Wonder how I could have adjusted the prompt
to highlight it.</p>

<p>Claude also stumbled over the fact that JONESFORT uses
<a href="https://en.wikipedia.org/wiki/Threaded_code#Indirect_threading">indirect threading</a>.
And that’s when I was humbly reminded of the quote, often attributed to Einstein:</p>

<blockquote>
  <p>If you can’t explain it simply, you don’t understand it well enough.</p>
</blockquote>

<p>I had already ported JONESFORTH three times then
(<a href="https://github.com/jburgy/blog/blob/main/forth/4th.c">4th.c</a>,
<a href="https://github.com/jburgy/blog/blob/main/forth/5th.c">5th.c</a>, and
<a href="https://github.com/jburgy/blog/blob/main/forth/6th.zig">6th.zig</a>) yet I <em>still</em>
couldn’t explain the difference between direct and indirect threading <strong>simply</strong>.
Ironically, I think 
<a href="https://discord.com/channels/1415980515806412813/1441786905980174458/1442326514065604710">I can now</a>!</p>

<p>I made a second attempt from scrach with Claude, this time using the following
prompt:</p>

<blockquote>
  <p><code class="language-plaintext highlighter-rouge">jonesforth.S</code> is an indirect threaded Forth interpreter written in 32-bit x86 assembly.
I want you to produce the closest possible equivalent in WAT, the textual representation
of WebAssembly. You will use <code class="language-plaintext highlighter-rouge">return_call_indirect</code> to replace <code class="language-plaintext highlighter-rouge">JMP *(%eax)</code> in the 
<code class="language-plaintext highlighter-rouge">NEXT</code> macro. The <code class="language-plaintext highlighter-rouge">name_*</code> labels which jonesforth.S introduces via the <code class="language-plaintext highlighter-rouge">defcode</code> macro 
will become WASM functions.</p>
</blockquote>

<p>This time I tried to explain the <code class="language-plaintext highlighter-rouge">WORD</code>/<code class="language-plaintext highlighter-rouge">_WORD</code> duality in follow-up prompts but we
were going in circles so I stopped, grabbed the outer shell, and started down
the good ole-fashioned “artisanal” route of typing the bloody code myself!
I will admit that Claude’s two failed attempts gave me plenty of good pointers
on the <a href="https://developer.mozilla.org/en-US/docs/WebAssembly/Guides/Understanding_the_text_format">WebAssembly text format</a>.
I chose to use S-expressions throughout for readability although YMMV.</p>

<p><a href="https://rwmj.wordpress.com/about/">Richard WM Jones</a> uses a lot of
GNU assembler macros in <code class="language-plaintext highlighter-rouge">jonesforth.S</code>.  My C implementations use the
C Preprocessor to similar effect.  Zig prides itself on having “no preprocessor,
no macros” but offers comptime metaprogramming instead.  WebAssembly provides
none of that.  So I wrote a poor man’s assembler of sorts in
<a href="https://github.com/jburgy/blog/blob/main/forth/wasm/dictionary.py">python</a>.
At its core, it’s but a <code class="language-plaintext highlighter-rouge">dict</code> of word definitions and a loop to print them along with headers (using 
<a href="https://developer.mozilla.org/en-US/docs/WebAssembly/Guides/Understanding_the_text_format#webassembly_memory"><code class="language-plaintext highlighter-rouge">data</code></a>)
and <code class="language-plaintext highlighter-rouge">funcref</code> registrations (using 
<a href="https://developer.mozilla.org/en-US/docs/WebAssembly/Guides/Understanding_the_text_format#webassembly_tables"><code class="language-plaintext highlighter-rouge">elem</code></a>).
Take <code class="language-plaintext highlighter-rouge">SWAP</code> for example:</p>
<div class="language-scheme highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="nf">data</span> <span class="p">(</span><span class="nf">i32</span><span class="o">.</span><span class="nv">const</span> <span class="mi">0</span><span class="nv">x5054</span><span class="p">)</span> <span class="s">"\44\50\00\00\04SWAP\00\00\00\02\00\00\00"</span><span class="p">)</span>
<span class="p">(</span><span class="nf">func</span> <span class="nv">$swap</span> <span class="p">(</span><span class="nf">type</span> <span class="mi">0</span><span class="p">)</span>
    <span class="p">(</span><span class="nf">local</span> <span class="nv">$t</span> <span class="nv">i32</span><span class="p">)</span>
    <span class="p">(</span><span class="nf">local</span><span class="o">.</span><span class="nv">set</span> <span class="nv">$t</span> <span class="p">(</span><span class="nf">i32</span><span class="o">.</span><span class="nv">load</span> <span class="nv">offset=4</span> <span class="p">(</span><span class="nf">global</span><span class="o">.</span><span class="nv">get</span> <span class="nv">$sp</span><span class="p">)))</span>
    <span class="p">(</span><span class="nf">i32</span><span class="o">.</span><span class="nv">store</span> <span class="nv">offset=4</span> <span class="p">(</span><span class="nf">global</span><span class="o">.</span><span class="nv">get</span> <span class="nv">$sp</span><span class="p">)</span> <span class="p">(</span><span class="nf">i32</span><span class="o">.</span><span class="nv">load</span> <span class="p">(</span><span class="nf">global</span><span class="o">.</span><span class="nv">get</span> <span class="nv">$sp</span><span class="p">)))</span>
    <span class="p">(</span><span class="nf">i32</span><span class="o">.</span><span class="nv">store</span> <span class="p">(</span><span class="nf">global</span><span class="o">.</span><span class="nv">get</span> <span class="nv">$sp</span><span class="p">)</span> <span class="p">(</span><span class="nf">local</span><span class="o">.</span><span class="nv">get</span> <span class="nv">$t</span><span class="p">))</span>
    <span class="p">(</span><span class="nf">return_call</span> <span class="nv">$next</span><span class="p">)</span>
<span class="p">)</span>
<span class="p">(</span><span class="nf">elem</span> <span class="p">(</span><span class="nf">i32</span><span class="o">.</span><span class="nv">const</span> <span class="mi">0</span><span class="nv">x2</span><span class="p">)</span> <span class="nv">$swap</span><span class="p">)</span>
</code></pre></div></div>
<p>Its header starts at address <code class="language-plaintext highlighter-rouge">0x5054</code> with a link to the start of the preceding word
(in this case <code class="language-plaintext highlighter-rouge">DROP</code> at address <code class="language-plaintext highlighter-rouge">0x5044</code> and yes, WebAssembly is
<a href="https://en.wikipedia.org/wiki/Endianness">little-endian</a>).  The next byte (<code class="language-plaintext highlighter-rouge">0x04</code>) encodes
the length and potential flags like <code class="language-plaintext highlighter-rouge">HIDDEN</code> or <code class="language-plaintext highlighter-rouge">IMMEDIATE</code>.  The name itself (<code class="language-plaintext highlighter-rouge">"SWAP"</code>) follows,
with padding to the next 4 byte word boundary.  Finally, because <code class="language-plaintext highlighter-rouge">SWAP</code> is built in (i.e.
implemented in WebAssembly), its code word is just the <em>index</em> of <code class="language-plaintext highlighter-rouge">$swap</code> in the funcref table.
The first 2 and last 3 lines in this snippet were generated by python.</p>

<p><a href="https://github.com/jburgy/blog/blob/main/forth/wasm/jonesforth.wast"><code class="language-plaintext highlighter-rouge">jonesforth.wast</code></a>
started as a <em>direct threading</em> interpreter which meant it could at best execute composite
words consisting <em>entirely</em> of builtin words.  One could imagine an implementation that
<a href="https://github.com/jburgy/blog/blob/main/forth/wasm/jonesforth.wast">inlines</a> aggressively
to maintain that invariant but that would lead to 
<a href="https://en.wikipedia.org/wiki/Code_bloat">code bloat</a> and is certainly not how JONESFORTH
works. <a href="https://github.com/jburgy/blog/pull/17/commits/2eb361565961b22dcd3a0768f123ebe0a808a208">2eb3615</a>
converted to <em>indirect threading</em> and the following two lines go a long way to
explain the difference:</p>
<div class="language-diff highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gd">-   (data (i32.const 0x53ec) "\dc\53\00\00\04&gt;DFA\00\00\00\00\00\00\00\42\00\00\00\0d\00\00\00\23\00\00\00")
</span><span class="gi">+   (data (i32.const 0x53ec) "\dc\53\00\00\04&gt;DFA\00\00\00\00\00\00\00\e8\53\00\00\fc\50\00\00\10\52\00\00")
</span></code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">&gt;DFA</code> is the first composite word to appear in <code class="language-plaintext highlighter-rouge">jonesforth.S</code>.  It decompiles to
<code class="language-plaintext highlighter-rouge">: &gt;DFA &gt;CFA 4+ ;</code>.  Before
<a href="https://github.com/jburgy/blog/pull/17/commits/2eb361565961b22dcd3a0768f123ebe0a808a208">2eb3615</a>,
its data consisted of the <em>indices</em> of <code class="language-plaintext highlighter-rouge">&gt;CFA</code>, <code class="language-plaintext highlighter-rouge">4+</code>, and <code class="language-plaintext highlighter-rouge">EXIT</code> in the funcref table.
Afterwards, its data became the <strong>code field addresses</strong> of those same words.  So <code class="language-plaintext highlighter-rouge">NEXT</code>
<a href="https://developer.mozilla.org/en-US/docs/WebAssembly/Reference/Memory/load"><em>loads</em></a> the value
at memory address <code class="language-plaintext highlighter-rouge">0x53fc</code> (16 bytes <em>past</em> <code class="language-plaintext highlighter-rouge">0x53ec</code> to skip the link to <code class="language-plaintext highlighter-rouge">&gt;CFA</code>, flag, name,
padding, and <em>index</em> of <code class="language-plaintext highlighter-rouge">DOCOL</code> (0)), see another memory address (<code class="language-plaintext highlighter-rouge">0x53e8</code>) which it needs to
<em>load</em> before it can <code class="language-plaintext highlighter-rouge">return_call_indirect</code> the corresponding function.  Note also how the first 
data word (<code class="language-plaintext highlighter-rouge">0x53e8</code>) is exactly 12 bytes past the link (<code class="language-plaintext highlighter-rouge">0x53dc</code>).  The link points to the
<code class="language-plaintext highlighter-rouge">&gt;CFA</code> <em>word</em> (with header) whereas <code class="language-plaintext highlighter-rouge">&gt;DFA</code>’s first data element points to the <em>address</em> of <code class="language-plaintext highlighter-rouge">&gt;CFA</code>’s
<em>code field</em>!  Oh, the joys of pointer arithmetic!</p>

<p>To recap, it’s somehow fitting that this should be my fourth Forth!</p>

<table>
  <thead>
    <tr>
      <th>#</th>
      <th style="text-align: left">Language</th>
      <th style="text-align: left">Strategy</th>
      <th style="text-align: right">Source</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>1</td>
      <td style="text-align: left">C</td>
      <td style="text-align: left">Labels as Values</td>
      <td style="text-align: right"><a href="https://github.com/jburgy/blog/blob/main/forth/4th.c">4th.c</a></td>
    </tr>
    <tr>
      <td>2</td>
      <td style="text-align: left">C</td>
      <td style="text-align: left"><code class="language-plaintext highlighter-rouge">musttail</code></td>
      <td style="text-align: right"><a href="https://github.com/jburgy/blog/blob/main/forth/5th.c">5th.c</a></td>
    </tr>
    <tr>
      <td>3</td>
      <td style="text-align: left">Zig</td>
      <td style="text-align: left"><code class="language-plaintext highlighter-rouge">.always_tail</code></td>
      <td style="text-align: right"><a href="https://github.com/jburgy/blog/blob/main/forth/6th.zig">6th.zig</a></td>
    </tr>
    <tr>
      <td>4</td>
      <td style="text-align: left">WAT</td>
      <td style="text-align: left"><code class="language-plaintext highlighter-rouge">return_call_indirect</code></td>
      <td style="text-align: right"><a href="https://github.com/jburgy/blog/blob/main/forth/wasm/jonesforth.wast">jonesforth.wast</a></td>
    </tr>
  </tbody>
</table>

<p>And it’s entirely unacceptable that the “native” WebAssembly implementation is the only one to
<strong>not</strong> have a browser demo!  That’s because I developed it on <a href="https://wasi.dev/">WASI</a> to
iterate quickly in the command line.  I’ll work on a <a href="https://en.wikipedia.org/wiki/Shim_(computing)">shim</a>
<del>real soon</del> now:</p>

<div id="terminal"></div>
<script src="/blog/main.js" type="module"></script>

<h2 id="epilogue">Epilogue</h2>

<p>I had so much fun writing <a href="https://github.com/jburgy/blog/blob/main/forth/wasm/jonesforth.wast">jonesforth.wast</a>
that I continued to experiment.  First I replaced the 4 most often modified global variables by local ones.
With that, <code class="language-plaintext highlighter-rouge">$swap</code> becomes</p>
<div class="language-scheme highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="nf">data</span> <span class="p">(</span><span class="nf">i32</span><span class="o">.</span><span class="nv">const</span> <span class="mi">0</span><span class="nv">x5054</span><span class="p">)</span> <span class="s">"\44\50\00\00\04SWAP\00\00\00\02\00\00\00"</span><span class="p">)</span>
<span class="p">(</span><span class="nf">func</span> <span class="nv">$swap</span> <span class="p">(</span><span class="nf">param</span> <span class="nv">$cfa</span> <span class="nv">i32</span><span class="p">)</span> <span class="p">(</span><span class="nf">param</span> <span class="nv">$ip</span> <span class="nv">i32</span><span class="p">)</span> <span class="p">(</span><span class="nf">param</span> <span class="nv">$sp</span> <span class="nv">i32</span><span class="p">)</span> <span class="p">(</span><span class="nf">param</span> <span class="nv">$rsp</span> <span class="nv">i32</span><span class="p">)</span>
    <span class="p">(</span><span class="nf">local</span> <span class="nv">i32</span><span class="p">)</span>
    <span class="p">(</span><span class="nf">local</span><span class="o">.</span><span class="nv">set</span> <span class="mi">4</span> <span class="p">(</span><span class="nf">i32</span><span class="o">.</span><span class="nv">load</span> <span class="nv">offset=4</span> <span class="p">(</span><span class="nf">local</span><span class="o">.</span><span class="nv">get</span> <span class="nv">$sp</span><span class="p">)))</span>
    <span class="p">(</span><span class="nf">i32</span><span class="o">.</span><span class="nv">store</span> <span class="nv">offset=4</span> <span class="p">(</span><span class="nf">local</span><span class="o">.</span><span class="nv">get</span> <span class="nv">$sp</span><span class="p">)</span> <span class="p">(</span><span class="nf">i32</span><span class="o">.</span><span class="nv">load</span> <span class="p">(</span><span class="nf">local</span><span class="o">.</span><span class="nv">get</span> <span class="nv">$sp</span><span class="p">)))</span>
    <span class="p">(</span><span class="nf">i32</span><span class="o">.</span><span class="nv">store</span> <span class="p">(</span><span class="nf">local</span><span class="o">.</span><span class="nv">get</span> <span class="nv">$sp</span><span class="p">)</span> <span class="p">(</span><span class="nf">local</span><span class="o">.</span><span class="nv">get</span> <span class="mi">4</span><span class="p">))</span>
    <span class="p">(</span><span class="nf">return_call</span> <span class="nv">$next</span> <span class="p">(</span><span class="nf">local</span><span class="o">.</span><span class="nv">get</span> <span class="nv">$cfa</span><span class="p">)</span> <span class="p">(</span><span class="nf">local</span><span class="o">.</span><span class="nv">get</span> <span class="nv">$ip</span><span class="p">)</span> <span class="p">(</span><span class="nf">local</span><span class="o">.</span><span class="nv">get</span> <span class="nv">$sp</span><span class="p">)</span> <span class="p">(</span><span class="nf">local</span><span class="o">.</span><span class="nv">get</span> <span class="nv">$rsp</span><span class="p">))</span>
<span class="p">)</span>
<span class="p">(</span><span class="nf">elem</span> <span class="p">(</span><span class="nf">i32</span><span class="o">.</span><span class="nv">const</span> <span class="mi">0</span><span class="nv">x2</span><span class="p">)</span> <span class="nv">$swap</span><span class="p">)</span>
</code></pre></div></div>
<p>You can see the whole thing in <a href="https://github.com/jburgy/blog/blob/main/forth/wasm/localize.wast">localize.wast</a>.</p>

<p>After that, I remembered something Remko shared on Discord: <a href="https://mko.re/blog/uxn-wasm/">uxn.wasm</a>.
He uses <a href="https://developer.mozilla.org/en-US/docs/WebAssembly/Reference/Control_flow/br_table"><code class="language-plaintext highlighter-rouge">br_table</code></a>
to implement the <a href="https://100r.co/site/uxn.html">UXN</a> virtual machine.  That’s quite cool so I took a
crack at it.  The result is in <a href="https://github.com/jburgy/blog/blob/main/forth/wasm/tabulate.wast">tabulate.wast</a>.
The whole point is how parentheses are balanced.  More than a hundred are <em>opened</em> between lines 223 and 234 to
introduce the nested blocks that <code class="language-plaintext highlighter-rouge">br_table</code> requires.  They are <em>closed</em> one at a time after each builtin word, e.g.</p>
<div class="language-scheme highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="c1">;; swap</span>
    <span class="p">(</span><span class="nf">local</span><span class="o">.</span><span class="nv">set</span> <span class="mi">4</span> <span class="p">(</span><span class="nf">i32</span><span class="o">.</span><span class="nv">load</span> <span class="nv">offset=4</span> <span class="p">(</span><span class="nf">local</span><span class="o">.</span><span class="nv">get</span> <span class="nv">$sp</span><span class="p">)))</span>
    <span class="p">(</span><span class="nf">i32</span><span class="o">.</span><span class="nv">store</span> <span class="nv">offset=4</span> <span class="p">(</span><span class="nf">local</span><span class="o">.</span><span class="nv">get</span> <span class="nv">$sp</span><span class="p">)</span> <span class="p">(</span><span class="nf">i32</span><span class="o">.</span><span class="nv">load</span> <span class="p">(</span><span class="nf">local</span><span class="o">.</span><span class="nv">get</span> <span class="nv">$sp</span><span class="p">)))</span>
    <span class="p">(</span><span class="nf">i32</span><span class="o">.</span><span class="nv">store</span> <span class="p">(</span><span class="nf">local</span><span class="o">.</span><span class="nv">get</span> <span class="nv">$sp</span><span class="p">)</span> <span class="p">(</span><span class="nf">local</span><span class="o">.</span><span class="nv">get</span> <span class="mi">4</span><span class="p">))</span>
    <span class="p">(</span><span class="nf">br</span> <span class="nv">$next</span><span class="p">))</span> <span class="c1">;; ⇐ note 2 right parentheses here! </span>
</code></pre></div></div>
<p>The only clever bit I came up with in that implementation are the jumps in <code class="language-plaintext highlighter-rouge">INTERPRET</code> and <code class="language-plaintext highlighter-rouge">EXECUTE</code>.
I realized I could simply branch “in the middle” of <code class="language-plaintext highlighter-rouge">NEXT</code> (after the <em>first</em> indirection from <code class="language-plaintext highlighter-rouge">ip</code>
into <code class="language-plaintext highlighter-rouge">cfa</code> but before the <em>second</em> one).  All this takes is another label, which I named (rather
unimaginatively) <code class="language-plaintext highlighter-rouge">$dispatch</code>.</p>

<p>That led me to realize that good ole <a href="https://en.wikipedia.org/wiki/C_(programming_language)">C</a> offers
just enough flexibilty to achieve the same thing without non-standard extensions like labels as values 
or tail calls.  <code class="language-plaintext highlighter-rouge">break</code> and <code class="language-plaintext highlighter-rouge">continue</code> offer just enough daylight to skip to the end of a <code class="language-plaintext highlighter-rouge">switch</code>
statement or the top of the surrounding loop.  This lends itself to the following scheme:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="k">while</span> <span class="p">(</span><span class="mi">1</span><span class="p">)</span> <span class="p">{</span>
        <span class="k">switch</span> <span class="p">(</span><span class="n">memory</span><span class="p">[</span><span class="n">cfa</span><span class="p">])</span> <span class="p">{</span>
            <span class="k">case</span> <span class="n">DOCOL</span><span class="p">:</span>
                <span class="n">memory</span><span class="p">[</span><span class="o">--</span><span class="n">rsp</span><span class="p">]</span> <span class="o">=</span> <span class="n">ip</span><span class="p">;</span>
                <span class="n">ip</span> <span class="o">=</span> <span class="n">cfa</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
                <span class="k">break</span><span class="p">;</span>
            <span class="k">case</span> <span class="n">DROP</span><span class="p">:</span>
                <span class="o">++</span><span class="n">sp</span><span class="p">;</span>
                <span class="k">break</span><span class="p">;</span>
            <span class="p">...</span>
            <span class="k">case</span> <span class="n">EXECUTE</span><span class="p">:</span>
                <span class="n">cfa</span> <span class="o">=</span> <span class="n">memory</span><span class="p">[</span><span class="n">sp</span><span class="o">++</span><span class="p">]</span> <span class="o">&gt;&gt;</span> <span class="mi">2</span><span class="p">;</span>
                <span class="k">continue</span><span class="p">;</span>
            <span class="p">...</span>
        <span class="p">}</span>
        <span class="n">cfa</span> <span class="o">=</span> <span class="n">memory</span><span class="p">[</span><span class="n">ip</span><span class="o">++</span><span class="p">]</span> <span class="o">&gt;&gt;</span> <span class="mi">2</span><span class="p">;</span>
    <span class="p">}</span>
</code></pre></div></div>
<p>(extracted from <a href="https://github.com/jburgy/blog/blob/main/forth/jansforth.c">here</a>).
As an added bonus, this last implementation (and I hope to everything I hold dear this is
the last one) is <a href="https://en.wikipedia.org/wiki/Relocation_(computing)">relocatable</a>.
Addresses are <em>relative</em> to <code class="language-plaintext highlighter-rouge">memory</code> (which explains the <code class="language-plaintext highlighter-rouge">SYSCALL1</code> hack).  The left shifts
are required to conform to 
<a href="https://github.com/nornagon/jonesforth/blob/master/jonesforth.f">jonesforth.f</a> address arithmetic,
specifically around control structures (<code class="language-plaintext highlighter-rouge">IF</code>, <code class="language-plaintext highlighter-rouge">THEN</code>, …) and decompilation (<code class="language-plaintext highlighter-rouge">CFA&gt;</code>, <code class="language-plaintext highlighter-rouge">SEE</code>).</p>

<p>In hindsight, my long and convoluted arc with JONESFORTH reminds me of
<a href="https://www.smart-words.org/jokes/programmer-evolution.html">this joke</a>.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[Look at me being all highfalutin and referencing “The Bard” when I ought to admit this is more an “Oops!… I Did It Again” situation! I promised myself I wouldn’t and yet, here it is, I wrote yet another JONESFORTH clone. I have no excuse. Well, I kinda do. I was nerd sniped on discord. See, what happened was this: I built all previous clones to WebAssembly so Paul Tarvydas whether I considered “writing Forth-ish directly in WAT (WASM)”. I said no because I was already aware of WAForth which took the conversation into an exploration of WAForth with its author, Remko. WAForth inlines much more aggressively than JONESFORTH. It uses subroutine threading and “generates WebAssembly instructions in binary format”. Quite the feat if I’m honest.]]></summary></entry><entry><title type="html">How do you visualize data?</title><link href="https://bur.gy/2025/10/28/how-do-you-visualize.html" rel="alternate" type="text/html" title="How do you visualize data?" /><published>2025-10-28T18:52:24+00:00</published><updated>2025-10-28T18:52:24+00:00</updated><id>https://bur.gy/2025/10/28/how-do-you-visualize</id><content type="html" xml:base="https://bur.gy/2025/10/28/how-do-you-visualize.html"><![CDATA[<p>We live in an era with “data is king” and that’s great.  Unfortunately, data is hard
to interpret.  <a href="https://en.wikipedia.org/wiki/Edward_Tufte">Edward Tufte</a> has written
phenomenal books on “The Visual Display of Quantitative Information”.  In them, he
coined the term “data-ink ratio” and introduced other best practices.</p>

<p>That’s all well and good but how are mere mortals supposed to apply these theories
on the web?  As with most things web, we are faced with an overwhelming
<a href="https://en.wikipedia.org/wiki/The_Paradox_of_Choice">paradox of choice</a>.  Type
<a href="https://letmegooglethat.com/?q=dashboard+framework">dashboard framework</a> in your favorite
search engine if you don’t believe me.</p>

<p>I have tried many of these “build dashboards in 10 easy steps” tutorials.  Some are
really great.  The opiniated ones frustrate you as soon as you try swimming outside
the lane they picked for you.  In my experience, the more mature ones are still
implemented in JavaScript or <a href="https://www.typescriptlang.org/">TypeScript</a> which
makes data scientists recoil in horror.</p>

<p><a href="https://d3js.org/">D3</a> is incredibly powerful but its API is much too vast for
my <a href="https://grugbrain.dev/">Grug brain</a>.  <a href="https://dc-js.github.io/dc.js/">dc.js</a>
is slightly less intimidating, yet retains many of D3’s cool features (like
<a href="https://developer.mozilla.org/en-US/docs/Web/SVG">Scalable Vector Graphics</a> and
transitions).  I wondered whether <a href="https://pyscript.net/">PyScript</a> was flexible
enough to support the <a href="https://dc-js.github.io/dc.js/">dc.js</a>.  Turns out, it was:</p>

<iframe src="https://examples.pyscriptapps.com/dc-js-dimensional-charting-library/latest/" width="100%" height="1337.5px" allow="cross-origin-isolated"></iframe>

<p>No discussion on data visualization is complete without a mention of 
<a href="https://en.wikipedia.org/wiki/Hans_Rosling">Hans Rosling</a>’s excellent
2006 TED Talk:</p>

<div style="max-width:100%">
    <div style="position:relative;height:0;padding-bottom:56.25%">
        <iframe src="https://embed.ted.com/talks/hans_rosling_the_best_stats_you_ve_ever_seen" width="100%" height="576px" title="The best stats you've ever seen" style="position:absolute;left:0;top:0;width:100%;height:100%" frameborder="0" scrolling="no" allowfullscreen="" onload="window.parent.postMessage('iframeLoaded', 'https://embed.ted.com')"></iframe>
    </div>
</div>]]></content><author><name></name></author><summary type="html"><![CDATA[We live in an era with “data is king” and that’s great. Unfortunately, data is hard to interpret. Edward Tufte has written phenomenal books on “The Visual Display of Quantitative Information”. In them, he coined the term “data-ink ratio” and introduced other best practices.]]></summary></entry><entry><title type="html">What’s a Veg-O-Matic?</title><link href="https://bur.gy/2025/10/16/veg-o-matic.html" rel="alternate" type="text/html" title="What’s a Veg-O-Matic?" /><published>2025-10-16T19:01:51+00:00</published><updated>2025-10-16T19:01:51+00:00</updated><id>https://bur.gy/2025/10/16/veg-o-matic</id><content type="html" xml:base="https://bur.gy/2025/10/16/veg-o-matic.html"><![CDATA[<h2 id="or-how-do-you-slice-and-dice-data">Or How do you Slice and Dice Data?</h2>

<p>A former employer of mine rolled out a graphical tool years ago which
let us interrogate tabular data sets with minimal coding.  Think 
Excel PivotTable except better.  You could drag column headers and drop
them into the left margin to summarize (group) data by the corresponding
values (or dimension).  There was a filter dialog who let you compose
complex <a href="https://en.wikipedia.org/wiki/Boolean_expression">Boolean expressions</a>
with a few mouse clicks.  That was <em>so</em> helpful and I miss it to this day.</p>

<p>After I left, I started looking around for a substitute and came across
<a href="https://pivottable.js.org/">PivotTable.js</a>.  I liked it but was immediately
<a href="https://xkcd.com/356/">nerd sniped</a> by 
<a href="https://github.com/nicolaskruchten/pivottable/wiki/Frequently-Asked-Questions#input-data-size">this comment</a>.
I looked at the code and thought that implementing 
<a href="https://en.wikipedia.org/wiki/Online_analytical_processing">OLAP</a> in JavaScript
probably contributed to data size limitations.  Around the same time, I noticed
that Chrome, like many industrial applications, embedded <a href="https://sqlite.org">SQLite</a>.
Not only that, it even <a href="https://www.w3.org/TR/webdatabase/">exposed it to JavaScript</a>.
Yes, the W3C page already sported the deprecation warning but I was young and naive
so I worked my way through the quirky Web SQL API and did a thing.</p>

<p><a href="https://react.dev">React</a> and <a href="https://angular.dev">Angular</a> were the leading Web UI
frameworks at the time but frameworks just irked me and I wanted something more
lightweight so I reached for <code class="language-plaintext highlighter-rouge">&lt;gasp&gt;</code><a href="https://developer.mozilla.org/en-US/docs/Web/API/Web_components">Web Components</a><code class="language-plaintext highlighter-rouge">&lt;/gasp&gt;</code>!
<a href="https://github.com/nicolaskruchten/pivottable">PivotTable.js</a> also relied on
<a href="https://jquery.com/">jQuery</a> and I wouldn’t stand for that so I learned
the <a href="https://developer.mozilla.org/en-US/docs/Web/API/HTML_Drag_and_Drop_API">HTML Drag and Drop API</a>.
And that crazy gizmo kinda worked back in 2019!</p>

<p><a href="https://github.com/jburgy/data-grid">jburgy/data-grid</a> sat there, gathering dust
(and <a href="https://en.wikipedia.org/wiki/Software_rot">bit rot</a>) until I created an
issue to <a href="https://github.com/jburgy/data-grid/issues/1">Replace Web SQL</a> on the
Spring Equinox.  You see, Chrome finally did
<a href="https://developer.chrome.com/blog/deprecating-web-sql">deprecate and remove Web SQL</a>
as <del>threatened</del> promised.  And just like that, lthe little spark that set <code class="language-plaintext highlighter-rouge">data-grid</code> apart
was estinguished.  However, a great many things happened to the Web during that time.
Most importantly, <a href="https://webassembly.org/">WebAssembly</a> democratized embedding
non-web software in web browsers.  Much to their credit, the SQLite maintainers decided
to offer an official <a href="https://sqlite.org/wasm">WASM port</a> of their awesome product.</p>

<p><a href="https://www.npmjs.com/package/@sqlite.org/sqlite-wasm"><code class="language-plaintext highlighter-rouge">@sqlite.org/sqlite-wasm</code></a>
brought <code class="language-plaintext highlighter-rouge">data-grid</code> back from the brink.  I should also say that its 
<a href="https://sqlite.org/wasm/doc/trunk/api-worker1.md#promiser">Promise-based Wrapper</a>
is much less quirky that the old Web SQL API.  That first step was the easy part.
The heavy lift came from the fact that I insisted on embedding a demo in this blog
post.  For better or worse, I really liked how the most recent iteration of the
<a href="/2022/03/10/what-python-slow.html">What, python, slow?</a> post turned out:
using <a href="https://jupyterlite.readthedocs.io/"><code class="language-plaintext highlighter-rouge">JupyterLite</code></a>.  Had I known how much
additional worked it would have created, I might have chosen a simpler approach.
(Nah, who am I kidding, since when do I do these because they’re easy)</p>

<p>2019 <code class="language-plaintext highlighter-rouge">data-grid</code> also came as an incredibly clunky custom
<a href="https://ipywidgets.readthedocs.io/en/stable/examples/Widget%20Custom.html">Jupyter Widget</a>.
You need to remember that 
<a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Modules">es6 modules</a>
were not widely supported back then so you still needed to contend with nonsense
like AMD and UMD.  <a href="https://anywidget.dev/"><code class="language-plaintext highlighter-rouge">anywidget</code></a> appeared in 2024 precisely
to make custom Jupyter Widgets easy (or at least easier if you don’t need to worry
about bundling dependencies).</p>

<p>Fortunately, many of the pieces were already in place.  Unfortunately, previous demos
on this blog pale in comparison with the complexity of this one, particularly when it
comes to <a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/import"><code class="language-plaintext highlighter-rouge">import</code></a>
depth.  <a href="https://github.com/gzuidhof/coi-serviceworker">gzuidhof/coi-serviceworker</a>, which
let me circumvent GitHub’s persistent lack of
<a href="https://github.com/orgs/community/discussions/54257">custom headers</a> many times before,
just <a href="https://stackoverflow.com/questions/79790645/active-service-worker-logging-but-not-intercepting-requests">wouldn’t work this time</a>.
I asked Claude and Copilot for help but they came up empty so I threw in the towel and
used a <a href="https://github.com/jburgy/ogoz">Cloudflare Worker</a>.  Its name is a nod to my
ancestors since my grandmother grew up on a farm that is now at the bottom of the lake
which surrounds <a href="https://en.wikipedia.org/wiki/%C3%8Ele_d%27Ogoz">l’Île d’Ogoz</a>.</p>

<p>Lastly, I needed to figure out how to mount the
<a href="https://developer.mozilla.org/en-US/docs/Web/API/File_System_API/Origin_private_file_system">Origin Private File System</a>
in a <a href="https://jupyterlite-pyodide-kernel.readthedocs.io/en/latest/">pyodide kernel</a>.
I had never heard of <a href="https://pyodide.org/en/stable/usage/api/js-api.html#module-pyodide">pyodide_js</a>
until today.  The amount of back-and-forth between python and JavaScript reminds me of
Robert Downey Jr’s character in “Tropic Thunder” (“I’m a dude playing a dude disguised as another dude”).</p>

<p>The notebook below runs entirely in your browser.  After installing
<a href="https://pypi.org/project/anywidget/">anywidget</a>, it fetches a public dataset on traffic
violations, massages it with <a href="https://pandas.pydata.org/docs/index.html"><code class="language-plaintext highlighter-rouge">pandas</code></a> before
saving it to a SQLite file in OPFS.  The widget uses
<a href="https://sqlite.org/wasm"><code class="language-plaintext highlighter-rouge">@sqlite.org/sqlite-wasm</code></a> to summarize that data.</p>

<iframe src="https://ogoz.jburgy.workers.dev/jupyter/notebooks/index.html?path=data_grid.ipynb" width="100%" height="900px" allow="cross-origin-isolated"></iframe>

<p>And if you read that far, first of all congratulations!  I realize you might still like an answer
to the question posed in this article’s title.  Besides sounding like a
<a href="https://en.wikipedia.org/wiki/Wallace_%26_Gromit">Wallace &amp; Gromit</a> invention,
The <a href="https://en.wikipedia.org/wiki/Veg-O-Matic">Veg-O-Matic</a> is the appliance that coined the
“It slices! It dices!” catchphrase.  I thought it was relevant since we’re discussing a tool
to slice and dice data.  Womp womp.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[Or How do you Slice and Dice Data?]]></summary></entry><entry><title type="html">What is vibe coding?</title><link href="https://bur.gy/2025/09/27/what-is-vibe-coding.html" rel="alternate" type="text/html" title="What is vibe coding?" /><published>2025-09-27T15:09:18+00:00</published><updated>2025-09-27T15:09:18+00:00</updated><id>https://bur.gy/2025/09/27/what-is-vibe-coding</id><content type="html" xml:base="https://bur.gy/2025/09/27/what-is-vibe-coding.html"><![CDATA[<h2 id="i-put-machine-learning-in-your-machine-learning">I put machine learning in your machine learning</h2>

<p>First a disclaimer: I am a bit of an AI skeptic.  Not a luddite, not quite.  I do see
(the current iteration of) AI as impressive and impactful.  On the other hand, I do not
subscribe to the hype that AGI is just around the corner.  With that out of the way, let us try
and have fun with it all the same.</p>

<p>After several semi-successful attempts at coaxing answers to brain teasers out of
<a href="https://chatgpt.com/">ChatGPT</a> and <a href="https://www.deepseek.com/">DeepSeek</a> (and being
pleasantly surprised that it explained its solution to $\int_0^\infty \frac{\cos x}{1 + x^2} dx$
rather well), I thought of a project that <a href="https://copilot.microsoft.com/">Copilot</a>
could probably help with.  And help it did!  My initial prompt was</p>

<blockquote>
  <p>I want to build a web page in html5 and vanilla es6 modules that renders user facing webcam in a <code class="language-plaintext highlighter-rouge">&lt;media&gt;</code> tag so I can practice adding effects</p>
</blockquote>

<p>I realized later that I really meant <code class="language-plaintext highlighter-rouge">&lt;video&gt;</code> but that didn’t matter, Copilot created
a folder, named it <code class="language-plaintext highlighter-rouge">webcam-demo</code>, then created two files named <code class="language-plaintext highlighter-rouge">index.html</code> and
<code class="language-plaintext highlighter-rouge">main.js</code>.  And just like that, I could open the first file in a browser, grant it
video permissions, and see my face in a rectangle.  I had read just enough of
<a href="https://web.dev/articles/getusermedia-intro">this tutorial</a> to understand that the
generated code followed modern practices.  I continued with</p>

<blockquote>
  <p>I would like you to add a button. Upon clicking it a pair of pixelated sunglasses drops from the top of the <code class="language-plaintext highlighter-rouge">&lt;video&gt;</code> box and lands on the subject’s nose.</p>
</blockquote>

<p>Copilot outlined its plan then applied changes in-place.  The pixelated sunglasses didn’t 
look anything like I had in mind (more on that later) and the “dropping” animation
landed them arbitrarily ⅔ of the way down.  I tried describing the shape I wanted for the
sunglasses several times in vain.  That’s one of the cases where I jumped in
and edited the code by hand.  Then came the <em>real</em> suprise:</p>

<blockquote>
  <p>Do not estimate nose position. Use face detection so the sunglasses line up with the subject’s nose. Track that position as it changes</p>
</blockquote>

<p>And that was enough for Copilot to pull in <a href="https://justadudewhohacks.github.io/face-api.js/docs/index.html"><code class="language-plaintext highlighter-rouge">face-api.js</code></a>!
I knew enough to understand that I was never going to train a bespoke model for
such a standard task but had no idea how easy it would be to grab one off-the-shelf.
Twenty minutes in, I was essentially done.  The rest was asking Copilot to fix some
URLs that returned 404 (bit surprised it gave broken links initially if it “knew”
how to fix) and reordering some functions to make <a href="https://eslint.org">ESLint</a>
less angry.  I cleaned some other stylistic choices I didn’t like (e.g. passing nose position 
as a parameter to <code class="language-plaintext highlighter-rouge">drawTrackedSunglasses</code> instead of mutating a module-scoped variable,
using <code class="language-plaintext highlighter-rouge">.forEach</code> in <code class="language-plaintext highlighter-rouge">drawSunglasses</code> instead of nested <code class="language-plaintext highlighter-rouge">for</code> loops) but that was very
minor.  I even agreed with function names and their operations.</p>

<p>This little experiment is a far cry from what’s happening on <a href="https://simonwillison.net/">Simon Willison’s Weblog</a> 
but <a href="https://en.wikipedia.org/wiki/That%27s_My_Story_(song)">“that’s <em>my</em> story and I’m sticking to it”</a>.
As luck would have it, I recently grokked how to 
<a href="https://docs.github.com/en/pages/getting-started-with-github-pages/configuring-a-publishing-source-for-your-github-pages-site#publishing-with-a-custom-github-actions-workflow">publish to GitHub pages with GitHub pages</a>
so I can let you try this silly little “app” for yourself:</p>

<button id="drop-btn">Drop Sunglasses</button>
<br>
<div style="position: relative; display: inline-block;;">
<video id="webcam" autoplay playsinline disablepictureinpicture style="display: block;"></video>
<canvas id="sunglasses-canvas" style="position: absolute; left: 0; top: 0; pointer-events: none;"></canvas>
</div>
<script type="module" src="https://bur.gy/blog/thug-life.js"></script>]]></content><author><name></name></author><summary type="html"><![CDATA[I put machine learning in your machine learning]]></summary></entry><entry><title type="html">Why are some functions foreign?</title><link href="https://bur.gy/2025/07/02/what-are-foreign-functions.html" rel="alternate" type="text/html" title="Why are some functions foreign?" /><published>2025-07-02T21:04:02+00:00</published><updated>2025-07-02T21:04:02+00:00</updated><id>https://bur.gy/2025/07/02/what-are-foreign-functions</id><content type="html" xml:base="https://bur.gy/2025/07/02/what-are-foreign-functions.html"><![CDATA[<p>In real-world software, layers accumulate like archeological
<a href="https://en.wikipedia.org/wiki/Stratigraphy_(archaeology)">stratigraphy</a>.
I recently cooked up a little utility which ended up touching layers spanning several
decades.  It all started when I found the 
<a href="https://github.com/awsdocs/aws-lambda-developer-guide/blob/main/sample-apps/layer-python/layer/2-package.sh">official recommendation</a>
from AWS to create a <a href="https://docs.aws.amazon.com/lambda/latest/dg/python-layers.html">python layer</a>
underwhelming.</p>

<p>Per the <a href="https://docs.aws.amazon.com/lambda/latest/dg/python-layers.html#python-layers-package">AWS documentation</a>,
a layer is a <code class="language-plaintext highlighter-rouge">.zip</code> archive which includes a <code class="language-plaintext highlighter-rouge">python</code> top-level directory.  That root
directory contains the <code class="language-plaintext highlighter-rouge">platlib</code> path of the relevant
<a href="https://docs.python.org/3/glossary.html#term-virtual-environment">virtual environment</a>.
The <a href="https://github.com/awsdocs/aws-lambda-developer-guide">AWS Lambda Developer Guide</a>
addresses this in 3 steps:</p>

<ol>
  <li>create a new directory named <code class="language-plaintext highlighter-rouge">python</code> (<code class="language-plaintext highlighter-rouge">mkdir</code>)</li>
  <li>copy <code class="language-plaintext highlighter-rouge">&lt;venv&gt;/lib</code> <em>recursively</em> to the newly created <code class="language-plaintext highlighter-rouge">python</code> (<code class="language-plaintext highlighter-rouge">cp -r</code>)</li>
  <li>zip the <code class="language-plaintext highlighter-rouge">python</code> directory <em>recursively</em> (<code class="language-plaintext highlighter-rouge">zip -r</code>)</li>
</ol>

<p>This is objectively silly.  You shouldn’t need to copy an entire folder just so you can rename it.
<a href="https://www.7-zip.org/">7-Zip</a> lets you rename files (even folders) inside a zip archive in-place.
But that still requires two separate commands (oh, the humanity)!  But hang on, the
<a href="https://docs.python.org/3/library/zlib.html">python standard library</a> supports the 
<a href="https://en.wikipedia.org/wiki/ZIP_(file_format)">ZIP format</a>!  And if you’re worried about
performance, fear not, python’s <code class="language-plaintext highlighter-rouge">zlib</code> is a <a href="https://github.com/python/cpython/blob/main/Modules/zlibmodule.c">C wrapper</a>
around this <a href="https://www.zlib.net/">zlib</a>.</p>

<p>So we can write a simple python program that looks roughly like</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">platlib</span> <span class="o">=</span> <span class="n">Path</span><span class="p">(</span><span class="n">sysconfig</span><span class="p">.</span><span class="n">get_path</span><span class="p">(</span><span class="s">"platlib"</span><span class="p">))</span>
<span class="n">data</span> <span class="o">=</span> <span class="n">Path</span><span class="p">(</span><span class="n">sysconfig</span><span class="p">.</span><span class="n">get_path</span><span class="p">(</span><span class="s">"data"</span><span class="p">))</span>

<span class="k">with</span> <span class="n">ZipFile</span><span class="p">(</span><span class="n">args</span><span class="p">.</span><span class="n">zipfile</span><span class="p">,</span> <span class="s">"w"</span><span class="p">)</span> <span class="k">as</span> <span class="n">zf</span><span class="p">:</span>
    <span class="k">for</span> <span class="n">root</span><span class="p">,</span> <span class="n">_</span><span class="p">,</span> <span class="n">files</span> <span class="ow">in</span> <span class="n">platlib</span><span class="p">.</span><span class="n">walk</span><span class="p">():</span>
        <span class="n">arcroot</span> <span class="o">=</span> <span class="n">Path</span><span class="p">(</span><span class="s">"python"</span><span class="p">)</span> <span class="o">/</span> <span class="n">root</span><span class="p">.</span><span class="n">relative_to</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>
        <span class="k">for</span> <span class="nb">file</span> <span class="ow">in</span> <span class="n">files</span><span class="p">:</span>
            <span class="n">zf</span><span class="p">.</span><span class="n">write</span><span class="p">(</span><span class="n">filename</span><span class="o">=</span><span class="n">root</span> <span class="o">/</span> <span class="nb">file</span><span class="p">,</span> <span class="n">arcname</span><span class="o">=</span><span class="n">arcroot</span> <span class="o">/</span> <span class="nb">file</span><span class="p">)</span>
</code></pre></div></div>

<p>(slightly circular to package your own virtual environment but very reasonable if you think about it.)
This is great but it runs for a while and gives you no sense of what it’s doing.  Gotta fix that.</p>

<p>And this is where the story gets interesting because I decided to get cute.  I didn’t want my utility
to just “vomit” a wall of text to its standard output.  That’s not particularly helpful.  It would be
much more helpful to update one or a few lines of output as we progress.  I knew that was possible but
the details were hazy.  So I read up on 
<a href="https://en.wikipedia.org/wiki/ANSI_escape_code#Control_Sequence_Introducer_commands">Control Sequence Introducer</a>
commands.  Unfortunately, <code class="language-plaintext highlighter-rouge">python</code> does not <a href="https://docs.python.org/3/reference/lexical_analysis.html#escape-sequences">understand</a> 
<code class="language-plaintext highlighter-rouge">"\e"</code> like <a href="https://tldp.org/LDP/Bash-Beginners-Guide/html/Bash-Beginners-Guide.html#tab_08_01">Bash</a>.
People often use <code class="language-plaintext highlighter-rouge">"\033"</code> or <code class="language-plaintext highlighter-rouge">"\x1b"</code> instead but that’s not super readable.  Fear not, <code class="language-plaintext highlighter-rouge">python</code>
accepts <code class="language-plaintext highlighter-rouge">"\N{escape}"</code> which is rather readable.  With that,</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">"</span><span class="se">\N{escape}</span><span class="s">[</span><span class="si">{</span><span class="n">n</span><span class="si">}</span><span class="s">F</span><span class="se">\N{escape}</span><span class="s">[J"</span><span class="p">,</span> <span class="n">end</span><span class="o">=</span><span class="s">""</span><span class="p">)</span>
</code></pre></div></div>

<p>F (or Cursor Previous Line) “moves the cursor to beginning of the line <em>n</em> (default 1) lines up” then
J (or Erase in Display) “clears part of the screen. If <em>n</em> is 0 (or missing), clear from cursor to end of screen.”
Printing this rather cryptic screen lets you print some a few lines (like, for example, the last <em>n</em> files
added to the archive), erase them, and print them more.  The refresh rate on your terminal is likely high
enough that your output looks animated.  Furthermore, you can use
<a href="https://docs.python.org/3/library/collections.html#collections.deque"><code class="language-plaintext highlighter-rouge">collections.deque</code></a>’s <em>maxlen</em>
parameter to easily keep track of those last <em>n</em> files.  That’s quite nice but is it nice enough?</p>

<p>Actually, one can use <a href="https://en.wikipedia.org/wiki/Box-drawing_characters">box-drawing characters</a> to
make those few lines look like the output of <a href="https://en.wikipedia.org/wiki/Tree_(command)"><code class="language-plaintext highlighter-rouge">tree</code></a>.
That lets you shrink the width of the output since virtual environments nest, leading
to long paths. (Funny side note, my initial implementation was not always clearing the screen
properly because it didn’t account for line wraps requiring clearing more lines than printed).  At
that point, I remembered that building zig generated precisely that kind of “scrolling tree” output.
A quick web search took me to
<a href="https://ziggit.dev/t/zigs-new-cli-progress-bar-explained/4499">Zig’s New CLI Progress Bar Explained</a>.
Yikes, Andrew really went nuts on that “infallible and non-heap-allocating” implementation!  For once,
my <a href="https://wiki.c2.com/?LazinessImpatienceHubris">laziness</a> beat out my hubris and I decided to not
reimplement <a href="https://github.com/ziglang/zig/blob/master/lib/std/Progress.zig"><code class="language-plaintext highlighter-rouge">Progress.zig</code></a> in python.
Instead, I decided to <em>expose</em> it to python.</p>

<p>As luck would have it, I recently explored
<a href="/2025/05/01/zig-from-python.html">How do you call Zig from python?</a> With that knowledge,
I quickly whipped up</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">const</span> <span class="n">std</span> <span class="o">=</span> <span class="err">@</span><span class="n">import</span><span class="p">(</span><span class="s">"std"</span><span class="p">);</span>
<span class="k">const</span> <span class="n">py</span> <span class="o">=</span> <span class="err">@</span><span class="n">import</span><span class="p">(</span><span class="s">"pydust"</span><span class="p">);</span>

<span class="k">const</span> <span class="n">root</span> <span class="o">=</span> <span class="err">@</span><span class="n">This</span><span class="p">();</span>

<span class="n">pub</span> <span class="k">const</span> <span class="n">Progress</span> <span class="o">=</span> <span class="n">py</span><span class="p">.</span><span class="n">class</span><span class="p">(</span><span class="k">struct</span> <span class="p">{</span>
    <span class="k">const</span> <span class="n">Self</span> <span class="o">=</span> <span class="err">@</span><span class="n">This</span><span class="p">();</span>
    <span class="nl">index:</span> <span class="n">std</span><span class="p">.</span><span class="n">Progress</span><span class="p">.</span><span class="n">Node</span><span class="p">.</span><span class="n">OptionalIndex</span><span class="p">,</span>

    <span class="n">pub</span> <span class="n">fn</span> <span class="n">__init__</span><span class="p">(</span><span class="n">self</span><span class="o">:</span> <span class="o">*</span><span class="n">Self</span><span class="p">)</span> <span class="o">!</span><span class="kt">void</span> <span class="p">{</span>
        <span class="n">self</span><span class="p">.</span><span class="n">index</span> <span class="o">=</span> <span class="n">std</span><span class="p">.</span><span class="n">Progress</span><span class="p">.</span><span class="n">start</span><span class="p">(.{}).</span><span class="n">index</span><span class="p">;</span>
    <span class="p">}</span>

    <span class="n">pub</span> <span class="n">fn</span> <span class="n">start</span><span class="p">(</span>
        <span class="nl">self:</span> <span class="o">*</span><span class="k">const</span> <span class="n">Self</span><span class="p">,</span>
        <span class="nl">args:</span> <span class="k">struct</span> <span class="p">{</span> <span class="n">name</span><span class="o">:</span> <span class="n">py</span><span class="p">.</span><span class="n">PyString</span><span class="p">,</span> <span class="n">estimated_total_items</span><span class="o">:</span> <span class="n">usize</span> <span class="p">},</span>
    <span class="p">)</span> <span class="o">!*</span><span class="k">const</span> <span class="n">Self</span> <span class="p">{</span>
        <span class="k">const</span> <span class="n">parent</span><span class="o">:</span> <span class="n">std</span><span class="p">.</span><span class="n">Progress</span><span class="p">.</span><span class="n">Node</span> <span class="o">=</span> <span class="p">.{</span> <span class="p">.</span><span class="n">index</span> <span class="o">=</span> <span class="n">self</span><span class="p">.</span><span class="n">index</span> <span class="p">};</span>
        <span class="k">const</span> <span class="n">node</span> <span class="o">=</span> <span class="n">parent</span><span class="p">.</span><span class="n">start</span><span class="p">(</span><span class="n">try</span> <span class="n">args</span><span class="p">.</span><span class="n">name</span><span class="p">.</span><span class="n">asSlice</span><span class="p">(),</span> <span class="n">args</span><span class="p">.</span><span class="n">estimated_total_items</span><span class="p">);</span>
        <span class="k">return</span> <span class="n">py</span><span class="p">.</span><span class="n">init</span><span class="p">(</span><span class="n">root</span><span class="p">,</span> <span class="n">Self</span><span class="p">,</span> <span class="p">.{</span> <span class="p">.</span><span class="n">index</span> <span class="o">=</span> <span class="n">node</span><span class="p">.</span><span class="n">index</span> <span class="p">});</span>
    <span class="p">}</span>

    <span class="n">pub</span> <span class="n">fn</span> <span class="n">end</span><span class="p">(</span><span class="n">self</span><span class="o">:</span> <span class="o">*</span><span class="k">const</span> <span class="n">Self</span><span class="p">)</span> <span class="kt">void</span> <span class="p">{</span>
        <span class="k">const</span> <span class="n">node</span><span class="o">:</span> <span class="n">std</span><span class="p">.</span><span class="n">Progress</span><span class="p">.</span><span class="n">Node</span> <span class="o">=</span> <span class="p">.{</span> <span class="p">.</span><span class="n">index</span> <span class="o">=</span> <span class="n">self</span><span class="p">.</span><span class="n">index</span> <span class="p">};</span>
        <span class="n">node</span><span class="p">.</span><span class="n">end</span><span class="p">();</span>
    <span class="p">}</span>

    <span class="n">pub</span> <span class="n">fn</span> <span class="n">complete_one</span><span class="p">(</span><span class="n">self</span><span class="o">:</span> <span class="o">*</span><span class="k">const</span> <span class="n">Self</span><span class="p">)</span> <span class="kt">void</span> <span class="p">{</span>
        <span class="k">const</span> <span class="n">node</span><span class="o">:</span> <span class="n">std</span><span class="p">.</span><span class="n">Progress</span><span class="p">.</span><span class="n">Node</span> <span class="o">=</span> <span class="p">.{</span> <span class="p">.</span><span class="n">index</span> <span class="o">=</span> <span class="n">self</span><span class="p">.</span><span class="n">index</span> <span class="p">};</span>
        <span class="n">node</span><span class="p">.</span><span class="n">completeOne</span><span class="p">();</span>
    <span class="p">}</span>
<span class="p">});</span>

<span class="n">comptime</span> <span class="p">{</span>
    <span class="n">py</span><span class="p">.</span><span class="n">rootmodule</span><span class="p">(</span><span class="n">root</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>based on <a href="https://andrewkelley.me/post/zig-new-cli-progress-bar-explained.html#zp.zig"><code class="language-plaintext highlighter-rouge">zp.zig</code></a>
(with thanks to <a href="https://github.com/bridgeQiao"><code class="language-plaintext highlighter-rouge">@bridgeQiao</code></a> for removing <code class="language-plaintext highlighter-rouge">root</code> where it was
not strictly needed).</p>

<p>Writing the code was trivial but getting <a href="https://mesonbuild.com/meson-python/">meson-python</a>
to build and install it was a real trip, see
<a href="https://github.com/mesonbuild/meson/discussions/14763">mesonbuild/meson#14763</a>.  And it made
<a href="https://github.com/jburgy/blog/actions">CI</a> almost 30s slower!  I started wondering how much
Ziggy Pydust’s comptime magic contributed to that slowdown so I 
<a href="https://github.com/jburgy/blog/pull/3">ripped it out</a>!  That led me to discover and
<a href="https://github.com/ziglang/zig/issues/24413">report a bug</a> with the Zig toolchain.  And the
savings were barely noticeable.  But, as the poet laureate of 
<a href="https://en.wikipedia.org/wiki/Bedford%E2%80%93Stuyvesant,_Brooklyn">Bed-Stuy</a> said it best:
“And if you don’t know, now you know”.</p>

<p>Ok, cool, so what did we learn from all this?  To me, the real amusing part was the relative
ages of the various bits of software involved in the making of this silly utility.  In chronological
order:</p>

<table>
  <thead>
    <tr>
      <th>Component</th>
      <th>Year</th>
      <th>Role played</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Linux Signals</td>
      <td>1972</td>
      <td>Handle SIGWINCH to truncate lines and erase correctly</td>
    </tr>
    <tr>
      <td>Control Sequence Introducers</td>
      <td>1976</td>
      <td>Manipulate standard output beyond appending to it</td>
    </tr>
    <tr>
      <td>ZIP file format<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup></td>
      <td>1989</td>
      <td>Because that’s what AWS chose in 2014</td>
    </tr>
    <tr>
      <td>Python</td>
      <td>1989</td>
      <td>High-level and general-purpose</td>
    </tr>
    <tr>
      <td>Meson</td>
      <td>2013</td>
      <td>Best build backend for compiled extension</td>
    </tr>
    <tr>
      <td>Zig</td>
      <td>2016</td>
      <td>“programming language designed for making perfect software”</td>
    </tr>
    <tr>
      <td>Ziggy Pydust<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup></td>
      <td>2023</td>
      <td>“Framework for building native Python extension in Zig”</td>
    </tr>
    <tr>
      <td>uv</td>
      <td>2024</td>
      <td>“extremely fast Python package installer and resolver”</td>
    </tr>
  </tbody>
</table>

<p>If anything, this proves that, in the world of software, maybe you <em>can</em> teach old dogs new tricks!
Obviously, all of this goes back to the
<a href="https://en.wikipedia.org/wiki/Von_Neumann_architecture">Von Neuman architecture</a> from 1945
but that’s true of every software project so I chose to leave it out.  The above selection
is trying to enumerate choices that are particularly salient to the tool being discussed.</p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:1" role="doc-endnote">
      <p><a href="https://discuss.python.org/t/disable-shell-word-wrapping/4297/5">Truncating</a> is one way to avoid line wraps but the <a href="https://en.wikipedia.org/wiki/VT100">VT100</a> has escape sequences to control <a href="https://www.vt100.net/docs/vt102-ug/chapter5.html#S5.5.2.8">auto wrap</a> (<code class="language-plaintext highlighter-rouge">"\N{escape}[?7l"</code>) <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:2" role="doc-endnote">
      <p>“It’s not the destination, it’s the journey.” ― Ralph Waldo Emerson <a href="#fnref:2" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name></name></author><summary type="html"><![CDATA[In real-world software, layers accumulate like archeological stratigraphy. I recently cooked up a little utility which ended up touching layers spanning several decades. It all started when I found the official recommendation from AWS to create a python layer underwhelming.]]></summary></entry><entry><title type="html">How do you call Zig from python?</title><link href="https://bur.gy/2025/05/01/zig-from-python.html" rel="alternate" type="text/html" title="How do you call Zig from python?" /><published>2025-05-01T14:27:41+00:00</published><updated>2025-05-01T14:27:41+00:00</updated><id>https://bur.gy/2025/05/01/zig-from-python</id><content type="html" xml:base="https://bur.gy/2025/05/01/zig-from-python.html"><![CDATA[<p>5/19/25 update: See my <a href="https://zignyc.github.io/">Zig NYC #4</a>
<a href="https://html-preview.github.io/?url=https://github.com/jburgy/blog/blob/main/talks/pydust.html">slides</a></p>

<p>As this <a href="/2024/08/31/why-not-zig.html">previous post</a> demonstrates,
I rather enjoy <a href="https://ziglang.org/">Zig</a>.  However, software I write
professionally is almost exclusively in <a href="https://www.python.org/">python</a>.
Now, because <a href="https://github.com/python/cpython">CPython</a> is the most common
implementation, people commonly write performance critical parts of their
python applications in <a href="https://en.wikipedia.org/wiki/C_(programming_language)">C</a>.
There is an <a href="https://docs.python.org/3/extending/extending.html">official tutorial</a>
documenting that process.  The amount of boilerplate required stands out rather
quickly in that tutorial.  People have come up with a number of utilities to
mitigate that boilerplate including, but not limited to, 
<a href="https://docs.python.org/3/library/ctypes.html">ctypes</a>,
<a href="https://cython.org/">Cython</a>,
<a href="https://cffi.rtfd.io/">CFFI</a>,
<a href="https://www.swig.org/">SWIG</a>,
<a href="https://numpy.org/doc/stable/f2py/index.html">F2PY</a>,
or <a href="http://pybind11.rtfd.io/">pybind11</a>.</p>

<p>At the same time, Zig advertises its 
<a href="https://ziglang.org/learn/overview/#integration-with-c-libraries-without-ffibindings">integration with C libraries without FFI/bindings</a>.
Moreover, Zig also offers
<a href="https://ziglang.org/learn/overview/#compile-time-reflection-and-compile-time-code-execution">compile-time reflection and compile-time code execution</a>
which can surely mitigate boilerplate.
It stands to reason that Zig is a good candidate to
implement performance-critical sections.  The fine folks at <a href="https://spiraldb.com/">spiral</a>
realized that opportunity and released <a href="https://pydust.fulcrum.so/latest/">Ziggy Pydust</a>
in 2023.  They focused a great deal of energy on ergonomics as illustrated by their
first example:</p>
<div class="language-zig highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">const</span> <span class="n">py</span> <span class="o">=</span> <span class="nb">@import</span><span class="p">(</span><span class="s">"pydust"</span><span class="p">);</span>

<span class="k">pub</span> <span class="k">fn</span> <span class="n">fibonacci</span><span class="p">(</span><span class="n">args</span><span class="p">:</span> <span class="k">struct</span> <span class="p">{</span> <span class="n">n</span><span class="p">:</span> <span class="kt">u64</span> <span class="p">})</span> <span class="kt">u64</span> <span class="p">{</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">args</span><span class="p">.</span><span class="py">n</span> <span class="o">&lt;</span> <span class="mi">2</span><span class="p">)</span> <span class="k">return</span> <span class="n">args</span><span class="p">.</span><span class="py">n</span><span class="p">;</span>

    <span class="k">var</span> <span class="n">sum</span><span class="p">:</span> <span class="kt">u64</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
    <span class="k">var</span> <span class="n">last</span><span class="p">:</span> <span class="kt">u64</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
    <span class="k">var</span> <span class="n">curr</span><span class="p">:</span> <span class="kt">u64</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
    <span class="k">for</span> <span class="p">(</span><span class="mi">1</span><span class="o">..</span><span class="n">args</span><span class="p">.</span><span class="py">n</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">sum</span> <span class="o">=</span> <span class="n">last</span> <span class="o">+</span> <span class="n">curr</span><span class="p">;</span>
        <span class="n">last</span> <span class="o">=</span> <span class="n">curr</span><span class="p">;</span>
        <span class="n">curr</span> <span class="o">=</span> <span class="n">sum</span><span class="p">;</span>
    <span class="p">}</span>
    <span class="k">return</span> <span class="n">sum</span><span class="p">;</span>
<span class="p">}</span>

<span class="k">comptime</span> <span class="p">{</span>
    <span class="n">py</span><span class="p">.</span><span class="nf">rootmodule</span><span class="p">(</span><span class="nb">@This</span><span class="p">());</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Unfortunately for them, Zig has not reached 
<a href="https://en.wikipedia.org/wiki/Software_versioning#Version_1.0_as_a_milestone">one ver</a>
and makes virtually no promise of stability.  True to that spirit, Zig 0.12
banned <a href="https://ziggit.dev/t/comptime-mutable-memory-changes/3702#what-about-my-global-mutable-comptime-state-3">global mutable comptime state</a> which
Ziggy Pydust <a href="https://github.com/spiraldb/ziggy-pydust/discussions/428">relied on heavily</a>.
Ziggy Pydust was stuck on Zig 0.11 and upgrading “isn’t an easy thing to do.”  Such was the
state of affairs when I came across Ziggy Pydust.  And that’s when I remembered
<a href="https://en.wikipedia.org/wiki/Larry_Wall">Larry Wall</a>’s <a href="https://threevirtues.dev/">three virtues</a>.</p>

<dl>
  <dt>Laziness</dt>
  <dd>There are so many thing I should be doing instead of this, sounds like a great excuse to procrastinate!</dd>
  <dt>Impatience</dt>
  <dd>What do you mean “We can try to reach consensus”?  I want to be able to write python extensions in Zig now!</dd>
  <dt>Hubris</dt>
  <dd>What do you mean “This isn’t an easy thing to do”?  Surely I am clever or at the very least stubborn enough!</dd>
</dl>

<p>So I made the rather ill-advised decision to throw my hat in that race.  Much to his credit,
<a href="https://github.com/gatesn">Nicholas Gates</a> pointed me straight to the 
<a href="https://github.com/spiraldb/ziggy-pydust/discussions/428#discussioncomment-12303082">problem area</a>
and please believe me when I say that I tried
<a href="https://github.com/spiraldb/ziggy-pydust/discussions/428#discussioncomment-12671308">a lot of approaches</a>.
I ended up finding one approach leveraging
<a href="https://ziglang.org/documentation/master/#struct">comptime memoization</a>.
Unfortunately, this approach also requires passing the top-level (<code class="language-plaintext highlighter-rouge">root</code>) type all the way
down the call stack.  Depending on 
<a href="https://softwareengineering.stackexchange.com/questions/335005/is-there-a-name-for-the-anti-pattern-of-passing-parameters-that-will-only-be">who you ask</a>,
this is either an anti-pattern called
<a href="https://en.wiktionary.org/wiki/tramp_data">“tramp data”</a> or, more generously,
a pattern called <a href="https://en.wikipedia.org/wiki/Dependency_injection">“Dependency Injection”</a>.
For what it’s worth, I admitted to not being a fan of what I called this
“root pollution” in private emails with 
<a href="https://github.com/robert3005">Robert Kruszewski</a> where I was asking (nay begging) him
to review <a href="https://github.com/spiraldb/ziggy-pydust/pull/429">#429</a>.  In the end, I reasoned
that passing an extra argument around felt like a fitting substitute for
<a href="https://news.ycombinator.com/item?id=9874824">global state</a>.</p>

<p>Now the good news, bad news of it all.  Ziggy Pydust is no longer stuck on zig 0.11.
I think we can all agree that’s reasonably good news.  This work paved the way
for <a href="https://github.com/bridgeQiao">@bridgeQiao</a> to update Ziggy Pydust to
<a href="https://github.com/spiraldb/ziggy-pydust/pull/441">zig 0.14</a>!  On the flip side, I saw
no way but to break the Ziggy Pydust API.</p>

<p>Before <a href="https://github.com/spiraldb/ziggy-pydust/pull/429">#429</a>:</p>
<div class="language-zig highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">const</span> <span class="n">py</span> <span class="o">=</span> <span class="nb">@import</span><span class="p">(</span><span class="s">"pydust"</span><span class="p">);</span>

<span class="k">pub</span> <span class="k">fn</span> <span class="n">hello</span><span class="p">()</span> <span class="o">!</span><span class="n">py</span><span class="p">.</span><span class="py">PyString</span> <span class="p">{</span>
    <span class="k">return</span> <span class="k">try</span> <span class="n">py</span><span class="p">.</span><span class="py">PyString</span><span class="p">.</span><span class="nf">create</span><span class="p">(</span><span class="s">"Hello!"</span><span class="p">);</span>
<span class="p">}</span>

<span class="k">comptime</span> <span class="p">{</span>
    <span class="n">py</span><span class="p">.</span><span class="nf">rootmodule</span><span class="p">(</span><span class="nb">@This</span><span class="p">());</span>
<span class="p">}</span>
</code></pre></div></div>

<p>After <a href="https://github.com/spiraldb/ziggy-pydust/pull/429">#429</a>:</p>
<div class="language-zig highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">const</span> <span class="n">py</span> <span class="o">=</span> <span class="nb">@import</span><span class="p">(</span><span class="s">"pydust"</span><span class="p">);</span>

<span class="k">const</span> <span class="n">root</span> <span class="o">=</span> <span class="nb">@This</span><span class="p">();</span>

<span class="k">pub</span> <span class="k">fn</span> <span class="n">hello</span><span class="p">()</span> <span class="o">!</span><span class="n">py</span><span class="p">.</span><span class="nf">PyString</span><span class="p">(</span><span class="n">root</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">return</span> <span class="k">try</span> <span class="n">py</span><span class="p">.</span><span class="nf">PyString</span><span class="p">(</span><span class="n">root</span><span class="p">).</span><span class="nf">create</span><span class="p">(</span><span class="s">"Hello!"</span><span class="p">);</span>
<span class="p">}</span>

<span class="k">comptime</span> <span class="p">{</span>
    <span class="n">py</span><span class="p">.</span><span class="nf">rootmodule</span><span class="p">(</span><span class="n">root</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>And for that,</p>
<div style="padding-top:42.600%;position:relative;">
    <iframe src="https://gifer.com/embed/RJ8L" width="100%" height="100%" style="position:absolute;top:0;left:0;" frameborder="0" allowfullscreen="">
    </iframe>
</div>]]></content><author><name></name></author><summary type="html"><![CDATA[5/19/25 update: See my Zig NYC #4 slides]]></summary></entry><entry><title type="html">How much can you pack into bytes?</title><link href="https://bur.gy/2025/01/21/how-much-can-you-pack.html" rel="alternate" type="text/html" title="How much can you pack into bytes?" /><published>2025-01-21T18:26:26+00:00</published><updated>2025-01-21T18:26:26+00:00</updated><id>https://bur.gy/2025/01/21/how-much-can-you-pack</id><content type="html" xml:base="https://bur.gy/2025/01/21/how-much-can-you-pack.html"><![CDATA[<p>I like <a href="https://ziglang.org/">Zig</a> because it encourages me to pay attention to details when I use
it.  I have recently come up with an interesting pattern in two separate use cases.  I think it
warrants a quick blog post.  The pattern is a curious mix of abstraction levels.  On the one hand,
it’s high-level because it prefers Zig’s 
<a href="https://ziglang.org/documentation/master/std/#std.array_list.ArrayList">std.ArrayList</a>
over the lower level
<a href="https://ziglang.org/documentation/master/std/#std.mem.Allocator">std.mem.Allocator</a>.
On the other, it’s low-level because it works best when you take memory alignment into account.
The TL;DR version is that I use <code class="language-plaintext highlighter-rouge">std.ArrayList</code> as a special-purpose
<a href="https://ziglang.org/documentation/master/std/#std.heap.ArenaAllocator">std.heap.ArenaAllocator</a>.
This opens a memory saving opportunity by converting pointers to indices.</p>

<h2 id="flattening-asts">Flattening ASTs</h2>

<p>The first use case happened by of a confluence of 3 events:</p>

<ul>
  <li>I <a href="https://discord.com/channels/605571803288698900/1325848433965400105">struggled with a memory leak</a>
in a hand-written <a href="https://en.wikipedia.org/wiki/Recursive_descent_parser">recursive descent parser</a></li>
  <li>I read <a href="https://www.cs.cornell.edu/~asampson/">Adrian Sampson</a>’s
<a href="https://www.cs.cornell.edu/~asampson/blog/flattening.html">Flattening ASTs (and Other Compiler Data Structures)</a></li>
  <li>I recalled watching <a href="https://andrewkelley.me/">Andrew Kelly</a>’s
<a href="https://www.youtube.com/watch?v=IroPQ150F6c">Practical Data Oriented Design (DoD)</a></li>
</ul>

<p>Although I didn’t realize it at the time, one sentence in Adrian’s post summarizes the idea:
“<em>Arenas</em> or <em>regions</em> mean many different things to different people, so I’m going to call
the specific flavor I’m interested in here <em>data structure flattening</em>.”  My initial motivation
was that I didn’t like how a seemingly simple <code class="language-plaintext highlighter-rouge">Node</code> data structure required such a complex
<code class="language-plaintext highlighter-rouge">.destroy</code> method:</p>

<div class="language-zig highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">const</span> <span class="n">Node</span> <span class="o">=</span> <span class="k">struct</span> <span class="p">{</span>
    <span class="n">token</span><span class="p">:</span> <span class="n">Token</span><span class="p">,</span>
    <span class="n">args</span><span class="p">:</span> <span class="p">[]</span><span class="k">const</span> <span class="o">*</span><span class="k">const</span> <span class="n">Node</span><span class="p">,</span>

    <span class="k">pub</span> <span class="k">fn</span> <span class="n">destroy</span><span class="p">(</span><span class="n">self</span><span class="p">:</span> <span class="o">*</span><span class="n">Node</span><span class="p">,</span> <span class="n">allocator</span><span class="p">:</span> <span class="n">Allocator</span><span class="p">)</span> <span class="k">void</span> <span class="p">{</span>
        <span class="k">for</span> <span class="p">(</span><span class="n">self</span><span class="p">.</span><span class="py">args</span><span class="p">)</span> <span class="p">|</span><span class="n">arg</span><span class="p">|</span> <span class="n">arg</span><span class="p">.</span><span class="nf">destroy</span><span class="p">(</span><span class="n">allocator</span><span class="p">);</span>
        <span class="n">allocator</span><span class="p">.</span><span class="nf">free</span><span class="p">(</span><span class="n">self</span><span class="p">.</span><span class="py">args</span><span class="p">);</span>
        <span class="n">allocator</span><span class="p">.</span><span class="nf">destroy</span><span class="p">(</span><span class="n">self</span><span class="p">);</span>
    <span class="p">}</span>
<span class="p">};</span>
</code></pre></div></div>

<p><a href="https://discord.com/channels/605571803288698900/1325848433965400105/1325857847917154325">Cloudef suggested an arena</a>
but I was reluctant because I feared it would create too much
<a href="https://en.wikipedia.org/wiki/Coupling_(computer_programming)">coupling</a> between a type and the allocator
that creates its instances.  Teralux also warned that
<a href="https://discord.com/channels/605571803288698900/1325848433965400105/1325860143820439585">“[f]reeing graphs is generally a slow operation.”</a>  Furthermore, echoes of Andrew’s “Data Oriented Design” talk were still brewing in my
left hippocampus.  I put them all together into</p>

<div class="language-zig highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">pub</span> <span class="k">const</span> <span class="n">Node</span> <span class="o">=</span> <span class="k">packed</span> <span class="k">union</span> <span class="p">{</span>
    <span class="n">head</span><span class="p">:</span> <span class="k">packed</span> <span class="k">struct</span><span class="p">(</span><span class="kt">u32</span><span class="p">)</span> <span class="p">{</span> <span class="n">token</span><span class="p">:</span> <span class="kt">u24</span><span class="p">,</span> <span class="n">count</span><span class="p">:</span> <span class="kt">u8</span> <span class="p">},</span>
    <span class="n">node</span><span class="p">:</span> <span class="kt">u32</span><span class="p">,</span>
<span class="p">};</span>
</code></pre></div></div>

<p>This definition replaces a <em>copy</em> of each token by an index <em>into</em> <code class="language-plaintext highlighter-rouge">[]const Token</code>
(weighing in at a mere <code class="language-plaintext highlighter-rouge">3</code> bytes intead of <code class="language-plaintext highlighter-rouge">24</code>).  It also broke up <code class="language-plaintext highlighter-rouge">[]const *const Node</code> into
a 1-byte <code class="language-plaintext highlighter-rouge">count</code> squeezed with the token followed by <em>indices</em> into <code class="language-plaintext highlighter-rouge">[]const Node</code>.  The memory
footprint of the old <code class="language-plaintext highlighter-rouge">Node</code> was <code class="language-plaintext highlighter-rouge">40 + 8n</code> bytes where <code class="language-plaintext highlighter-rouge">n = node.args.len</code>:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">.token.tag</code>: <code class="language-plaintext highlighter-rouge">8</code> bytes</li>
  <li><code class="language-plaintext highlighter-rouge">.token.src.ptr</code>: <code class="language-plaintext highlighter-rouge">8</code> bytes</li>
  <li><code class="language-plaintext highlighter-rouge">.token.src.len</code>: <code class="language-plaintext highlighter-rouge">8</code> bytes</li>
  <li><code class="language-plaintext highlighter-rouge">.args.ptr</code>: <code class="language-plaintext highlighter-rouge">8</code> bytes</li>
  <li><code class="language-plaintext highlighter-rouge">.args.len</code>: <code class="language-plaintext highlighter-rouge">8</code> bytes</li>
  <li>each additional <code class="language-plaintext highlighter-rouge">arg: *const Node</code>: <code class="language-plaintext highlighter-rouge">8</code> bytes</li>
</ul>

<p>The new <code class="language-plaintext highlighter-rouge">Node</code>, by contrast, only occupies <code class="language-plaintext highlighter-rouge">4</code> bytes.
That’s a 2½-fold size reduction for a node with zero arguments and 2⅙-fold for a binary node!
Of course, the main attraction is that <code class="language-plaintext highlighter-rouge">nodes</code> can be freed using either <code class="language-plaintext highlighter-rouge">nodes.deinit()</code>
if you’re still holding a reference to the <code class="language-plaintext highlighter-rouge">std.ArrayList</code> or <code class="language-plaintext highlighter-rouge">allocator.free(nodes)</code> if you
have invoked <code class="language-plaintext highlighter-rouge">.toOwnedSlice()</code> instead.</p>

<h2 id="hybrid-bump-allocator">Hybrid bump allocator</h2>

<p>The second use case for this idea goes back to
<a href="/2024/08/31/why-not-zig.html">my first Zig project</a>:
I initially <a href="https://discord.com/channels/605571803288698900/1254033012299927573">failed to grok ArrayList</a>.
This first project was translating a <a href="https://en.wikipedia.org/wiki/C_(programming_language)">C</a>
implementation of <a href="https://en.wikipedia.org/wiki/C_(programming_language)">FORTH</a> (which in turn
was translating an <a href="https://en.wikipedia.org/wiki/X86">x86</a> implementation).  As a consequence,
focus was more on how than why.  Re-reading my discord question, I realize now that I lacked
the vocabulary to precisely articulate what I was trying to achieve.  The original x86 was using
<a href="https://man7.org/linux/man-pages/man2/brk.2.html">brk(2)</a> and raw pointers to implement a
<a href="https://en.wikipedia.org/wiki/Region-based_memory_management">bump allocator</a> which supports
individual bytes (for <code class="language-plaintext highlighter-rouge">C!</code> and <code class="language-plaintext highlighter-rouge">@</code>), words (for <code class="language-plaintext highlighter-rouge">C!</code> and <code class="language-plaintext highlighter-rouge">C@</code>), and code addresses (for <code class="language-plaintext highlighter-rouge">,</code>).
I didn’t make the effort to understand <code class="language-plaintext highlighter-rouge">std.ArrayList</code> to see which parts lent themselves to my
purpose.  The trick turned out to be</p>

<div class="language-zig highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">fn</span> <span class="n">InterpAligned</span><span class="p">(</span><span class="k">comptime</span> <span class="n">alignment</span><span class="p">:</span> <span class="kt">u29</span><span class="p">)</span> <span class="k">type</span> <span class="p">{</span>
    <span class="k">return</span> <span class="k">struct</span> <span class="p">{</span>
        <span class="n">state</span><span class="p">:</span> <span class="kt">isize</span><span class="p">,</span>
        <span class="n">latest</span><span class="p">:</span> <span class="o">*</span><span class="n">Word</span><span class="p">,</span>
        <span class="n">s0</span><span class="p">:</span> <span class="p">[</span><span class="o">*</span><span class="p">]</span><span class="k">const</span> <span class="kt">isize</span><span class="p">,</span>
        <span class="n">base</span><span class="p">:</span> <span class="kt">isize</span><span class="p">,</span>
        <span class="n">r0</span><span class="p">:</span> <span class="p">[</span><span class="o">*</span><span class="p">]</span><span class="k">const</span> <span class="p">[</span><span class="o">*</span><span class="p">]</span><span class="k">const</span> <span class="n">Instr</span><span class="p">,</span>
        <span class="n">buffer</span><span class="p">:</span> <span class="p">[</span><span class="mi">32</span><span class="p">]</span><span class="kt">u8</span><span class="p">,</span>
        <span class="n">memory</span><span class="p">:</span> <span class="n">std</span><span class="p">.</span><span class="nf">ArrayListAligned</span><span class="p">(</span><span class="kt">u8</span><span class="p">,</span> <span class="n">alignment</span><span class="p">),</span>  <span class="c">// ⇐ here</span>
        <span class="n">here</span><span class="p">:</span> <span class="p">[</span><span class="o">*</span><span class="p">]</span><span class="kt">u8</span><span class="p">,</span>

        <span class="o">...</span>
    <span class="p">};</span>
<span class="p">}</span>

<span class="k">const</span> <span class="n">Interp</span> <span class="o">=</span> <span class="n">InterpAligned</span><span class="p">(</span><span class="nb">@alignOf</span><span class="p">(</span><span class="n">Instr</span><span class="p">));</span>
</code></pre></div></div>

<p>(Had to make it a <a href="https://ziglang.org/documentation/master/#Generic-Data-Structures">generic</a>
to avoid the dreaded <code class="language-plaintext highlighter-rouge">struct Instr depends on itself</code>).  <code class="language-plaintext highlighter-rouge">Interp.memory.items</code> are a slice of
bytes for maximum flexibility but they’re <code class="language-plaintext highlighter-rouge">Instr</code>-<em>aligned</em> which obviates the manual
alignment math in <code class="language-plaintext highlighter-rouge">CREATE</code>.</p>

<h2 id="artificial-constraints">Artificial constraints</h2>

<p><a href="https://mitchellh.com/">Mitchell Hashimoto</a> said something on
<a href="https://changelog.com/podcast/622">Changelog</a> which resonated with me: embrace constraints
when programming.  Programmers often have a tendency to go for the most generic
solution.  This leads to unnecessary complexity and 
<a href="https://en.wikipedia.org/wiki/Anti-pattern">anti-patterns</a> like
<a href="https://en.wikipedia.org/wiki/Hard_coding#Softcoding">softcoding</a> or
<a href="https://en.wikipedia.org/wiki/Greenspun%27s_tenth_rule">Greenspun’s tenth rule</a>.
In this case, the constraint was making sure that consecutive memory allocations
remain contiguous to facilitate FORTH’s poor man’s 
<a href="https://en.wikipedia.org/wiki/Reflective_programming">reflection</a>.
After a few detours, the constraint took me to a reasonably useful idea.
Meanwhile, my code remained legible and efficient in the process.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[I like Zig because it encourages me to pay attention to details when I use it. I have recently come up with an interesting pattern in two separate use cases. I think it warrants a quick blog post. The pattern is a curious mix of abstraction levels. On the one hand, it’s high-level because it prefers Zig’s std.ArrayList over the lower level std.mem.Allocator. On the other, it’s low-level because it works best when you take memory alignment into account. The TL;DR version is that I use std.ArrayList as a special-purpose std.heap.ArenaAllocator. This opens a memory saving opportunity by converting pointers to indices.]]></summary></entry><entry><title type="html">What is satisfiability?</title><link href="https://bur.gy/2024/12/19/what-is-satisfiability.html" rel="alternate" type="text/html" title="What is satisfiability?" /><published>2024-12-19T17:32:12+00:00</published><updated>2024-12-19T17:32:12+00:00</updated><id>https://bur.gy/2024/12/19/what-is-satisfiability</id><content type="html" xml:base="https://bur.gy/2024/12/19/what-is-satisfiability.html"><![CDATA[<h2 id="or-how-much-is-ever-enough">Or How Much is ever Enough?</h2>

<p>This year, I am having fun taking part in my first ever <a href="https://adventofcode.com/">Advent of Code</a>!
Day 17 threw me for a loop and forced me to learn a new technique so that deserves a quick write-up.
As with other days, it’s a two part challenge which starts easy.  The first challenge involves a
bytecode interpreter and you know how much I like those!  My input is equivalent to the following
python code:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">collections.abc</span> <span class="kn">import</span> <span class="n">Iterable</span>

<span class="k">def</span> <span class="nf">device</span><span class="p">(</span><span class="n">a</span><span class="p">:</span> <span class="nb">int</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">Iterable</span><span class="p">[</span><span class="nb">int</span><span class="p">]:</span>
    <span class="k">while</span> <span class="n">a</span><span class="p">:</span>
        <span class="n">b</span> <span class="o">=</span> <span class="n">a</span> <span class="o">&amp;</span> <span class="mi">7</span>    <span class="c1"># bst A
</span>        <span class="n">b</span> <span class="o">^=</span> <span class="mi">6</span>       <span class="c1"># bxl B
</span>        <span class="n">c</span> <span class="o">=</span> <span class="n">a</span> <span class="o">&gt;&gt;</span> <span class="n">b</span>   <span class="c1"># cdv B
</span>        <span class="n">b</span> <span class="o">^=</span> <span class="n">c</span>       <span class="c1"># bxc
</span>        <span class="n">b</span> <span class="o">^=</span> <span class="mi">4</span>       <span class="c1"># bxl 4
</span>        <span class="k">yield</span> <span class="n">b</span> <span class="o">&amp;</span> <span class="mi">7</span>  <span class="c1"># out B
</span>        <span class="n">a</span> <span class="o">&gt;&gt;=</span> <span class="mi">3</span>      <span class="c1"># adv 3
</span>        <span class="k">continue</span>     <span class="c1"># jnz 0
</span>
<span class="k">print</span><span class="p">(</span><span class="o">*</span><span class="n">device</span><span class="p">(</span><span class="mi">66171486</span><span class="p">),</span> <span class="n">sep</span><span class="o">=</span><span class="s">","</span><span class="p">)</span>
</code></pre></div></div>

<p>So far so good.  Part 2 gets hairy and involves searching for the smallest argument <code class="language-plaintext highlighter-rouge">a</code> that
turns the device into a <a href="https://en.wikipedia.org/wiki/Quine_(computing)">quine</a>.  Simplifiying
the program as shown below highlights the fact that <code class="language-plaintext highlighter-rouge">device</code> yields one value per octal digit
of <code class="language-plaintext highlighter-rouge">a</code>.  So we’re looking for an <code class="language-plaintext highlighter-rouge">a</code> with 16 octal digits.  There are \(2^{48}\) of those!
This reminds us of the <a href="https://en.wikipedia.org/wiki/Wheat_and_chessboard_problem">wheat and chessboard problem</a>
and we quickly realize that a 
<a href="https://en.wikipedia.org/wiki/Brute-force_search">brute-force search</a> will run into
<a href="https://en.wikipedia.org/wiki/Heat_death_of_the_universe">heat death of the universe</a> issues
(although not quite as quickly as I would like to admit).</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">device</span><span class="p">(</span><span class="n">b</span><span class="p">:</span> <span class="nb">int</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">Iterable</span><span class="p">[</span><span class="nb">int</span><span class="p">]:</span>
    <span class="k">while</span> <span class="n">a</span> <span class="p">:</span><span class="o">=</span> <span class="n">b</span><span class="p">:</span>
        <span class="n">b</span><span class="p">,</span> <span class="n">c</span> <span class="o">=</span> <span class="nb">divmod</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="mi">8</span><span class="p">)</span>
        <span class="k">yield</span> <span class="p">(</span><span class="n">a</span> <span class="o">&gt;&gt;</span> <span class="p">(</span><span class="n">c</span> <span class="o">^</span> <span class="mi">6</span><span class="p">)</span> <span class="o">^</span> <span class="n">c</span> <span class="o">^</span> <span class="mi">2</span><span class="p">)</span> <span class="o">&amp;</span> <span class="mi">7</span>
</code></pre></div></div>

<p>I could vaguely remember reading something about <a href="https://en.wikipedia.org/wiki/SAT_solver">SAT solvers</a>
and <a href="https://graphics.stanford.edu/~seander/bithacks.html">Bit Twiddling Hacks</a>.  The combined search
took me to <a href="https://ericpony.github.io/z3py-tutorial/guide-examples.htm">this page</a> and the following
Z3-based solution:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">z3</span> <span class="kn">import</span> <span class="n">BitVec</span><span class="p">,</span> <span class="n">solve</span>

<span class="n">a</span> <span class="o">=</span> <span class="n">BitVec</span><span class="p">(</span><span class="s">"A"</span><span class="p">,</span> <span class="mi">51</span><span class="p">)</span>
<span class="n">constraints</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">out</span> <span class="ow">in</span> <span class="p">[</span><span class="mi">2</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">6</span><span class="p">,</span> <span class="mi">7</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">6</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">0</span><span class="p">]:</span>
    <span class="n">b</span> <span class="o">=</span> <span class="n">a</span> <span class="o">&amp;</span> <span class="mi">7</span>
    <span class="n">b</span> <span class="o">^=</span> <span class="mi">6</span>
    <span class="n">c</span> <span class="o">=</span> <span class="n">a</span> <span class="o">&gt;&gt;</span> <span class="n">b</span>
    <span class="n">b</span> <span class="o">^=</span> <span class="n">c</span>
    <span class="n">b</span> <span class="o">^=</span> <span class="mi">4</span>
    <span class="n">constraints</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">b</span> <span class="o">&amp;</span> <span class="mi">7</span> <span class="o">==</span> <span class="n">out</span><span class="p">)</span>
    <span class="n">a</span> <span class="o">&gt;&gt;=</span> <span class="mi">3</span>
<span class="n">constraints</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">a</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span>
<span class="n">solve</span><span class="p">(</span><span class="o">*</span><span class="n">constraints</span><span class="p">)</span>
</code></pre></div></div>

<p>Z3 solves this problem so fast, its runtime is barely noticeable. That’s impressive on
several levels:</p>
<ul>
  <li><code class="language-plaintext highlighter-rouge">pip install z3-solver</code> just worked™</li>
  <li>the API is n00b-friendly</li>
  <li>solution is accepted by AoC</li>
</ul>

<p>In my defense, I had worked with
<a href="https://en.wikipedia.org/wiki/Algebraic_modeling_language">algebraic modeling languages</a>
before but <a href="https://en.wikipedia.org/wiki/Z3_Theorem_Prover">Z3</a> remains an impressive bit of
tech.</p>

<p>Still, I couldn’t help but feel that I <a href="https://xkcd.com/1890/">brought a gun to a knife fight</a>.
(Don’t get me wrong, I absolutely took the AoC credit but I didn’t love the 
<a href="https://en.wikipedia.org/wiki/Black_box">black box</a> solution).  So I stared at the short
implementation for a while longer and found an alternative solution:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">a</span> <span class="o">=</span> <span class="p">{</span><span class="mi">0</span><span class="p">}</span>
<span class="k">for</span> <span class="n">out</span> <span class="ow">in</span> <span class="nb">reversed</span><span class="p">([</span><span class="mi">2</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">6</span><span class="p">,</span> <span class="mi">7</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">6</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">0</span><span class="p">]):</span>
    <span class="n">a</span> <span class="o">=</span> <span class="p">{</span>
        <span class="n">d</span> <span class="k">for</span> <span class="n">b</span> <span class="ow">in</span> <span class="n">a</span> <span class="k">for</span> <span class="n">c</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">8</span><span class="p">)</span>
        <span class="k">if</span> <span class="p">((</span><span class="n">d</span> <span class="p">:</span><span class="o">=</span> <span class="p">(</span><span class="n">b</span> <span class="o">&lt;&lt;</span> <span class="mi">3</span><span class="p">)</span> <span class="o">|</span> <span class="n">c</span><span class="p">)</span> <span class="o">&gt;&gt;</span> <span class="p">(</span><span class="n">c</span> <span class="o">^</span> <span class="mi">6</span><span class="p">)</span> <span class="o">^</span> <span class="n">c</span> <span class="o">^</span> <span class="mi">2</span><span class="p">)</span> <span class="o">&amp;</span> <span class="mi">7</span> <span class="o">==</span> <span class="n">out</span>
    <span class="p">}</span>
<span class="nb">min</span><span class="p">(</span><span class="n">a</span><span class="p">)</span>
</code></pre></div></div>

<p>Ultimately, this amounts to <a href="https://en.wikipedia.org/wiki/Backward_induction">backward induction</a>
which is likely one of the many strategies that <code class="language-plaintext highlighter-rouge">Z3</code> implements.  Unlike the brute-force approach,
this implementation <a href="https://en.wikipedia.org/wiki/Prune_and_search">prunes</a> aggressively which
keeps its complexity manageable.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[Or How Much is ever Enough?]]></summary></entry><entry><title type="html">Why not try Zig next?</title><link href="https://bur.gy/2024/08/31/why-not-zig.html" rel="alternate" type="text/html" title="Why not try Zig next?" /><published>2024-08-31T12:41:26+00:00</published><updated>2024-08-31T12:41:26+00:00</updated><id>https://bur.gy/2024/08/31/why-not-zig</id><content type="html" xml:base="https://bur.gy/2024/08/31/why-not-zig.html"><![CDATA[<p>As this blog shows, I am deeply interested in interpreters.  I find the meta
nature of programs that run programs fascinating.  That said, I am also too
lazy to care for the fiddly bookkeeping needed to write a robust parser.  That
makes <a href="https://en.wikipedia.org/wiki/Forth_(programming_language)">FORTH</a>’s
complete lack of syntax particularly appealing (look ma, no parser)!</p>

<p>I first experimented with FORTH as mere shorthand to 
<a href="/2022/05/28/what-is-forth.html">generate python bytecode</a>,
then translated <a href="/2023/02/24/what-forth-again.html">jonesforth</a> 
from x86 to GNU C with <a href="https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html">labels as values</a>,
and finally to C with <a href="/2024/03/29/tail-recursion.html">tail recursion</a>.
You would think that would be enough, wouldn’t you.</p>

<p>Yet somehow that itch still needed scratching.  I also thought it was time to
learn a new programming language that would accommodate my most recent
implementation.  I obviously thought of <a href="https://rust-lang.org">Rust</a>
first because of all the buzz but <a href="https://github.com/rust-lang/rfcs/pull/81">#81</a>
put the kibosh on that idea.  <a href="https://ziglang.org/">Zig</a> seemed a natural choice
as well, particularly because of the “better C than C” tagline.  And <code class="language-plaintext highlighter-rouge">Zig</code> supports
an <a href="https://ziglang.org/documentation/master/#call"><code class="language-plaintext highlighter-rouge">.always_tail</code></a> call modifier!</p>

<p>Alright then, porting a C program that uses <code class="language-plaintext highlighter-rouge">musttail</code> to a “better C than C” that
supports <code class="language-plaintext highlighter-rouge">.always_tail</code> should be
<a href="https://en.wiktionary.org/wiki/like_shooting_fish_in_a_barrel">like shooting fish in a barrel</a>.
But things rarely go as you wish they would.  Which brings us to our first challenge:</p>

<h2 id="macros">Macros</h2>

<p>Besides the excellent literate comments, <a href="https://rwmj.wordpress.com/">Richard WM Jones</a>
made liberal use of <a href="https://en.wikipedia.org/wiki/GNU_Assembler">gas</a> <code class="language-plaintext highlighter-rouge">.macro</code> to
keep his x86 code readable.  Let that sink in: readable x86 is no small feat!  That
influenced me to rely heavily on the <a href="https://en.wikipedia.org/wiki/C_preprocessor">C preprocessor</a>
in my C translations.  Case in point: in <code class="language-plaintext highlighter-rouge">jonesforth.s</code>, the <code class="language-plaintext highlighter-rouge">+</code> primitive is
defined by</p>

<div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="nf">defcode</span> <span class="s">"+"</span><span class="p">,</span><span class="mi">1</span><span class="p">,,</span><span class="nv">ADD</span>
    <span class="nf">pop</span> <span class="o">%</span><span class="nb">eax</span>            <span class="err">#</span> <span class="nv">get</span> <span class="nv">top</span> <span class="nv">of</span> <span class="nv">stack</span>
    <span class="nf">addl</span> <span class="o">%</span><span class="nb">eax</span><span class="p">,(</span><span class="o">%</span><span class="nb">esp</span><span class="p">)</span>	<span class="err">#</span> <span class="nv">and</span> <span class="nv">add</span> <span class="nv">it</span> <span class="nv">to</span> <span class="nv">next</span> <span class="kt">word</span> <span class="nv">on</span> <span class="nv">stack</span>
    <span class="nf">NEXT</span>
</code></pre></div></div>

<p>which expands (transitively) to</p>

<div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="nf">.section</span> <span class="nv">.rodata</span>
    <span class="nf">.align</span> <span class="mi">4</span>
    <span class="nf">.globl</span> <span class="nv">name_ADD</span>
<span class="nf">name_ADD</span> <span class="p">:</span>
    <span class="nf">.int</span> <span class="nv">link</span>           <span class="err">#</span> <span class="nv">link</span> <span class="nv">to</span> <span class="nv">previous</span> <span class="kt">word</span> <span class="p">(</span><span class="nv">name_DECR4</span><span class="p">)</span>
    <span class="nf">.set</span> <span class="nv">link</span><span class="p">,</span><span class="nv">name_ADD</span>
    <span class="nf">.byte</span> <span class="mi">1</span>             <span class="err">#</span> <span class="nv">flags</span> <span class="o">+</span> <span class="nv">length</span> <span class="kt">byte</span>
    <span class="nf">.ascii</span> <span class="s">"+"</span>		<span class="err">#</span> <span class="nv">the</span> <span class="nv">name</span>
    <span class="nf">.align</span> <span class="mi">4</span>		<span class="err">#</span> <span class="nv">padding</span> <span class="nv">to</span> <span class="nv">next</span> <span class="mi">4</span> <span class="kt">byte</span> <span class="nv">boundary</span>
    <span class="nf">.globl</span> <span class="nv">ADD</span>
<span class="nf">ADD</span> <span class="p">:</span>
    <span class="nf">.int</span> <span class="nv">code_ADD</span>	<span class="err">#</span> <span class="nv">codeword</span>
    <span class="nf">.text</span>
    <span class="err">//</span><span class="nf">.align</span> <span class="mi">4</span>
    <span class="nf">.globl</span> <span class="nv">code_ADD</span>
<span class="nf">code_ADD</span> <span class="p">:</span>
    <span class="nf">pop</span> <span class="o">%</span><span class="nb">eax</span>		<span class="err">#</span> <span class="nv">get</span> <span class="nv">top</span> <span class="nv">of</span> <span class="nv">stack</span>
    <span class="nf">addl</span> <span class="o">%</span><span class="nb">eax</span><span class="p">,(</span><span class="o">%</span><span class="nb">esp</span><span class="p">)</span>    <span class="err">#</span> <span class="nv">and</span> <span class="nv">add</span> <span class="nv">it</span> <span class="nv">to</span> <span class="nv">next</span> <span class="kt">word</span> <span class="nv">on</span> <span class="nv">stack</span>
    <span class="nf">lodsl</span>               <span class="err">#</span> <span class="nv">NEXT</span>
    <span class="nf">jmp</span> <span class="o">*</span><span class="p">(</span><span class="o">%</span><span class="nb">eax</span><span class="p">)</span>
</code></pre></div></div>

<p>Wof.  Which version do you prefer reading?  I, for one, am glad to not
be constantly reminded of all these alignment directives.  For comparison,
my second C implementation’s <code class="language-plaintext highlighter-rouge">+</code> primitive looks like this</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">DEFCODE</span><span class="p">(</span><span class="n">DECRP</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="s">"+"</span><span class="p">,</span> <span class="n">ADD</span><span class="p">)</span> 
<span class="p">{</span>
    <span class="n">sp</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">+=</span> <span class="n">sp</span><span class="p">[</span><span class="mi">0</span><span class="p">];</span>
    <span class="o">++</span><span class="n">sp</span><span class="p">;</span>
    <span class="n">NEXT</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Neat and tidy, right.  Unfortunately, this also expands into the gorier</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">intptr_t</span> <span class="o">*</span><span class="nf">ADD</span><span class="p">(</span>  <span class="cm">/* forward declaration */</span>
    <span class="k">struct</span> <span class="n">interp_t</span> <span class="o">*</span><span class="p">,</span>
    <span class="kt">intptr_t</span> <span class="o">*</span><span class="p">,</span>
    <span class="k">union</span> <span class="n">instr_t</span> <span class="o">**</span><span class="p">,</span>
    <span class="k">union</span> <span class="n">instr_t</span> <span class="o">*</span><span class="p">,</span>
    <span class="k">union</span> <span class="n">instr_t</span> <span class="o">*</span><span class="p">,</span>
<span class="p">);</span>
<span class="k">static</span> <span class="k">struct</span> <span class="n">word_t</span> <span class="n">name_ADD</span> <span class="n">__attribute__</span><span class="p">((</span><span class="n">used</span><span class="p">))</span> <span class="o">=</span> <span class="p">{</span>
    <span class="p">.</span><span class="n">link</span> <span class="o">=</span> <span class="o">&amp;</span><span class="n">name_DECRP</span><span class="p">,</span>
    <span class="p">.</span><span class="n">flag</span> <span class="o">=</span> <span class="mi">0</span> <span class="o">|</span> <span class="p">((</span><span class="k">sizeof</span> <span class="s">"+"</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span><span class="p">),</span>
    <span class="p">.</span><span class="n">name</span> <span class="o">=</span> <span class="s">"+"</span><span class="p">,</span>
    <span class="p">.</span><span class="n">code</span> <span class="o">=</span> <span class="p">{.</span><span class="n">code</span> <span class="o">=</span> <span class="n">ADD</span><span class="p">}</span>
<span class="p">};</span>
<span class="kt">intptr_t</span> <span class="o">*</span><span class="nf">ADD</span><span class="p">(</span>
    <span class="k">struct</span> <span class="n">interp_t</span> <span class="o">*</span><span class="n">env</span><span class="p">,</span>
    <span class="kt">intptr_t</span> <span class="o">*</span><span class="n">sp</span><span class="p">,</span>
    <span class="k">union</span> <span class="n">instr_t</span> <span class="o">**</span><span class="n">rsp</span><span class="p">,</span>
    <span class="k">union</span> <span class="n">instr_t</span> <span class="o">*</span><span class="n">ip</span><span class="p">,</span>
    <span class="k">union</span> <span class="n">instr_t</span> <span class="o">*</span><span class="n">target</span> <span class="n">__attribute__</span><span class="p">((</span><span class="n">unused</span><span class="p">)),</span>
<span class="p">)</span>
<span class="p">{</span>
    <span class="n">sp</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">+=</span> <span class="n">sp</span><span class="p">[</span><span class="mi">0</span><span class="p">];</span>
    <span class="o">++</span><span class="n">sp</span><span class="p">;</span>
    <span class="n">__attribute__</span><span class="p">((</span><span class="n">musttail</span><span class="p">))</span> <span class="k">return</span> <span class="n">ip</span><span class="o">-&gt;</span><span class="n">word</span><span class="o">-&gt;</span><span class="n">code</span><span class="p">(</span><span class="n">env</span><span class="p">,</span> <span class="n">sp</span><span class="p">,</span> <span class="n">rsp</span><span class="p">,</span> <span class="n">ip</span> <span class="o">+</span> <span class="mi">1</span><span class="p">,</span> <span class="n">ip</span><span class="o">-&gt;</span><span class="n">word</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Macros are great for hiding <a href="https://en.wikipedia.org/wiki/Boilerplate_code">boilerplate</a>.</p>

<p>Yet Zig proudly <em>eschewed</em> preprocessor and macros!  It’s right there on the
<a href="https://ziglang.org/">front page</a>:</p>

<ul>
  <li>No hidden control flow.</li>
  <li>No hidden memory allocations.</li>
  <li>No preprocessor, no macros.</li>
</ul>

<p>Oh no, what are we going to do!?  Fear not friends, for Zig offers
<a href="https://ziglang.org/documentation/master/#comptime"><code class="language-plaintext highlighter-rouge">comptime</code></a>!
(It took me a while to wrap my head around <code class="language-plaintext highlighter-rouge">comptime</code> but when I did, 
I found it poetic to implement an interpreter that also compiles using
a compiler that also interprets).  My mental model for Zig’s <code class="language-plaintext highlighter-rouge">comptime</code>-based
<a href="https://en.wikipedia.org/wiki/Metaprogramming">metaprogramming</a> is that Zig
will eagerly resolve expressions that are “known at compile-time”.  This
reminds me very much of good ole C++
<a href="https://en.wikipedia.org/wiki/Template_metaprogramming">template metaprogramming</a>
except Zig doesn’t restrict you to an esoteric
<a href="https://en.wikipedia.org/wiki/Domain-specific_language">DSL</a>.  No, Zig gives
you access to the entire language syntax for compile-time expressions subject
to static constraints.</p>

<p>We need to discuss another Zig limitation before exploring how <code class="language-plaintext highlighter-rouge">comptime</code>
saved us from the preprocessor quagmire.  If Zig supports 
<a href="https://www.reddit.com/r/Zig/comments/sr7p3f/does_zig_support_flexible_array_members/">flexible array members</a>,
then I wasn’t smart enough to make them work.  But Zig does have a very
<a href="https://discord.com/channels/605571803288698900/1274691126833578087">friendly and helpful community</a>
who gave me pointers (pun intended) to help me <em>reframe</em> my approach:
instead of implementing FORTH words as a <code class="language-plaintext highlighter-rouge">struct</code> with an array of instructions
bolted to the end, let’s use an array of instructions whose first few (5 to be
precise) entries are hijacked to represent a <code class="language-plaintext highlighter-rouge">struct</code>.  You might recall that
FORTH implementations have an <a href="https://en.wikipedia.org/wiki/Association_list">association list</a>
at their core.  <code class="language-plaintext highlighter-rouge">name_ADD</code> in the snippets above represents one node of
that association list, keyed by <code class="language-plaintext highlighter-rouge">"+"</code> and linked to <code class="language-plaintext highlighter-rouge">name_DECR4</code>.</p>

<p>In Zig, preprocessor macros are replaced by a handful of comptime functions
which all ultimately call</p>
<div class="language-zig highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">const</span> <span class="n">Word</span> <span class="o">=</span> <span class="k">extern</span> <span class="k">struct</span> <span class="p">{</span>
    <span class="n">link</span><span class="p">:</span> <span class="o">?*</span><span class="k">const</span> <span class="n">Word</span><span class="p">,</span>
    <span class="n">flag</span><span class="p">:</span> <span class="kt">u8</span><span class="p">,</span>
    <span class="n">name</span><span class="p">:</span> <span class="p">[</span><span class="n">F_LENMASK</span><span class="p">]</span><span class="kt">u8</span> <span class="k">align</span><span class="p">(</span><span class="mi">1</span><span class="p">),</span>
<span class="p">};</span>

<span class="k">const</span> <span class="n">Instr</span> <span class="o">=</span> <span class="k">packed</span> <span class="k">union</span> <span class="p">{</span>
    <span class="n">code</span><span class="p">:</span> <span class="o">*</span><span class="k">const</span> <span class="k">fn</span> <span class="p">(</span><span class="o">*</span><span class="n">Interp</span><span class="p">,</span> <span class="p">[]</span><span class="kt">isize</span><span class="p">,</span> <span class="p">[][</span><span class="o">*</span><span class="p">]</span><span class="k">const</span> <span class="n">Instr</span><span class="p">,</span> <span class="p">[</span><span class="o">*</span><span class="p">]</span><span class="k">const</span> <span class="n">Instr</span><span class="p">,</span> <span class="p">[</span><span class="o">*</span><span class="p">]</span><span class="k">const</span> <span class="n">Instr</span><span class="p">)</span> <span class="k">anyerror</span><span class="o">!</span><span class="k">void</span><span class="p">,</span>
    <span class="n">literal</span><span class="p">:</span> <span class="kt">isize</span><span class="p">,</span>
    <span class="n">word</span><span class="p">:</span> <span class="p">[</span><span class="o">*</span><span class="p">]</span><span class="k">const</span> <span class="n">Instr</span><span class="p">,</span>
<span class="p">};</span>

<span class="k">const</span> <span class="n">offset</span> <span class="o">=</span> <span class="nb">@divTrunc</span><span class="p">(</span><span class="nb">@sizeOf</span><span class="p">(</span><span class="n">Word</span><span class="p">),</span> <span class="nb">@sizeOf</span><span class="p">(</span><span class="n">Instr</span><span class="p">));</span>

<span class="k">fn</span> <span class="n">defword</span><span class="p">(</span>
    <span class="k">comptime</span> <span class="n">last</span><span class="p">:</span> <span class="o">?</span><span class="p">[]</span><span class="k">const</span> <span class="n">Instr</span><span class="p">,</span>
    <span class="k">comptime</span> <span class="n">flag</span><span class="p">:</span> <span class="n">Flag</span><span class="p">,</span>
    <span class="k">comptime</span> <span class="n">name</span><span class="p">:</span> <span class="p">[]</span><span class="k">const</span> <span class="kt">u8</span><span class="p">,</span>
    <span class="k">comptime</span> <span class="n">code</span><span class="p">:</span> <span class="p">[]</span><span class="k">const</span> <span class="n">Instr</span><span class="p">,</span>
<span class="p">)</span> <span class="p">[</span><span class="n">offset</span> <span class="o">+</span> <span class="n">code</span><span class="p">.</span><span class="py">len</span><span class="p">]</span><span class="n">Instr</span> <span class="p">{</span>
    <span class="k">var</span> <span class="n">instrs</span><span class="p">:</span> <span class="p">[</span><span class="n">offset</span> <span class="o">+</span> <span class="n">code</span><span class="p">.</span><span class="py">len</span><span class="p">]</span><span class="n">Instr</span> <span class="o">=</span> <span class="k">undefined</span><span class="p">;</span>
    <span class="k">const</span> <span class="n">p</span><span class="p">:</span> <span class="o">*</span><span class="n">Word</span> <span class="o">=</span> <span class="nb">@ptrCast</span><span class="p">(</span><span class="o">&amp;</span><span class="n">instrs</span><span class="p">[</span><span class="mi">0</span><span class="p">]);</span>
    <span class="n">p</span><span class="p">.</span><span class="py">link</span> <span class="o">=</span> <span class="k">if</span> <span class="p">(</span><span class="n">last</span><span class="p">)</span> <span class="p">|</span><span class="n">link</span><span class="p">|</span> <span class="nb">@ptrCast</span><span class="p">(</span><span class="n">link</span><span class="p">.</span><span class="py">ptr</span><span class="p">)</span> <span class="k">else</span> <span class="kc">null</span><span class="p">;</span>
    <span class="n">p</span><span class="p">.</span><span class="py">flag</span> <span class="o">=</span> <span class="n">name</span><span class="p">.</span><span class="py">len</span> <span class="p">|</span> <span class="n">@intFromEnum</span><span class="p">(</span><span class="n">flag</span><span class="p">);</span>
    <span class="nb">@memcpy</span><span class="p">(</span><span class="n">p</span><span class="p">.</span><span class="py">name</span><span class="p">[</span><span class="mi">0</span><span class="o">..</span><span class="n">name</span><span class="p">.</span><span class="py">len</span><span class="p">],</span> <span class="n">name</span><span class="p">);</span>
    <span class="nb">@memset</span><span class="p">(</span><span class="n">p</span><span class="p">.</span><span class="py">name</span><span class="p">[</span><span class="n">name</span><span class="p">.</span><span class="py">len</span><span class="o">..</span><span class="n">F_LENMASK</span><span class="p">],</span> <span class="mi">0</span><span class="p">);</span>
    <span class="nb">@memcpy</span><span class="p">(</span><span class="n">instrs</span><span class="p">[</span><span class="n">offset</span><span class="o">..</span><span class="p">],</span> <span class="n">code</span><span class="p">);</span>
    <span class="k">return</span> <span class="n">instrs</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p><a href="https://ziglang.org/documentation/master/#ptrCast"><code class="language-plaintext highlighter-rouge">@ptrCast</code></a> lets us
treat the first <code class="language-plaintext highlighter-rouge">instrs[0..offset]</code> as a <code class="language-plaintext highlighter-rouge">Word</code>.  Zig lets you get away
with that because <a href="https://discord.com/channels/605571803288698900/1254033012299927573/1254039887703969794">it’s just bytes all the way down</a>.</p>

<p>You’re probably wondering why I’m willing to abuse Zig’s semantics to
guarantee this very specific memory layout.  I’m doing this to respect as many
details of the <a href="https://github.com/nornagon/jonesforth">jonesforth</a> implementation
as possible in order to make fiddly words like <code class="language-plaintext highlighter-rouge">CFA&gt;</code> and <code class="language-plaintext highlighter-rouge">ID.</code> work.  Those
meta words exploit the precise memory layout to go from a word’s first instruction
to the word itself as well as compute the number of instructions in a word.</p>

<p>We used one more <code class="language-plaintext highlighter-rouge">comptime</code> trick to keep our code
<a href="https://en.wikipedia.org/wiki/Don%27t_repeat_yourself">DRY</a>: abusing <code class="language-plaintext highlighter-rouge">struct</code> to create
<a href="https://gencmurat.com/en/posts/zig-anonymus-functions-and-closures/">closures</a>.
This let us implement <code class="language-plaintext highlighter-rouge">+</code> as</p>
<div class="language-zig highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">inline</span> <span class="k">fn</span> <span class="mi">_</span><span class="n">add</span><span class="p">(</span><span class="n">sp</span><span class="p">:</span> <span class="p">[]</span><span class="kt">isize</span><span class="p">)</span> <span class="o">!</span><span class="p">[]</span><span class="kt">isize</span> <span class="p">{</span>
    <span class="n">sp</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">+=</span> <span class="n">sp</span><span class="p">[</span><span class="mi">0</span><span class="p">];</span>
    <span class="k">return</span> <span class="n">sp</span><span class="p">[</span><span class="mi">1</span><span class="o">..</span><span class="p">];</span>
<span class="p">}</span>
<span class="k">const</span> <span class="n">add</span> <span class="o">=</span> <span class="n">defword</span><span class="p">(</span><span class="o">&amp;</span><span class="n">decrp</span><span class="p">,</span> <span class="n">Flag</span><span class="p">.</span><span class="py">ZERO</span><span class="p">,</span> <span class="s">"+"</span><span class="p">,</span> <span class="n">wrap</span><span class="p">(</span><span class="mi">_</span><span class="n">add</span><span class="p">));</span>
</code></pre></div></div>
<p>which is somewhat reminiscent of the C implementation above.  Note that <code class="language-plaintext highlighter-rouge">sp</code>
is a <a href="https://ziglang.org/documentation/master/#Slices">slice</a> which gives the
Zig compiler an opportunity to generate bound checks.</p>

<h2 id="next-steps">Next steps</h2>
<p><a href="https://github.com/jburgy/blog/blob/main/forth/6th.zig">6th.zig</a> mostly works.
It executes the bootstrapping FORTH code that adds control structures, strings,
introspection, and a bunch of other goodies.  I don’t love how I skirted
memory allocation.  I would prefer switching to
<a href="https://github.com/ziglang/zig/blob/master/lib/std/heap/sbrk_allocator.zig">SbrkAllocator</a>
and support <code class="language-plaintext highlighter-rouge">MORECORE</code> seamlessly.  It would also be neat to generate a 
<a href="https://webassembly.org/">WebAssembly</a> build to include on this page.  Finally,
this is my first ever Zig code.  The compiler and VSCode language server
already act as style guides but I would appreciate feedback from experienced
Ziguanas.</p>

<h2 id="bonus-material">Bonus material</h2>

<p>Richard WM Jones added some pretty fantastic <a href="https://en.wikipedia.org/wiki/ASCII_art">ASCII art</a>
to his literate x86 jonesforth.  Turns out <a href="https://ivanceras.github.io/svgbob-editor/">Svgbob</a>
does a really good job of converting them to 
<a href="https://en.wikipedia.org/wiki/SVG">SVG</a>.  Here’s the diagram Richard uses to explain
<a href="https://en.wikipedia.org/wiki/Threaded_code#Indirect_threading">indirect threading</a>:</p>

<svg xmlns="http://www.w3.org/2000/svg" width="736" height="416" class="svgbob"><style>.svgbob line, .svgbob path, .svgbob circle, .svgbob rect, .svgbob polygon {
    stroke: black;
    stroke-width: 2;
    stroke-opacity: 1;
    fill-opacity: 1;
    stroke-linecap: round;
    stroke-linejoin: miter;
}
.svgbob text {
    white-space: pre;
    fill: black;
    font-family: Iosevka Fixed, monospace;
    font-size: 14px;
}
.svgbob rect.backdrop {
    stroke: none;
    fill: white;
}
.svgbob .broken {
    stroke-dasharray: 8;
}
.svgbob .filled {
    fill: black;
}
.svgbob .bg_filled {
    fill: white;
    stroke-width: 1;
}
.svgbob .nofill {
    fill: white;
}
.svgbob .end_marked_arrow {
    marker-end: url(#arrow);
}
.svgbob .start_marked_arrow {
    marker-start: url(#arrow);
}
.svgbob .end_marked_diamond {
    marker-end: url(#diamond);
}
.svgbob .start_marked_diamond {
    marker-start: url(#diamond);
}
.svgbob .end_marked_circle {
    marker-end: url(#circle);
}
.svgbob .start_marked_circle {
    marker-start: url(#circle);
}
.svgbob .end_marked_open_circle {
    marker-end: url(#open_circle);
}
.svgbob .start_marked_open_circle {
    marker-start: url(#open_circle);
}
.svgbob .end_marked_big_open_circle {
    marker-end: url(#big_open_circle);
}
.svgbob .start_marked_big_open_circle {
    marker-start: url(#big_open_circle);
}<!--separator--></style><defs><marker id="arrow" viewBox="-2 -2 8 8" refX="4" refY="2" markerWidth="7" markerHeight="7" orient="auto-start-reverse"><polygon points="0,0 0,4 4,2 0,0"></polygon></marker><marker id="diamond" viewBox="-2 -2 8 8" refX="4" refY="2" markerWidth="7" markerHeight="7" orient="auto-start-reverse"><polygon points="0,2 2,0 4,2 2,4 0,2"></polygon></marker><marker id="circle" viewBox="0 0 8 8" refX="4" refY="4" markerWidth="7" markerHeight="7" orient="auto-start-reverse"><circle cx="4" cy="4" r="2" class="filled"></circle></marker><marker id="open_circle" viewBox="0 0 8 8" refX="4" refY="4" markerWidth="7" markerHeight="7" orient="auto-start-reverse"><circle cx="4" cy="4" r="2" class="bg_filled"></circle></marker><marker id="big_open_circle" viewBox="0 0 8 8" refX="4" refY="4" markerWidth="7" markerHeight="7" orient="auto-start-reverse"><circle cx="4" cy="4" r="3" class="bg_filled"></circle></marker></defs><rect class="backdrop" x="0" y="0" width="736" height="416"></rect><text x="2" y="12">:</text><text x="18" y="12">QUADRUPLE</text><text x="98" y="12">DOUBLE</text><text x="154" y="12">DOUBLE</text><text x="210" y="12">;</text><text x="18" y="60">codeword</text><text x="18" y="92">addr</text><text x="58" y="92">of</text><text x="82" y="92">DOUBLE</text><line x1="144" y1="88" x2="264" y2="88" class="solid"></line><polygon points="264,84 272,88 264,92" class="filled"></polygon><text x="18" y="124">addr</text><text x="58" y="124">of</text><text x="82" y="124">DOUBLE</text><text x="18" y="156">addr</text><text x="58" y="156">of</text><text x="82" y="156">EXIT</text><text x="282" y="60">:</text><text x="298" y="60">DOUBLE</text><text x="354" y="60">DUP</text><text x="386" y="60">+</text><text x="402" y="60">;</text><text x="298" y="108">codeword</text><text x="298" y="140">addr</text><text x="338" y="140">of</text><text x="362" y="140">DUP</text><line x1="408" y1="136" x2="520" y2="136" class="solid"></line><polygon points="520,132 528,136 520,140" class="filled"></polygon><text x="298" y="172">addr</text><text x="338" y="172">of</text><text x="362" y="172">+</text><text x="298" y="204">addr</text><text x="338" y="204">of</text><text x="362" y="204">EXIT</text><polygon points="520,276 528,280 520,284" class="filled"></polygon><text x="554" y="156">codeword</text><text x="554" y="188">assembly</text><text x="626" y="188">to</text><polygon points="680,180 672,184 680,188" class="filled"></polygon><text x="554" y="204">implement</text><text x="634" y="204">DUP</text><text x="578" y="220">..</text><text x="554" y="236">NEXT</text><text x="218" y="172">%esi</text><line x1="256" y1="168" x2="264" y2="168" class="solid"></line><polygon points="264,164 272,168 264,172" class="filled"></polygon><text x="554" y="300">codeword</text><text x="554" y="332">assembly</text><text x="626" y="332">to</text><polygon points="672,324 664,328 672,332" class="filled"></polygon><text x="554" y="348">implement</text><text x="634" y="348">+</text><text x="578" y="364">..</text><text x="554" y="380">NEXT</text><g><line x1="4" y1="40" x2="156" y2="40" class="solid"></line><line x1="4" y1="40" x2="4" y2="168" class="solid"></line><line x1="156" y1="40" x2="156" y2="72" class="solid"></line><line x1="4" y1="72" x2="156" y2="72" class="solid"></line><line x1="4" y1="104" x2="156" y2="104" class="solid"></line><line x1="156" y1="104" x2="156" y2="168" class="solid"></line><line x1="4" y1="136" x2="156" y2="136" class="solid"></line><line x1="4" y1="168" x2="156" y2="168" class="solid"></line></g><g><line x1="284" y1="88" x2="436" y2="88" class="solid"></line><line x1="284" y1="88" x2="284" y2="216" class="solid"></line><line x1="436" y1="88" x2="436" y2="120" class="solid"></line><line x1="284" y1="120" x2="436" y2="120" class="solid"></line><line x1="284" y1="152" x2="436" y2="152" class="solid"></line><line x1="284" y1="184" x2="436" y2="184" class="solid"></line><line x1="436" y1="184" x2="436" y2="216" class="solid"></line><line x1="284" y1="216" x2="436" y2="216" class="solid"></line></g><g><line x1="408" y1="168" x2="476" y2="168" class="solid"></line><line x1="476" y1="168" x2="476" y2="280" class="solid"></line><line x1="476" y1="280" x2="520" y2="280" class="solid"></line></g><g><line x1="540" y1="136" x2="692" y2="136" class="solid"></line><line x1="540" y1="136" x2="540" y2="248" class="solid"></line><line x1="540" y1="168" x2="692" y2="168" class="solid"></line><line x1="692" y1="192" x2="692" y2="248" class="solid"></line><line x1="540" y1="248" x2="692" y2="248" class="solid"></line></g><g><line x1="664" y1="152" x2="724" y2="152" class="solid"></line><line x1="724" y1="152" x2="724" y2="184" class="solid"></line><line x1="680" y1="184" x2="724" y2="184" class="solid"></line></g><g><line x1="540" y1="280" x2="692" y2="280" class="solid"></line><line x1="540" y1="280" x2="540" y2="392" class="solid"></line><line x1="540" y1="312" x2="692" y2="312" class="solid"></line><line x1="692" y1="336" x2="692" y2="392" class="solid"></line><line x1="540" y1="392" x2="692" y2="392" class="solid"></line></g><g><line x1="664" y1="296" x2="724" y2="296" class="solid"></line><line x1="724" y1="296" x2="724" y2="328" class="solid"></line><line x1="672" y1="328" x2="724" y2="328" class="solid"></line></g></svg>

<h2 id="update-1-10242024">Update 1: 10/24/2024</h2>

<p>I presented Zorth to the first NYC zig meetup!</p>

<p><a href="https://www.youtube.com/watch?v=Bar4IFc1NpM" title="FORTH in Zig"><img src="http://img.youtube.com/vi/Bar4IFc1NpM/0.jpg" alt="ZORTH" /></a></p>

<h2 id="update-2-11122024">Update 2: 11/12/2024</h2>

<p>After much dragon fighting, I managed to <a href="https://discord.com/channels/605571803288698900/1305550948713627669">build zorth for the web</a>.  Enjoy!</p>

<div id="terminal"></div>
<script type="module">
    import "/blog/node_modules/@xterm/xterm/lib/xterm.js";
    import { openpty } from "/blog/node_modules/xterm-pty/index.mjs";
    import initEmscripten from "/blog/6th.mjs";

    const xterm = new Terminal();
    xterm.open(document.getElementById("terminal"));

    const { master, slave } = openpty();
    xterm.loadAddon(master);

    const response = await fetch("/blog/jonesforth.f");
    const preamble = new Uint8Array(await response.arrayBuffer());
    slave.ldisc.toUpperBuf.push(...preamble);

    await initEmscripten({ pty: slave });
    slave.ldisc.flushToUpper();
</script>]]></content><author><name></name></author><summary type="html"><![CDATA[As this blog shows, I am deeply interested in interpreters. I find the meta nature of programs that run programs fascinating. That said, I am also too lazy to care for the fiddly bookkeeping needed to write a robust parser. That makes FORTH’s complete lack of syntax particularly appealing (look ma, no parser)!]]></summary></entry><entry><title type="html">What is Tail Call Elimination?</title><link href="https://bur.gy/2024/03/29/tail-recursion.html" rel="alternate" type="text/html" title="What is Tail Call Elimination?" /><published>2024-03-29T13:00:07+00:00</published><updated>2024-03-29T13:00:07+00:00</updated><id>https://bur.gy/2024/03/29/tail-recursion</id><content type="html" xml:base="https://bur.gy/2024/03/29/tail-recursion.html"><![CDATA[<p>Previous posts highlight my interest in (obsession with?) interpreters.  I keep marveling at
techniques people have come up with over the years to make our interactions with computers feel conversational.
I was recently reminded of <a href="https://en.wikipedia.org/wiki/Guy_L._Steele_Jr.">Guy Steele</a>’s seminal 
<a href="https://en.wikisource.org/wiki/Lambda:_The_Ultimate_GOTO">Lambda: The Ultimate GOTO</a>
article.  Around the same time, I came across old release notes for
<a href="https://releases.llvm.org/13.0.0/tools/clang/docs/ReleaseNotes.html">Clang 13.0.0</a> announcing, among other things, a
<a href="https://clang.llvm.org/docs/AttributeReference.html#musttail">musttail attribute</a>.  This immediately reminded me
of my decision to use <a href="https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html">GNU’s labels as values</a> to implement
<a href="/2023/02/24/what-forth-again.html">jonesforth in C</a>.  Naturally, I started wondering what a FORTH
implemented in C would look like with <code class="language-plaintext highlighter-rouge">goto</code> replaced by <code class="language-plaintext highlighter-rouge">musttail</code>.  A quick google search turned up
a public <a href="https://gist.github.com/snej/9ba59d90689843b22dc5be2730ef0d49/2d55f844b7622aa117b9275bbc9d189613f7ff7f">gist</a>
by <a href="https://gist.github.com/snej">Jens Alfke</a>.  As an aside, I didn’t notice Jens’s more elaborate
<a href="https://github.com/snej/tails/">Tails</a> project until later.</p>

<p>The first step to migrating from <code class="language-plaintext highlighter-rouge">goto</code> to <code class="language-plaintext highlighter-rouge">musttail</code> is to abandon all the <code class="language-plaintext highlighter-rouge">void **</code> (and occasional <code class="language-plaintext highlighter-rouge">void ***</code> because
of <a href="https://en.wikipedia.org/wiki/Threaded_code#Indirect_threading">indirect threading</a>) in favor of explicit
types.  I adapted the union below from Jens:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">union</span> <span class="n">instr_t</span> <span class="p">{</span>
    <span class="kt">intptr_t</span> <span class="o">*</span><span class="p">(</span><span class="o">*</span><span class="n">code</span><span class="p">)(</span><span class="k">struct</span> <span class="n">interp_t</span> <span class="o">*</span><span class="p">,</span> <span class="kt">intptr_t</span> <span class="o">*</span><span class="p">,</span> <span class="k">union</span> <span class="n">instr_t</span> <span class="o">**</span><span class="p">,</span> <span class="k">union</span> <span class="n">instr_t</span> <span class="o">*</span><span class="p">,</span> <span class="k">union</span> <span class="n">instr_t</span> <span class="o">*</span><span class="p">);</span>
    <span class="kt">intptr_t</span> <span class="n">literal</span><span class="p">;</span>
    <span class="k">union</span> <span class="n">instr_t</span> <span class="o">*</span><span class="n">word</span><span class="p">;</span>
<span class="p">};</span>
</code></pre></div></div>

<p>A FORTH instruction is either an address to a function with a complicated signature (more on that later),
a literal word (branching instructions encode their offsets directly in the instruction stream), or a pointer
to an instruction (for indirect threading).  Let’s pick one example apart</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">intptr_t</span> <span class="o">*</span><span class="nf">SWAP</span><span class="p">(</span><span class="k">struct</span> <span class="n">interp_t</span> <span class="o">*</span><span class="p">,</span> <span class="kt">intptr_t</span> <span class="o">*</span><span class="p">,</span> <span class="k">union</span> <span class="n">instr_t</span> <span class="o">**</span><span class="p">,</span> <span class="k">union</span> <span class="n">instr_t</span> <span class="o">*</span><span class="p">,</span> <span class="k">union</span> <span class="n">instr_t</span> <span class="o">*</span><span class="p">);</span>
<span class="k">static</span> <span class="k">struct</span> <span class="n">word_t</span> <span class="n">name_SWAP</span> <span class="n">__attribute__</span><span class="p">((</span><span class="n">used</span><span class="p">))</span> <span class="o">=</span> <span class="p">{</span>
    <span class="p">.</span><span class="n">link</span> <span class="o">=</span> <span class="o">&amp;</span><span class="n">name_DROP</span><span class="p">,</span>
    <span class="p">.</span><span class="n">flag</span> <span class="o">=</span> <span class="mi">4</span><span class="p">,</span>
    <span class="p">.</span><span class="n">name</span> <span class="o">=</span> <span class="s">"SWAP"</span><span class="p">,</span>
    <span class="p">.</span><span class="n">code</span> <span class="o">=</span> <span class="p">{.</span><span class="n">code</span> <span class="o">=</span> <span class="n">SWAP</span><span class="p">}</span>
<span class="p">};</span>
<span class="kt">intptr_t</span> <span class="o">*</span><span class="nf">SWAP</span><span class="p">(</span><span class="k">struct</span> <span class="n">interp_t</span> <span class="o">*</span><span class="n">env</span><span class="p">,</span> <span class="kt">intptr_t</span> <span class="o">*</span><span class="n">sp</span><span class="p">,</span> <span class="k">union</span> <span class="n">instr_t</span> <span class="o">**</span><span class="n">rsp</span><span class="p">,</span> <span class="k">union</span> <span class="n">instr_t</span> <span class="o">*</span><span class="n">ip</span><span class="p">,</span> <span class="k">union</span> <span class="n">instr_t</span> <span class="o">*</span><span class="n">target</span> <span class="n">__attribute__</span><span class="p">((</span><span class="n">unused</span><span class="p">)))</span>
<span class="p">{</span>
    <span class="k">register</span> <span class="kt">intptr_t</span> <span class="n">tmp</span> <span class="o">=</span> <span class="n">sp</span><span class="p">[</span><span class="mi">1</span><span class="p">];</span>
    <span class="n">sp</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">=</span> <span class="n">sp</span><span class="p">[</span><span class="mi">0</span><span class="p">];</span>
    <span class="n">sp</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="n">tmp</span><span class="p">;</span>
    <span class="n">__attribute__</span><span class="p">((</span><span class="n">musttail</span><span class="p">))</span> <span class="k">return</span> <span class="n">ip</span><span class="o">-&gt;</span><span class="n">word</span><span class="o">-&gt;</span><span class="n">code</span><span class="p">(</span><span class="n">env</span><span class="p">,</span> <span class="n">sp</span><span class="p">,</span> <span class="n">rsp</span><span class="p">,</span> <span class="n">ip</span> <span class="o">+</span> <span class="mi">1</span><span class="p">,</span> <span class="n">ip</span><span class="o">-&gt;</span><span class="n">word</span><span class="p">)</span>
<span class="p">}</span>
</code></pre></div></div>

<p>This example begins with a <a href="https://en.wikipedia.org/wiki/Forward_declaration">forward declaration</a> of
the <code class="language-plaintext highlighter-rouge">SWAP</code> function so we can include it in the <code class="language-plaintext highlighter-rouge">name_SWAP</code> struct.  That struct is part of the
<a href="https://en.wikipedia.org/wiki/Linked_list#Singly_linked_list">singly linked list</a> that implements the
FORTH <a href="https://www.forth.com/starting-forth/11-forth-compiler-defining-words/">vocabulary</a>.
The <code class="language-plaintext highlighter-rouge">name_*</code> structs let the FORTH interpreter/compiler associate textual names of FORTH words
with their implementations.  That logic lives in <code class="language-plaintext highlighter-rouge">INTERPRET</code> which invokes <code class="language-plaintext highlighter-rouge">WORD</code> to
recognize user input then either appends to the word being compiled or transfers control when
in interpreter mode.  <code class="language-plaintext highlighter-rouge">STATE</code> distinguishes between interpreter and compiler mode.  By convention,
<code class="language-plaintext highlighter-rouge">INTERPRET</code> also tries to interpret strings it couldn’t map to words as literal numbers.  If it
fails, it emits an error message and moves on to the next token.  This illustrates FORTH parsimony:
no tokenizer, lexer, or syntax tree.  FORTH is a <a href="https://en.wikipedia.org/wiki/WYSIWYG">WYSIWYG</a> language.</p>

<p>Performance specialists might worry that hot functions with so many parameters will challenge
clang’s <a href="https://en.wikipedia.org/wiki/Register_allocation">register allocation</a>.  Fortunately,
those functions will never be called by a
<a href="https://www.quora.com/What-does-the-call-instruction-do-in-an-assembly-language">CALL instruction</a>!
The parameters are as follow</p>

<blockquote>
  <p><code class="language-plaintext highlighter-rouge">struct interp_t *env</code></p>
</blockquote>

<p>A number of interpreter global variables like <code class="language-plaintext highlighter-rouge">BASE</code>, <code class="language-plaintext highlighter-rouge">STATE</code> (whether we’re compiling or interpreting), <code class="language-plaintext highlighter-rouge">HERE</code></p>

<blockquote>
  <p><code class="language-plaintext highlighter-rouge">intprt_t *sp</code></p>
</blockquote>

<p>A pointer to the top of the value stack.  Both stacks grow downward</p>

<blockquote>
  <p><code class="language-plaintext highlighter-rouge">union instr_t **rsp</code></p>
</blockquote>

<p>A pointer to the top of the return stack</p>

<blockquote>
  <p><code class="language-plaintext highlighter-rouge">union instr_t *ip</code></p>
</blockquote>

<p>A pointer to the <em>next</em> instruction.  Because we’re using indirect threading, <code class="language-plaintext highlighter-rouge">ip-&gt;code</code> rarely makes
sense, only <code class="language-plaintext highlighter-rouge">ip-&gt;literal</code> and <code class="language-plaintext highlighter-rouge">ip-&gt;word</code> do.</p>

<blockquote>
  <p><code class="language-plaintext highlighter-rouge">union instr_t *target</code></p>
</blockquote>

<p>A pointer to the <em>current</em> instruction.  Used by <code class="language-plaintext highlighter-rouge">DOCOL</code> to transfer control to the next word.
Having it as rightmost argument is reminiscent of
<a href="https://en.wikipedia.org/wiki/Continuation-passing_style">continuation-passing style</a>.  If anything,
having so many parameters simplifies clang’s register allocation as the 
<a href="https://en.wikipedia.org/wiki/X86_calling_conventions#System_V_AMD64_ABI">System V ABI</a> mandates that
the first six integer or pointer arguments are passed in registers <code class="language-plaintext highlighter-rouge">%rdi</code>, <code class="language-plaintext highlighter-rouge">%rsi</code>, <code class="language-plaintext highlighter-rouge">%rdx</code>, <code class="language-plaintext highlighter-rouge">%rcx</code>,
<code class="language-plaintext highlighter-rouge">%r8</code>, and <code class="language-plaintext highlighter-rouge">%r9</code>.</p>

<p>All word initializers start with a forward declaration so the first eight lines (before the
last opening curly) can be generated by a <a href="https://gcc.gnu.org/onlinedocs/cpp/Macros.html">preprocessor macro</a>.
A previous version injected the code (from the last opening to the end) as a macro argument.  This
made for a sub-optimal debugging experience since macros expand everything into a single line.
The macro version of <code class="language-plaintext highlighter-rouge">SWAP</code> looks much tidier, like this:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">DEFCODE</span><span class="p">(</span><span class="n">DROP</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="s">"SWAP"</span><span class="p">,</span> <span class="n">SWAP</span><span class="p">)</span>
<span class="p">{</span>
    <span class="k">register</span> <span class="kt">intptr_t</span> <span class="n">tmp</span> <span class="o">=</span> <span class="n">sp</span><span class="p">[</span><span class="mi">1</span><span class="p">];</span>
    <span class="n">sp</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">=</span> <span class="n">sp</span><span class="p">[</span><span class="mi">0</span><span class="p">];</span>
    <span class="n">sp</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="n">tmp</span><span class="p">;</span>
    <span class="n">NEXT</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Amazingly, the <a href="https://en.wikipedia.org/wiki/X86">x86</a> version of <code class="language-plaintext highlighter-rouge">SWAP</code> is barely longer than the c version:</p>

<div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nl">SWAP:</span>                           <span class="err">#</span> <span class="err">@</span><span class="nf">SWAP</span>
    <span class="nf">movdqu</span>  <span class="p">(</span><span class="o">%</span><span class="nb">rsi</span><span class="p">),</span> <span class="o">%</span><span class="nv">xmm0</span>
    <span class="nf">pshufd</span>  <span class="kc">$</span><span class="mi">78</span><span class="p">,</span> <span class="o">%</span><span class="nv">xmm0</span><span class="p">,</span> <span class="o">%</span><span class="nv">xmm0</span>   <span class="err">#</span> <span class="nv">xmm0</span> <span class="err">=</span> <span class="nv">xmm0</span><span class="p">[</span><span class="mi">2</span><span class="p">,</span><span class="mi">3</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="mi">1</span><span class="p">]</span>
    <span class="nf">movdqu</span>  <span class="o">%</span><span class="nv">xmm0</span><span class="p">,</span> <span class="p">(</span><span class="o">%</span><span class="nb">rsi</span><span class="p">)</span>
    <span class="nf">movq</span>    <span class="p">(</span><span class="o">%</span><span class="nb">rcx</span><span class="p">),</span> <span class="o">%</span><span class="nv">r8</span>         <span class="err">#</span> <span class="nv">target</span> <span class="err">=</span> <span class="nv">ip</span><span class="o">-&gt;</span><span class="kt">word</span>
    <span class="nf">movq</span>    <span class="p">(</span><span class="o">%</span><span class="nv">r8</span><span class="p">),</span> <span class="o">%</span><span class="nb">rax</span>         <span class="err">#</span> <span class="nb">rax</span> <span class="err">=</span> <span class="nv">target</span><span class="o">-&gt;</span><span class="nv">code</span>
    <span class="nf">addq</span>    <span class="kc">$</span><span class="mi">8</span><span class="p">,</span> <span class="o">%</span><span class="nb">rcx</span>            <span class="err">#</span> <span class="nv">ip</span> <span class="o">+</span> <span class="mi">1</span>
    <span class="nf">jmpq</span>    <span class="o">*%</span><span class="nb">rax</span>               <span class="err">#</span> <span class="nv">TAILCALL</span>
</code></pre></div></div>

<p>The first three lines actually swap <code class="language-plaintext highlighter-rouge">sp[0]</code> and <code class="language-plaintext highlighter-rouge">sp[1]</code> (using <code class="language-plaintext highlighter-rouge">pshufd</code> to
<a href="https://www.felixcloutier.com/x86/pshufd">shuffle packed doublewords</a>).
The last four lines dereference <code class="language-plaintext highlighter-rouge">ip</code> <em>twice</em> (<em>indirect</em> threading)
the perform the tail call using an <a href="https://en.wikipedia.org/wiki/Indirect_branch">indirect branch</a>.
Staring at that generated assembly highlights a clever bit of
<a href="https://en.wikipedia.org/wiki/Code_golf">code golf</a> in 
<a href="https://rwmj.wordpress.com/">Richard WM Jones</a>’s handcrafted 
<a href="https://github.com/nornagon/jonesforth/blob/master/jonesforth.S">jonesforth.S</a>:
he condensed the last 4 instructions above into just 2 
(<code class="language-plaintext highlighter-rouge">lodsl</code> is equivalent to <code class="language-plaintext highlighter-rouge">mov (%esi), %eax; add $4, %esi</code> while
<code class="language-plaintext highlighter-rouge">jmp *(%eax)</code> is equivalent to <code class="language-plaintext highlighter-rouge">mov (%eax), %eax; jmp *%eax</code>)!
<a href="https://github.com/jburgy/blog/blob/97e00d7795ee0c97cbea901aae46d98a7bb5ebf5/fun/5th.c">97e00d7</a>
is a (failed) attempt at using <a href="https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html">extended asm</a>
to achieve the same compression.  Unfortunately, I found no better way to enforce
pre-conditions than explicit <code class="language-plaintext highlighter-rouge">mov</code> instructions (which more than offset the saving).</p>

<p>The <a href="https://webassembly.org/">webassembly</a> version, courtesy of
<a href="https://webassembly.github.io/wabt/demo/wasm2wat/">wasm2wat</a>, is only slightly longer:</p>
<div class="language-scheme highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="nf">module</span> <span class="nv">$5th</span><span class="o">.</span><span class="nv">wasm</span>
  <span class="p">(</span><span class="nf">type</span> <span class="nv">$t0</span> <span class="p">(</span><span class="nf">func</span> <span class="p">(</span><span class="nf">param</span> <span class="nv">i32</span> <span class="nv">i32</span> <span class="nv">i32</span> <span class="nv">i32</span> <span class="nv">i32</span><span class="p">)</span> <span class="p">(</span><span class="nf">result</span> <span class="nv">i32</span><span class="p">)))</span>
  <span class="o">...</span>
  <span class="p">(</span><span class="nf">func</span> <span class="nv">$SWAP</span> <span class="p">(</span><span class="nf">type</span> <span class="nv">$t0</span><span class="p">)</span> <span class="p">(</span><span class="nf">param</span> <span class="nv">$p0</span> <span class="nv">i32</span><span class="p">)</span> <span class="p">(</span><span class="nf">param</span> <span class="nv">$p1</span> <span class="nv">i32</span><span class="p">)</span> <span class="p">(</span><span class="nf">param</span> <span class="nv">$p2</span> <span class="nv">i32</span><span class="p">)</span> <span class="p">(</span><span class="nf">param</span> <span class="nv">$p3</span> <span class="nv">i32</span><span class="p">)</span> <span class="p">(</span><span class="nf">param</span> <span class="nv">$p4</span> <span class="nv">i32</span><span class="p">)</span> <span class="p">(</span><span class="nf">result</span> <span class="nv">i32</span><span class="p">)</span>
    <span class="p">(</span><span class="nf">i64</span><span class="o">.</span><span class="nv">store</span> <span class="nv">align=4</span>
      <span class="p">(</span><span class="nf">local</span><span class="o">.</span><span class="nv">get</span> <span class="nv">$p1</span><span class="p">)</span>
      <span class="p">(</span><span class="nf">i64</span><span class="o">.</span><span class="nv">rotl</span>
        <span class="p">(</span><span class="nf">i64</span><span class="o">.</span><span class="nv">load</span> <span class="nv">align=4</span>
          <span class="p">(</span><span class="nf">local</span><span class="o">.</span><span class="nv">get</span> <span class="nv">$p1</span><span class="p">))</span>
        <span class="p">(</span><span class="nf">i64</span><span class="o">.</span><span class="nv">const</span> <span class="mi">32</span><span class="p">)))</span>
    <span class="p">(</span><span class="nf">return_call_indirect</span> <span class="p">(</span><span class="nf">type</span> <span class="nv">$t0</span><span class="p">)</span>
      <span class="p">(</span><span class="nf">local</span><span class="o">.</span><span class="nv">get</span> <span class="nv">$p0</span><span class="p">)</span>
      <span class="p">(</span><span class="nf">local</span><span class="o">.</span><span class="nv">get</span> <span class="nv">$p1</span><span class="p">)</span>
      <span class="p">(</span><span class="nf">local</span><span class="o">.</span><span class="nv">get</span> <span class="nv">$p2</span><span class="p">)</span>
      <span class="p">(</span><span class="nf">i32</span><span class="o">.</span><span class="nv">add</span>
        <span class="p">(</span><span class="nf">local</span><span class="o">.</span><span class="nv">get</span> <span class="nv">$p3</span><span class="p">)</span>
        <span class="p">(</span><span class="nf">i32</span><span class="o">.</span><span class="nv">const</span> <span class="mi">4</span><span class="p">))</span>
      <span class="p">(</span><span class="nf">local</span><span class="o">.</span><span class="nv">tee</span> <span class="nv">$p3</span>
        <span class="p">(</span><span class="nf">i32</span><span class="o">.</span><span class="nv">load</span>
          <span class="p">(</span><span class="nf">local</span><span class="o">.</span><span class="nv">get</span> <span class="nv">$p3</span><span class="p">)))</span>
      <span class="p">(</span><span class="nf">i32</span><span class="o">.</span><span class="nv">load</span>
        <span class="p">(</span><span class="nf">local</span><span class="o">.</span><span class="nv">get</span> <span class="nv">$p3</span><span class="p">))))</span>
  <span class="o">...</span>
<span class="p">)</span>
</code></pre></div></div>

<p>In summary, <a href="https://github.com/jburgy/blog/blob/main/forth/5th.c"><code class="language-plaintext highlighter-rouge">5th.c</code></a> looks
different from <a href="https://github.com/jburgy/blog/blob/main/forth/4th.c"><code class="language-plaintext highlighter-rouge">4th.c</code></a>, at least
when considering their C sources. Turns out that compilers generate very similar code
from those difference sources.  That code is also reminiscent of
<a href="https://github.com/nornagon/jonesforth/blob/master/jonesforth.S">jonesforth.S</a>
which was hand-crafted in 32-bit x86.  <code class="language-plaintext highlighter-rouge">5th.c</code> is probably the easiest to extend
of the bunch.  New native instructions only require implementing a new C function with
a well-defined signature.  It might be the
<a href="https://en.wikipedia.org/wiki/Shiny_object_syndrome">shiny object syndrome</a> but think that
<a href="https://github.com/jburgy/blog/blob/main/forth/5th.c"><code class="language-plaintext highlighter-rouge">5th.c</code></a> is a reasonable
<a href="https://en.wikipedia.org/wiki/Blueprint">blueprint</a> for implementing your next interpreter.</p>

<div id="terminal"></div>
<script type="module">
    import "/blog/node_modules/@xterm/xterm/lib/xterm.js";
    import { openpty } from "/blog/node_modules/xterm-pty/index.mjs";
    import initEmscripten from "/blog/5th.mjs";

    const xterm = new Terminal();
    xterm.open(document.getElementById("terminal"));

    const { master, slave } = openpty();
    xterm.loadAddon(master);

    const response = await fetch("/blog/jonesforth.f");
    const preamble = new Uint8Array(await response.arrayBuffer());
    slave.ldisc.toUpperBuf.push(...preamble);

    await initEmscripten({ pty: slave });
    slave.ldisc.flushToUpper();
</script>]]></content><author><name></name></author><summary type="html"><![CDATA[Previous posts highlight my interest in (obsession with?) interpreters. I keep marveling at techniques people have come up with over the years to make our interactions with computers feel conversational. I was recently reminded of Guy Steele’s seminal Lambda: The Ultimate GOTO article. Around the same time, I came across old release notes for Clang 13.0.0 announcing, among other things, a musttail attribute. This immediately reminded me of my decision to use GNU’s labels as values to implement jonesforth in C. Naturally, I started wondering what a FORTH implemented in C would look like with goto replaced by musttail. A quick google search turned up a public gist by Jens Alfke. As an aside, I didn’t notice Jens’s more elaborate Tails project until later.]]></summary></entry></feed>