Confused Life - Reloaded

Saturday, March 7, 2026

Two Agents, One Codebase: An F1 Race Team Approach to Porting ACOLITE to Rust

In Formula 1, every team fields two drivers. Not as a backup plan – as a strategy. One driver pushes the pace, forcing rivals to respond. The other holds position, manages tyres, and covers the alternative strategy. They share telemetry, they share a garage, but they are running different races on the same track. The team wins when both cars score points, not when one driver tries to do everything.

Porting a scientific Python codebase to Rust feels remarkably similar. You need the aggressive driver – the one who charges into unfamiliar code and lays down fast laps of Rust implementation. And you need the calculating driver – the one who reads the data, watches for degradation, and calls out when the numerical precision is drifting. Two AI coding agents, paired like Norris and Piastri, sharing a codebase but operating on different parts of the problem.

The Starting Grid: Why Rust for ACOLITE?

ACOLITE is RBINS’ atmospheric correction toolkit for aquatic remote sensing. It handles everything from Landsat and Sentinel-2 to hyperspectral sensors like PACE OCI (286 bands) and PRISMA (239 bands). The Dark Spectrum Fitting (DSF) algorithm is elegant – image-based, no external atmospheric inputs – but in Python, processing a full PACE scene involves reading 291 NetCDF variables, interpolating multi-dimensional LUTs, and correcting each pixel’s reflectance through a chain of gas transmittance, Rayleigh scattering, and aerosol models. On a decent machine, this takes around 230 seconds.

The seed was planted at FOSS4G 2025 in Auckland when Leo Hardtke ran a tutorial on Earth Observation processing with Rust. It was plagued by Nix environment issues (as I noted in my conference write-up), but when the code ran, it was fast. Zero-cost abstractions and fearless concurrency are not just slogans at that point – they are wall-clock seconds you are not spending waiting for your atmospheric correction to finish.

I had also been watching Rob Woodcock’s acolite-mp branch, which tackled the same performance problem from within Python. His approach was clever: per-band parallelism with memory budgets tuned to cloud CPU-to-RAM ratios (2, 4, or 8 GiB per core), replacing NumPy’s interpolation with the multithreaded pyinterp, and carefully managing the GIL contention that Python’s threading model inflicts on you. He got Sentinel-2 from 791s down to 197s and Landsat from 312s to 99s on a 24-core i9 – roughly a 3-4x speedup.

But the GIL is still there. The memory model is still Python’s. And as Rob himself noted, “further performance improvements are possible but require more extensive changes to the file handling” and “there is a fair amount of GIL contention which limits threading being caused by some structural choices in the implementation.” At some point, you are fighting the language rather than the problem.

Rust sidesteps all of this. No GIL. No garbage collector. Rayon gives you data-parallel iterators that map across bands or tiles with work-stealing. Memory usage is deterministic and known at compile time – you can profile it statically before deploying, which is a sentence that makes no sense in Python-land but is table stakes in systems programming.

The Pit Crew: Two Agents via ACP

Here is where the teammate analogy really kicks in. In F1, a team with only one driver is not half a team – it is no team at all. You cannot run a split strategy with a single car. You cannot use one driver to hold up a rival while the other pulls a gap. The performance of the pair exceeds the sum of the individuals because they create options that a solo driver simply cannot.

Porting 40,000+ lines of scientific Python to Rust is the same. A single AI agent writing Rust will drift – the implementation slowly diverging from Python’s numerical behaviour until your reflectance values are off by just enough to be scientifically useless. You need the second driver to keep it honest.

The solution I landed on was a multi-agent orchestration harness using the Agent Client Protocol (ACP), a JSON-RPC 2.0 protocol over NDJSON stdio that lets coding agents communicate in a structured way:

Agent	Role	F1 Equivalent
Kiro	Executor – writes Rust code, runs tests, reads files	Lead driver – pushes the pace, sets fast laps
Copilot	Proposer – reviews output, suggests next steps, cross-checks Python	Second driver – covers the strategy, watches the gaps
Human	Approver – filters proposals before dispatch	Team principal – makes the call on when to pit

The workflow per sensor port looks like this:

Human provides --task to the orchestrator (tools/agent_harness.py)
Kiro receives the task via ACP session/prompt and starts writing code
Kiro streams output via session/update chunks
Output goes to Copilot for review against the Python source
Copilot proposes ACTION: lines – “fix the gas transmittance interpolation order”, “the Rayleigh LUT needs pressure stacking”
Human approves or rejects
Approved actions go back to Kiro
Repeat until regression tests pass or maximum cycles reached

This is not vibe coding. This is a two-car team running a split strategy.

Think about how McLaren or Red Bull operate. The lead driver qualifies on pole and sets the pace in clean air. The second driver starts on a different tyre compound, runs a longer first stint, and emerges from the pits into a different part of the field. They are solving complementary problems – one optimises for raw speed, the other for strategic coverage. Neither is redundant.

Kiro is the lead driver. It attacks the Rust implementation aggressively – writing loaders, porting DSF algorithms, wiring up rayon parallelism. It sets fast laps. It also occasionally bins it into the gravel trap by hallucinating a NumPy broadcasting rule that does not exist in ndarray.

Copilot is the second driver. It reads the Python source, cross-references the Rust output, and spots where the gap to parity is growing. “The gas transmittance interpolation order is wrong” is exactly the kind of radio call a second driver makes – not flashy, but it prevents a DNF.

The human is the team principal. You do not override the drivers on every corner, but you make the strategic calls: do we pit now and fix this RMSE regression, or do we push on and address it in the next stint? Is a 0.002 RMSE difference in Sentinel-2 reflectance acceptable? (It is – that is within float32 precision.) When do we switch from tiled DSF to fixed DSF mode for this sensor?

Together, they converge faster than either alone, for the same reason that two cars gathering tyre data in free practice gives the team more information than one car doing twice as many laps.

The Telemetry: Regression Tests Against Real Data

In F1, both drivers generate telemetry. The team overlays their data – braking points, throttle application, cornering speed – to find where one is faster and why. The overlay is the truth. Not the driver’s feeling, not the engineer’s simulation, but what the car actually did on the track.

Regression tests are our telemetry overlay. The Python ACOLITE output is Driver 1’s trace. The Rust output is Driver 2’s. We overlay them pixel-by-pixel, band-by-band, and look at the delta. When the traces diverge, something real has changed and we need to understand whether it is a genuine improvement or an error we need to correct.

There are currently 141 Python regression tests that compare Rust output against Python output pixel-by-pixel across real satellite scenes:

Landsat 8/9: 13 regression + 13 Rust-vs-Python + 7 benchmark tests
Sentinel-2 A/B: 19 regression + 15 Rust-vs-Python + 9 benchmark tests
PACE OCI: 17 regression + 14 Rust-vs-Python + 12 DSF comparison + 12 ROI + 10 full-scene tests

The tolerances are tight. Sentinel-2 achieves RMSE < 0.002 (physics-equivalent). Landsat gets RMSE < 0.02. PACE full-scene (1710 x 1272 pixels x 291 bands) hits mean RMSE of 0.004 with 100% of pixels within 0.05 of Python. Correlation coefficients are R > 0.999 across all sensors.

These are not toy tests on synthetic data. They run against actual L1 scenes downloaded from USGS and NASA. When the tests break, something real is wrong.

The Performance Gap: Where the Seconds Go

Sensor	Scene Size	Rust	Python	Speedup
Landsat 8	62M px x 7 bands	66s	180s	2.7x
Landsat 9	62M px x 7 bands	56s	180s	3.2x
Sentinel-2 A	30M px x 11 bands	52s	182s	3.5x
Sentinel-2 B	30M px x 11 bands	64s	173s	2.7x
PACE OCI (full)	1710 x 1272 x 291 bands	84s	230s	2.7x

The PACE result is particularly satisfying. The key optimisation was switching from 291 per-band NetCDF reads to 3 bulk detector reads, then applying rayon-parallel atmospheric correction across tiles. Load is 12 seconds, AC is 34 seconds, write is 35 seconds. That write phase for a 291-band hyperspectral cube goes to GeoZarr V3 with gzip compression – try doing that in a Python event loop without your memory allocator throwing a tantrum.

Energy Efficiency: The Fuel Strategy Nobody Talks About

Here is the part where I get philosophical – and where the F1 analogy turns from metaphor into mirror.

Formula 1 underwent a fuel efficiency revolution in 2014. The FIA introduced hybrid power units, capped fuel flow at 100 kg/hour (monitored 2,200 times per second), and forced teams to extract maximum performance from minimum fuel. The result was not slower cars – it was faster cars that used less. The 2026 regulations go further: fossil carbon is prohibited entirely, the MGU-K will deliver three times the electrical power (350kW vs today’s 120kW), producing up to 1,000 horsepower while burning sustainable fuel. Less fuel, more power. That is not a trade-off – it is an engineering constraint that drives innovation.

The same constraint applies to scientific computing, we just pretend it does not. Cloud computing bills are denominated in dollars, but the underlying unit is energy. Every CPU cycle your atmospheric correction burns is a watt drawn from a power grid somewhere. When you are processing continental-scale Sentinel-2 archives or the full PACE ocean colour mission, those watts add up. Python is the V10 era of scientific computing – glorious, unrestricted, and profligate with resources.

Rust is the hybrid power unit. Its advantage is not just speed – it is energy per unit of work. A 3x speedup roughly translates to using a third of the compute time, which means a third of the energy, a third of the carbon footprint, and a third of your AWS bill. The Rust Foundation and others have pointed to studies showing compiled languages like Rust and C using an order of magnitude less energy than interpreted languages for equivalent workloads. Just as F1 teams discovered that fuel efficiency constraints forced them to build fundamentally better engines, switching to Rust forces you to think about memory layout, allocation patterns, and data flow in ways that Python’s garbage collector lets you ignore – until the bill arrives.

And here is the irony that would make an F1 sustainability officer wince: Earth observation processing is meant to monitor the planet’s health. Burning excess energy to do it is like running your emissions-monitoring car on leaded fuel. F1 recognised that the sport’s 20-car grid is only 1% of its total carbon footprint, but pursued fuel efficiency anyway because the technology trickles down. The same logic applies to EO processing pipelines. The individual savings per scene are modest, but at continental archive scale they compound – just like how F1’s hybrid innovations now power road cars from Ferrari’s SF90 to the electric components in every modern turbo engine.

Static memory profiling makes this tangible. In Rust, I can tell you at compile time that a Sentinel-2 full-scene atmospheric correction will peak at approximately N gigabytes of memory, because the allocations are deterministic. In Python, you find out at runtime – usually when the OOM killer visits your pod. F1 teams know their fuel load to the gram before the formation lap. Rust gives you the same certainty for compute.

Kubernetes 1.35 and Vertical Pod Autoscaling

This deterministic memory behaviour dovetails nicely with Kubernetes 1.35’s improvements to Vertical Pod Autoscaler (VPA). VPA watches your pod’s actual resource usage and adjusts CPU and memory requests/limits accordingly. When your workload has predictable resource usage – as Rust workloads tend to – VPA converges quickly to the right allocation instead of oscillating between OOM kills and wasted headroom.

For a processing pipeline that ingests satellite scenes of varying sizes (a Landsat scene is 62 million pixels across 7 bands; a PACE scene is 2.2 million pixels across 291 bands), VPA can right-size pods per sensor type. Rust’s static memory profile means the VPA recommendations stabilise fast, which means tighter bin-packing, which means more scenes processed per node, which means lower cost per scene.

Compare this to Python pods where memory usage is non-deterministic, garbage collection spikes are unpredictable, and the VPA has to overprovision to avoid OOM. The 2 GiB/core cloud ratio that Rob’s acolite-mp was carefully designed around becomes less of a constraint when your language does not waste half of it on interpreter overhead.

Out-of-Band Development: Preventing Merge Conflicts with Upstream

One design decision I am particularly happy with is keeping the Rust port on a separate feature branch (feature/rust-port) and treating it as out-of-band from the Python codebase. ACOLITE upstream is actively maintained by Quinten Vanhellemont at RBINS, with regular additions of new sensors, algorithm refinements, and bug fixes. A traditional “rewrite in Rust” approach would create an immediate fork that diverges with every upstream commit.

Instead, the Rust code lives in src/, benches/, and tests/ directories that do not exist in upstream Python ACOLITE. The Python code in acolite/ stays untouched. The regression tests are the synchronisation mechanism – they import both the Python ACOLITE modules and the compiled Rust binary, run the same scene through both, and compare outputs.

When upstream adds a new sensor or changes a gas transmittance coefficient, the regression tests fail in the Rust port. That failure is the trigger: it goes into the agent harness as a --task, Kiro investigates the numerical difference, Copilot cross-references the upstream commit, and the fix lands in Rust without touching a single Python file. No merge conflicts. No rebasing nightmares. Just tests that enforce parity.

This is how you keep an acceleration layer in sync with a moving target – you do not try to merge them. You test them against each other.

What Is Next: The Gap to Full Sensor Parity

The roadmap has the current state at 48 Rust tests, 141 Python regression tests, and three sensors fully validated (Landsat 8/9, Sentinel-2 A/B, PACE OCI). The architecture – loader, AC, writer – is clean and extensible. But three sensors out of 30+ is a qualifying lap, not a race win. Here is what closing the gap to full ACOLITE parity actually looks like.

Sensor Coverage: 3 down, 30+ to go

Python ACOLITE supports a sprawling constellation of sensors. The Rust port has ticked off the three highest-priority ones but the remaining fleet breaks into tiers:

Tier 1 – Near-term (shared loader patterns exist):

Sensor	Bands	Loader Type	Blocker
Sentinel-3 OLCI	21	NetCDF	Sensor def exists, needs full pipeline
PRISMA	239	HDF5	Shares pattern with PACE
DESIS	235	HDF5	Shares pattern with PACE
EnMAP	224	HDF5	Shares pattern with PACE
EMIT	285	NetCDF	Similar to PACE OCI

These are the low-hanging fruit. The PACE port proved out the NetCDF and hyperspectral GeoZarr writer path; PRISMA/DESIS/EnMAP share the HDF5 loader pattern. Each is a well-scoped --task for the agent harness – Kiro writes the loader and wires up the AC pipeline, Copilot validates against Python output on a reference scene.

Tier 2 – Medium-term (new loader work required):

Sensor	Bands	Notes
Landsat 5 TM / 7 ETM+	7-8	Older calibration metadata formats
PlanetScope (Dove/SuperDove)	4-8	Commercial format, GeoTIFF based
WorldView-2/3	6-29	Multi-resolution, pan-sharpening
Pleiades	5-7	DIMAP format
QuickBird-2	5	Legacy but still used
VIIRS (NPP/J1/J2)	22	Swath-based HDF5, three platforms
Aqua/Terra MODIS	36	HDF4/HDF-EOS
GOCI-2	12	Korean ocean colour mission

Each of these needs a dedicated loader – different metadata formats, different calibration approaches, different file layouts. The atmospheric correction core (DSF, gas transmittance, Rayleigh, aerosol models) is shared, but getting the radiometrically calibrated top-of-atmosphere reflectance array into the pipeline is the per-sensor work.

Tier 3 – Geostationary and niche (lowest priority for aquatic applications):

GOES ABI, Himawari AHI, MTG-I FCI, SEVIRI, Sentinel-3 SLSTR, AMAZONIA-1 WFI, CHRIS, HYPERION, HICO, HyperField, HYPSO, Tanager. Some of these (HYPERION, HICO) are decommissioned but their archives are still processed. Others (Tanager at 420 bands, HYPSO at 120) are newer hyperspectral missions that would benefit most from Rust’s performance advantage.

Beyond Loaders: The Algorithm Gap

Sensor parity is not just about reading files. Python ACOLITE has several processing features the Rust port does not yet implement:

ROI subsetting: Limit processing to a bounding box or polygon – critical for operational workflows that do not need a full scene
Ancillary data retrieval: NCEP ozone, pressure, and wind speed from NASA OBPG; currently the Rust port uses default values
DEM-derived pressure: Copernicus DEM at 30/90m for surface pressure estimation in mountainous coastal regions
Glint correction: Sun glint removal for low-latitude ocean scenes
RAdCor adjacency correction: The physics-based adjacency effect correction developed under the STEREO program
TACT thermal processing: Surface temperature from Landsat thermal bands via libRadtran – this one is architecturally interesting because it requires calling an external Fortran radiative transfer code
Interface reflectance (rsky): Sky reflection correction at the air-water interface
L2W water products: Chlorophyll-a (OC algorithms), TSS (Nechad, Dogliotti), turbidity, Secchi depth – the derived products that downstream scientists actually use

The L2W gap is the most consequential. Most ACOLITE users do not care about surface reflectance per se; they want chlorophyll maps or turbidity time series. Until the Rust port can produce L2W outputs, it remains a fast atmospheric correction engine rather than a complete aquatic remote sensing toolkit.

The Realistic Path

Closing this gap is not a sprint, it is an endurance race – appropriately enough. The agent harness makes each sensor port a repeatable, testable unit of work. The pattern is established: write loader, wire to AC pipeline, run regression tests against Python, fix deltas, validate on real data. Each sensor port takes the agents a day or two of focused work plus human review.

At the current pace, Tier 1 sensors are within reach in the near term. Tier 2 will follow as the loader library matures. The algorithm features (ancillary data, glint, TACT) are orthogonal to sensor coverage and can be developed in parallel. L2W is the final milestone – when the Rust port can ingest a Sentinel-2 scene and produce a chlorophyll-a map that matches Python to within measurement uncertainty, the port will be race-ready for production.

Each of these is a --task for the agent harness. Two drivers, one constructor’s championship. The lead driver pushes into unfamiliar sensor territory, the second driver validates against Python, and the telemetry overlay catches every divergence before it compounds into a retirement.

If the intersection of Rust, Earth observation, and AI-assisted development interests you, the code is all on GitHub. Feel free to ping me with ideas, bug reports, or competing approaches – especially if you have a cleverer way to handle the N-dimensional LUT interpolation. That one was a fun 3 days of Rapid Rust Rewrite fuelled by AI Amphetamine Analogs.

Friday, February 27, 2026

Copilot + Arduino CLI + Saleae: Driver Development for V93XX

I have been spending time building out the V93XX_Arduino library for Vangotech energy-monitoring ASICs, and this is one of those projects where tooling makes or breaks momentum.

The pairing that worked best for me was:

GitHub Copilot for repetitive protocol code, tests, and workflow glue
arduino-cli for deterministic compile and upload loops
Saleae Logic Analyzer captures to confirm what is actually on the wire

The short version: Copilot gets you to a plausible driver quickly, but the logic analyzer gets you to a correct driver.

V93XX driver workflow walkthrough

Why this trio works

Developing protocol drivers is usually a game of “spec says one thing, silicon does another thing”.

If you rely only on serial logs, you can miss timing and framing issues. If you rely only on captures, you can waste hours writing boilerplate and one-off scripts. Using these three together gives a tighter loop:

Copilot proposes code and tests from your intent
Arduino CLI compiles and flashes quickly from scripts
Saleae confirms framing, parity, baud behavior, and CRC bytes
Copilot helps refactor after you learn what the bus is doing

In practice, this turned the V9381 UART ChecksumMode and waveform/FFT work from a stop-start activity into a repeatable pipeline.

Project context: V93XX_Arduino

Repository: whatnick/V93XX_Arduino

The library currently covers:

V93XX_UART for V9360 and V9381 UART modes (including address pins and checksum modes)
V93XX_SPI for V9381 SPI paths (faster acquisition, up to MHz clocks)
Examples for baseline comms, waveform capture, and FFT

Useful docs in the repo:

Practical workflow

1) Start with a machine-checkable baseline

Before touching hardware, run the local CRC and framing tests.

python tools/test_checksum_mode.py

If this is red, do not flash anything yet.

2) Use Copilot for structured implementation work

I get better results when prompts are specific and constrained. Example prompt:

Implement V9381 UART read path with explicit ChecksumMode behavior.
Keep Clean mode strict and Dirty mode permissive.
Add unit tests for CRC-8 edge cases and frame parsing.
Do not change public method names.

Copilot is strongest here at:

Building test scaffolding quickly
Keeping repetitive register access code consistent
Producing incremental refactors after captures reveal protocol edge cases

3) Consume datasheets with PDF MCP and keep them in-repo

Before generating more driver code, I now ingest the vendor datasheet with pdf-reader-mcp so Copilot can work from extracted register tables and command framing notes.

My workflow is:

Commit the original datasheet PDF to the repository, for example under docs/datasheets/v93xx/.
Use PDF MCP to extract the high-value sections (register map, UART/SPI timing, checksum/CRC rules, waveform buffer details).
Save extracted artifacts back into the repo as text or markdown notes, for example docs/datasheets/v93xx/register_notes.md.
Prompt Copilot using both code and extracted notes so generated changes are tied to concrete datasheet language.

This gives you a versioned paper trail from silicon docs to driver behavior, which is very useful when reconciling captures from the logic analyzer.

4) Compile and flash with Arduino CLI

For ESP32-S3 targets, this keeps the build deterministic and scriptable:

arduino-cli board list
arduino-cli compile --fqbn esp32:esp32:esp32s3 examples/V9381_UART_DIRTY_MODE
arduino-cli upload --fqbn esp32:esp32:esp32s3 --port COM3 examples/V9381_UART_DIRTY_MODE

PowerShell users can run the project helper script:

.\tools\run_automated_tests.ps1 -Port COM3

That pipeline can run unit tests, compile, upload, serial verification, and optional capture/analysis phases.

5) Validate on-wire behavior with Saleae

The logic analyzer phase is where protocol assumptions get tested.

For UART work:

Confirm bus settings (baud, parity, stop bits) match the target mode
Verify frame boundaries and inter-frame timing
Check CRC byte placement and value against expected payload sum/complement logic

If using the helper scripts in this repo:

python tools/capture_v9381_uart.py
python tools/analyze_checksum_captures.py [capture_dir]

This gives a concrete report of expected vs observed CRC and whether behaviour matches Clean vs Dirty semantics.

What changed in my debugging habits

Old approach

Edit sketch
Upload
Stare at serial output
Guess

New approach

Ask Copilot for narrow, test-backed change
Build and flash with Arduino CLI from script
Capture with Saleae and compare against expected frame math
Feed mismatch details back into Copilot for targeted fixes

It feels closer to hardware TDD than trial-and-error firmware hacking.

Trade-offs and cost notes

Arduino CLI is free and excellent for repeatable CI-style local loops
Copilot is a paid productivity tool, but pays for itself when protocol code churn is high
Saleae hardware is not cheap, but it can save days when parity/timing/CRC bugs are subtle

If you are doing serious serial or SPI driver work, a logic analyzer is not optional for long.

Troubleshooting notes that keep coming up

Wrong serial configuration (especially parity) can look like random CRC failure.
Upload success does not imply runtime protocol correctness.
Dirty mode can hide issues by design; keep a Clean mode path for strict validation.
Keep datasheet extracts in-repo so Copilot prompts reference real tables, not memory.
A scripted workflow (pdf-reader-mcp + arduino-cli + capture + analysis) beats ad-hoc manual steps every time.

Closing

For this V93XX work, the winning combination was AI-assisted coding plus old-school instrumentation.

Copilot helps me move faster, Arduino CLI keeps the loop reproducible, and Saleae captures keep me honest.

If you are building protocol drivers, treat those as complementary tools, not alternatives.

Tuesday, January 27, 2026

Loading FPGA Bitstreams with OpenOCD and Raspberry Pi - From Captain DMA to NeTV2

A Hot Adelaide Afternoon and a Pile of FPGAs

I was recruited to wire together the NeTV2 to the Raspberry Pi 5 PCI bus by some means. Tim donated a bunch of hardware to make it work. I picked them up from his place, while he sorted through the contents of his container. The ultimate aim is to get them going as part of fpgas.online.

The box contained a treasure trove: an NeTV2 boards from bunnie’s (Andrew Huang) video overlay project, and lots of Raspberry Pi’s. As I went down the rabbit hole I acquired some more hardware for myself, Captain DMA development boards, and various other Artix-7 based experiments (Acorn, LiteFury). These boards share a common trait – they’re designed to appear as network devices on a PCI Express bus, which makes them interesting for all sorts of applications from video processing to… let’s say “creative” system analysis and game hacking using auto-target features.

The challenge with inheriting random FPGA boards is getting new gateware onto them. The original programming cables might be lost, proprietary software might be Windows-only and in Chinese, and documentation can be scarce. This is where OpenOCD and a humble Raspberry Pi come to the rescue. As long as you figure out which pins are the JTAG, and can get a PR merged.

The Problem: DMA Cards and Their Programming Woes

Captain DMA cards (and similar Artix-7 based DMA devices) typically come with a CH347 USB JTAG interface. The vendor provides a GUI tool, but it’s entirely in Chinese and only runs on Windows. For those of us who prefer reproducible, scripted workflows on Linux – or want to program boards headless on a Raspberry Pi – this is less than ideal.

The workflow the GUI implements is actually straightforward:

Connect to the CH347 JTAG interface
Detect the FPGA/JTAG chain
Load a BSCAN SPI bitstream for JTAG-to-SPI access
Program the SPI flash with the target FPGA image
Refresh the FPGA to load the new bitstream

The key insight is that we’re not programming the FPGA directly – we’re using JTAG to access the SPI flash chip that stores the bitstream. The BSCAN primitive acts as a bridge between JTAG and the SPI flash.

OpenOCD Scripts for Captain DMA

I’ve put together a set of scripts at https://github.com/whatnick/fpga-dma-scripts that replicate the Chinese GUI’s functionality using pure OpenOCD commands. The scripts support Linux/macOS and Windows, and handle:

Automatic BSCAN bitstream download from quartiq’s prebuilt collection
CH347 interface configuration installation
Version checking for OpenOCD (needs 0.12.0+dev for CH347 support)
Cross-platform flashing with proper error handling

Captain DMA clone attached to Raspberry Pi

Basic Usage

On Linux/macOS:

./flash.sh 75t /path/to/pcileech_75t484_x1.bin

On Windows:

flash.cmd 75t C:\path\to\pcileech_75t484_x1.bin

The scripts support three Artix-7 variants: 35T, 75T, and 100T. The underlying OpenOCD command is:

openocd -c "set BSCAN_FILE bscan_spi_xc7a75t.bit" \
        -c "set FPGAIMAGE pcileech_75t484_x1.bin" \
        -f flash.cfg

The flash.cfg orchestrates the JTAG chain detection, BSCAN loading, and SPI programming sequence.

Example Output

A successful flash looks like this:

Open On-Chip Debugger 0.12.0+dev-02377-gb9e401616 (2026-01-25-09:47)
Licensed under GNU GPL v2
For bug reports, read
        http://openocd.org/doc/doxygen/bugs.html
xc7_program
jtagspi_program
Info : CH347 USB To UART+JTAG from vendor wch.cn with serial number 0123456789 found. (Chip version=5.44, Firmware=0x45)
Info : clock speed 7500 kHz
Info : JTAG tap: xc7.tap tap/device found: 0x13632093 (mfg: 0x049 (Xilinx), part: 0x3632, ver: 0x1)
Info : [xc7.proxy] Examination succeed
Info : Found flash device 'win w25q64fv/jv' (ID 0x1740ef)
Info : sector 0 took 234 ms
Info : sector 1 took 250 ms
...
Info : sector 32 took 259 ms
Info : sector 33 took 256 ms

Each sector takes about 230-260ms to program. A full 75T bitstream covers around 34 sectors, so expect 8-10 seconds for the flash operation.

NeTV2: Bunnie’s Video Overlay Board

The NeTV2 (Network Television version 2) is bunnie’s open-source HDMI video overlay board. It uses the same Xilinx Artix-7 family (XC7A35T or XC7A100T depending on variant) and was designed for real-time HDMI signal processing. The board features:

Artix-7 FPGA (35T or 100T variant)
DDR3 SDRAM (K4B2G1646F)
PCIe x4 interface
RMII Ethernet
HDMI input and output
SD card slot
SPI flash for bitstream storage

The original NeTV2 scripts from bunnie’s repo (https://github.com/bunnie/netv2mvp-scripts) were designed for Raspberry Pi 2/3 using the older sysfs GPIO interface. I’ve submitted a PR (https://github.com/AlphamaxMedia/netv2mvp-scripts/pull/1) that updates these for Raspberry Pi 4 and upstream OpenOCD.

Raspberry Pi 4 GPIO JTAG Configuration

The key changes for RPi4 compatibility involve the bcm2835gpio driver configuration:

# Boilerplate setup for Rpi on Alphamax configuration

adapter driver bcm2835gpio

transport select jtag

set _CHIPNAME xc7a35t

# Raspi4 uses a different peripheral base address
# RPi 2/3: 0x3F000000
# RPi 4:   0xFE000000
bcm2835gpio peripheral_base 0x3F000000

# Speed coefficients tuned for RPi 3B+
bcm2835gpio speed_coeffs 100000 5

# GPIO pin assignments for JTAG
adapter gpio tck 4
adapter gpio tms 17
adapter gpio tdi 27
adapter gpio tdo 22
adapter gpio srst 24

reset_config none

adapter speed 10000

The GPIO pins map directly to the Raspberry Pi’s header:

Signal	GPIO	Physical Pin
TCK	4	7
TMS	17	11
TDI	27	13
TDO	22	15
SRST	24	18

SPI Flash Programming with Raspberry Pi

The cl-spifpga-rpi4.cfg config file handles the actual programming:

# Burn an FPGA image onto the SPI ROM

source [find interface/alphamax-rpi.cfg]

source [find cpld/xilinx-xc7.cfg]
source [find cpld/jtagspi.cfg]

init

jtagspi_init xc7.pld $BSCAN_FILE
jtagspi_program $FPGAIMAGE 0

virtex2 refresh xc7.pld

exit

The command to flash a NeTV2 from a Raspberry Pi:

/opt/openocd/src/openocd \
  -c "set BSCAN_FILE /home/k8s/netv2mvp-scripts/bscan_spi_xc7a100t.bit" \
  -c "set FPGAIMAGE /home/k8s/pcileech_netv2_top.bin" \
  -f /home/k8s/netv2mvp-scripts/cl-spifpga-rpi4.cfg

The output shows the Macronix SPI flash being programmed:

Info : BCM2835 GPIO JTAG/SWD bitbang driver
Info : clock speed 10000 kHz
Info : JTAG tap: xc7.tap tap/device found: 0x13631093 (mfg: 0x049 (Xilinx), part: 0x3631, ver: 0x1)
Info : [xc7.proxy] Examination succeed
Info : Found flash device 'mac 25l6405' (ID 0x1720c2)
Info : sector 0 took 286 ms
Info : sector 1 took 282 ms
...

Note the device ID difference: 0x13631093 for the 100T part versus 0x13632093 for the 75T. The flash chip is a Macronix MX25L6405 (ID 0x1720c2) versus the Winbond W25Q64FV (ID 0x1740ef) on the Captain DMA boards.

LiteX Builds for NeTV2

If you want to run something other than the original gateware, LiteX has excellent support for the NeTV2. The board definition lives at https://github.com/litex-hub/litex-boards/blob/master/litex_boards/targets/kosagi_netv2.py.

Building a Basic LiteX SoC

Install LiteX following the standard setup, then:

cd litex-boards/litex_boards/targets

# Build for XC7A35T variant with Ethernet
./kosagi_netv2.py --variant=a7-35 --with-ethernet --build

# Or for the 100T variant with PCIe
./kosagi_netv2.py --variant=a7-100 --with-pcie --build

The target supports:

DDR3 SDRAM: Using the K4B2G1646F module with LiteDRAM
Ethernet: RMII PHY via LiteEth
PCIe: x4 Gen2 via LitePCIe
SD Card: Both SPI mode and native SDIO
LED Chaser: For the obligatory blinky demo

Platform Definition Highlights

The platform file defines the IO constraints:

_io = [
    # 50 MHz clock input
    ("clk50", 0, Pins("J19"), IOStandard("LVCMOS33")),
    
    # User LEDs
    ("user_led", 0, Pins("M21"),  IOStandard("LVCMOS33")),
    ("user_led", 1, Pins("N20"),  IOStandard("LVCMOS33")),
    # ...
    
    # SPI Flash for bitstream storage
    ("spiflash", 0,
        Subsignal("cs_n", Pins("T19")),
        Subsignal("mosi", Pins("P22")),
        Subsignal("miso", Pins("R22")),
        Subsignal("wp",   Pins("P21")),
        Subsignal("hold", Pins("R21")),
        IOStandard("LVCMOS33")
    ),
]

The platform also defines the programmer configuration, which now supports both OpenOCD and OpenFPGALoader:

def create_programmer(self, programmer="openocd"):
    if programmer == "openfpgaloader":
        return OpenFPGALoader(cable="digilent_hs2")
    elif programmer == "openocd":
        bscan_spi = "bscan_spi_xc7a100t.bit" if "xc7a100t" in self.device else "bscan_spi_xc7a35t.bit"
        return OpenOCD("openocd_netv2_rpi.cfg", bscan_spi)

Provisioning for fpgas.online

The ultimate destination for these NeTV2 boards is fpgas.online – a community project that provides free remote access to FPGA development boards. The service currently has around 10 FPGA boards connected, each with a webcam pointed at the LEDs so you can see your bitstream running in real-time.

Each board is attached to a Raspberry Pi that handles:

JTAG programming via the GPIO bitbang interface
Webcam streaming for visual feedback
Web-based terminal access for interactive development
Queue management so multiple users can share the hardware

This is where the Raspberry Pi JTAG programming approach really shines – you can reflash gateware remotely without any physical access to the boards. Combined with LiteX’s rapid build cycle, it becomes practical to iterate on FPGA designs from anywhere in the world.

If you want to try FPGA development without buying hardware, head over to https://ps1.fpgas.online/fpgas/ and claim a board. There’s usually one free and ready to use.

Why This Matters

These techniques apply to any Artix-7 board with SPI flash storage:

Reproducibility: Scripts in version control beat clicking through GUIs
Automation: CI/CD pipelines can flash test firmware automatically
Headless operation: A Raspberry Pi in a lab can program boards remotely
Cross-platform: The same OpenOCD configs work on Linux, macOS, and Windows
Tool preservation: When vendors disappear, open source tooling remains
Remote labs: Services like fpgas.online can offer FPGA access to anyone

The BSCAN SPI approach is particularly elegant – it uses a small “helper” bitstream that turns the FPGA’s JTAG interface into a SPI master, allowing you to program the flash chip that normally stores the FPGA configuration. Once programmed, a power cycle or refresh command loads the new bitstream.

Resources

fpgas.online Remote Lab: https://ps1.fpgas.online/fpgas/
Captain DMA Scripts: https://github.com/whatnick/fpga-dma-scripts
NeTV2 RPi4 Scripts: https://github.com/AlphamaxMedia/netv2mvp-scripts/pull/1
LiteX NeTV2 Target: https://github.com/litex-hub/litex-boards/blob/master/litex_boards/targets/kosagi_netv2.py
BSCAN SPI Bitstreams: https://github.com/quartiq/bscan_spi_bitstreams
PCILeech FPGA Project: https://github.com/ufrisk/pcileech-fpga
Video Walkthrough: https://youtu.be/CNQnafoOTCM

The next time you inherit a box of mystery FPGA boards, you’ll know what to do. OpenOCD and a Raspberry Pi can breathe new life into almost any Xilinx 7-series board – no proprietary tools, no Windows VM, no Chinese language skills required.

Feel free to ping me with other interesting DMA or video processing boards you’ve encountered. There’s always more gateware to explore.

Wednesday, December 24, 2025

Seven Years to Kubernetes: A Turing Pi 1 Christmas Miracle

The Seven Year Itch (For Hardware)

Often I buy more hardware than I need. ~~Actually, strike that~~ – I always buy more hardware than I need. It’s a disease, really. This particular affliction manifested in 2018 when I pre-ordered a Turing Pi 1 because I had convinced myself that building a Raspberry Pi cluster would be the perfect way to learn Kubernetes.

It was not the perfect way.

Little did I realize that it would take me seven years to gather all seven Raspberry Pi Compute Module 3+ boards and finally bootstrap a k3s cluster. In that time:

Kubernetes went through approximately 47 major versions
The Raspberry Pi 4 and 5 came out (and experienced their own chip shortages)
I discovered my Turing Pi board had a faulty ethernet switch
I aged visibly, just look at my github profile and videos from recent conference presentations.

The Recipe for Homelab Kubernetes Suffering

In this write-up, I’ll outline what it actually takes to set up a Raspberry Pi 3+ cluster in 2025. Consider this a cautionary tale wrapped in a tutorial. I’ll probably resell this now-functioning cluster to another masochist – er, enthusiast – and use the recouped capital to buy something newer that will sit on my shelf for another seven years.

Step 1: Acquire the Base Board

The Turing Pi 1 was a great option back in the day. It’s a mini-ITX form factor board that accepts up to 7 Raspberry Pi Compute Modules in SODIMM slots. The on-board gigabit ethernet switch was supposed to be the killer feature – no external networking required!

Pro tip: Make sure the on-board switch actually works. Test it before you commit to this path. Mine didn’t, which I discovered approximately 6 years too late.

Step 2: Collect Your Compute Modules (Like Pokemon, But Expensive)

You’ll need Raspberry Pi Compute Module 3+ boards. The Turing Pi 1 can handle up to 7 of them. I sourced mine from Mouser Electronics, though availability has been… variable… over the years.

I really wish there were more alternatives in the SODIMM compute module format. If you’re in the business of making one with a newer processor and more RAM, let’s talk. Seriously. My DMs are open.

Step 3: Flash the OS (The Easy Part, They Said)

The Compute Modules have onboard eMMC storage, which is the preferred boot device. Trying to use SD cards will lead to disappointment, inconsistent boots, and existential questioning of your life choices.

Here’s the gear you’ll need:

A Compute Module IO Board - Something like the Waveshare CM3/CM3+ IO Board or the official Raspberry Pi IO Board to put the module in USB mass storage mode
rpiboot/usbboot - The tool that makes the eMMC appear as a USB drive
Raspberry Pi Imager - The official tool for flashing OS images

Critical step: Bake in your SSH public key during the imaging process. This will save you from having to find 7 spare HDMI cables and keyboards. The Pi Imager has a settings gear icon that lets you configure hostname, SSH keys, and WiFi – use it.


# Generate an SSH key if you don't have one

ssh-keygen -t ed25519 -C "kubernetes-cluster"

Step 4: Network Configuration (Here Be Dragons)

Plug in all the modules and fire up the on-board Turing Pi ethernet. If you’re lucky, the on-board network works and you can access all the nodes. Marvel at how easy this was.

If you’re me, you’ll discover the switch is dead and enter the seven stages of homelab grief:

Denial: “It’s probably just a loose connection”
Anger: Unprintable + emails to Turing Pi support and learning that the board is End-of-Life and unsupported.
Bargaining: “Maybe I only need 4 nodes anyway”
Depression: stares at pile of unused compute modules
Acceptance: “I guess I’m buying USB ethernet adapters”

The Workaround: Get a bunch of USB-to-Ethernet adapters like the TP-Link UE300 and wire them into an external switch.

Unfortunately, only 4 of the compute modules have their USB ports exposed on the Turing Pi 1. For the other 3, you’ll need to do some creative soldering to expose the USB D+/D- and power lines. That’s just 12 more flying wires on the board. What could go wrong?

USB Ethernet Adapters and Working cluster

Step 5: The Case Mod (Optional But Satisfying)

I got a nice acrylic case to put it all in. It has a fan connection on top for cooling, which you’ll need because 7 Pi’s generate surprising heat.

There were no extra slots for the 3 additional USB connections I needed. But I have a Dremel, two weeks of Christmas holidays, and absolutely no fear of voiding warranties.

Step 6: Actually Installing Kubernetes (The Easy Part, For Real This Time)

With SSH keys baked in, installing k3s is delightfully straightforward using k3sup (pronounced “ketchup”, because of course it is).


# Install k3sup

curl -sLS https://get.k3sup.dev | sh

sudo install k3sup /usr/local/bin/



# Bootstrap the first node as the server

k3sup install --ip 192.168.1.101 --user k8s



# Join additional nodes as agents

k3sup join --ip 192.168.1.102 --server-ip 192.168.1.101 --user k8s

k3sup join --ip 192.168.1.103 --server-ip 192.168.1.101 --user k8s

# ... repeat for remaining nodes

k3sup SSHes into each machine, downloads the necessary bits, and bootstraps a low-resource-friendly cluster with etcd (or SQLite) as the datastore. It’s genuinely magical compared to kubeadm.

Reality check: After the k3s install, the Pi 3 doesn’t have much headroom left for actually running applications. We’re talking about 1GB of RAM shared between the OS, kubelet, and your workloads. It’s a great testbed for learning the k3s API and running ARM binaries natively, but don’t expect to run your company’s microservices on it.

K3Sup based cluster setup

The Final Result

After seven years of procrastination, hardware hunting, debugging dead ethernet switches, creative soldering, and Dremel work, I finally have a working 7-node Kubernetes cluster.

It also serves as a rather festive Christmas decoration with the green PCB and red blinking LEDs. Very on-brand for the holidays.

What’s Next?

Hopefully I’ve been a good boy this year and Santa will bring me some newer clustering hardware to play with. The Turing Pi 2.5 looks tempting with its support for CM4, Jetson, and the Turing RK1 modules.

But knowing me, I’ll buy it in 2025 and finally get it working by 2032.

Hardware Shopping List

For anyone brave enough to follow this path, here’s what you’ll need:

Item	Link	Notes
Turing Pi 1 Board	Turing Pi	Check if ethernet works!
Raspberry Pi CM3+ (x7)	Mouser	8GB/16GB/32GB eMMC options
CM IO Board for flashing	Waveshare	Or official RPi IO Board
USB Ethernet Adapters	Amazon	Just in case
Ethernet Switch	Your choice	8+ ports recommended
Acrylic Case	Various	With fan for cooling

Software & Tools

Raspberry Pi Imager - OS flashing tool
rpiboot/usbboot - For eMMC flashing
k3s - Lightweight Kubernetes distribution
k3sup - k3s installer via SSH
etcd - Distributed key-value store

Feel free to ping me with your own homelab Kubernetes horror stories. Misery loves company.

Friday, December 12, 2025

Refactoring the Austender Scraper: From Colly to OCDS

The AusTender analyser started life as a straight HTML scraper built with Colly, walking the procurement portal page by page. It worked, but it was always one redesign away from a slow death: layout shifts, odd pagination edges, and the constant need to throttle hard so I could sleep at night.

Then the Australian Government exposed an Open Contracting Data Standard (OCDS) API. That changed the whole game. Instead of scraping tables and div soup, I can treat the portal like a versioned data feed.

Part of why I care: I am kind of fascinated by government spending as a system. Budgets read like a mixture of engineering constraints and political storytelling, and I keep wanting to trace the thread from “budget line item” to “actual contract award” without hand-waving. The Treasurer’s Final Budget Outcome release (2022-23, “first surplus in 15 years”) is exactly the sort of headline that makes me want to drill down into the mechanics: Final Budget Outcome shows first surplus in 15 years.

So the redesign in austender_analyser does three things differently:

Fetch via OCDS, not HTML: Reduce breakage by consuming the API’s canonical JSON, not scraped pages.
Persist to Ducklake: Store releases, parties, and contracts in Ducklake so you can query locally without rerunning the whole pipeline. This does not quite work yet; I am treating it as a learning exercise with Ducklake. It is much easier to learn on a real problem than on toy demo datasets.
Treat caching as optional: Counterintuitively, the local cache is sometimes slower than pulling fresh data. Ducklake’s startup and query overhead can outweigh a simple, parallelized upstream call. The new design keeps the cache but makes it opt-in and measurable.

If you prefer Python, the upstream API team ships a reference walkthrough in the austender-ocds-api repo (see also the SwaggerHub docs and an example endpoint like findById).

Why move off Colly?

Scraping HTML is like doing accounting by screenshot. OCDS is the ledger export.
Less breakage: OCDS is documented and versioned; DOM scraping is brittle.
Faster iteration: You model on structured data immediately, not after a fragile extraction layer.
Clear rate behavior: You can respect API limits without guessing at dynamic page loads.

Why keep Ducklake in the loop?

Ducklake is the reproducibility knob. It lets me freeze a snapshot, replay transforms, and run offline queries when I am iterating on analysis (or when the upstream is slow, or when I just do not want to be a bad citizen).

But caches are not free. Ducklake has startup and query overhead, and that can be slower than simply pulling fresh JSON in parallel. So the pipeline treats Ducklake like a tool, not a religion: measure the latency, pick the faster path, keep an escape hatch when you need repeatability.

Current flow

Pull OCDS releases in batches, keyed by release date and procurement identifiers.
Normalize the JSON into Ducklake tables (releases, awards, suppliers, items).
Emit lightweight summaries for quick diffing between runs.

Lessons learned

A stable API beats heroic HTML scraping almost every time. Even in times of AI and (firecrawl)[https://www.firecrawl.dev/].
Caches are not free; measure them. Sometimes stressing the upstream lightly is faster and still acceptable within published rate limits.
Keep exit hatches: allow forcing cache use, bypassing it, and snapshotting runs for reproducibility.

Next steps: Going deeper : tighten validation against the OCDS schema, add minimal observability (latency histograms for API vs cache), and ship a “fast path” mode that only hydrates the fields needed for high-level spend dashboards. Going broader : find sites and build API and Web aggregators for Australian state tender sites (e.g. VicTender and international ones.

Saturday, December 6, 2025

Solar Ceilings and Compounding Dreams

It is fashionable to wave away physical constraints with vague references to solar abundance and human ingenuity. Yet every balance sheet eventually meets a balance of energy. Solar photons may shower Earth with roughly 170,000 terawatts, but financial markets expect growth that compounds on top of itself forever. The math linking those stories rarely appears in the same paragraph—so let’s put them together.

Setting the Stage

I keep coming back to Tom Murphy’s dialogue in Exponential Economist Meets Finite Physicist. In Act One, Murphy plots U.S. energy use from 1650 onward and it traces a remarkably straight exponential line at ~3% per year. Economists in the conversation shrug; after all, 2–3% feels modest. But compounding at that pace means energy demand multiplies by ten every century. Our economic models implicitly assume something even more optimistic : 8–10% returns in equity markets, pension targets, and venture decks; without asking what energy supply function supports that.

Thermodynamic Guardrails

Murphy distills the second law of thermodynamics into plain language:

“At a 2.3% growth rate (conveniently chosen to represent a 10× increase every century), we would reach boiling temperature in about 400 years… Even if we don’t have a name for the energy source yet, as long as it obeys thermodynamics, we cook ourselves with perpetual energy increase.”

That thought experiment matters less for the literal 400-year timer and more because it shows energy growth must decelerate to avoid turning Earth into a heat engine. Solar panels, fusion, space mirrors … pick your technology. The waste heat still has to radiate away. We cannot spreadsheet, app and AI our way around Stefan–Boltzmann and Black Body radiation.

Solar Arithmetic vs Demand Curves

Let’s grant the optimists a heroic build-out: cover 5% of Earth’s land area with 20%-efficient photovoltaic arrays, assume a generous 200 W/m² average output, and we net roughly 20 TW—about the entire human primary energy demand today. That is fantastic news for decarbonization, but it is not a blank check for compounding GDP. If demand keeps growing at 3%, we would need 20 TW × (1.03)ⁿ in perpetuity. Within 250 years we’d be trying to harvest thousands of terawatts—orders of magnitude more land, materials, storage, and transmission than our initial miracle project. Solar abundance is real; solar infinity is fiction.

Finance Is an Energy IOU

Money is a claim on future work, and work requires energy. When pensions assume 7–8% annual returns, when startups pledge 10× growth, and when national budgets bake in permanent productivity gains, they are effectively promising that future societies will deliver 2–3 doublings of net energy per century. If we instead hit a solar plateau—because land, materials, or social license cap expansion—those financial promises become unmoored. We can pretend that virtual goods, algorithmic trading, or luxury desserts (to borrow Murphy’s Act Four anecdote) deliver infinite utility without added energy, but the chefs, coders, and data centers still eat, commute, and cool their CPU’s , GPU’s and Tensor processors. The intangible economy rides on a very tangible energy base.

Rewriting the Business Plan

Accepting a solar ceiling does not doom us to stagnation. It just forces different design constraints:

grow quality, not quantity—prioritize outcomes per unit energy … do proof of useful work rather that roll the dice and gamble.
align finance with expected energy supply rather than mythical exponentials … and I am not talking of wasting energy on crypto.
treat efficiency gains as buying time, not as a perpetual motion machine … if you learnt enough physics in high school to reject the perpetual motion machine, but have been lulled into perpetual 8% returns from the finance markets, there is a serious schizophrenia issue.
embed thermodynamic literacy in economic education so debates start from the same math.

Murphy ends his essay noting that growth is not a “good quantum number.” It is not conserved. Our job is to craft institutions, portfolios, and narratives that can thrive when net energy flattens, because physics already told us that day will arrive long before our spreadsheets hit overflow errors.

Darwin 2022 - Ruminations Compendium

Collected reflections from the July 2022 Darwin trip, a narrative of adaptation, organisational change, and expansion can live in a single place.

July 19 – Lemmings And Launchpads

There is no exception to the rule that every organic being naturally increases at so high a rate, that if not destroyed the earth would soon be covered by the progeny of a single pair. Even slow breeding man has doubled in twenty five years, and at this rate in a few thousand years there would literally be no standing room for his progeny. – Charles Darwin

Like the lemming marching and diving into the ocean to self‑regulate, humanity plunges itself into vices of its own creation: alcohol, drugs, violence, and greed. Perhaps the next plunge is into the real ocean or into the vacuum of space, chasing more room in which to stand or float. Failure in harsh environments creates room by removing weaker individuals, or greater resilience by rewarding the most adaptable. Colonial Australia itself was founded on such selection—the most adaptable individuals and the strictest rule enforcers reshaped an unforgiving frontier.

July 20 – Organisational Evolution In Flight

Seeing that a few members of such water-breathing classes as the Crustacea and Mollusca are adapted to live on the land, and seeing that we have flying birds and mammals, flying insects of vast diversified types, and formerly had flying reptiles. It is conceivable that flying fish, which now glide far through air, slightly rising and falling by the aid of their fluttering fins, might have been modified into perfectly winged animals. – Charles Darwin

The ability to skim over water for a few metres comes from external tweaks, but the ability to cross the Pacific like a Godwin Tern comes from internal rewiring: hollow bones, high metabolism, and a brain with a built‑in compass. Organisations face the same distinction. A brief digital-transformation spasm can bolt on an app or a website, yet sustaining that flight demands internal metamorphosis and a sense of direction from leadership. Caterpillars become butterflies through wholesale change—so must companies that aspire to be more than flying fish.

July 23 – Questions For The Corporate Naturalist

Where are the transitional forms?
Organisations with no lines on the org chart operate as pure adhocracy. Hidden behind corporate veils, they are like pupae in cocoons, waiting to emerge in a more defined shape.
How can specialised organs evolve?
Marketing machines, technology muscle, sales teeth, enterprise-planning backbone, analyst frontal lobes—each department is an organ honed for a specific survival task.
Is behaviour or instinct inheritable?
Culture answers this. The rituals, stories, and incentives that survive layoffs and leadership changes become the genetic code of the firm.
Why are some species sterile when crossed, while others are fertile?
Some mergers and acquisitions thrive; others fail because the two organisational genomes cannot integrate and diverge instead of hybridising.

July 24 – Conquering New Lands

He who believes in the struggle for existence and in the principle of natural selection, will acknowledge that every organic being is constantly endeavouring to increase in numbers; and that if any one being vary ever so little, either in habits or structure, and thus gain an advantage over some of that inhabitant, however different it may be from its own place, it will seize on the place of that inhabitant. – Charles Darwin

International expansion is a contest for ecological niches. Bringing hard‑won optimisations from one country to another is a bid to displace incumbents. The organisations that vary—by process, by product, by mindset—claim new ground first.