pablo formoso FUTURE / DATA & AI
ES EN Streaming –:–:– UTC

A Hundred Children in 101 Milliseconds: the trick forkd stole from mitosis

forkd applies the old fork() call to whole virtual machines: a hundred isolated sandboxes in 101 ms by cloning a warm parent. The idea, explained calmly — with biology.

Imagine that every time you wanted to boil an egg, you had to build the entire kitchen first. Put up the walls, run the gas line, install the stove, wait for everything to be ready… and then, when you’re done, demolish all of it. Absurd, right? Well, that’s roughly what an AI agent does today every time it needs to run a snippet of code safely: it boots a whole computer from scratch, spins up an operating system, imports the heavy libraries, runs three lines… and throws it in the bin a second later. Multiply that by a hundred parallel requests and you’ll understand why the bill —in time and in money— goes through the roof. And here comes the red pill: there’s a project that has decided to stop building disposable kitchens and start doing what every cell in your body has done since you were an embryo. Divide.

That project is called forkd, and its thesis fits in a single line on its repo’s front page: “fork() for AI agent microVMs.” If that sentence means nothing to you yet, don’t worry: over the next few minutes we’re going to take it apart piece by piece, because underneath it lies one of the most elegant ideas I’ve seen in infrastructure all year. An idea that, like almost everything good in computing, biology had been using for millions of years before we got around to naming it.

The cold-start tax

Let’s start with the problem, because the solution isn’t impressive until you feel it.

Modern AI agents —the ones that write code, browse the web, solve tasks step by step— don’t sit around thinking. They go out into the world and execute. And when an agent runs code it generated itself, there’s a golden rule: that code must not touch your machine. It has to run inside a sealed box, a sandbox, a padded little room where, if things blow up, nothing else gets hurt. So far, common sense.

The problem is scale. A serious agent doesn’t open one box: it opens dozens or hundreds at once. Every conversational turn, every tool it calls, every branch of reasoning it wants to explore in parallel wants its own clean box. And each box, traditionally, is born the same way: booting an operating system from zero, importing numpy, importing torch, maybe loading a model into memory. That welcome ritual —what the jargon calls a cold start— costs hundreds of milliseconds per box. Sometimes seconds.

Think of it as a bridge toll. Paying it once doesn’t hurt. Paying it a hundred times in a row, for a hundred identical cars all going to the same place to do the same thing, is just plain dumb. And yet that’s how nearly all the agent infrastructure that exists today works: a hundred cold starts, a hundred copies of the same import torch, a hundred tolls.

fork(), or the most biological verb in computing

Here it’s worth a pause for history, because the solution isn’t new. It’s been with us since the seventies.

At the heart of any Unix system —and therefore Linux, and therefore most of the internet you use— lives a system call with a beautiful name: fork(). To branch. When a program calls fork(), it doesn’t restart or rebuild itself: it divides. From a parent process a child process is born that is, at the instant of birth, an exact copy. Same memory, same variables, same mental state, so to speak. Two processes where there was one, identical as twins just separated.

And how does it afford the luxury of copying all that memory at once without taking forever? With a trick that is, to me, one of the loveliest ideas in computing: copy-on-write. The system duplicates nothing up front. Parent and child share the same memory pages, physically the same ones, and only when one of them tries to modify a page does the kernel make —right then, and only for that page— a private copy. As long as nobody writes, everything is shared and free. The copy gets paid for, page by page, only when it’s actually needed.

If this sounds familiar, it’s because your body has been doing it forever. A cell doesn’t reproduce by building a daughter from scratch, atom by atom, mitochondrion by mitochondrion. It divides: it copies what it already has and splits the difference. Mitosis is, at bottom, a fork() with a membrane. And the clever laziness of copy-on-write —spending no energy until divergence demands it— is exactly the kind of stingy efficiency that evolution rewards. Nature never builds the whole kitchen to boil an egg. It copies the kitchen it already has and walks off.

forkd’s move: a stem cell that’s already warm

And now, finally, forkd’s play: take this good old fork() and apply it not to a tiny process, but to an entire virtual machine.

The plan has three beats. First, you boot a single virtual machine —the parent— and let it warm up to its heart’s content: import Python and all its heavy libraries, load whatever model you need, let the JIT compiler get up to speed. All that expensive work happens once. Second, when the parent is warm and ready, you pause it and take a photo of its memory: a snapshot. Third, when the requests arrive, forkd boots nothing cold. It launches N child processes, and each child maps the parent’s memory image in private mode —that MAP_PRIVATE you’ll see in the docs— so the kernel applies copy-on-write at the page level. The children share the parent’s warm memory until they start to diverge.

The result? The numbers the project itself reports are breathtaking: a hundred isolated children in about 101 milliseconds, each consuming 0.12 MiB of extra RAM —because almost all of their memory is borrowed from the parent. Compare that to cold-booting a hundred Firecracker machines: 759 milliseconds and 84 MiB a head. This isn’t an incremental improvement; it’s a different league.

And here’s the detail that makes an engineer raise an eyebrow. Normally, in this business, you pick one of two things: either fork()-style cloning speed, or real isolation. The latter means KVM: a hardware boundary, policed by the processor, not a flimsy software partition like the one separating Docker-style containers. forkd gives you both at once. Each child is a microVM with its own Linux kernel and its own wall of hardware —built on Firecracker, by the way, the same technology behind AWS Lambda— and yet it’s born almost as fast as a fork(). Having both at the same time is precisely the thing you normally can’t have.

What nature knew first

Let me go back to biology for a moment, because the analogy carries more weight than it seems, and because I think that’s where the real beauty of all this lives.

A warm stem cell —metabolically active, all its machinery running— doesn’t need its daughters to reinvent life. It hands them the complete apparatus, already assembled: ribosomes running, enzymes in place, the engine on. The daughter starts up already in motion. That’s exactly forkd’s warmed parent: a machine that has already done the import torch, already loaded the model, already got the interpreter lukewarm. The children don’t inherit a blueprint to build from; they inherit the building, already standing.

And copy-on-write is the unmistakable signature of how nature manages resources: don’t spend until there’s no other choice. Two daughters share the same machinery while they’re doing the same thing; only when one takes its own path —when it writes its difference— does it pay the cost of that divergence, and only for that part. It’s metabolic thrift raised to a design principle.

The project adds one more gesture that rounds the metaphor off: it’s called BRANCH, and it lets you pause a machine that’s already working, photograph its state mid-task, and fork it in ~150 milliseconds. You don’t clone the parent before it begins: you clone it mid-thought. A cell that divides not before acting, but at the very moment it’s doing something, passing on to its daughters not just the machinery but the half-finished task. If you don’t find that gorgeous, I don’t know what to tell you.

There’s one figure that condenses the whole value proposition into a single number. Running a trivial operation by reusing the parent’s already-warm Python takes 1 millisecond. Doing it by spinning up a cold process that has to re-import numpy takes 96. That 96× factor isn’t marketing: it is, literally, the difference between copying the kitchen and rebuilding it. The entire idea, distilled into a ratio.

Where the metaphor breaks (the honest counterpoint)

It would be dishonest to sell you this as cost-free magic, so let’s step down from the enthusiasm for a second.

The most fundamental limitation is beautiful precisely because it’s biological: copy-on-write requires parent and child to live on the same machine. You can’t stretch one cell’s membrane across two separate bodies. If the child lands on a different server, “copy-on-write” degenerates into “copy-everything-over-the-network,” and the magic evaporates. Today forkd is single-host, and jumping to multiple nodes is the big problem it has ahead of it.

The rest are the aches of youth. The project openly declares itself alpha: internal formats may change, it hasn’t passed a third-party security audit, and it lives off a single maintainer —what we call a bus factor of one, that uncomfortable number measuring how many people would have to be hit by a bus for the project to die with them. None of this invalidates the idea. But it marks the difference between “this is fascinating, let’s study it and test it on a bench” and “let’s build critical production on top of it.” The first, yes. The second, not yet.

The future being incubated

Let’s swallow the red pill whole and look a little further out.

If this primitive goes mainstream —and the economics push hard in that direction— the substrate AI runs on is going to start looking a lot less like a tidy data center and a lot more like a Petri dish. Picture an agent that, faced with a hard problem, doesn’t answer with an answer: it divides into a thousand versions of itself, each exploring a different branch of the reasoning, each in its own sealed microVM, all born from the same warm parent in the blink of an eye. Most will die in milliseconds, pruned the moment their branch stops looking promising. A few will survive to merge into the final answer.

Computation as a swarm. As a colony. Thousands of ephemeral selves that are born, diverge, get evaluated and discarded faster than it takes you to read this sentence. Up close it’s wildly efficient and, frankly, elegant. Take a step back and it has something vertiginous about it: a tide of disposable minds germinating and dying in silence inside the machine, with no one to mourn a single one. It’s not a dystopia of robots with red eyes. It’s something subtler and stranger: life —its grammar of copy, diverge, discard— reappearing inside silicon because it turns out it was, simply, the most efficient way to do things. Nature was right. It always was. It’s just that now we’re running it at a hundred per 101 milliseconds.

What I’m taking home

Three ideas, in case this is all you keep:

The best infrastructure ideas are usually old biology ideas. fork(), copy-on-write, snapshots of a warm parent… none of this is new under the sun. It’s mitosis by another name. When a technical solution reminds you of something a cell does, pay attention: you’re probably onto something.

The cold start was a tax we paid out of habit. forkd doesn’t invent new physics; it notices that rebuilding the kitchen a hundred times was a silliness we accepted without thinking. A lot of real innovation is exactly this: looking at a cost everyone treats as inevitable and asking, “but why?”

A brilliant idea is not the same as a ready foundation. forkd is the cleanest thing out there for cloning machines while they’re warm, and at the same time it’s alpha, single-author and single-host. Both are true at once. Treat it as what it is —a first-rate architectural inspiration and a promising technical pilot— not as the bedrock to build the cathedral on. Not yet.

The cell has spent a billion years proving that dividing while warm beats being reborn cold. That it took us until 2026 to copy the trick onto a virtual machine says less about how hard the problem is and more about how slowly we sometimes look at what nature already solved.

With the red pill well and truly swallowed.


Sources:

Pablo Formoso
author

Pablo Formoso

Field notes from the intersection of data, AI, and applied philosophy.

posts
38
from
2024

Leave a Reply

Your email address will not be published. Required fields are marked *