Virtualization, microVMs and Three-Headed Monkeyshttps://slp.prose.sh2022-07-28T20:09:00ZSergio Lopez's blogslpFinding a Wasm Runtime Unikernel for libkrun2022-09-27T15:54:25Zhttps://slp.prose.sh/finding-a-wasm-runtime-unikernel-for-libkrun<p>There's this interesting idea of adding support for running Wasm/WASI
payloads in <a href="https://github.com/containers/libkrun" rel="nofollow">libkrun</a>, which is
something we could easily achieve by simply embedding a Wasm runtime,
statically built for Linux, into <code>initrd</code>.</p>
<p>Now, the problem with this approach is that, despite having a
payload (the Wasm runtime) with a well-known behavior, <strong>we would
still be using a (built with a minimal config, but otherwise complete)
Linux kernel, while only needing a small amount of its
functionality</strong>. In other words, the workload's
<a href="https://en.wikipedia.org/wiki/Trusted_computing_base" rel="nofollow">TCB</a> would not
be optimal.</p>
<p>But, <strong>what if the Wasm runtime was also the kernel</strong>?</p>
<hr>
<h1 id="wait-do-you-really-need-virtulization-for-running-a-wasm-workload"><a class="anchor" href="#wait-do-you-really-need-virtulization-for-running-a-wasm-workload" rel="nofollow">#</a> Wait, do you really need Virtulization for running a Wasm workload?</h1>
<p>Yes and no. In most cases, no. <strong>The isolation provided by the Wasm
runtime</strong>, <strong>combined with container isolation</strong> (namespaces, cgroups,
selinux...) <strong>for the runtime itself, provides an excellent degree of
security</strong>.</p>
<p>But there's an scenario where Virtualization is not optional, and
that's <strong>when you want to protect the workload with Confidential
Computing technologies</strong> such as <a href="https://developer.amd.com/sev/" rel="nofollow">SEV</a>
or
<a href="https://www.intel.com/content/www/us/en/developer/articles/technical/intel-trust-domain-extensions.html" rel="nofollow">TDX</a>,
as both of them are built on top of the existing Virtualization
capabilities provided by the hardware.</p>
<hr>
<h1 id="choosing-a-unikernel--wasm-runtime-combo"><a class="anchor" href="#choosing-a-unikernel--wasm-runtime-combo" rel="nofollow">#</a> Choosing a Unikernel + WASM Runtime combo</h1>
<p>The first idea that came to my mind was to use
<a href="https://github.com/hermitcore/rusty-hermit" rel="nofollow">RustyHermit</a>, which is
supported as a target by the Rust toolchain, to build a Rust-based
runtime, such as <a href="https://wasmer.io/" rel="nofollow">Wasmer</a> or
<a href="https://wasmtime.dev/" rel="nofollow">Wasmtime</a>.</p>
<p>After giving it a quick try, I've noticed that, with both runtimes,
<strong>there are a number of dependencies that include platform-dependent
code</strong> that would need to be ported to <code>RustyHermit</code>. Since I didn't
have much time to invest in this experiment, I decided to <strong>look for a
simpler solution</strong>.</p>
<p>And the simplest one would've probably been using
<a href="http://osv.io/" rel="nofollow">OSv</a>. <code>OSv</code> is able to run unmodified, dynamically
linked Linux binaries, by playing this cool trick in which <strong>their
linker resolves the symbols of some well-known libraries to
kernel-provided functions</strong>, so it still behaves like a unikernel.</p>
<p>While this one was tempting, the main goal of this experiment was to
find out how small the TCB can get for this use case, so <code>OSv</code>'s approach
wasn't really a good fit, <strong>since you still have a potentially larger
kernel than what you really need</strong>.</p>
<p>Finally, I came across <a href="https://unikraft.org/" rel="nofollow">Unikraft</a>, which does
really behave like a <a href="https://en.wikipedia.org/wiki/Operating_system#Library" rel="nofollow">library
OS</a>, which is
exactly what I was looking for in the context of this experiment. It's
also well documented, and the project <a href="https://github.com/unikraft/unikraft" rel="nofollow">looks pretty alive in
GitHub</a>.</p>
<p>Now we just need <strong>a Wasm runtime written in C to pair it up with</strong>
<code>Unikraft</code>.</p>
<hr>
<h1 id="the-low-hanging-fruit"><a class="anchor" href="#the-low-hanging-fruit" rel="nofollow">#</a> The low-hanging fruit</h1>
<p>There's certainly <a href="https://github.com/appcypher/awesome-wasm-runtimes" rel="nofollow">quite a number of Wasm runtimes out
there</a>, but the
one that first caught my attention was
<a href="https://github.com/wasm3/wasm3" rel="nofollow">Wasm3</a>, as it's small, simple, and
written in C. The build process is so simple that they even provide a
one-liner to build it manually with <code>gcc</code>. Identifying which source
files and headers need to get built wasn't going to be a problem.</p>
<p>Said and done, in a very small amount of time I got it running a
simple Wasm <code>hello world</code> program:</p>
<pre><code>[ 0.000000] Info: [libkvmplat] <setup.c @ 472> Entering from KVM (x86)...
[ 0.000000] Info: [libkvmplat] <setup.c @ 473> multiboot: 0
[ 0.000000] Info: [libkvmplat] <setup.c @ 407> HEAP area @ 400000000 - 63b56a000 (9585467392 bytes)
[ 0.000000] Info: [libkvmplat] <setup.c @ 499> initrd: 0x1b1000
[ 0.000000] Info: [libkvmplat] <setup.c @ 501> heap start: 0x400000000
[ 0.000000] Info: [libkvmplat] <setup.c @ 505> stack top: 0x63b56a000
[ 0.000000] Info: [libkvmplat] <setup.c @ 532> Switch from bootstrap stack to stack @0x63b57a000
[ 0.000000] Info: [libukboot] <boot.c @ 199> Unikraft constructor table at 0x167000 - 0x167028
[ 0.000000] Info: [libuklibparam] <param.c @ 113> libname: netdev, 72
[ 0.000000] Info: [libuklibparam] <param.c @ 113> libname: vfs, 96
[ 0.000000] Info: [libuklibparam] <param.c @ 594> No library arguments found
[ 0.000000] Info: [libukboot] <boot.c @ 213> Found 0 library args
[ 0.000000] Info: [libukboot] <boot.c @ 221> Initialize memory allocator...
[ 0.000000] Info: [libukallocregion] <region.c @ 202> Initialize allocregion allocator @ 0x400000000,2
[ 0.000000] Info: [libukboot] <boot.c @ 264> Initialize IRQ subsystem...
[ 0.000000] Info: [libukboot] <boot.c @ 271> Initialize platform time...
[ 0.000000] Info: [libkvmplat] <tscclock.c @ 253> Calibrating TSC clock against i8254 timer
[ 0.100001] Info: [libkvmplat] <tscclock.c @ 274> Clock source: TSC, frequency estimate is 2592089840z
[ 0.104876] Info: [libukschedcoop] <schedcoop.c @ 232> Initializing cooperative scheduler
[ 0.111274] Warn: [libpthread-embedded] <pte_osal.c @ 215> Thread 0x4000000d0 created without libpthr.
[ 0.123340] Info: [libuksched] <thread.c @ 180> Thread "Idle": pointer: 0x4000000d0, stack: 0x40001000
[ 0.131926] Warn: [libpthread-embedded] <pte_osal.c @ 215> Thread 0x4000203d8 created without libpthr.
[ 0.143563] Info: [libuksched] <thread.c @ 180> Thread "main": pointer: 0x4000203d8, stack: 0x40003000
[ 0.150728] Info: [libukboot] <boot.c @ 95> Init Table @ 0x167028 - 0x167058
[ 0.157502] Info: [libukswrand] <swrand.c @ 86> Initialize random number generator...
[ 0.164264] Info: [libukbus] <bus.c @ 134> Initialize bus handlers...
[ 0.168237] Info: [libukbus] <bus.c @ 136> Probe buses...
[ 0.171996] Info: [liblwip] <init.c @ 152> Initializing lwip
[ 0.175937] Info: [libuksched] <thread.c @ 180> Thread "lwip": pointer: 0x400040fc0, stack: 0x40005000
[ 0.193270] Info: [libvfscore] <rootfs.c @ 98> Mount ramfs to /...
[ 0.201977] Info: [libvfscore] <mount.c @ 122> VFS: mounting ramfs at /
[ 0.208725] Info: [libvfscore] <rootfs.c @ 106> Extracting initrd @ 0x1b1000 (136704 bytes) to /...
[ 0.217546] Info: [libukcpio] <cpio.c @ 233> Extracting /main.aot (136428 bytes)
Powered by
o. .o _ _ __ _
Oo Oo ___ (_) | __ __ __ _ ' _) :_
oO oO ' _ `| | |/ / _)' _` | |_| _)
oOo oOO| | | | | (| | | (_) | _) :_
OoOoO ._, ._:_:_,\_._, .__,_:_, \___)
Phoebe 0.10.0~9bf6e63-custom
[ 0.245898] Info: [libukboot] <boot.c @ 125> Pre-init table at 0x1763f0 - 0x1763f0
[ 0.251770] Info: [libukboot] <boot.c @ 136> Constructor table at 0x1763f0 - 0x1763f0
[ 0.257999] Info: [libukboot] <boot.c @ 146> Calling main(2, ['build/wasm3_kvm-x86_64', 'main.wasm'])
[ 0.264377] Warn: [libukmmap] <mmap.c @ 196> __uk_syscall_r_mprotect() stubbed
[ 0.270256] Warn: [libukmmap] <mmap.c @ 190> __uk_syscall_r_madvise() stubbed
Hello, Unikraft + Wasm3!
[ 0.278786] Info: [libukboot] <boot.c @ 155> main returned 0, halting system
[ 0.294673] Info: [libkvmplat] <shutdown.c @ 35> Unikraft halted
</code></pre>
<hr>
<h1 id="gotta-go-fast"><a class="anchor" href="#gotta-go-fast" rel="nofollow">#</a> Gotta go fast!</h1>
<p><code>Wasm3</code> is neat piece of software and a good starting point, but it's
just an interpreter. It doesn't have a
<a href="https://en.wikipedia.org/wiki/Just-in-time_compilation" rel="nofollow">JIT</a>, nor the
ability to run
<a href="https://en.wikipedia.org/wiki/Ahead-of-time_compilation" rel="nofollow">AOT</a>
code. So I started looking for some other option.</p>
<p>Soon I came across
<a href="https://github.com/bytecodealliance/wasm-micro-runtime" rel="nofollow">WAMR</a>, which
is also small, written in C, and pretty portable. It's build process
is a bit more complex, but looking at a regular build log generated on
Linux I was able to figure out which source files, headers and flags I
needed for building it with <code>Unikraft</code>.</p>
<p>Another interesting aspect of <code>WAMR</code> is that it provides an <code>AOT</code>
compiler, based on <code>LLVM</code>, to <strong>compile the Wasm payload into native
code</strong>. And you can also <strong>tune the runtime build process to include just
the code to load and run <code>AOT</code> code</strong>, leaving out the interpreter and the
<code>JIT</code>, <strong>leading to a pretty small <code>TCB</code></strong>.</p>
<p>After a bit of tweaking, I was able to build the <code>Unikraft + WAMR</code>
bundle, an run it with the Wasm <code>hello world</code> program:</p>
<pre><code>[ 0.000000] Info: [libkvmplat] <setup.c @ 472> Entering from KVM (x86)...
[ 0.000000] Info: [libkvmplat] <setup.c @ 473> multiboot: 0
[ 0.000000] Info: [libkvmplat] <setup.c @ 407> HEAP area @ 400000000 - 43f5dc000 (1063108608 bytes)
[ 0.000000] Info: [libkvmplat] <setup.c @ 499> initrd: 0x1b1000
[ 0.000000] Info: [libkvmplat] <setup.c @ 501> heap start: 0x400000000
[ 0.000000] Info: [libkvmplat] <setup.c @ 505> stack top: 0x43f5dc000
[ 0.000000] Info: [libkvmplat] <setup.c @ 532> Switch from bootstrap stack to stack @0x43f5ec000
[ 0.000000] Info: [libukboot] <boot.c @ 199> Unikraft constructor table at 0x167000 - 0x167028
[ 0.000000] Info: [libuklibparam] <param.c @ 113> libname: netdev, 72
[ 0.000000] Info: [libuklibparam] <param.c @ 113> libname: vfs, 96
[ 0.000000] Info: [libuklibparam] <param.c @ 594> No library arguments found
[ 0.000000] Info: [libukboot] <boot.c @ 213> Found 0 library args
[ 0.000000] Info: [libukboot] <boot.c @ 221> Initialize memory allocator...
[ 0.000000] Info: [libukallocregion] <region.c @ 202> Initialize allocregion allocator @ 0x400000000,8
[ 0.000000] Info: [libukboot] <boot.c @ 264> Initialize IRQ subsystem...
[ 0.000000] Info: [libukboot] <boot.c @ 271> Initialize platform time...
[ 0.000000] Info: [libkvmplat] <tscclock.c @ 253> Calibrating TSC clock against i8254 timer
[ 0.100001] Info: [libkvmplat] <tscclock.c @ 274> Clock source: TSC, frequency estimate is 2592141140z
[ 0.106867] Info: [libukschedcoop] <schedcoop.c @ 232> Initializing cooperative scheduler
[ 0.114889] Warn: [libpthread-embedded] <pte_osal.c @ 215> Thread 0x4000000d0 created without libpthr.
[ 0.125951] Info: [libuksched] <thread.c @ 180> Thread "Idle": pointer: 0x4000000d0, stack: 0x40001000
[ 0.132773] Warn: [libpthread-embedded] <pte_osal.c @ 215> Thread 0x4000203d8 created without libpthr.
[ 0.142529] Info: [libuksched] <thread.c @ 180> Thread "main": pointer: 0x4000203d8, stack: 0x40003000
[ 0.148912] Info: [libukboot] <boot.c @ 95> Init Table @ 0x167028 - 0x167058
[ 0.154532] Info: [libukswrand] <swrand.c @ 86> Initialize random number generator...
[ 0.160847] Info: [libukbus] <bus.c @ 134> Initialize bus handlers...
[ 0.164679] Info: [libukbus] <bus.c @ 136> Probe buses...
[ 0.168079] Info: [liblwip] <init.c @ 152> Initializing lwip
[ 0.171448] Info: [libuksched] <thread.c @ 180> Thread "lwip": pointer: 0x400040fc0, stack: 0x40005000
[ 0.178030] Info: [libvfscore] <rootfs.c @ 98> Mount ramfs to /...
[ 0.181563] Info: [libvfscore] <mount.c @ 122> VFS: mounting ramfs at /
[ 0.192104] Info: [libvfscore] <rootfs.c @ 106> Extracting initrd @ 0x1b1000 (136704 bytes) to /...
[ 0.204603] Info: [libukcpio] <cpio.c @ 233> Extracting /main.aot (136428 bytes)
Powered by
o. .o _ _ __ _
Oo Oo ___ (_) | __ __ __ _ ' _) :_
oO oO ' _ `| | |/ / _)' _` | |_| _)
oOo oOO| | | | | (| | | (_) | _) :_
OoOoO ._, ._:_:_,\_._, .__,_:_, \___)
Phoebe 0.10.0~9bf6e63-custom
[ 0.233750] Info: [libukboot] <boot.c @ 125> Pre-init table at 0x1763f0 - 0x1763f0
[ 0.239519] Info: [libukboot] <boot.c @ 136> Constructor table at 0x1763f0 - 0x1763f0
[ 0.245658] Info: [libukboot] <boot.c @ 146> Calling main(2, ['build/wamr_kvm-x86_64', 'main.aot'])
[ 0.252124] Warn: [libukmmap] <mmap.c @ 196> __uk_syscall_r_mprotect() stubbed
AOT module instantiate failed: mmap memory failed
[ 0.261049] Info: [libukboot] <boot.c @ 155> main returned 0, halting system
[ 0.266754] Info: [libkvmplat] <shutdown.c @ 35> Unikraft halted
</code></pre>
<p>... but it failed with <code>AOT module instantiate failed: mmap memory failed</code>. What's happening here?</p>
<p>Tracking down the error message in <code>WAMR</code>'s source code we get to this
section from <code>aot_runtime.c</code>:</p>
<pre><code> /* Totally 8G is mapped, the opcode load/store address range is 0 to 8G:
* ea = i + memarg.offset
* both i and memarg.offset are u32 in range 0 to 4G
* so the range of ea is 0 to 8G
*/
if (!(p = mapped_mem =
os_mmap(NULL, map_size, MMAP_PROT_NONE, MMAP_MAP_NONE))) {
set_error_buf(error_buf, error_buf_size, "mmap memory failed");
return NULL;
}
</code></pre>
<p>So <code>WAMR</code> needs to <code>mmap()</code> a 8GB chunk of anonymous memory, but
<code>Unikraft</code> <strong>does not yet support on-demand paging</strong> (though it looks like,
after merging <a href="https://github.com/unikraft/unikraft/pull/338" rel="nofollow">PR#338</a>
they're pretty close to having it). So it seems like we've hit a wall,
don't we?</p>
<p>Well, <strong>while <code>Unikraft</code> does not have on-demand paging, our host system
does!</strong> Which means we can simply <strong>create a VM machine with more than 8GB
of RAM and comment out the <code>memset</code> in <code>ukmmap/mmap.c</code> to avoid
touching every page from that region in advance</strong>. (NOTE: this hack
wouldn't work in a SEV/TDX
<a href="https://en.wikipedia.org/wiki/Trusted_execution_environment" rel="nofollow">TEE</a>,
since the VM's memory is pre-allocated and pinned in those cases, I'm
just using it to be able to continue with the experiment).</p>
<p>And, after doing so, it works:</p>
<pre><code>[ 0.000000] Info: [libkvmplat] <setup.c @ 472> Entering from KVM (x86)...
[ 0.000000] Info: [libkvmplat] <setup.c @ 473> multiboot: 0
[ 0.000000] Info: [libkvmplat] <setup.c @ 407> HEAP area @ 400000000 - 63b56a000 (9585467392 bytes)
[ 0.000000] Info: [libkvmplat] <setup.c @ 499> initrd: 0x1b1000
[ 0.000000] Info: [libkvmplat] <setup.c @ 501> heap start: 0x400000000
[ 0.000000] Info: [libkvmplat] <setup.c @ 505> stack top: 0x63b56a000
[ 0.000000] Info: [libkvmplat] <setup.c @ 532> Switch from bootstrap stack to stack @0x63b57a000
[ 0.000000] Info: [libukboot] <boot.c @ 199> Unikraft constructor table at 0x167000 - 0x167028
[ 0.000000] Info: [libuklibparam] <param.c @ 113> libname: netdev, 72
[ 0.000000] Info: [libuklibparam] <param.c @ 113> libname: vfs, 96
[ 0.000000] Info: [libuklibparam] <param.c @ 594> No library arguments found
[ 0.000000] Info: [libukboot] <boot.c @ 213> Found 0 library args
[ 0.000000] Info: [libukboot] <boot.c @ 221> Initialize memory allocator...
[ 0.000000] Info: [libukallocregion] <region.c @ 202> Initialize allocregion allocator @ 0x400000000,2
[ 0.000000] Info: [libukboot] <boot.c @ 264> Initialize IRQ subsystem...
[ 0.000000] Info: [libukboot] <boot.c @ 271> Initialize platform time...
[ 0.000000] Info: [libkvmplat] <tscclock.c @ 253> Calibrating TSC clock against i8254 timer
[ 0.100001] Info: [libkvmplat] <tscclock.c @ 274> Clock source: TSC, frequency estimate is 2592107600z
[ 0.105883] Info: [libukschedcoop] <schedcoop.c @ 232> Initializing cooperative scheduler
[ 0.113099] Warn: [libpthread-embedded] <pte_osal.c @ 215> Thread 0x4000000d0 created without libpthr.
[ 0.123215] Info: [libuksched] <thread.c @ 180> Thread "Idle": pointer: 0x4000000d0, stack: 0x40001000
[ 0.130177] Warn: [libpthread-embedded] <pte_osal.c @ 215> Thread 0x4000203d8 created without libpthr.
[ 0.140264] Info: [libuksched] <thread.c @ 180> Thread "main": pointer: 0x4000203d8, stack: 0x40003000
[ 0.146943] Info: [libukboot] <boot.c @ 95> Init Table @ 0x167028 - 0x167058
[ 0.152705] Info: [libukswrand] <swrand.c @ 86> Initialize random number generator...
[ 0.158711] Info: [libukbus] <bus.c @ 134> Initialize bus handlers...
[ 0.162330] Info: [libukbus] <bus.c @ 136> Probe buses...
[ 0.165668] Info: [liblwip] <init.c @ 152> Initializing lwip
[ 0.169172] Info: [libuksched] <thread.c @ 180> Thread "lwip": pointer: 0x400040fc0, stack: 0x40005000
[ 0.180469] Info: [libvfscore] <rootfs.c @ 98> Mount ramfs to /...
[ 0.189926] Info: [libvfscore] <mount.c @ 122> VFS: mounting ramfs at /
[ 0.196627] Info: [libvfscore] <rootfs.c @ 106> Extracting initrd @ 0x1b1000 (136704 bytes) to /...
[ 0.205720] Info: [libukcpio] <cpio.c @ 233> Extracting /main.aot (136428 bytes)
Powered by
o. .o _ _ __ _
Oo Oo ___ (_) | __ __ __ _ ' _) :_
oO oO ' _ `| | |/ / _)' _` | |_| _)
oOo oOO| | | | | (| | | (_) | _) :_
OoOoO ._, ._:_:_,\_._, .__,_:_, \___)
Phoebe 0.10.0~9bf6e63-custom
[ 0.232435] Info: [libukboot] <boot.c @ 125> Pre-init table at 0x1763f0 - 0x1763f0
[ 0.238147] Info: [libukboot] <boot.c @ 136> Constructor table at 0x1763f0 - 0x1763f0
[ 0.243980] Info: [libukboot] <boot.c @ 146> Calling main(2, ['build/wamr_kvm-x86_64', 'main.aot'])
[ 0.250338] Warn: [libukmmap] <mmap.c @ 196> __uk_syscall_r_mprotect() stubbed
[ 0.256167] Warn: [libukmmap] <mmap.c @ 190> __uk_syscall_r_madvise() stubbed
Hello, Unikraft + WAMR!
[ 0.264487] Info: [libukboot] <boot.c @ 155> main returned 0, halting system
[ 0.270064] Info: [libkvmplat] <shutdown.c @ 35> Unikraft halted
</code></pre>
<hr>
<h1 id="now-give-me-the-numbers"><a class="anchor" href="#now-give-me-the-numbers" rel="nofollow">#</a> Now give me the numbers</h1>
<p>With the option <code>Drop unused functions and data</code> enabled in
<code>Unikraft</code>'s config, the size of the stripped binary for the <code>WAMR</code>
unikernel is <code>642K</code>:</p>
<pre><code>[slopezpa@toolbox wamr]$ ls -l build/wamr_kvm-x86_64
-rwxr-xr-x. 1 slopezpa slopezpa 656856 Sep 27 17:05 build/wamr_kvm-x86_64
</code></pre>
<p>That's kind of nice, but can be better. Right now, I'm building <code>WAMR</code>
with <code>Unikraft</code> <strong>using the <code>POSIX</code> compatibility layer</strong>, which means
including a number of external libraries (<code>newlib</code>,
<code>pthread-embedded</code>, <code>lwip</code>) into the build. If, instead, <strong>we ported
<code>WAMR</code> to support <code>Unikraft</code>'s libraries directly</strong>, we would significantly
reduce the size of the unikernel (and, with it, the <code>TCB</code>).</p>
<p>Now, let's take a look at the memory consumption of our unikernel
while running the example Wasm payload:</p>
<pre><code>[slopezpa@mhamilton libkrunfw.wamr]$ ps -axuww |grep chroot_vm
slopezpa 71854 0.9 0.0 9517320 13828 pts/8 Sl+ 17:04 0:00 ./chroot_vm
</code></pre>
<p>That's less than <code>14MB</code> of <code>RSS</code>, and that's including the <code>VMM</code>
(libkrun) internal structures, the guest's memory usage, and without
discounting shared pages. Not bad, I guess... ;-)</p>
<hr>
<h1 id="where-to-go-from-there"><a class="anchor" href="#where-to-go-from-there" rel="nofollow">#</a> Where to go from there</h1>
<p>I think from this experiment we can conclude that <strong>it is, indeed,
feasible to build a Wasm runtime in unikernel form factor in a
reasonable amount of time</strong>, and that would come with <strong>significant
benefits in <code>TCB</code> reduction</strong> and, perhaps, in performance (yet to be
tested).</p>
<p>Some things I'd like to do next (if I manage to find the time):</p>
<ul>
<li>
<p>Clean up both <code>Wasm3</code> and <code>WAMR</code> build repositories and see if it's
worth getting them upstream (there's already a port for <code>WAMR</code>,
albeit quite old, so in this case it'd be just a PR updating it).</p>
</li>
<li>
<p>Evaluate the best way to integrate <code>Unikraft</code>-based unikernels into
<code>libkrunfw</code>. Right now, <code>Unikraft</code> only supports the <code>multiboot</code>
specification, while <code>libkrun</code> only provides a <code>Linux Zero Page</code>. For this experiment I've hacked the needed values manually into
Unikraft's <code>setup.c</code>, but of course we need a more reasonable
solution.</p>
</li>
<li>
<p>Implement support for <code>libkrun</code>'s <code>TSI</code> (Transparent Socket
Impersonation) into <code>Unikraft</code>. This would mean implementing support
for <code>virtio vsock</code> first, and then writing a library to support socket
semantics using <code>TSI+vsock</code>. <strong>I think <code>TSI</code> is a very good fit for this
use case, as it would give us network support with a very little
amount of code (we won't even need a TCP/IP stack!) and good
performance</strong>.</p>
</li>
</ul>
<p>Sounds like fun! ;-)</p>
<p>Do you have a comment about this post? Let's chat: <a href="https://matrix.to/#/@slp:matrix.org" rel="nofollow">Matrix</a> | <a href="https://fosstodon.org/@slp" rel="nofollow">Mastodon</a> | <a href="https://twitter.com/slpnix" rel="nofollow">Twitter</a> | <a href="https://github.com/slp" rel="nofollow">GitHub</a></p>
Running Linux microVMs on macOS (M1/M2)2022-07-28T21:16:33Zhttps://slp.prose.sh/running-microvms-on-m1<p>Sometimes, while working on macOS, you may find the need to test
something quick on Linux, or use some utility that's only available on
this OS. But, of course, you don't want to go through all the process
of creating the VM from scratch.</p>
<p>The good news is, you don't need to! Using
<a href="https://github.com/containers/krunvm" rel="nofollow">krunvm</a> you can create and
start a microVM from a regular container image (that is, an OCI
image), in just two commands and a couple of seconds.</p>
<hr>
<h1 id="installing-krunvm-on-macos"><a class="anchor" href="#installing-krunvm-on-macos" rel="nofollow">#</a> Installing krunvm on macOS</h1>
<p>If you're already using <a href="https://brew.sh" rel="nofollow">Homebrew</a>, getting krunvm
installed is super easy:</p>
<pre><code>brew tap slp/krun
brew install -s buildah libkrunfw libkrun krunvm
</code></pre>
<hr>
<h1 id="creating-a-volume-for-krunvm"><a class="anchor" href="#creating-a-volume-for-krunvm" rel="nofollow">#</a> Creating a volume for krunvm</h1>
<p>The first time you execute krunvm, you'll get this message:</p>
<pre><code>On macOS, krunvm requires a dedicated, case-sensitive volume.
You can easily create such volume by executing something like
this on another terminal:
diskutil apfs addVolume disk3 "Case-sensitive APFS" krunvm
NOTE: APFS volume creation is a non-destructive action that
doesn't require a dedicated disk nor "sudo" privileges. The
new volume will share the disk space with the main container
volume.
Please enter the mountpoint for this volume [/Volumes/krunvm]:
</code></pre>
<p>krunvm uses <a href="https://virtio-fs.gitlab.io/" rel="nofollow">virtio-fs</a> to share files
between macOS and the Linux microVMs and, since Linux requires a
case-sensitive file system, we need to create one on our system to be
used to host the expanded container images.</p>
<p>Luckily, this is as simple as running the command suggested in the
message:</p>
<pre><code>diskutil apfs addVolume disk3 "Case-sensitive APFS" krunvm
</code></pre>
<hr>
<h1 id="creating-and-starting-your-first-microvm"><a class="anchor" href="#creating-and-starting-your-first-microvm" rel="nofollow">#</a> Creating and starting your first microVM</h1>
<p>Lets suppose you want to run a Debian microVM. As there are official
Debian container images published on <a href="https://hub.docker.com" rel="nofollow">Docker
Hub</a>, this is as simple as running:</p>
<pre><code>krunvm create --name debian-microVM debian
</code></pre>
<p>In this example, <strong>debian</strong> is the name of the container image used as
source for the microVM, and <strong>debian-microVM</strong> is the name given to
the microVM itself. Instead of <strong>debian</strong> you can use the name of any
container, the same way you would when running a container with
<a href="https://www.docker.com/" rel="nofollow">docker</a> or <a href="https://podman.io/" rel="nofollow">podman</a>. In
this other example, the container image is specified including the
registry, the image name and the tag:</p>
<pre><code>krunvm create --name fedora-rawhide registry.fedoraproject.org/fedora:rawhide
</code></pre>
<p>Once created, you can start the newly created microVM this way (it
should take less than a second to start up!):</p>
<pre><code>% krunvm start debian-microVM
# uname -a
Linux debian-microVM 5.15.52 #1 SMP Mon Jul 18 08:47:44 EDT 2022 aarch64 GNU/Linux
</code></pre>
<p>By default, krunvm will execute <strong>/bin/sh</strong> in the microVM, but you
can run any other command present on it by specifying it in the
command line:</p>
<pre><code>% krunvm start debian-microVM /bin/bash
root@debian-microVM:~# echo $SHELL
/bin/bash
</code></pre>
<p>In fact, you can also run non-interactive commands and pass additional
arguments to them:</p>
<pre><code>% krunvm start debian-microVM /bin/ls -- -al /root
total 12
-rw------- 1 root root 12 Jul 28 20:49 .bash_history
-rw-r--r-- 1 root root 571 Apr 10 2021 .bashrc
-rw-r--r-- 1 root root 161 Jul 9 2019 .profile
</code></pre>
<p>And, without any kind of configuration, the network is completely
funtional, so you can start installing packages right away:</p>
<pre><code>% krunvm start debian-microVM
# apt update
Get:1 http://deb.debian.org/debian bullseye InRelease [116 kB]
Get:2 http://deb.debian.org/debian-security bullseye-security InRelease [48.4 kB]
Get:3 http://deb.debian.org/debian bullseye-updates InRelease [44.1 kB]
Get:4 http://deb.debian.org/debian bullseye/main arm64 Packages [8069 kB]
Get:5 http://deb.debian.org/debian-security bullseye-security/main arm64 Packages [167 kB]
Get:6 http://deb.debian.org/debian bullseye-updates/main arm64 Packages [2600 B]
Fetched 8447 kB in 1s (5686 kB/s)
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
All packages are up to date.
</code></pre>
<hr>
<h1 id="sharing-files-with-the-microvm"><a class="anchor" href="#sharing-files-with-the-microvm" rel="nofollow">#</a> Sharing files with the microVM</h1>
<p>krunvm allows you to share volumes from macOS with the Linux microVM
by using the <code>-v/--volume</code> argument, which can be specified with both
the <code>create</code> and the <code>changevm</code> commands.</p>
<p>Let's change the Debian microVM we've created before to give it access
to <code>$HOME/Public</code> (macOS) through <code>/Public</code> (Debian):</p>
<pre><code>% krunvm changevm debian-microVM --volume $HOME/Public:/Public
% ls ~/Public
Drop Box
% krunvm start debian-microVM
# ls /Public
'Drop Box'
</code></pre>
<hr>
<h1 id="exposing-ports-from-the-microvm-to-macos"><a class="anchor" href="#exposing-ports-from-the-microvm-to-macos" rel="nofollow">#</a> Exposing ports from the microVM to macOS</h1>
<p>It's also possible to run a service in the microVM and expose it to
the outside by using the <code>-p/--port</code>, also available with both the
<code>create</code> and <code>changevm</code>.</p>
<p>Once again, let's update our Debian microVM, this time to expose port
80 on the microVM to the port 8080 on macOS:</p>
<pre><code>% krunvm changevm debian-microVM --port 8080:80
% krunvm start debian
# apt install -y python3
(long output omitted)
# echo "Hello!" > index.html
# python3 -m http.server 80
Serving HTTP on 0.0.0.0 port 80 (http://0.0.0.0:80/) ...
(from another terminal on macOS)
% curl http://127.0.0.1:8080
Hello!
</code></pre>
<hr>
<h1 id="final-words"><a class="anchor" href="#final-words" rel="nofollow">#</a> Final words</h1>
<p>krunvm is just a simple CLI utility. The heavy lifting here is done by
<a href="https://github.com/containers/buildah" rel="nofollow">buildah</a>,
<a href="https://github.com/containers/libkrun" rel="nofollow">libkrun</a> and the
<a href="https://developer.apple.com/documentation/hypervisor" rel="nofollow">Hypervisor.framework</a></p>
<p>If you find it useful, or you have any suggestion, I'll be happy to
hear from you on <a href="https://twitter.com/slpnix" rel="nofollow">Twitter</a> or
<a href="https://fosstodon.org/@slp" rel="nofollow">Mastodon</a>. If you have found a bug,
please consider creating an <a href="https://github.com/containers/krunvm" rel="nofollow">issue on
GitHub</a>.</p>
<p>Do you have a comment about this post? Let's chat: <a href="https://matrix.to/#/@slp:matrix.org" rel="nofollow">Matrix</a> | <a href="https://fosstodon.org/@slp" rel="nofollow">Mastodon</a> | <a href="https://twitter.com/slpnix" rel="nofollow">Twitter</a> | <a href="https://github.com/slp" rel="nofollow">GitHub</a></p>