Skip to main content

Why you need a Raspberry Pi 5 in your homelab if you're building ARM software

· 5 min read

I've been working on ARM images lately. They are not application containers, but a distro-style image built from ubuntu:22.04 that installs large package sets, runs post-install scripts, enables services, and pulls NVIDIA repositories. It's much closer to OS assembly than a typical app container.

I wasn't surprised that building these images natively on ARM would be faster than doing it from an x86 machine using emulation. That part was expected.

What surprised me was how much faster it was.

Seeing a Raspberry Pi 5 consistently outperform a 3-year-old high-end Intel laptop by almost 2x forced me to re-evaluate some assumptions I had about build performance, hardware specs, and what "powerful enough" actually means when you're building ARM software.

The setup

I ran the same Docker-based build targeting linux/arm64 on two machines.

SpecRaspberry Pi 5 (native ARM)Intel laptop (ARM via emulation)
Architectureaarch64x86_64
CPUCortex-A76 (4 cores @ 2.4 GHz)Intel i7-1280P (14 cores / 20 threads, up to 4.8 GHz)
RAM~8 GB (no swap)62 GB (+ swap)
Diskext4 on /dev/sda2 (USB SSD, not microSD)NVMe, btrfs
Kernel6.17.x (Ubuntu raspi kernel)6.17.x
Docker Engine28.x (linux/arm64)building linux/arm64 via QEMU (binfmt_misc)

On paper, the Intel laptop should dominate: it has about 3.5x more CPU cores (14 vs 4) with much higher boost clocks, roughly 8x the RAM (62 GB vs ~8 GB), and an NVMe drive that is typically much faster than a USB SSD in both throughput and latency.

On the Intel system, ARM execution is handled through Docker BuildKit with binfmt_misc enabled using tonistiigi/binfmt, which registers qemu-aarch64 at the kernel level. No static QEMU binary is copied into the image; emulation happens transparently during the build.

The Dockerfile

The Dockerfile is not compiling application code. It performs a lot of OS-level work:

  • apt-get update
  • installing many packages
  • running post-install scripts
  • enabling system services
  • generating initramfs/dracut bits
  • pulling NVIDIA repositories

In other words: this is much closer to assembling a small Linux distribution than building a typical container image.

If you want to see the exact build steps, the Dockerfile is here.

Results

Using the exact same Dockerfile and target architecture:

  • Raspberry Pi 5: real 7m35s
  • Intel laptop: real 14m31s

Despite the Intel machine having dramatically more CPU power, memory, and faster storage, it still took almost twice as long. For transparency, I ran the build once on the Raspberry Pi 5 and a couple of times on the Intel machine; the Intel timings were consistent enough that cold vs warm effects did not change the overall outcome.

Why this happens

The decisive difference isn't raw performance, it's architecture alignment.

On the Raspberry Pi, everything runs natively. ARM binaries execute directly on an ARM CPU.

On the Intel machine, all ARM binaries run through QEMU user-mode emulation. That means every instruction has to be translated before it can execute.

This build workload is particularly unfriendly to emulation:

  • heavy apt / dpkg usage
  • many short-lived processes
  • lots of filesystem operations
  • syscall-heavy post-install scripts
  • very little meaningful parallelism

This isn't a compute-bound workload where faster clocks and more cores help. It's dominated by overhead, and emulation multiplies that cost.

As a result, even a much more powerful x86 system can lose badly to a modest ARM system when the workload is OS-heavy.

The part that really matters: cost vs time

A Raspberry Pi 5 costs roughly EUR 80-100, depending on the RAM configuration and availability.

That's not "cheap" in an absolute sense, but in practice it's a bargain.

If that machine saves me even a few minutes per build -- multiplied across many iterations while working on images -- it pays for itself very quickly. Not in hardware terms, but in focus, iteration speed, and reduced friction.

Instead of waiting 15 minutes for a build to finish on my laptop under emulation, I can let a small ARM box do the work in half the time, quietly, in the background.

For the type of work I'm doing, that trade-off is an easy decision.

This applies even more to CI pipelines

The same logic applies to pipelines, arguably even more so.

For Kairos, we run our ARM builds on native ARM runners, which GitHub provides for free. Moving away from emulated ARM builds has significantly reduced how long our pipelines run.

The impact is very noticeable:

  • faster feedback loops
  • less wasted CI time
  • fewer flaky or timing-sensitive failures
  • lower overall pipeline cost

When your builds are dominated by package installation, system initialization, and OS-level steps, native execution isn't an optimization, it's the correct architectural choice.

Side note: about Docker "cache" confusion

While running these experiments, I repeatedly saw output like:

CACHED FROM ubuntu:22.04@sha256:...

even when using --no-cache and --pull.

What's happening here is subtle:

  • --no-cache disables Docker build step caching
  • it does not clear BuildKit's internal content store
  • base image layers and remote ADD blobs can still be reused by digest

If you really want to start from a clean slate, you need to clear BuildKit's cache explicitly:

docker builder prune -a --force

This removes cached content stored by BuildKit itself, not just images visible via docker images.

It's a blunt tool, but useful when you're trying to reason about cold-build performance.

What I took away from this

I didn't learn that native ARM is faster -- I already knew that.

What I learned is that architecture alignment matters far more than raw hardware specs for certain workloads.

When your build process is dominated by package managers, system initialization, and distribution-level tooling, native execution can easily outperform much stronger hardware running under emulation.

The Raspberry Pi 5 didn't win because it's fast.

It won because it speaks the right language.