The upstream RISC-V experience: running RISC-V hardware with upstream distros
This post is part of the Upstream RISC-V serie:
- The upstream RISC-V experience: running RISC-V hardware with upstream distros
- VisionFive 2
- BananaPi BPI-F3
A foreword on cfarm.net
In case you don't know about cfarm.net (formerly the GCC Compile Farm), it is the longest-running and most comprehensive compile farm for free software developers. We provide SSH access to machines with many different operating systems and many different CPU architectures. Currently, we have 6 Linux distributions, 5 other OS, and 7 architecture families (with several variants such as big-endian and little-endian ppc64). The goal is to help developers to port, build and debug their projets on less common or hard-to-get hardware.
So, of course, we have been very interested in RISC-V: it's not every day that a whole new CPU architecture gets its place in the landscape, and this specific architecture holds very interesting promises of openness.
The many challenges of a new architecture
Adding RISC-V machines to cfarm.net turned out to be a big challenge: for many years, there was no hardware
that was both capable of running Linux and stable enough to provide a compilation service to many concurrent users.
Like many projects, we resorted to QEMU emulation. It was working mostly fine (especially because user-mode emulation
is reasonably fast on server-class x86 hardware), but it did not provide everything needed: debugging with
gdb
was a challenge, and microarchitectural details are obviously different in QEMU compared to real hardware.
When capable RISC-V hardware eventually became available, it came with very custom software, because the software ecosystem needed time and hardware to support this new architecture properly. In practice, it meant custom bootloader, custom kernel, and hacked-up Linux distros.
Now, the lowest-levels of software support (libc, compiler, bootloader, kernel) are definitely ready for RISC-V. But support for specific hardware is another story.
Making the case for upstream software and distros
To properly integrate new RISC-V hardware in cfarm.net, we set ourselves a challenge: try to run them with upstream software and distros as much as possible. On our side, this makes the machines easier to manage and is more future-proof. In addition, this offers a more consistent environment to developers using the farm: if their code works on the farm, there is a good chance it will work on different RISC-V hardware and in a different software environment.
This may feel a bit theoretical, but if you have ever worked with vendor SDKs in the embedded world, you know the nightmare it quickly becomes: ancient kernels with very questionable hacks, old compilation toolchains, low-level code that is highly specific to a particular hardware, poor security support, custom and hacky implementations of features that have since been upstreamed in a different way... Unfortunately, the RISC-V ecosystem tends to go a bit into this direction, although it is very far from the worst and we see RISC-V vendors making real efforts to upstream their code.
To give more substance to the argument, here is a list of issues we have encountered so far because of non-upstream code:
-
Starfive only provides custom Debian images with a full graphical environment. This is something we can't use in the farm because it's a headless environment. In addition, they seem to be based on a Debian snapshot from 2022, and many packages are customized by Starfive. This is not a good base to build and debug free software projects on top of it, because there is no guarantee that another RISC-V system would work the same way.
-
the u-boot shipped with the VisionFive boards is old and highly misconfigured, needing workarounds in the operating system. I reported the issues for the VisionFive v1 and even proposed fixes (one, two, three). They were eventually merged after one year, but by this time the VisionFive v1 was already more or less abandoned. The u-boot shipped with VisionFive v2 is better configured, but doesn't provide an easy way to boot an upstream distro from NVMe.
-
the non-upstream kernel on the VisionFive v1 caused confusion when reporting an issue in a userspace software. I used the kernel of the VisionFive v1 as an example of correct behaviour, but that was actually due to a downstream change in Starfive's patched kernel: the upstream kernel code was still buggy. This wrong assumption caused much head-scratching to the third-party kernel developer working on a fix.
Of course, I understand that upstream code does not appear magically, and there is always a fuzzy phase during which code is only available in downstream vendor forks. The point I want to make is the following: we should not be satisfied until the code is available upstream, as it is the only way to ensure its quality and longevity.
Targeted hardware and current plans
Thanks to the awesome support from RISC-V International, we already have received several boards to work on:
- VisionFive v2
- Milk-V Pioneer
- Lichee Pi 4A
We plan to experiment with upstream distros on these boards, document the installation process, submit fixes upstream if required, and generally smooth out the path to get upstream software up and running on this hardware. Then, of course, we will make these boards and upstream software freely available to all cfarm.net developers for the foreseeable future.
Expect more blog articles and news as we progress through this plan.
So far, we mostly target Debian, Alpine Linux and Arch Linux, but we are also exploring other OSes such as OpenBSD.
It should be noted that Ubuntu generally has first-class upstream support for RISC-V boards. Of course they still need to patch the Linux kernel to support the hardware, but this is currently unavoidable. In any case, it's very nice that a distro is able to offer official images with this level of quality. Given this, I believe that we have very little to contribute to Ubuntu to make it better.
Long-term plans
In the long term, we would like to work on vector instructions and virtualisation instructions support, but this is still a long way to go given the lack of hardware supporting these instructions.