As you may know, at Kalvad, we are managing thousands of servers with Ubuntu, Alma or Rocky installed on it, but we are also running a big part of our production on Archlinux. But why? How? Is that stable? Do you have a backup plan?
How do you choose your Linux distribution? Mostly, you have 2 ways:
- Simple and Mainstream, like Ubuntu, Fedora, etc... The good point is that most sysadmin knows Ubuntu, but we have already exposed some issues with it (compilation flags, hard to package .deb/.rpm)
- Rugged and configurable, like Arch, or Exherbo (used by the people at Clever Cloud)
Of course, we picked the second choice, but why?
We are building our servers has a 2 faces system:
- What is provided by the OS? For example, htop, vector, telegraf, etc...
- What is provided by the customer (where Kalvad could be the customer too)? For example PostgreSQL, some Java apps, ...
Of course, if you want your system to be secure, fast, and shinny :-), you want to be as close as possible from upstream, without having some maintainers changing some code because Valgrind said that a free was missing (true old story). Great, Archlinux is upstream-based!
For example, today (2022/04/24), HAProxy, which is our favorite load balancer, is in version 2.5.5 (the latest one) on Arch, but 1.8.27 on Alma Linux and 2.4.14 on Ubuntu 22.04, which was released only a few days ago!
Furthermore, archlinux, like most source-based/rolling release distributions, is trying to stick as much as possible to the main package: if you check the method to compile HAProxy on Arch, you would see that there is only 1 patch distro-specific!
# Alma Linux haproxy -vvv HA-Proxy version 1.8.27-493ce0b 2020/11/06 Copyright 2000-2020 Willy Tarreau <email@example.com> Build options : TARGET = linux2628 CPU = generic CC = gcc CFLAGS = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement -fwrapv -Wno-null-dereference -Wno-unused-label -Wno-stringop-overflow OPTIONS = USE_LINUX_TPROXY=1 USE_CRYPT_H=1 USE_GETADDRINFO=1 USE_ZLIB=1 USE_REGPARM=1 USE_OPENSSL=1 USE_LUA=1 USE_SYSTEMD=1 USE_PCRE=1 Default settings : maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 Built with OpenSSL version : OpenSSL 1.1.1g FIPS 21 Apr 2020 Running on OpenSSL version : OpenSSL 1.1.1k FIPS 25 Mar 2021 OpenSSL library supports TLS extensions : yes OpenSSL library supports SNI : yes OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3 Built with Lua version : Lua 5.3.4 Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND Encrypted password support via crypt(3): yes Built with multi-threading support. Built with PCRE version : 8.42 2018-03-20 Running on PCRE version : 8.42 2018-03-20 PCRE library supports JIT : no (USE_PCRE_JIT not set) Built with zlib version : 1.2.11 Running on zlib version : 1.2.11 Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip") Built with network namespace support. Available polling systems : epoll : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result OK Total: 3 (3 usable), will use epoll. Available filters : [SPOE] spoe [COMP] compression [TRACE] trace # Arch haproxy -vv HAProxy version 2.5.5-384c5c5 2022/03/14 - https://haproxy.org/ Status: stable branch - will stop receiving fixes around Q1 2023. Known bugs: http://www.haproxy.org/bugs/bugs-2.5.5.html Running on: Linux 5.17.4-arch1-1 #1 SMP PREEMPT Wed, 20 Apr 2022 18:29:28 +0000 x86_64 Build options : TARGET = linux-glibc CPU = native CC = cc CFLAGS = -march=x86-64 -mtune=native -O2 -pipe -fno-plt -fexceptions -Wp,-D_FORTIFY_SOURCE=2 -Wformat -Werror=format-security -fstack-clash-protection -fcf-protection -fwrapv OPTIONS = USE_PCRE=1 USE_PCRE_JIT=1 USE_GETADDRINFO=1 USE_OPENSSL=1 USE_LUA=1 USE_ZLIB=1 USE_SYSTEMD=1 USE_PROMEX=1 DEBUG = Feature list : +EPOLL -KQUEUE +NETFILTER +PCRE +PCRE_JIT -PCRE2 -PCRE2_JIT +POLL +THREAD +BACKTRACE -STATIC_PCRE -STATIC_PCRE2 +TPROXY +LINUX_TPROXY +LINUX_SPLICE +LIBCRYPT +CRYPT_H +GETADDRINFO +OPENSSL +LUA +ACCEPT4 -CLOSEFROM +ZLIB -SLZ +CPU_AFFINITY +TFO +NS +DL +RT -DEVICEATLAS -51DEGREES -WURFL +SYSTEMD -OBSOLETE_LINKER +PRCTL -PROCCTL +THREAD_DUMP -EVPORTS -OT -QUIC +PROMEX -MEMORY_PROFILING Default settings : bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 Built with multi-threading support (MAX_THREADS=64, default=12). Built with OpenSSL version : OpenSSL 1.1.1n 15 Mar 2022 Running on OpenSSL version : OpenSSL 1.1.1n 15 Mar 2022 OpenSSL library supports TLS extensions : yes OpenSSL library supports SNI : yes OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3 Built with Lua version : Lua 5.3.6 Built with the Prometheus exporter as a service Built with network namespace support. Built with zlib version : 1.2.12 Running on zlib version : 1.2.12 Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip") Support for malloc_trim() is enabled. Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND Built with PCRE version : 8.45 2021-06-15 Running on PCRE version : 8.45 2021-06-15 PCRE library supports JIT : yes Encrypted password support via crypt(3): yes Built with gcc compiler version 11.2.0 Available polling systems : epoll : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result OK Total: 3 (3 usable), will use epoll. Available multiplexer protocols : (protocols marked as <default> cannot be specified using 'proto' keyword) h2 : mode=HTTP side=FE|BE mux=H2 flags=HTX|CLEAN_ABRT|HOL_RISK|NO_UPG fcgi : mode=HTTP side=BE mux=FCGI flags=HTX|HOL_RISK|NO_UPG <default> : mode=HTTP side=FE|BE mux=H1 flags=HTX h1 : mode=HTTP side=FE|BE mux=H1 flags=HTX|NO_UPG <default> : mode=TCP side=FE|BE mux=PASS flags= none : mode=TCP side=FE|BE mux=PASS flags=NO_UPG Available services : prometheus-exporter Available filters : [SPOE] spoe [CACHE] cache [FCGI] fcgi-app [COMP] compression [TRACE] trace
To be completely transparent, there is one lib that we are late with on Archlinux, it's OpenSSL 3, but this could be another article.
But you can see the difference: we compile and execute on the same lib version, and we are as close as possible to upstream (+ some optimizations for the CPU).
Contrary to some source-based distributions (like Exherbo), we have precompiled packages, but it's so easy to compile your own packages and build your own repo, and seeing a 10-15% performance improvement be doing a small modification inside /etc/makepkg.conf!
What is the interest? Distributions like Debian or Ubuntu, want to run the distribution from a super computer to a toaster (NetBSD is not the only one to run on one :-) !), where we know what is our hardware, so we can clearly optimize it through our repos, or at compilation time on the servers!
As you check a PKGBUILD on Archlinux, it's very simple to understand and hack your way around, for example, inside HAProxy's PKGBUILD, changing from generic to native was just changing a line!
Furthermore, we can maintain our own packages (like warp10), at least for some time, before contributing to the AUR or the repos!
Finally, we are able to patch our own software, and implement a security patch without waiting for the distribution!
As most of the attacks are based on injecting some code inside a binary, we don't have the same binary as everybody else, so we are far less exposed!
We explained why we chose Arch, but how do we manage it?
We have been building a software (not yet open-source, unfortunately), called Konstruis, which was heavily inspired by this fabulous FreeBSD tool called Poudriere!
Long story short, it builds a VM on Xen (XCP-NG), installs the latest version of packages from the main repo, and starts to build our own packages/upgraded one, like the HAProxy shared earlier! Then it uploads the built packages to our central storage, and rebuilds our repo!
Furthermore, we have developed another tool: package tracking. All our servers are having this tool, which will run on a periodical basis, and it will send:
- the package installed
- Its version/source
We can then compare to our internal database and gain 2 advantages:
- we can detect an unauthorized package on our servers
- we can compare it with the RSS/Atom feed of security.archlinux.org
Then we add the repo to our servers, and it's done!
Is our solution stable? To be honest, yes! We have a lot of traffic coming, we are upgrading permanently, and we are rarely facing issues! Of course, we had some troubles, but we are used to managing it and improving our redundancy systems, especially for the reboots due to the kernel upgrade! (We don't have any infamous servers with 10 years of uptime!).
We have more issues with not being able to patch some sensitive servers on Ubuntu/Centos-like than with Arch, but is Arch the perfect solution? For the moment, and for us, it is!
We know that there are some alternatives, especially on the server-side:
- FreeBSD: same advantages as Arch, but sometimes less up to date. We are using it in production for ZFS (a working btrfs).
- KissLinux: very minimalist OS, could be interesting for us.
- Exherbo: actively recommended by some people that we respect a lot!
But we are not yet ready to cross this line!
We hope to release all our tools, once the code is cleaned, on Github, but we are also planning to add some new features, like geo-replicated storage through Garage, optimized build machine, and go to the next step: put Kalvad's kernel configuration inside the build system.
If you have a problem and no one else can help, maybe you can hire the Kalvad-Team.