To Cloud or not to Cloud

Kalvad is very present in the Middle East, especially on certain topics:

  • Fixers: we are the people who can fix everything
  • High Performance Systems
  • Crazy constraints apps

Of course, you don't find this kind of setup in a small startup, mostly in big Corporations or Governmental Entities. And as you may know, we are using a lot of open source software, for example:

We are very happy with those, but we are facing 2 small issues:

  • Some of our customers are "sensitive" customers, which means that we cannot put their data outside of Dubai.
  • one of our cloud provider (Scaleway) raised the prices, due to the energy crisis in Europe.

So how can we fix this?

Of course, we could use some cloud infrastructure deployed inside Dubai (like Azure), but they are very expensive, not very secure, and we have some concerns regarding the Cloud Act.

If you want something done right...

...you have to do it yourself!

That's how we started the phase #1, can we host some software/data at our office? Are we going to be able to selfhost? Is it going to be stable enough?

So we started with a simple prototype of the infrastructure, but we had one constraint: the prototype should not be too expensive to run!

So the first setup would be:

  1. An old workstation that we transformed into a server
  2. A Synology NAS that I had at my place
  3. A surge protector
  4. A switch 1G 12 ports that was taking some dust at my place
  5. Our router, which is based on OpenWRT.

But here come the first problem: having a static IP as a company in the UAE is at least 475USD/month (just for the static IP, without the bandwidth, where it's 2.5 USD/month as an individual). But we did a check with some people of the Ops team: we are never getting a new IP! We still have the same from the beginning (almost 2.5 years ago). So we took a bet: rely on Cloudflare to handle a DDNS-like setup, and so far, it worked.

Then come the famous question: how should we deploy our softwares locally?

Peek a tech!

Obviously some people wanted to push to use Kubernetes, but sorry, I like my systems simple and efficient, with a real security between my different elements, so containers were obviously out (some people are whispering about runx or kata, but it seems to be still a bit hectic to set it up properly).

As you know, we are already using Nomad for some projects, and we like it, but this platform is where we are supposed to go if some Nomad clusters are down, so we don't want to rely on the same technology.

What was left? "Old-school" VMs. we evaluated 3 technologies:

  • KVM, that we are using for a long time, it's performant, integrated inside the kernel, but hard to manage (the number of crazy options!)
  • bhyve, the hypervisor/virtual machine manager for FreeBSD. We are using it on some servers, it's also performant, and easy to understand, but the tooling is not matching the last one
  • XCP-NG, originally based on XenServer, it comes as an OS, ready to be installed as an Host. Of course, for crazy arch guys like us, it sounded a bit old school, but we needed to spend as less time as possible on this project, so here was our choice!

Furthermore, if you couple XCP-ng with Xen Orchestra, you have a very capable (and lightweight) platform to manage your VMs!

Some example of Xen Orchestra

Install our softwares

Yeah, we are ready to install... wait, does Xen Orchestra supports Arch? Nope, it does not! so the first step would be to create our template!

How do we do it? by not starting with Arch! Yes that's weird, but stay with me!

We are going to have around 10 VMs permanentely up, + around 10 as testing, and we have some people coming to the office from time to time, where most of them are running Arch! So let's start by having a local mirror of Arch!

How to do it? very simple! I connect on the NAS, create a CRON, with the following script:

#! /bin/bash
set -e
set -x
REPO=rsync://ftp.acc.umu.se/mirror/archlinux/ #You can choose your own!
DEST=<Your repo path>

mkdir -p $DEST

# Common rsync options
RSYNC_OPTS="-rtlH -4 --safe-links --no-motd --exclude=.*"

# Only be verbose on tty
if tty -s; then
  RSYNC_OPTS="$RSYNC_OPTS -v"
fi

# first get new package files (the pool) and don't delete anything
/usr/bin/rsync $RSYNC_OPTS $REPO/pool/ $DEST/pool/

# … and only then get the database, links and the structure
/usr/bin/rsync $RSYNC_OPTS --delete-after --delay-updates $REPO $DEST \
        --exclude iso/ --exclude other/ --exclude archive/ --exclude sources/

# --delete-before so that it frees disk space earlier
/usr/bin/rsync $RSYNC_OPTS --delete-before $REPO/iso/ $DEST/iso/ --exclude archboot

Once this is done, just serve the repo through http, and you are good to go!

Our repo is ready, let's launch our first Arch VM!

Templates

Cool, I installed my first VM, I finished the setup with ansible, everything is cool, but I don't want to have to do this task manually every time! So how do you automate it? By creating a template! (Just to be clear, we do this only for generic VMs, not job specific VMs)

So once you installed your tools, like htop, iotop, etc... you go on XO and you shut down the VM, then click on Advanced -> Convert to template. Put a name, description and you are done.

Few remarks about this steps:

  • be careful when you use systemd-networkd, you could have some DHCP "cache", creating some issues
  • the space allocated to disk for the template cannot be changed after, so don't put too much, as it's better to use additional volumes.

Terraform

Once the template is done, we can deploy all VMs. I will not go in details about this part, as you have a really nice tutorial written: https://xen-orchestra.com/blog/virtops1-xen-orchestra-terraform-provider/

Ansible

Last step before having a full working infrastructure is Ansible, but we don't want to maintain an inventory, so there are 2 ways:

  • Netbox: Xen Orchestra can sync with netbox automatically and Ansible will load the inventory from Netbox
  • Xen Orchestra Inventory Plugin for Ansible, it can load directly from the websocket interface of Xen Orchestra

As we are bootstrapping the first version of this infrastructure, we choose the second option, even if we are going to choose the first one for the next version!

So now, we deployed everything! What is missing?

Backups

As we give a lot of conferences to a wide variety of people, sometimes, we have to come with some metaphors which could be understand by everybody, and we come with this one:

Running a business without backups is like driving a car without a seat belt: if you crash, it's going to be painful!

We had 2 choices for this one:

  • we use the automated backup system from Xen Orchestra
  • we do our own based on Duplicity

The backups from Xen Orchestra has a big advantage: it's very easy to deploy, but why do you need to backup everything? I don't think backing up ls or cat 10 times is worth it.

So we deployed a full Duplicity system, with encryption of the backup and backup to Microsoft Azure UAE.

Conclusion

We have a first version of our infra, it's stable, backed up, but we are still missing some pieces. This article is going to be the base of a series of articles about self-hosting for companies, and in the next one, we are going to cover monitoring, scaling server, and more!

If you have a problem and no one else can help, maybe you can hire the Kalvad-Team.