Scaling up the Continuous Integration infrastructure for Eclipse Foundation’s projects — Act 2

TL;DR

Infrastructure improvements and migration described in last year post is eventually happening, with some tweaks.

As of today, more than 250 Eclipse projects use the build infrastructure at the Eclipse Foundation. For a year now, we’re planning how the infrastructure can be scaled and expanded to keep up with the high demand. Improving the utilization and efficiency of our current hardware and integrating new hardware and cloud resources are the main goals of this effort.

About a year ago, we announced that we were planning to migrate the entire build infrastructure at the Eclipse Foundation to CloudBees Core (formerly known as CloudBees Jenkins Enterprise - CJE) on top of RedHat OpenShift Container Platform, a Kubernetes distribution. What happened since then? CloudBees Core has been setup last June and we eventually created all JakartaEE Jenkins instances (and a couple more) on it until last October. While hitting a couple of issues in the beginning, the whole stack allowed us to provide just enough computing resources to the ~40 JakartaEE projects and allowed Glassfish 5.1 to be build on the Eclipse infrastructure and released a couple of weeks ago. Still, we are not entirely satisfied with the current state, and a couple of showstoppers prevent us from migrating all existing Jenkins instances (the ones on ci.eclipse.org) to CloudBees Core. Here are two examples of such showstoppers:

  • First, we cannot create CloudBees Core Jenkins instances in a way that would allow us to setup resource quota management per project. It means, that the whole cluster would be available to all projects and the issue we have on our current infrastructure (one project can starve others) would be even worse: it would happen at the cluster level, impacting all projects and not only the ones sharing the same machine of the greedy project. 
  • The second main issue is about secret management, more specifically, secrets that cannot be stored in Jenkins credentials. On the current infrastructure, secrets are kept on the file system and we use good old POSIX permissions to prevent projects to read other projects’ secrets. With CloudBees Core, all Jenkins instances run as the same cluster user and thus can read secrets from others (I’m oversimplifying here, if you want details, feel free to reach out to us on [email protected] mailing list). It is highly undesirable.

We’ve worked with CloudBees to find solutions and/or workarounds, but of course we’re not their only customer and our requirements are not necessarily top priorities for their products. It was a bit frustrating because we knew that most of the issues could be solved by interacting with Kubernetes directly (OpenShift more specifically in our case). Instead we had to go through CloudBees Core UI/CLI which was abstracting the cluster management and did not provide all the options we needed. Eventually, we had to move on. We learned a lot about Kubernetes/OpenShift in the meantime, and we tested what would be the result if we were deploying the Open Source version of Jenkins in the cluster. Surprisingly it worked very well and the freedom we get by having a hand on every deployment aspects really felt liberating

So here we are. We now have a tailored, organic, home-grown solution to manage Jenkins instances in the new clustered infrastructure and we’re about to start the migration. The first instance to be migrated will be the CBI one and it will happen in the next couple of days. We will also start reaching out to projects to announce when the migration will happen. Don’t call us, we will call you, very soon :). We don’t expect much disruption, and most projects will only need to change minor things in their build settings. You can track the global effort with the top level ticket #544221 on Bugzilla.

We have setup a Migration FAQ on the Eclipse wiki. Feel free to let us know if you have any concerns or questions in the meantime.

As part of this effort, we would like to thank both CloudBees and RedHat for their generous donations in the form of software and support. We especially thank CloudBees for their help and support. Unfortunately, a custom tailored solution to our very specific requirements (feature-wise and time-wise) will be a better fit than CloudBees Core.