Most Powerful Open Source ERP

How Nexedi moved to HTTP2 with Caddy

After trying and failing to move to HTTP2 using Apache in 2017, Nexedi has adopted Go language based Caddy server. Initial results of HTTP2 show significant speed improvement and good stability. Next step will be the adoption of QUIC. Migration was smooth and entirely automated thank to a test driver devops approach recently introduced in SlapOS.
  • Last Update:2018-07-18
  • Version:001
  • Language:en

Back in 2017, Nexedi tried to move to move to HTTP2 using Apache's HTTP2 module. It was a failure: named servers would show the content of another web site after switching to HTT2. Also, we found that Apache has complex process leak bugs, which although are supposed to be fixed, actually are not after 7 years. Since those bugs were too hard for us to fix considering our limited know how in C/C++ concurrent memory management, we decided to consider alternatives.

We first thought using NGINX which is already available as part of our Open Source / Free CDN Software in SlapOS. But NGINX has its own share of issues too which are not easier to fix than those of Apache.

Then, we discovered a recurring pattern: every 10 years, a new HTTP server is released. It becomes the most popular HTTP server within 10 years then declines. Apache was released 20 years ago, NGINX was released 10 years ago. Both Apache and NGINX brought significant progress to the Web. Apache is declining. NGINX became trendy 10 years ago and is still growing, but does not really solve our problems with Apache.

So, we considered what would be the successor to Apache and NGINX.

We believe that Caddy could be that successor because it brings to the Web two features that are absent in both Apache and NGINX:

  • a stable implementation of HTTP2/QUIC;
  • a programming language that drastically reduces the risk of memory leaks or process leaks when implementing extensions.

With HTTP servers nowadays able to process 400.000 requests per second on low end hardware, memory leaks and process leaks have become a major risk. Caddy does it better because its programming language, Go, does it better. It includes a garbage collector that solves the problems of memory allocation. And it is based on a clean model of concurrency with the algorithm of scheduling: work stealing. All this at nearly the same raw speed as compiled C or C++.

We thus considered that Caddy was the best option for us. Rather than fixing problems in Apache that could not be solved after years, we decided that we would rather adopt Caddy, benefit immediately from good HTTP2/QUIC and then contribute to Caddy whichever extensions are still missing. Programming in Go language without memory errors or process leaks is much easier for most python programmers than programming in C or C++. By adopting Caddy, we actually adopt a high performance HTTP2 server to which anyone in Nexedi can contribute without much risk. This is invaluable in our opinion.

Since we use SlapOS everywhere in Nexedi for devops, we wanted to achieve an important goal: seamless transition from Apache to Caddy, with maximum feature compatibility. In order to do so, we started by adding tests for our current Apache frontend, testing Apache compilation, instantiation and automatic configuration based on SlapOS Master input. Each test tries to cover one specific feature which we are using in our frontends: how files care cached, how named virtual hosts are created, etc.

Once we covered our Apache frontend with enough test cases, we used Test-Driven-Development (TDD) approach to achieve the same functionality with Caddy as with Apache for selected minimal viable features. Since we have been using the same test suite for both Apache and Caddy, we could be sure sure that the transition would be seamless and automatic at SlapOS level. We could be sure that nothing had to be changed on instance configuration, yet behaviour would be the same.

Once all test cases passed with Caddy, we decided to start the automated migration process on about 3000 named virtual hosts. It completed after about 20 minutes. No incident happened. All sites suddenly started to provide faster response thanks to HTTP2 optimisations.

Our next step will be to extend Caddy and add QUIC support. 

And maybe some day, we will move Caddy to a Cython based HTTP server, if we can pass the same test suite.