perl-handler-utils + Systemd
We just had another run in with Systemd and I thought I'd share our findings for posterity.
By now 5209R is really getting to a point where the initially somewhat bumpy ride gets a hell of a lot smoother. There was just another quirk with Apache restarts that affected 5207R, 5208R and especially 5209R.
Say you make a small change to a Vsite. Like creating an SSL certificate for it, modifying the used PHP method, or enabling/disabling "Web Server Alias redirects". In these cases we need to reload or restart Apache in order for the changed config to come into effect.
Ideally we would like to "reload" Apache and not "restart" it. Because that would be less interruptive for people who are currently visiting hosted websites. But as it is: "reload" isn't immediate. Apache childs will continue to serve requests until a timeout or until the request is finished. So in a lot of cases a "reload" will give you the impression that your config change didn't work, although it'll eventually kick in.
So we opted for a "restart" instead.
But here is the bugger: We have about six GUI handlers which might (or might not) feel inclined to issue either a "reload" or "restart" of Apache. Depending on if they actually changed the config or not. Typically this boils down to 2-4 requests to the init service to either "reload" or "restart" Apache. We don't mind it that much if Apache gets restarted more than once. Yet we try to avoid it. It has to get restarted reliably. And that it didn't. Especially not on 5209R due to Systemd.
Apache is a special case for us. As said it might see several "reload" or "restart" requests from handlers we pipe all of them to a client/daemon called Sauce::Service. That takes the requests and waits a few seconds if more than one request comes in. Then it looks at them and decides what to do. We have one "reload" and one "restart" request? The "restart" wins and gets executed. We have 3x "restart"? We do all of them. It's not ideal, but architecturally we can't really avoid it.
But Apache is a fickly case: It spawns a master process and several childs. Sometimes the master process dies and the childs remain in a state where they no longer do anything productive. If you run "/sbin/service httpd status" to check if Apache is up, it will then tell you that it is. But it's no longer working right due to the detached childs.
On CentOS 7 with Systemd we'd use "systemctl status httpd" or "systemctl is-active httpd" (for example) and even that would tell us a lie if the childs have detached and the master process is gone. The detached childs even continue to send Systemd watchdog events, so bloody Systemd will not even try to restart Apache. Even if it does (because you tell it to!), it doesn't. Because it's still running and the "stop" command doesn't terminate the detached childs. Great, isn't it?
So for a reliable restart we need to make sure that there are no detached children. So we have Sauce::Service check for this. We have it kill off all httpd processes (except those of AdmServ). And then we can issue a restart.
InitV takes the multiple Apache restarts within the same 1-2 second timeframe lightly. Systemd? I thought you might ask. It doesn't if you use "systemctl restart httpd". Because that pipes it to the Systemd queue and the queue seems to bugger out if the requests for the same service come in too fast.
So we had the really great idea of not using Systemd for restarting Apache. How about the "apachectl" command? We use it rarely, but went looking at what it does. Forget it! It's a shell script that pipes start|stop|restart|condrestart and so on to Systemd. Which we wanted to avoid.
How about "/usr/sbin/httpd -k restart"? This restarts Apache just fine - if you already took care of detached children. But guess what? If our Sauce:Service implementation runs "/usr/sbin/httpd -k restart", we get a Policy-Toolkit warning from bloody Systemd! And then the box is so fucked, that you can't restart Apache in *ANY* way at all. Even as "root" from the command line none of the restart methods then works anymore. Neither via said "/usr/sbin/httpd -k restart" or "systemctl restart httpd".
The *only* fix is a hard reboot. Jesus fucking Christ!
This is so marvelously ridiculous that I'm really at a loss of curse words. I can only reiterate what I said on other occasions about Systemd:
If I were in a room with Hitler, Stalin and Lennart Pöttering and had a gun with only two bullets? I'd shoot Poettering twice. Just to be sure.
Eventualy we got the Apache restart/reload problem sorted by switching Sauce::Service to use this to restart Apache:
/usr/bin/systemctl --job-mode=flush restart httpd.service
This tells Systemd that the last transactions wins and flushes all others to the same service out of the Systemd transaction queue.
But I'm still not over the fact that a Policy-Toolkit violation fucks up the box in a fashion that you can only shake it loose again by doing a hard reboot. I mean ... seriously? What potential problem short of a kernel-"Ooops!!" really requires a reboot these days? You know what? Gimme a third bullet and I'll shoot the fucking horse that Poettering rode in on, too! Policy-Toolkit and Systemd? My ass!
Sorry for the rant, but I'm so "slightly upset" about it I had to get this off my chest.
For what it's worth: The updated perl-handler-utils RPM for 5207R, 5208R and 5209R now available on YUM bears the fruits of this hard labor and should make sure that your Apache is now reliably restarted whenever it needs to be restarted.