如何用supervisor守护php-fpm主进程以实现php-fpm的自动重启
最近有同事有个针对php-fpm进程的监护需求,也即:如果php-fpm的master进程意外退出(可能是crash,也可能是被误kill),那么希望master进程能被自动拉起,以免中断服务。
我们知道,supervisor是一个非常强大的进程监控(monitor & control)工具,它理论上可以实现php-fpm master进程的守护需求。因此,我帮同事试验了如何用supervisor完成他的需求,结果表明,supervisor确实是神器,只需一个合理的配置文件,它就能解决问题。
下面是我的调研过程及最终实现php-fpm主进程守护功能的配置文件,在此做个记录,也希望能帮助到别人。
supervisor本身是python实现的,而且是调研阶段,故先创建一个新的virtualenv环境,然后用pip安装好supervisor包。
至此,基本的调研环境搭建完毕。当然,php-fpm和PHP环境以及前端的Nginx是早就ready的。
通常编译安装PHP后,php-fpm这个2进制的C程序也会被编译并安装好,典型路径在php_install_path/sbin/目录下。该目录下还有个名为php-fpm.sh的脚本用于控制php-fpm进程的start/stop/restart/reload等动作。
./sbin/php-fpm.sh脚本中,”start”操作启动了php-fpm主进程,其余的操作都是通过向php-fpm master进程发signal实现的。
<code class=" hljs bash"><span class="hljs-comment">## code segment in php-fpm.sh</span><span class="hljs-keyword">case</span> <span class="hljs-string">"<span class="hljs-variable">$1</span>"</span> <span class="hljs-keyword">in</span> start) <span class="hljs-built_in">echo</span> -n <span class="hljs-string">"Starting php-fpm "</span> <span class="hljs-comment">## 下面这行是关键命令</span> <span class="hljs-variable">$php_fpm_BIN</span> --daemonize <span class="hljs-variable">$php_opts</span> <span class="hljs-keyword">if</span> [ <span class="hljs-string">"$?"</span> != <span class="hljs-number">0</span> ] ; <span class="hljs-keyword">then</span> <span class="hljs-built_in">echo</span> <span class="hljs-string">" failed"</span> <span class="hljs-keyword">exit</span> <span class="hljs-number">1</span> <span class="hljs-keyword">fi</span> wait_<span class="hljs-keyword">for</span>_pid created <span class="hljs-variable">$php_fpm_PID</span> <span class="hljs-keyword">if</span> [ -n <span class="hljs-string">"<span class="hljs-variable">$try</span>"</span> ] ; <span class="hljs-keyword">then</span> <span class="hljs-built_in">echo</span> <span class="hljs-string">" failed"</span> <span class="hljs-keyword">exit</span> <span class="hljs-number">1</span> <span class="hljs-keyword">else</span> <span class="hljs-built_in">echo</span> <span class="hljs-string">" done"</span> <span class="hljs-keyword">fi</span> ;;</code>
从上面是终端输入”./sbin/php-fpm.sh start”时,实际执行的代码,可以看到,php-fpm进程的启动参数是–daemonize $php_opts,而$php_opts的值为”–fpm-config $php_fpm_CONF –pid $php_fpm_PID”。
注意: php-fpm.sh启动php-fpm master进程时,传入了daemonize参数,表明php-fpm master process以守护(daemon)方式启动,而根据supervisor文档的说明,当用supervisor监护进程时,被监护进程不能是守护进程,这是由于守护进程通常会在fork完子进程后就让父进程”结束生命”,也即由supervisor创建的父进程退出,此时,supervisor无法再监护已退出进程创建出来的子进程。关于daemon process的行为,可以参考Linux Daemon Writing HOWTO一文来理解。
根据上面的分析,我们知道,只要supervisor启动php-fpm进程时,不传入daemonize参数即可。
上面的分析已经告诉我们应该怎么解决问题了,下面直接上验证可用的配置文件。文件位于php-fpm.conf同级目录下(典型路径为php_install_path/etc/)。
<code class=" hljs vhdl">[inet_http_server] ; inet (TCP) server disabled by <span class="hljs-keyword">default</span><span class="hljs-keyword">port</span>=<span class="hljs-number">127.0</span><span class="hljs-number">.0</span><span class="hljs-number">.1</span>:<span class="hljs-number">9015</span> ; (ip_address:<span class="hljs-keyword">port</span> specifier, *:<span class="hljs-keyword">port</span> <span class="hljs-keyword">for</span> <span class="hljs-keyword">all</span> iface)[supervisord]logfile=./var/log/supervisord.log ; (main log <span class="hljs-keyword">file</span>;<span class="hljs-keyword">default</span> $CWD/supervisord.log)logfile_maxbytes=<span class="hljs-number">50</span>MB ; (max main logfile bytes b4 rotation;<span class="hljs-keyword">default</span> <span class="hljs-number">50</span>MB)logfile_backups=<span class="hljs-number">2</span> ; (num <span class="hljs-keyword">of</span> main logfile rotation backups;<span class="hljs-keyword">default</span> <span class="hljs-number">10</span>)loglevel=info ; (log level;<span class="hljs-keyword">default</span> info; <span class="hljs-keyword">others</span>: debug,warn,trace)pidfile=./var/run/supervisord.pid ; (supervisord pidfile;<span class="hljs-keyword">default</span> supervisord.pid)nodaemon=false ; (start <span class="hljs-keyword">in</span> foreground <span class="hljs-keyword">if</span> true;<span class="hljs-keyword">default</span> false)minfds=<span class="hljs-number">1024</span> ; (min. avail startup <span class="hljs-keyword">file</span> descriptors;<span class="hljs-keyword">default</span> <span class="hljs-number">1024</span>)minprocs=<span class="hljs-number">200</span> ; (min. avail <span class="hljs-keyword">process</span> descriptors;<span class="hljs-keyword">default</span> <span class="hljs-number">200</span>)identifier=sup.php-fpm ; (supervisord identifier, <span class="hljs-keyword">default</span> <span class="hljs-keyword">is</span> <span class="hljs-attribute">'supervisor</span>')[rpcinterface:supervisor]supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface[supervisorctl]serverurl=http://<span class="hljs-number">127.0</span><span class="hljs-number">.0</span><span class="hljs-number">.1</span>:<span class="hljs-number">9015</span> ; <span class="hljs-keyword">use</span> an http:// url <span class="hljs-keyword">to</span> specify an inet socket[program:php-fpm]command=bash -c <span class="hljs-string">"sleep 1 && /home/slvher/tools/php/5.6.11/sbin/php-fpm --fpm-config /home/slvher/tools/php/5.6.11/etc/php-fpm.conf --pid /home/slvher/tools/php/5.6.11/var/run/php-fpm.pid"</span> ; the program (relative uses PATH, can take args)process_name=%(program_name)s ; process_name expr (<span class="hljs-keyword">default</span> %(program_name)s)autostart=true ; start at supervisord start (<span class="hljs-keyword">default</span>: true)autorestart=true ; whether/<span class="hljs-keyword">when</span> <span class="hljs-keyword">to</span> restart (<span class="hljs-keyword">default</span>: unexpected)startretries=<span class="hljs-number">5</span> ; max # <span class="hljs-keyword">of</span> serial start failures (<span class="hljs-keyword">default</span> <span class="hljs-number">3</span>)exitcodes=<span class="hljs-number">0</span>,<span class="hljs-number">2</span>,<span class="hljs-number">70</span> ; <span class="hljs-attribute">'expected</span>' <span class="hljs-keyword">exit</span> codes <span class="hljs-keyword">for</span> <span class="hljs-keyword">process</span> (<span class="hljs-keyword">default</span> <span class="hljs-number">0</span>,<span class="hljs-number">2</span>)stopsignal=QUIT ; <span class="hljs-keyword">signal</span> used <span class="hljs-keyword">to</span> kill <span class="hljs-keyword">process</span> (<span class="hljs-keyword">default</span> TERM)stopwaitsecs=<span class="hljs-number">2</span> ; max num secs <span class="hljs-keyword">to</span> <span class="hljs-keyword">wait</span> b4 SIGKILL (<span class="hljs-keyword">default</span> <span class="hljs-number">10</span>)</code>
配置文件结构通过查看supervisor文档很容易就能掌握,有两个配置项需要特别注意:
它指定了supervisor要监控的进程的启动命令,可以看到,这里我们没有给php-fpm传入daemonize参数,其余参数只是展开了php-fpm.sh中的shell变量而已。
大家已经注意到,command也不是直接调起php-fpm,而是通过bash -c执行了两个命令,而第一个命令是sleep 1。这是由于php-fpm在stop后,其占用的端口通常不能立即释放,此时,supervisor以极快的速度试图重新拉起进程时,可能会由于报如下错误而导致几次retry均失败:
<code class=" hljs vbscript">## var/<span class="hljs-built_in">log</span>/php-fpm.<span class="hljs-keyword">error</span>.<span class="hljs-built_in">log</span>[<span class="hljs-number">18</span>-Jul-<span class="hljs-number">2015</span> <span class="hljs-number">21</span>:<span class="hljs-number">35</span>:<span class="hljs-number">28</span>] <span class="hljs-keyword">ERROR</span>: unable <span class="hljs-keyword">to</span> bind listening socket <span class="hljs-keyword">for</span> address <span class="hljs-comment">'127.0.0.1:9002': Address already in use (98)</span>[<span class="hljs-number">18</span>-Jul-<span class="hljs-number">2015</span> <span class="hljs-number">21</span>:<span class="hljs-number">35</span>:<span class="hljs-number">28</span>] <span class="hljs-keyword">ERROR</span>: FPM initialization failed</code>
而supervisor目前还不支持delay restart功能,因此,这里只能通过先sleep再启动的略显tricky的方法来解决问题,结果表明,疗效不错且无副作用。-_-
其文档描述如下:
<code class=" hljs livecodeserver">May be <span class="hljs-constant">one</span> <span class="hljs-operator">of</span> <span class="hljs-constant">false</span>, unexpected, <span class="hljs-operator">or</span> <span class="hljs-constant">true</span>. If <span class="hljs-constant">false</span>, <span class="hljs-operator">the</span> <span class="hljs-built_in">process</span> will never be autorestarted. If unexpected, <span class="hljs-operator">the</span> <span class="hljs-built_in">process</span> will be restart when <span class="hljs-operator">the</span> program exits <span class="hljs-operator">with</span> <span class="hljs-operator">an</span> exit code that is <span class="hljs-operator">not</span> <span class="hljs-constant">one</span> <span class="hljs-operator">of</span> <span class="hljs-operator">the</span> exit codes associated <span class="hljs-operator">with</span> this <span class="hljs-built_in">process</span>’ configuration (see exitcodes). If <span class="hljs-constant">true</span>, <span class="hljs-operator">the</span> <span class="hljs-built_in">process</span> will be unconditionally restarted when <span class="hljs-keyword">it</span> exits, <span class="hljs-keyword">without</span> regard <span class="hljs-built_in">to</span> its exit code.</code>
其默认值是unexpected,表示若被监护进程的exit code异常时,supervisor才会重新拉起进程。这里设置为true,表明任何时候进程退出均会被再次拉起。
这样配置好后,在本文第1步搭建好的virtualenv环境中,运行如下命令即可完成supervisor对php-fpm master进程的监护:
<code class=" hljs avrasm">shell> supervisord -c etc/sup<span class="hljs-preprocessor">.php</span>-fpm<span class="hljs-preprocessor">.conf</span></code>
然后,通过ps x | fgrep fpm可以看到,php-fpm主进程已经被拉起了。
然后,kill掉php-fpm主进程,再次ps x | fgrep fpm可以看到,一个新的php-fpm主进程会被supervisor创建出来。
至此,用supervisor守护php-fpm主进程以实现php-fpm的自动重启的需求已经解决了。
========================= EOF ====================
版权声明:本文为博主原创文章,未经博主允许不得转载。