合 PMM不能正常运行,报错pmm-update-perform-init (exit status 1; not expected)
现象
PMM在重启OS后不能启动,一直报错:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | [root@mdw ~]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES db294830a440 percona/pmm-server:2.40.1-el7 "/opt/entrypoint.sh" 3 months ago Up 2 minutes (unhealthy) 80/tcp, 0.0.0.0:2443->443/tcp, :::2443->443/tcp lhr-pmm-server [root@mdw ~]# docker logs -f db294830a440 2024-02-08 01:50:20,006 INFO success: pmm-update-perform-init entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 2024-02-08 01:50:20,006 INFO success: postgresql entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 2024-02-08 01:50:23,861 INFO exited: postgresql (exit status 1; not expected) 2024-02-08 01:50:24,014 INFO spawned: 'postgresql' with pid 6250 2024-02-08 01:50:25,015 INFO success: postgresql entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 2024-02-08 01:50:28,869 INFO exited: postgresql (exit status 1; not expected) 2024-02-08 01:50:29,022 INFO spawned: 'postgresql' with pid 6370 2024-02-08 01:50:30,021 INFO success: postgresql entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 2024-02-08 01:50:30,387 INFO exited: pmm-update-perform-init (exit status 1; not expected) 2024-02-08 01:50:31,024 INFO spawned: 'pmm-update-perform-init' with pid 6453 |
解决
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 | [root@mdw ~]# runlike -p db294830a440 docker run --name=lhr-pmm-server \ --hostname=lhr-pmm-server \ --mac-address=02:42:c0:5c:00:0a \ --volume=lhr-pmm-data:/srv \ --workdir=/opt \ -p 2443:443 \ --expose=80 \ --restart=always \ --runtime=runc \ --detach=true \ percona/pmm-server:2.40.1-el7 \ /opt/entrypoint.sh [root@mdw ~]# docker exec -it db294830a440 bash [root@lhr-pmm-server opt]# [root@lhr-pmm-server opt]# [root@lhr-pmm-server opt]# [root@lhr-pmm-server opt]# cat /etc/redhat-release CentOS Linux release 7.9.2009 (Core) [root@lhr-pmm-server opt]# [root@lhr-pmm-server opt]# [root@lhr-pmm-server opt]# [root@lhr-pmm-server opt]# cat /opt/entrypoint.sh #!/bin/bash set -o errexit # init /srv if empty DIST_FILE=/srv/pmm-distribution if [ ! -f $DIST_FILE ]; then echo "File $DIST_FILE doesn't exist. Initialize /srv..." echo docker > $DIST_FILE mkdir -p /srv/{clickhouse,grafana,logs,postgres14,prometheus,nginx,victoriametrics} echo "Copying plugins and VERSION file" cp /usr/share/percona-dashboards/VERSION /srv/grafana/PERCONA_DASHBOARDS_VERSION cp -r /usr/share/percona-dashboards/panels/ /srv/grafana/plugins chown -R grafana:grafana /srv/grafana chown pmm:pmm /srv/{victoriametrics,prometheus,logs} chown postgres:postgres /srv/postgres14 echo "Generating self-signed certificates for nginx" bash /var/lib/cloud/scripts/per-boot/generate-ssl-certificate echo "Initializing Postgres" su postgres -c "/usr/pgsql-14/bin/initdb -D /srv/postgres14" echo "Enable pg_stat_statements extension" su postgres -c "/usr/pgsql-14/bin/pg_ctl start -D /srv/postgres14" su postgres -c "psql postgres postgres -c 'CREATE EXTENSION pg_stat_statements SCHEMA public'" su postgres -c "/usr/pgsql-14/bin/pg_ctl stop -D /srv/postgres14" fi # pmm-managed-init validates environment variables. pmm-managed-init # Start supervisor in foreground exec supervisord -n -c /etc/supervisord.conf [root@lhr-pmm-server opt]# ps -ef|grep post postgres 10278 1 0 02:03 ? 00:00:00 /usr/pgsql-14/bin/postgres -D /srv/postgres14 -c shared_preload_libraries=pg_stat_statements -c pg_stat_statements.max=10000 -c pg_stat_statements.track=all -c pg_stat_statements.save=off -c logging_collector=off postgres 10279 10278 0 02:03 ? 00:00:00 postgres: startup recovering 0000000100000000000000F9 root 10281 7674 0 02:03 pts/0 00:00:00 grep --color=auto post |
可见PG一直在做恢复,我们直接重置日志: