合 PG使用插件pg_squeeze收缩膨胀的表和索引
Tags: PGPostgreSQL插件表膨胀膨胀索引膨胀收缩pg_squeeze
简介
PostgreSQL中大量更新或者删除记录后,加上autovacuum参数未做优化或设置不当,会导致表及索引膨胀。生产环境除了手动使用vacuum之外,还有两个比较常用的工具:一个是pg_repack,另外一个是pg_squeeze。
使用pg_repack或pg_squeeze对表做重组时,比vacuum full对系统的影响小,且性能更高。
安装pg_squeeze
GitHub:https://github.com/cybertec-postgresql/pg_squeeze
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | -- yum安装(PG源就有) yum list pg_squeeze* yum install -y pg_squeeze_14 -- 编译安装 wget https://github.com/cybertec-postgresql/pg_squeeze/archive/refs/tags/REL1_5_0.tar.gz tar -zxvf REL1_5_0.tar.gz cd pg_squeeze-REL1_5_0/ make && make install cat >> /var/lib/pgsql/14/data/postgresql.conf <<"EOF" wal_level = logical max_replication_slots = 10 shared_preload_libraries = 'pg_squeeze' EOF pg_ctl restart CREATE EXTENSION pg_squeeze; |
过程:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | postgres=# create extension pg_squeeze ; CREATE EXTENSION postgres=# set search_path to 'squeeze'; SET postgres=# \dt List of relations Schema | Name | Type | Owner ---------+-----------------+-------+---------- squeeze | errors | table | postgres squeeze | log | table | postgres squeeze | tables | table | postgres squeeze | tables_internal | table | postgres squeeze | tasks | table | postgres (5 rows) postgres=# \dn List of schemas Name | Owner ---------+---------- public | postgres repack | postgres squeeze | postgres (3 rows) |
pg_repack使用例子
对test数据库下的foo和bar表做在线vacuum full
1 | $ pg_repack --no-order --table foo --table bar test |
转移foo表的索引到tbs表空间
1 | $ pg_repack -d test --table foo --only-indexes --tablespace tbs |
pg_squeeze使用例子
以注册表方式crontab定期运行“squeeze挤压”
1 2 | INSERT INTO squeeze.tables (tabschema, tabname, schedule) VALUES ('public', 'foo', ('{30}', '{22}', NULL, NULL, '{3, 5}')); |
手工对public模式下的pgbench_accounts表“squeeze挤压”
1 | SELECT squeeze.squeeze_table('public', 'pgbench_accounts', null, null, null); |
pg_squeeze测试
创建测试表
1 2 | create table public.test(id int primary key); insert into public.test select generate_series(1,2000000); |
查看表的大小
1 2 3 4 5 6 7 | postgres=# SELECT pg_size_pretty(pg_total_relation_size('test')); pg_size_pretty ---------------- 112 MB (1 row) postgres=# |
向squeeze.tables插入一条数据,定期清理test表,每天的0点、2点、6点的10、30和50分别执行1次,空闲空间超过10%就会对表进行重建.
1 2 3 4 5 6 7 8 9 10 11 12 | postgres=# insert into squeeze.tables (tabschema, tabname, schedule, free_space_extra) values ('public', 'test', ('{10,30,50}', '{00,02,06}', NULL, NULL, NULL), '10'); INSERT 0 1 postgres=# select * from squeeze.tables; id | tabschema | tabname | clustering_index | rel_tablespace | ind_tablespaces | free_space_extra | min_size | vacuum_max_age | max_retry | skip_analyze | schedule ----+-----------+---------+------------------+----------------+-----------------+------------------+----------+----------------+-----------+--------------+------------------------ 1 | public | test | | | | 10 | 8 | 01:00:00 | 0 | f | ("{0,59}","{0,23}",,,) (1 row) postgres=# -- schedule的几个列表示:分钟、小时、天、月、周 |
“minutes”(0到59)和“hours”(0到23)指定了一天内检查的时间,而“days_of_month”(1到31)、“months”(1到12)和“days_of_week”(0到7,其中0和7都代表星期日)确定了检查的日期。
如果“minute”、“hour”和“month”都与当前时间戳匹配,则进行检查,而NULL值分别表示任何分钟、小时和月份。至于“days_of_month”和“days_of_week”,至少有一个需要与当前时间戳匹配,或者两者都为NULL才会进行检查。
启动pg_squeeze的进程需要调用