GreenPlum 7.2.0新特性介绍

0    310    1

Tags:

👉 本文共约5275个字,系统预计阅读时间或需20分钟。

简介

GreenPlum 7.0.0于2023-09-28发布,大约半年后,GreenPlum 7.1.0于2024-02-09发布,GreenPlum 7.2.0于2024-06-20发布。

在本文中,麦老师就其中一些比较实用的新特性做一些简单说明。

GreenPlum 7.2.0环境准备

此docker包括1个master,1个standby master,2个segment,2个mirror实例;还包括gpcc 7.1.1

image-20240627133432378

中文翻译

VMware Greenplum 7.2.0 是一个小版本更新,包含新特性、功能变化,并解决了若干问题。

  • 新特性和功能变化 增强功能 GPORCA 现在支持以前仅Postgres计划器支持的多个功能:
    • 预处理语句
    • 包含查询参数的函数
    • DISTINCT限定的窗口聚合
    • full hash joins连接
  • 在7.2.0版本中,无镜像的Greenplum架构不再使用HA服务来提供Greenplum主段的高可用性,而是取代了镜像架构中使用的FTS探测。HA服务要求使用root控制多个服务,导致与正常集群工具(如gpstart/gpstop)之间的争用。这个问题和其他可用性问题已由新引入的Postmaster服务解决,该服务完全取代了原来的HA服务。有关使用Postmaster服务的更多信息,请参阅安装Greenplum高可用性服务 Installing the Greenplum High Availability Service.。
  • VMware Greenplum 7.2.0 在与CREATE TABLE相同事务中执行COPY FROM时,生成的WAL更少。
  • VMware Greenplum 7.2.0 提高了gpexpand在段清理时的性能。
  • VMware Greenplum 7.2.0 改进了pg_basebackup、pg_rewind和rsync的标准输出日志记录,并在成功恢复后保留恢复进度文件在日志目录中。
  • VMware Greenplum 7.2.0 将简单进度跟踪设为gpexpand的默认值,并添加了--detailed-progress选项用于详细进度跟踪。
  • gpcheckcat工具现在包含一个新测试——mix_distribution_policy,用于检查使用传统和非传统哈希操作创建的表。
  • gpsupport gp_log_collector工具现在支持通过新选项-with-gpdr-primary和-with-gpdr-recovery收集VMware Greenplum灾难恢复日志。
  • VMware Greenplum 现在支持计划提示,包括:扫描、行估算、连接顺序和连接类型。
  • VMware Greenplum 现在支持对追加优化表进行索引扫描,而以前仅支持内存位图扫描。 VMware Greenplum 7.2.0 引入了两个与追加优化表索引扫描相关的新服务器配置参数:
    • gp_cpu_decompress_cost:允许用户微调追加优化表索引扫描期间解压缩的成本。
    • gp_enable_ao_indexscan:启用对追加优化表的索引扫描。
  • VMware Greenplum 7.2.0 引入了一个新的服务器配置参数——gp_appendonly_compaction_segfile_limit。此参数设置插入前下一次压缩所需的最小段文件数量。
  • VMware Greenplum 7.2.0 重新引入了以下服务器配置参数:
    • gp_max_partition_level:限制使用经典语法创建的分区层级数。
    • gp_resgroup_print_operator_memory_limits:允许打印由资源组内存管理分配给操作符的内存限制(在explain中)。
  • VMware Greenplum 7.2.0 现在支持OFFSET/LIMIT推送到外部表,当数据分布在多个远程服务器时,设置mpp_execute='all segments'。
  • 对于追加优化的列存储表,ADD COLUMN命令不再需要为整列写入默认值。
  • 警告 如果您的数据库包含此类表,您可能无法从将来的版本降级到VMware Greenplum 7.1或更早版本。
  • VMware Greenplum 7.2.0 现在支持使用syscache进行pg_attribute_encoding目录搜索。

新扩展/模块 VMware Greenplum 7.2.0 引入了pg_cron模块,提供基于cron的作业调度程序,运行在数据库内。

  • VMware Greenplum 7.2.0 引入了3DCityDb模块,支持空间数据处理。
  • VMware Greenplum 7.2.0 引入了H3模块,提供六边形分层地理空间索引。
  • VMware Greenplum 7.2.0 支持使用Zstandard算法(zstd)的列存储表进行游程编码(RLE)压缩。
  • gpstate -e命令现在显示一个名为“启动恢复剩余字节”的新字段。此字段报告正在恢复的镜像段的启动WAL归档恢复剩余字节数,直到该段在gp_segment_configuration表中被标记为“已上线”。
  • VMware Greenplum 7.2.0 引入了一个新扩展,orafce_ext,提供用于操作RAW数据类型的Oracle兼容SQL函数。

全部新特性原文

Release 7.2.0

Release Date: 2024-06-20

VMware Greenplum 7.2.0 is a minor release that includes new and changed features and resolves several issues.

New and Changed Features

Enhancements

  • GPORCA now supports a number of features previously only supported in the Postgres-based Planner:

    • Prepared statements
    • Functions containing query parameters
    • DISTINCT-qualified window aggregates
    • Full hash joins
  • With the 7.2.0 release, mirrorless Greenplum architectures no longer use the HA service to provide high availability of Greenplum primary segments in place of the FTS probe used in mirrored architectures. The HA service required that Greenplum state was controlled with root for multiple services, causing contention between normal cluster utilities such as gpstart/gpstop and the HA service. This issue and other usability issues are resolved by the new Postmaster service, which entirely replaces the original HA service. For more information about using the Postmaster service, see Installing the Greenplum High Availability Service.

  • VMware Greenplum 7.2.0 generates less WAL for COPY FROM on heap tables when executed in the same transaction as CREATE TABLE.

  • VMware Greenplum 7.2.0 enhances gpexpand performance in segment cleanup.

  • VMware Greenplum 7.2.0 improves pg_basebackup, pg_rewind, and rsync logging to stdout and retains recovery progress files in the log directory after successful recovery.

  • VMware Greenplum 7.2.0 sets simple progress tracking as the default for gpexpand and adds a --detailed-progress option for detailed progress tracking.

  • The gpcheckcat utility now includes a new test -- mix_distribution_policy -- which checks for tables created with legacy and non-legacy hash operations.

  • The gpsupport gp_log_collector tool now supports gathering logs for VMware Greenplum Disaster Recovery, via the new -with-gpdr-primary and -with-gpdr-recovery options.

  • VMware Greenplum now supports plan hints for: Scan, Row Estimation, Join Order and Join Types.

  • VMware Greenplum now supports index scans for append-optimized tables in comparison to previously only supporting in-memory bitmap scan.

    • VMware Greenplum 7.2.0 introduces two new server configuration parameters related to index scans of append-optimized tables:
    • gp_cpu_decompress_cost allows a user to fine-tune the cost of decompression during index scans of append-optimized tables.
    • gp_enable_ao_indexscan enables index scans on append-optimized tables.
  • VMware Greenplum 7.2.0 introduces a new server configuration parameter — gp_appendonly_compaction_segfile_limit. This parameter sets the minimum number of segment files required for inserts before the next compaction.

  • VMware Greenplum 7.2.0 re-introduces the following server configuration parameters:

    • gp_max_partition_level caps the number of levels of a partition hierarchy that can be created using classic syntax.
    • gp_resgroup_print_operator_memory_limits allows printing the memory limits for operators (in explain) assigned by the resource group's memory management.
  • VMware Greenplum 7.2.0 now supports OFFSET/LIMIT pushdown for foreign tables with data distributed across multiple remote servers when mpp_execute = 'all segments' is set.

  • The ADD COLUMN command for append-optimized column-oriented tables no longer needs to write default values for the full column.

    Caution

    If your database contains such tables, you may not be able to downgrade from future releases to VMware Greenplum 7.1 or earlier releases.

  • VMware Greenplum 7.2.0 now supports pg_attribute_encoding catalog search using syscache.

New Extensions/Modules

  • VMware Greenplum 7.2.0 introduces the pg_cron module, which provides a cron-based job scheduler that runs inside the database.
  • VMware Greenplum 7.2.0 introduces the 3DCityDb module, which enables spatial data processing.
  • VMware Greenplum 7.2.0 introduces the H3 module, which provides hexagonal hierarchical geospatial indexing.
  • VMware Greenplum 7.2.0 supports Run-length encoding (RLE) compression with the Zstandard algorith, or zstd for column-oriented tables.
  • The gpstate -e command now displays an additional field called "Startup recovery remaining bytes". This field reports the number of bytes of startup WAL archive recovery remaining for the mirror segment that is undergoing recovery before the segment is marked as "up" in the gp_segment_configuration table.
  • VMWware Greenplum 7.2.0 introduces a new extension, orafce_ext, which provides Oracle Compatibility SQL functions for manipulating RAW datatypes.

Updated Libraries

  • The pgvector module has been updated to version 0.7.0. Refer to pgvector for module and upgrade information.
  • The Python version for PL/Container and PostgresML has been updated from 3.9 to 3.11.

Changes

  • The resource group parameter MEMORY_LIMIT has been renamed to MEMORY_QUOTA.
  • The log_checkpoints server configuration parameter is now set to on by default.
  • In order to use VMware Greenplum Text with VMware Greenplum v7.2.0 and higher, you must set the default Python 3 version to 3.9 or higher.

Resolved Issues

Server

标签:

Avatar photo

小麦苗

学习或考证,均可联系麦老师,请加微信db_bao或QQ646634621

您可能还喜欢...

发表回复