Pivotal Greenplum 6.8 Release Notes

A newer version of this documentation is available. Use the version menu above to view the most up-to-date release of the Greenplum 6.x documentation.

Pivotal Greenplum 6.8 Release Notes

This document contains pertinent release information about Pivotal Greenplum Database 6.8 releases. For previous versions of the release notes for Greenplum Database, go to Pivotal Greenplum Database Documentation. For information about Greenplum Database end of life, see Pivotal Greenplum Database end of life policy.

Pivotal Greenplum 6 software is available for download from the Pivotal Greenplum page on Pivotal Network.

Pivotal Greenplum 6 is based on the open source Greenplum Database project code.

Important: Pivotal Support does not provide support for open source versions of Greenplum Database. Only Pivotal Greenplum Database is supported by Pivotal Support.

Release 6.8.1

Release Date: 2020-06-11

Pivotal Greenplum 6.8.1 is a maintenance release that contains a changed feature and resolves several issues.

Changed Feature

The Greenplum PostGIS extension package has been updated to postgis-2.5.4+pivotal.2. The release contains these changes:
  • Adds support for the PostGIS TIGER geocoder extension and the PostGIS address standardizer and address rules files extensions.
  • Removes PostGIS Raster function limitations.
  • Uses the CREATE EXTENSION and DROP EXTENSION commands to enable and disable support for the PostGIS extension and supported, optional PostGIS extensions.
    Note: The postgis_manager.sh script is deprecated and will be removed in a future release of Greenplum PostGIS. To enable or disable PostGIS support, use the CREATE EXTENSION or DROP EXTENSION command. See Enabling and Removing PostGIS Support.

Resolved Issues

Pivotal Greenplum 6.8.1 resolves these issues:

30664 - Query Optimizer
For a complex CTAS query that has implicit casts in the project list, GPORCA may generate a plan with duplicate eliminating motions, to ensure correctness. However, if a duplicate eliminating motion is performed under the hash operation of a Hash Join, an implicit cast operation creates an additional column that causes memtuple binding issues in the executor. To address this problem, GPORCA now generates a modified plan that prunes the output of any duplicate eliminating motions before sending the output to the hash operation.
30684 - Query Optimizer
GPORCA returned incorrect results for some queries when the query's select list contains a window function and the window function contains a correlated subquery or an outer reference. Now the query falls back to the Postgres planner.
30615 - Query Optimizer
GPORCA query performance degraded when compared with Greenplum 5 for some queries that perform joins using an equality predicate and the equality predicate contains a function, for example coalesce(tbl1.a, '999999') = coalesce(tbl2.a, '999999'). The performance issue was caused by inaccurate cardinality estimates. GPORCA cardinality estimation has been improved for the specified type of query.
172732495, 9953 - query execution
Greenplum Database generated a PANIC when executing a query that executes multiple user-defined functions and more than one of the functions is defined with the EXECUTE ON INITPLAN attribute. This issue is resolved.
172098556 - psql
Resolved a problem where the psql client \dm command did not display materialized views.
172094194, 9837 - gprecoverseg
In some cases when recovering segment instances using the gprecoverseg utility with the -i <recover_config_file> option to specify details about failed segments to recover, the utility changed some segment instance dbid values in the Greenplum system configuration. This issue is resolved.

Release 6.8.0

Release Date: 2020-06-05

Pivotal Greenplum 6.8.0 is a minor release that includes changed features and resolves several issues.

Features

Greenplum Database 6.8.0 includes these new and changed features:

  • Greenplum Streaming Server (GPSS) version 1.3.6 is included, which introduces many new and changed features and bug fixes since the last GPSS version installed in Greenplum 6.x (1.3.1). Refer to the GPSS Release Notes for more information on release content and to access the GPSS documentation.
    Note: If you have previously used GPSS in your Greenplum 6.x installation, you are required to perform upgrade actions as described in Upgrading the Streaming Server.
  • The gpinitsystem input configuration file specified with the -I option supports an additional format to specify hosts. The QD_PRIMARY_ARRAY, PRIMARY_ARRAY, and MIRROR_ARRAY host parameters may now be specified using either of the following formats:
    host~port~data_directory/seg_prefix<segment_id>~dbid~content_id
    hostname~address~port~data_directory/seg_prefix<segment_id>~dbid~content_id

    The first format, which is the pre-existing format, sets both the hostname and address columns of the gp_segment_configuration catalog table to the value in the host field. The second format sets the hostname and address columns of the gp_segment_configuration catalog table to the values in the respective hostname and address fields. See

  • PXF version 5.12.0 is included, which introduces new and changed features and bug fixes. See PXF Version 5.12.0 below.
  • PL/Container version 2.1.2 is included, which introduces the following new features:
    • Support for R version 3.6.3.
    • A new --use_local_copy option to the plcontainer add-image command that you can use to install the specified image only on the local host.
  • Greenplum Database 6.8 adds support for Moving a Query to a Different Resource Group.
  • Greenplum Database 6.8 includes a new metrics collector extension that is compatible with Greenplum Command Center 6.2 and above. If you are using Command Center 6.0 or 6.1 you must download and install Command Center 6.2 after you install Greenplum Database 6.8.

PXF Version 5.12.0

PXF includes the following new and changed features:

  • PXF trims right-padded white space added by Greenplum before it writes Parquet data.
  • PXF bundles newer hive, jackson-databind, and supporting internal libraries.
  • A PXF server running on Java 11 can now read from Hive using an external table that specifies a Hive* profile.
  • PXF introduces the new custom option IGNORE_MISSING_PATH for external tables that you use to read file-based data. Setting this option may be useful when a PXF external table is a child partition of a partitioned Greenplum table. Refer to About PXF External Table Child Partitions for more information.
  • PXF bundles the jodd-core library to satisfy a missing transitive dependency that is required when PXF reads Parquet files that contain data in timestamp format.
  • PXF adds column projection support for the Hive and HiveRC profiles by changing the implementation to use column name-based, rather than column index-based, mapping.
    Note: If you have existing PXF external tables that specify a Hive* profile, you may be required to perform upgrade actions as described in Upgrading PXF.

Resolved Issues

Pivotal Greenplum 6.8.0 resolves these issues:

329, 30602 - PXF
PXF did not correctly read a partitioned Hive table when the external table specified a Hive* profile and the external table and Hive table had a differing number of columns. This issue is resolved. PXF now supports column projection for the Hive* profiles and correctly handles this situation.
30611 - Query Optimizer
When falling back to the Postgres planner, GPORCA incorrectly logged messages that were internal messages. This made the log file difficult to read and caused bloat in the file. This issue is resolved. GPORCA message logging has been improved and the internal messages are no longer sent to the log files.
30585 - Locking
Resolved a problem that could corrupt resource queue locks, and potentially other types of locks, in shared memory. This problem could cause errors such as lock lock_name on object object_identifier is already held.
30557 - DDL
When performing a data reorganization with the ALTER TABLE command on a leaf partition of a partitioned table that did not change the distribution policy, Greenplum Database returned the error ERROR: can't set the distribution policy. This type of redistribution is allowed in Greenplum 5. Now Greenplum allows data reorganization on a leaf partition if the distribution policy is not changed.
30289 - Query Optimizer
When GPORCA performed dynamic partition elimination for some queries against partitioned tables that perform joins, GPORCA was not using the correct statistics. This caused a performance degradation when compared with Greenplum 5. GPORCA has improved how statistics are computed tor the specified type of query.
172854840 - Interconnect
In some cases, a query that executes a stable function that contains an SQL statement might hang because the query dispatcher (QD) did not correctly manage the execution of the function and the dispatching of the query plan. This issue is resolved.
172832212 - Interconnect
In some cases, communication between a query dispatcher (QD) and a query executor (QE) on different segments was slow when the Greenplum interconnect type is set to the TCP networking protocol for Greenplum Database interconnect traffic. Now the communication between a QD and a QE is more efficient.
172615233 - Query Optimizer
For text data types, the GPORCA the cardinality estimation algorithm has been improved for equality comparisons. For example, when a query contains an IN clause that contains text elements.
172576000 - COPY
If data format errors occurred while copying data into a partitioned table with a COPY FROM command in single row error isolation mode, Greenplum Database might crash when a query executor (QE) did not handle the data format error correctly. This issue is resolved.
30487 - Utility Commands
On a Greenplum Database 6 system with FIPS enabled, Greenplum utility commands such as gpinitsystem returned the error "ERROR:root:code for hash md5 was not found." This issue is resolved.
30484 - Utility Commands
When initializing a Greenplum Database system with gpinitsystem, the primary segments were erroneously named using DNS resolvable external hostnames instead of the internal interconnect interface hostnames. At the same time, the segment mirrors were correctly named. This issue is now resolved.

Upgrading from Greenplum 6.x to Greenplum 6.8

Note: Greenplum 6 does not support direct upgrades from Greenplum 4 or Greenplum 5 releases, or from earlier Greenplum 6 Beta releases.

See Upgrading from an Earlier Greenplum 6 Release to upgrade your existing Greenplum 6.x software to Greenplum 6.8.0.

Deprecated Features

Deprecated features will be removed in a future major release of Greenplum Database. Pivotal Greenplum 6.x deprecates:

  • The gpsys1 utility.
  • The analzyedb option --skip_root_stats (deprecated since 6.2).

    If the option is specified, a warning is issued stating that the option will be ignored.

  • The server configuration parameter gp_statistics_use_fkeys (deprecated since 6.2).
  • The following PXF configuration properties (deprecated since 6.2):
    • The PXF_USER_IMPERSONATION, PXF_PRINCIPAL, and PXF_KEYTAB settings in the pxf-env.sh file. You can use the pxf-site.xml file to configure Kerberos and impersonation settings for your new Hadoop server configurations.
    • The pxf.impersonation.jdbc property setting in the jdbc-site.xml file. You can use the pxf.service.user.impersonation property to configure user impersonation for a new JDBC server configuration.
  • The server configuration parameter gp_ignore_error_table (deprecated since 6.0).

    To avoid a Greenplum Database syntax error, set the value of this parameter to true when you run applications that execute CREATE EXTERNAL TABLE or COPY commands that include the now removed Greenplum Database 4.3.x INTO ERROR TABLE clause.

  • Specifying => as an operator name in the CREATE OPERATOR command (deprecated since 6.0).
  • The Greenplum external table C API (deprecated since 6.0).

    Any developers using this API are encouraged to use the new Foreign Data Wrapper API in its place.

  • Commas placed between a SUBPARTITION TEMPLATE clause and its corresponding SUBPARTITION BY clause, and between consecutive SUBPARTITION BY clauses in a CREATE TABLE command (deprecated since 6.0).

    Using this undocumented syntax will generate a deprecation warning message.

  • The timestamp format YYYYMMDDHH24MISS (deprecated since 6.0).

    This format could not be parsed unambiguously in previous Greenplum Database releases, and is not supported in PostgreSQL 9.4.

  • The createlang and droplang utilities (deprecated since 6.0).
  • The pg_resqueue_status system view (deprecated since 6.0).

    Use the gp_toolkit.gp_resqueue_status view instead.

  • The GLOBAL and LOCAL modifiers when creating a temporary table with the CREATE TABLE and CREATE TABLE AS commands (deprecated since 6.0).

    These keywords are present for SQL standard compatibility, but have no effect in Greenplum Database.

  • The Greenplum Platform Extension Framework (PXF) HDFS profile names for the Text, Avro, JSON, Parquet, and SequenceFile data formats (deprecated since 5.16).

    Refer to Connectors, Data Formats, and Profiles in the PXF Hadoop documentation for more information.

  • Using WITH OIDS or oids=TRUE to assign an OID system column when creating or altering a table (deprecated since 6.0).
  • Allowing superusers to specify the SQL_ASCII encoding regardless of the locale settings (deprecated since 6.0).

    This choice may result in misbehavior of character-string functions when data that is not encoding-compatible with the locale is stored in the database.

  • The @@@ text search operator (deprecated since 6.0).

    This operator is currently a synonym for the @@ operator.

  • The unparenthesized syntax for option lists in the VACUUM command (deprecated since 6.0).

    This syntax requires that the options to the command be specified in a specific order.

  • The plain pgbouncer authentication type (auth_type = plain) (deprecated since 4.x).

Migrating Data to Greenplum 6

Note: Greenplum 6 does not support direct upgrades from Greenplum 4 or Greenplum 5 releases, or from earlier Greenplum 6 Beta releases.

See Migrating Data from Greenplum 4.3 or 5 for guidelines and considerations for migrating existing Greenplum data to Greenplum 6, using standard backup and restore procedures.

Known Issues and Limitations

Pivotal Greenplum 6 has these limitations:

  • Upgrading a Greenplum Database 4 or 5 release, or Greenplum 6 Beta release, to Pivotal Greenplum 6 is not supported.
  • MADlib, GPText, and PostGIS are not yet provided for installation on Ubuntu systems.
  • Greenplum 6 is not supported for installation on DCA systems.
  • Greenplum for Kubernetes is not yet provided with this release.

The following table lists key known issues in Pivotal Greenplum 6.x.

Table 1. Key Known Issues in Pivotal Greenplum 6.x
Issue Category Description
10216 ALTER TABLE, ALTER DOMAIN In some cases, heap table data is lost when performing concurrent ALTER TABLE or ALTER DOMAIN commands where one command alters a table column and the other rewrites or redistributes the table data. For example, performing concurrent ALTER TABLE commands where one command changes a column data type from int to text might cause data loss. This issue might also occur when altering a table column during the data distribution phase of a Greenplum system expansion. Greenplum Database did not correctly capture the current state of the table during command execution.

This issue is resolved in Pivotal Greenplum 6.9.0.

N/A Spark Connector This version of Greenplum is not compatible with Greenplum-Spark Connector versions earlier than version 1.7.0, due to a change in how Greenplum handles distributed transaction IDs.
N/A PXF Starting in 6.x, Greenplum does not bundle cURL and instead loads the system-provided library. PXF requires cURL version 7.29.0 or newer. The officially-supported cURL for the CentOS 6.x and Red Hat Enterprise Linux 6.x operating systems is version 7.19.*. Greenplum Database 6 does not support running PXF on CentOS 6.x or RHEL 6.x due to this limitation.

Workaround: Upgrade the operating system of your Greenplum Database 6 hosts to CentOS 7+ or RHEL 7+, which provides a cURL version suitable to run PXF.

29703 Loading Data from External Tables Due to limitations in the Greenplum Database external table framework, Greenplum Database cannot log the following types of errors that it encounters while loading data:
  • data type parsing errors
  • unexpected value type errors
  • data type conversion errors
  • errors returned by native and user-defined functions
LOG ERRORS returns error information for data exceptions only. When it encounters a parsing error, Greenplum terminates the load job, but it cannot log and propagate the error back to the user via gp_read_error_log().

Workaround: Clean the input data before loading it into Greenplum Database.

30594 Resource Management Resource queue-related statistics may be inaccurate in certain cases. Pivotal recommends that you use the resource group resource management scheme that is available in Greenplum 6.
30522 Logging Greenplum Database may write a FATAL message to the standby master or mirror log stating that the database system is in recovery mode when the instance is synchronizing with the master and Greenplum attempts to contact it before the operation completes. Ignore these messages and use gpstate -f output to determine if the standby successfully synchronized with the Greenplum master; the command returns Sync state: sync if it is synchronized.
30537 Postgres Planner The Postgres Planner generates a very large query plan that causes out of memory issues for the following type of CTE (common table expression) query: the WITH clause of the CTE contains a partitioned table with a large number partitions, and the WITH reference is used in a subquery that joins another partitioned table.

Workaround: If possible, use the GPORCA query optimizer. With the server configuration parameter optimizer=on, Greenplum Database attempts to use GPORCA for query planning and optimization when possible and falls back to the Postgres Planner when GPORCA cannot be used. Also, the specified type of query might require a long time to complete.

170824967 gpfidsts For Greenplum Database 6.x, a command that accesses an external table that uses the gpfdists protocol fails if the external table does not use an IP address when specifying a host system in the LOCATION clause of the external table definition.
n/a Materialized Views By default, certain gp_toolkit views do not display data for materialized views. If you want to include this information in gp_toolkit view output, you must redefine a gp_toolkit internal view as described in Including Data for Materialized Views.
168957894 PXF The PXF Hive Connector does not support using the Hive* profiles to access Hive transactional tables.

Workaround: Use the PXF JDBC Connector to access Hive.

168548176 gpbackup When using gpbackup to back up a Greenplum Database 5.7.1 or earlier 5.x release with resource groups enabled, gpbackup returns a column not found error for t6.value AS memoryauditor.
164791118 PL/R PL/R cannot be installed using the deprecated createlang utility, and displays the error:
createlang: language installation failed: ERROR:
no schema has been selected to create in
Workaround: Use CREATE EXTENSION to install PL/R, as described in the documentation.
N/A Greenplum Client/Load Tools on Windows The Greenplum Database client and load tools on Windows have not been tested with Active Directory Kerberos authentication.

Differences Compared to Open Source Greenplum Database

Pivotal Greenplum 6.x includes all of the functionality in the open source Greenplum Database project and adds:
  • Product packaging and installation script
  • Support for QuickLZ compression. QuickLZ compression is not provided in the open source version of Greenplum Database due to licensing restrictions.
  • Support for data connectors:
    • Greenplum-Spark Connector
    • Greenplum-Informatica Connector
    • Greenplum-Kafka Integration
    • Greenplum Streaming Server
  • Data Direct ODBC/JDBC Drivers
  • gpcopy utility for copying or migrating objects between Greenplum systems
  • Support for managing Greenplum Database using Pivotal Greenplum Command Center
  • Support for full text search and text analysis using Pivotal GPText
  • Greenplum backup plugin for DD Boost
  • Backup/restore storage plugin API (Beta)