Pivotal Greenplum 6.2 Release Notes

A newer version of this documentation is available. Use the version menu above to view the most up-to-date release of the Greenplum 6.x documentation.

Pivotal Greenplum 6.2 Release Notes

This document contains pertinent release information about Pivotal Greenplum Database 6.2 releases. For previous versions of the release notes for Greenplum Database, go to Pivotal Greenplum Database Documentation. For information about Greenplum Database end of life, see Pivotal Greenplum Database end of life policy.

Pivotal Greenplum 6 software is available for download from the Pivotal Greenplum page on Pivotal Network.

Pivotal Greenplum 6 is based on the open source Greenplum Database project code.

Important: Pivotal Support does not provide support for open source versions of Greenplum Database. Only Pivotal Greenplum Database is supported by Pivotal Support.

Release 6.2.1

Release Date: 2019-12-12

Pivotal Greenplum 6.2.1 is a minor release that includes new features and resolves several issues.

New Features

Greenplum Database 6.2.1 includes these new features:

  • Greenplum Database supports materialized views. Materialized views are similar to views. A materialized view enables you to save a frequently used or complex query, then access the query results in a SELECT statement as if they were a table. Materialized views persist the query results in a table-like form. Materialized view data cannot be directly updated. To refresh the materialized view data, use the REFRESH MATERIALIZED VIEW command. See Creating and Managing Materialized Views.
    Note: Known Issues and Limitations describes a limitation of materialized view support in Greenplum 6.2.1.
  • The gpinitsystem utility supports the --ignore-warnings option. The option controls the value returned by gpinitsystem when warnings or an error occurs. If you specify this option, gpinitsystem returns 0 if warnings occurred during system initialization, and returns a non-zero value if a fatal error occurs. If this option is not specified, gpinitsystem returns 1 if initialization completes with warnings, and returns value of 2 or greater if a fatal error occurs.
  • PXF version 5.10.0 is included, which introduces several new and changed features and bug fixes. See PXF Version 5.10.0 below.

PXF Version 5.10.0

PXF 5.10.0 includes the following new and changed features:

  • PXF has improved its performance when reading a large number of files from HDFS or an object store.
  • PXF bundles newer tomcat and jackson libraries.
  • The PXF JDBC Connector now supports pushdown of OR and NOT logical filter operators when specified in a JDBC named query or in an external table query filter condition.
  • PXF supports writing Avro-format data to Hadoop and object stores. Refer to Reading and Writing HDFS Avro Data for more information about this feature.
  • PXF is now certified with Hadoop 2.x and 3.1.x and Hive Server 2.x and 3.1, and bundles new and upgraded Hadoop libraries to support these versions.
  • PXF supports Kerberos authentication to Hive Server 2.x and 3.1.x.
  • PXF supports per-server user impersonation configuration.
  • PXF supports concurrent access to multiple Kerberized Hadoop clusters. In previous releases of Greenplum Database, PXF supported accessing a single Hadoop cluster secured with Kerberos, and this Hadoop cluster must have been configured as the default PXF server.
  • PXF introduces a new template file, pxf-site.xml, to specify the Kerberos and impersonation property settings for a Hadoop or JDBC server configuration. Refer to About Kerberos and User Impersonation Configuration (pxf-site.xml) for more information about this file.
  • PXF now supports connecting to Hadoop with a configurable Hadoop user identity. PXF previously supported only proxy access to Hadoop via the gpadmin Greenplum user.
  • PXF version 5.10.0 deprecates the following configuration properties.
    Note: These property settings continue to work.
    • The PXF_USER_IMPERSONATION, PXF_PRINCIPAL, and PXF_KEYTAB settings in the pxf-env.sh file. You can use the pxf-site.xml file to configure Kerberos and impersonation settings for your new Hadoop server configurations.
    • The pxf.impersonation.jdbc property setting in the jdbc-site.xml file. You can use the pxf.service.user.impersonation property to configure user impersonation for a new JDBC server configuration.
Note: If you have previously configured a PXF JDBC server to access Kerberos-secured Hive, you must upgrade the server definition. See Upgrading PXF in Greenplum 6.x for more information.

Changed Features

Greenplum Database 6.2.1 includes these changed features:

  • Greenplum Stream Server version 1.3.1 is included in the Greenplum distribution.

Resolved Issues

Pivotal Greenplum 6.2.1 is a minor release that resolves these issues:

29454 - gpstart
During Greenplum Database start up, the gpstart utility did not report when a segment instance failed to start. The utility always displayed 0 skipped segment starts. This issue has been resolved. gpstart output was also enhanced to provide additional warnings and summary information about the number of skipped segments. For example:
[WARNING]:-There are 1 segment(s) marked down in the database
[WARNING]:-To recover from this current state, review usage of the gprecoverseg
[WARNING]:-management utility which will recover failed segment instance databases.
30248, 9022 - DLL
Greenplum Database might generate a PANIC when an index is created on a column of an append-optimized, column-oriented table if the index definition contains a WHERE clause that references multiple columns. This issue has been resolved.
7545 - Postgres Planner
The Postgres Planner might return incorrect results for queries that contain a subquery in an EXISTS clause if the subquery includes a LIMIT [ 0| ALL| NULL] clause or an OFFSET NULL clause. This issue has been resolved.
8590 - Postgres Planner
A query that used the Postgres planner could return incorrect results if it specified a volatile function in a LIMIT clause (for example, LIMIT (random() * 10)). This occurred because Greenplum evaluated the LIMIT clause separately on each segment instance to obtain a preliminary limit, before evaluating it once again as the query was dispatched. The problem was fixed by ensuring that a volatile functions in a LIMIT clause functions are not pushed to segment instances for evaluation.
30083 - Postgres Planner
Fixed a problem in the Postgres planner that could result in the error variable not found in subplan target list. The issue applied to join queries where a table column had a user prescribed CAST applied to it while being both in the select list and in a join condition. At the same time, the column was also part of a motion operator in the query plan.
30200 - Metrics Collector
Greenplum Database 6 stores tablespaces with non-default names as symlinks in the $MASTER_DATA_DIRECTORY/pg_tblspc directory and the metrics collector did not detect these tablespaces. The metrics collector now follows the symlinks to find the names of the tablespace directories and the data directories located in those tablespaces. After enabling the new metrics collector the tablespaces may not be visible in Greenplum Command Center for up to four hours.
30203 - Query Optimizer
When updating a table's distribution column, Greenplum Database returned an error that states an UPDATE statement cannot update distribution columns if a btree index is defined on the distribution column and the UPDATE command contains an IN clause. The error was returned when Greenplum Database fell back to the Postgres planner to attempt the UPDATE operation. This issue has been resolved. Now GPORCA supports the specified type of updates.
30206 - gpinitsystem
An example in the gpinitsystem help output used an invalid option for specifying the placement of mirror segment instances in a spread configuration. The correct option is --mirror_mode=spread. This issue has been resolved.
30227 - Server
Greenplum Database with resource groups enabled might generate a PANIC when using an extension with improper debug_query_string settings. The cause was a message context issue and it has been resolved.
30256 - analyzedb
When executing some queries against partitioned tables, GPROCA would fail because of missing root partition statistics. This was caused by the analyzedb utility not updating the root partition statistics when generating the partitioned table statistics. This issue has now been resolved.
30292 - External Table
When Greenplum Database attempted to access data from an external table, a PANIC was generated when Greenplum Database could not resolve the host name that is specified in the external table definition. This issue has been resolved. Now Greenplum Database returns an error in the specified situation.
168881383 - PXF
PXF fixed a regression in file and directory name pattern matching that affected the *:text:multi profiles and S3 Select. This issue has been resolved. PXF now correctly handles wildcards specified in the LOCATION data path.
8918 - Postgres Planner
The Postgres Planner generated an incorrect result on a JOIN query when different data types were used in a table column or the query constraints included a constant, and the query required motion. This issue is resolved.
169694492 - Query Optimizer
For a table that has a column that is defined with a btree index, GPORCA fell back to the Postgres planner for queries that use IN clause against the column or an OR of simple comparisons on the column such as col = 5 OR col = 7. Now GPORCA attempts to generate a query plan that uses the index.
169806983 - Greenplum Stream Server
In some cases, reading from Kafka using the default MINIMAL_INTERVAL (0 seconds) caused GPSS to consume a large amount of CPU resources, even when no new messages existed in the Kafka topic. This issue is resolved in GPSS 1.3.1.
169807372, 169831558 - Greenplum Stream Server
GPSS 1.3.0 did not recognize internal history tables that were created with GPSS 1.2.6 and earlier. In some cases, this caused GPSS to load duplicate messages into Greenplum Database. This issue is resolved in GPSS 1.3.1.
170041280 - PXF
PXF was unable to read data from an encrypted HDFS zone and returned an org.apache.hadoop.crypto.CryptoInputStream cannot be cast to org.apache.hadoop.hdfs.DFSInputStream error in this situation. This issue is resolved.

Upgrading to Greenplum 6.2.1

Note: Greenplum 6 supports direct upgrades, using gpupgrade, from Greenplum 5.x releases to Greenplum 6.x. For more information, see the gpupgrade documentation. See also Migrating Data from Greenplum 4.3 or 5 for guidelines and considerations for migrating existing Greenplum data to Greenplum 6, using standard backup and restore procedures.

See Upgrading from an Earlier Greenplum 6 Release to upgrade your existing Greenplum 6.x software to Greenplum 6.2.1.

Deprecated Features

Deprecated features will be removed in a future major release of Greenplum Database. Pivotal Greenplum 6.x deprecates:

  • The analzyedb option --skip_root_stats (deprecated since 6.2).

    If the option is specified, a warning is issued stating that the option will be ignored.

  • The server configuration parameter gp_statistics_use_fkeys (deprecated since 6.2).
  • The following PXF configuration properties (deprecated since 6.2):
    • The PXF_USER_IMPERSONATION, PXF_PRINCIPAL, and PXF_KEYTAB settings in the pxf-env.sh file. You can use the pxf-site.xml file to configure Kerberos and impersonation settings for your new Hadoop server configurations.
    • The pxf.impersonation.jdbc property setting in the jdbc-site.xml file. You can use the pxf.service.user.impersonation property to configure user impersonation for a new JDBC server configuration.
  • The server configuration parameter gp_ignore_error_table (deprecated since 6.0).

    To avoid a Greenplum Database syntax error, set the value of this parameter to true when you run applications that execute CREATE EXTERNAL TABLE or COPY commands that include the now removed Greenplum Database 4.3.x INTO ERROR TABLE clause.

  • Specifying => as an operator name in the CREATE OPERATOR command (deprecated since 6.0).
  • The Greenplum external table C API (deprecated since 6.0).

    Any developers using this API are encouraged to use the new Foreign Data Wrapper API in its place.

  • Commas placed between a SUBPARTITION TEMPLATE clause and its corresponding SUBPARTITION BY clause, and between consecutive SUBPARTITION BY clauses in a CREATE TABLE command (deprecated since 6.0).

    Using this undocumented syntax will generate a deprecation warning message.

  • The timestamp format YYYYMMDDHH24MISS (deprecated since 6.0).

    This format could not be parsed unambiguously in previous Greenplum Database releases, and is not supported in PostgreSQL 9.4.

  • The createlang and droplang utilities (deprecated since 6.0).
  • The pg_resqueue_status system view (deprecated since 6.0).

    Use the gp_toolkit.gp_resqueue_status view instead.

  • The GLOBAL and LOCAL modifiers when creating a temporary table with the CREATE TABLE and CREATE TABLE AS commands (deprecated since 6.0).

    These keywords are present for SQL standard compatibility, but have no effect in Greenplum Database.

  • The Greenplum Platform Extension Framework (PXF) HDFS profile names for the Text, Avro, JSON, Parquet, and SequenceFile data formats (deprecated since 5.16).

    Refer to Connectors, Data Formats, and Profiles in the PXF Hadoop documentation for more information.

  • Using WITH OIDS or oids=TRUE to assign an OID system column when creating or altering a table (deprecated since 6.0).
  • Allowing superusers to specify the SQL_ASCII encoding regardless of the locale settings (deprecated since 6.0).

    This choice may result in misbehavior of character-string functions when data that is not encoding-compatible with the locale is stored in the database.

  • The @@@ text search operator (deprecated since 6.0).

    This operator is currently a synonym for the @@ operator.

  • The unparenthesized syntax for option lists in the VACUUM command (deprecated since 6.0).

    This syntax requires that the options to the command be specified in a specific order.

  • The plain pgbouncer authentication type (auth_type = plain) (deprecated since 4.x).

Migrating Data to Greenplum 6

Note: Greenplum 6 supports direct upgrades, using gpupgrade, from Greenplum 5.x releases to Greenplum 6.x. For more information, see the gpupgrade documentation.

See Migrating Data from Greenplum 4.3 or 5 for guidelines and considerations for migrating existing Greenplum data to Greenplum 6, using standard backup and restore procedures.

Known Issues and Limitations

Pivotal Greenplum 6 has these limitations:

  • Upgrading a Greenplum Database 4 or 5 release, or Greenplum 6 Beta release, to Pivotal Greenplum 6 is not supported.
  • MADlib, GPText, and PostGIS are not yet provided for installation on Ubuntu systems.
  • Greenplum 6 is not supported for installation on DCA systems.
  • Greenplum for Kubernetes is not yet provided with this release.

The following table lists key known issues in Pivotal Greenplum 6.x.

Table 1. Key Known Issues in Pivotal Greenplum 6.x
Issue Category Description
10216 ALTER TABLE, ALTER DOMAIN In some cases, heap table data is lost when performing concurrent ALTER TABLE or ALTER DOMAIN commands where one command alters a table column and the other rewrites or redistributes the table data. For example, performing concurrent ALTER TABLE commands where one command changes a column data type from int to text might cause data loss. This issue might also occur when altering a table column during the data distribution phase of a Greenplum system expansion. Greenplum Database did not correctly capture the current state of the table during command execution.

This issue is resolved in Pivotal Greenplum 6.9.0.

N/A PXF Starting in 6.x, Greenplum does not bundle cURL and instead loads the system-provided library. PXF requires cURL version 7.29.0 or newer. The officially-supported cURL for the CentOS 6.x and Red Hat Enterprise Linux 6.x operating systems is version 7.19.*. Greenplum Database 6 does not support running PXF on CentOS 6.x or RHEL 6.x due to this limitation.

Workaround: Upgrade the operating system of your Greenplum Database 6 hosts to CentOS 7+ or RHEL 7+, which provides a cURL version suitable to run PXF.

29703 Loading Data from External Tables Due to limitations in the Greenplum Database external table framework, Greenplum Database cannot log the following types of errors that it encounters while loading data:
  • data type parsing errors
  • unexpected value type errors
  • data type conversion errors
  • errors returned by native and user-defined functions
LOG ERRORS returns error information for data exceptions only. When it encounters a parsing error, Greenplum terminates the load job, but it cannot log and propagate the error back to the user via gp_read_error_log().

Workaround: Clean the input data before loading it into Greenplum Database.

170824967 gpfidsts For Greenplum Database 6.x, a command that accesses an external table that uses the gpfdists protocol fails if the external table does not use an IP address when specifying a host system in the LOCATION clause of the external table definition.
n/a Materialized Views By default, certain gp_toolkit views do not display data for materialized views. If you want to include this information in gp_toolkit view output, you must redefine a gp_toolkit internal view as described in Including Data for Materialized Views.
168689202 PXF PXF fails to run any query on Java 11 that specifies a Hive* profile due to this Hive known issue: ClassCastException when initializing HiveMetaStoreClient on JDK10 or newer.

Workaround: Run PXF on Java 8 or use the PXF JDBC Connector to access Hive.

168957894 PXF The PXF Hive Connector does not support using the Hive* profiles to access Hive transactional tables.

Workaround: Use the PXF JDBC Connector to access Hive.

169200795 Greenplum Stream Server When loading Kafka data into Greenplum Database in UPDATE and MERGE modes, GPSS requires that a MAPPING exist for each column name identified in the MATCH_COLUMNS and UPDATE_COLUMNS lists.
170202002 Greenplum-Kafka Integration Updating the METADATA:SCHEMA property and restarting a previously-run load job could cause gpkafka to re-read Kafka messages published to the topic, and load duplicate messages into Greenplum Database.
168548176 gpbackup When using gpbackup to back up a Greenplum Database 5.7.1 or earlier 5.x release with resource groups enabled, gpbackup returns a column not found error for t6.value AS memoryauditor.
164791118 PL/R PL/R cannot be installed using the deprecated createlang utility, and displays the error:
createlang: language installation failed: ERROR:  
no schema has been selected to create in
Workaround: Use CREATE EXTENSION to install PL/R, as described in the documentation.
30437 JDBC Driver JDBC cached query plans do not store the partition selector parameter that is required for performing partition elimination. If you create a JDBC prepared statement that operates against a partitioned table, partition elimination is performed the first 5 times the query is executed. After that point, the JDBC driver may choose to cache the query plan, in which case partition elimination is no longer performed and the query may suffer from extremely degraded performance.

Workaround: Use the DataDirect JDBC driver version (F000450.U000214) or later, introduced in Greenplum 6.1, and set the prepareThreshold connection parameter to a very large value. For example: jdbc:Pivotal:greenplum://<ip>:<port>;DatabaseName=<name>;prepareThreshold=1000000000 See PrepareThreshold in the DataDirect documentation.

N/A Greenplum Client/Load Tools on Windows The Greenplum Database client and load tools on Windows have not been tested with Active Directory Kerberos authentication.

Differences Compared to Open Source Greenplum Database

Pivotal Greenplum 6.x includes all of the functionality in the open source Greenplum Database project and adds:
  • Product packaging and installation script
  • Support for QuickLZ compression. QuickLZ compression is not provided in the open source version of Greenplum Database due to licensing restrictions.
  • Support for data connectors:
    • Greenplum-Spark Connector
    • Greenplum-Informatica Connector
    • Greenplum-Kafka Integration
    • Greenplum Stream Server
  • Data Direct ODBC/JDBC Drivers
  • gpcopy utility for copying or migrating objects between Greenplum systems
  • Support for managing Greenplum Database using Pivotal Greenplum Command Center
  • Support for full text search and text analysis using Pivotal GPText
  • Greenplum backup plugin for DD Boost
  • Backup/restore storage plugin API (Beta)