About gpupgrade

Overview

The Greenplum Database gpupgrade utility allows in-place upgrades from a major Greenplum Database 5.x version to a later 6.x major release. The cluster architecture and node count remains the same between version 5.x and 6.x. The upgrade does not require the time-consuming backup and restore typically used for major version upgrades. Greenplum Database administrators use the gpupgrade command-line interface to initiate the upgrade and then manage the upgrade through each of its phases.

Due to binary incompatibility between Greenplum Database major releases, upgrading to a new major release has in the past required creating a backup of the source database data and then restoring the backup into the newly installed target Greenplum Database version. To accommodate the source database, the backup files and the target database often required additional expensive hardware and disk storage. Significant downtime could be required to back up and restore the data and to verify the target system. The gpupgrade utility removes this lengthy process and introduces in-place upgrades.

gpupgrade is not used for Greenplum minor version upgrades (for example from 6.9 to 6.10), because minor version upgrades do not change system tables and require only a software update.

NOTE In this documentation, the existing Greenplum Database 5 cluster is the source and the new Greenplum Database 6 cluster is the target.

The gpupgrade process includes five phases. For a summary of the phases and the process see gpupgrade Process

gpupgrade Architecture

The gpupgrade utility architecture contains three types of processes:

  • Hub: Running on the master host, the hub process coordinates the agent processes and reports upgrade status information. It allows administrators to manage the upgrade process using the gpupgrade utility.

  • Agents: Running on each host in the Greenplum cluster, including the master, standby master, and segment hosts, the gpupgrade agent processes respond to instructions from the hub to perform tasks such as pre-upgrade checks, running pg_upgrade on each segment on the host, and returning status information to the hub.

  • CLI: Running on the master host, the cli issues the commands for the upgrade and allows progress visibility.

A typical architecture example: