Using the optional migration scripts speeds up the upgrade process before you start the
gpupgrade initialize phase. The
gpupgrade migration scripts are two types, bash and SQL.
The migration bash scripts,
gpupgrade-migration-sql-executor.bash, identify catalog inconsistencies between source and target cluster, and generate the SQL scripts to resolve them. The
gpupgrade-migration-sql-generator.bash can be run outside your upgrade downtime window.
The migration SQL scripts are run by
gpupgrade-migration-sql-executor.bash during the upgrade downtime window, and resolve a subset of the issues that are found when the
gpupgrade initialize command runs the
pg_upgrade check command.
gpupgrade-migration-sql-executor.bash output directory has three subdirectories:
pre-initialize- Run the SQL scripts in the
pre-initializesubdirectory just before you run the
gpupgrade initializecommand. Execute these scripts as the first step after you start the downtime scheduled to perform the upgrade.
post-finalize- The SQL scripts in the
post-finalizesubdirectory are run after
gpupgrade finalizehas finished. These scripts perform tasks such as creating indexes that were dropped by the
post-revert- The SQL scripts in the
post-revertsubdirectory reverse the changes made by the
pre-initializescripts and are to be executed only if you revert the upgrade. Run them after the
gpupgrade revertcommand has finished.
gpupgrade-migration-sql-executor.bash script runs all of the SQL scripts in the directory you specify:
post-revert. See Executing the SQL Migration Scripts.
If you have used
yum install to install
gpupgrade, the bash scripts are located in
/usr/local/bin and the SQL scripts in
/usr/local/bin/greenplum/gpupgrade/data-migration-scripts. If you used
rpm -prefix=<USER-DIRECTORY> during
gpupgrade installation, the bash scripts are in
/<USER-DIRECTORY> and the SQL scripts in
The `gpupgrade-migration-sql-executor.bash` script must be run during the upgrade downtime window.
Migration issues that are not handled with the generated SQL scripts will be caught by the
pg_ugprade check command when you run
gpupgrade initialize, and you may have to manually resolve them. See gpupgrade Initialize pg_upgrade Checks for information about the issues that
pg_upgrade check detects and potential workarounds.
Currently, the data migration scripts detect these issues:
external tables used in partitioning
indexes on partitioned tables
tsquery data type. Greenplum Database 5.x contains the
tsquerydata type. The internal representation of this data type changed between Greenplum Database 5.x and 6.x, so the script changes the
tsquerydata type to
textbefore proceeding with the upgrade.
gphdfs external tables. The
gphdfsexternal table protocol is removed in Greenplum Database 6. These tables must be converted to use PXF. The
post-revertSQL scripts. The
pre-initializescripts drop external gphdfs tables from the databases. The
post-revertscript recreates them.
gphdfs role. The
gpupgrade-migration-sql-generator.bashscript generates a
pre-initializescript that drops the
gphdfsrole and a
post-revertscript to restore the role.
nameuser columns. The
post-revertscripts. Because the internal alignment of the
namedata type changed between the source and target Greenplum versions, user tables with
namedata type columns cannot be upgraded. The generated
pre-initializeSQL script alters the data type of such columns to
varchar(63). However, the script cannot handle some cases, such as a
namecolumn used as a distribution key or partitioning key, or a
namecolumn used as an index key. The
post-revertscript alters the
varchar(63)columns back to
tintervaldata types. Columns of these data types may not be used in a table’s distribution key. The
post-revertscripts to alter the column type.
unique or primary key contraints The
post-revertSQL scripts. Unique and primary key constraints must be dropped before upgrading and recreated after upgrading. The
pre-initializescripts drop the constraints. The
post-finalizescripts recreate them on the source cluster after finalization. The
post-revertscripts recreate constraints on the source cluster. The scripts cannot handle the case where a constraint is found on a child of a heap table. In this case you must back up the heap table and recreate it after the upgrade is completed. To restore constraints on child tables as they were in the source cluster, you must create the constraints manually.
gpupgrade-migration-sql-generator.bash script, to generate the SQL scripts:
$ gpupgrade-migration-sql-generator.bash <GPHOME> <PGPORT> <OUTPUT_DIR> [--help | -h]
<GPHOME>is the path to the source Greenplum Database installation directory.
<PGPORT>is the source Greenplum system port number.
<OUTPUT_DIR>is the user-defined directory where the SQL scripts are created. If the directory does not exist, it is created. If the directory already exists, all existing
.shfiles are removed from the
--helpdisplay help information, for example:
$ gpupgrade-migration-sql-generator.bash -h Identifies catalog inconsistencies between the source and target Greenplum versions and generates SQL scripts to resolve them. This command should be run prior to "gpupgrade". Usage: gpupgrade-migration-sql-generator.bash <GPHOME> <PGPORT> <OUTPUT_DIR> <GPHOME> : the path to the source Greenplum installation directory <PGPORT> : the source Greenplum system port number <OUTPUT_DIR> : the user-defined directory where the SQL scripts are created The output directory structure is: <output directory> + pre-initialize drop and alter objects prior to "gpupgrade initialize" + post-finalize restore and recreate objects following "gpupgrade finalize" + post-revert restore objects following "gpupgrade revert" After running gpupgrade-migration-sql-generator.bash, run gpupgrade-migration-sql-executor.bash. Run gpupgrade-migration-sql-executor.bash -h for more information.
$ gpupgrade-migration-sql-generator.bash $GPHOME 5432 /home/gpadmin/migration-output
Output files are located in: /home/gpadmin/migration-output/pre-initialize Output files are located in: /home/gpadmin/migration-output/post-finalize Output files are located in: /home/gpadmin/migration-output/post-revert
You can review the SQL scripts in the output directory to see what changes will be applied when you execute the scripts.
gpupgrade-migration-sql-executor.bash to execute the generated
post-revert SQL scripts. The shell script runs all SQL scripts in the specified directory and its subdirectories, so it is important to specify the
$ gpupgrade-migration-sql-executor.bash <GPHOME> <PGPORT> <INPUT_DIR>
<GPHOME>is the path to the source Greenplum Database installation directory.
<PGPORT>is the source Greenplum System port number.
<INPUT_DIR>is the directory containing the scripts you want to execute. This is the path to the
<OUTPUT_DIR>. Do not specify the
<OUTPUT_DIR>without a subdirectory.
After you execute the generated SQL scripts using
gpupgrade-migration-sql-generator.bash you can regenerate SQL scripts into the same <OUTPUT_DIR>. Regenerating the scripts in the same output directory will archive the previous run scripts in <OUTPUT_DIR>/archive/. The scripts will have
pre-initializemigration scripts. Caution: Only run the
pre-initializescripts after you have started the
$ gpupgrade-migration-sql-executor.bash $GPHOME 5432 /home/gpadmin/migration-output/pre-initialize
If you choose to revert the upgrade: After the
gpupgrade revertcommand has completed, run the generated
$ gpupgrade-migration-sql-executor.bash $GPHOME 5432 /home/gpadmin/migration-output/post-revert
If you choose to finalize the upgrade: After the
gpupgrade finalizecommand has completed, run the
$ gpupgrade-migration-sql-executor.bash $GPHOME 5432 /home/gpadmin/migration-output/post-finalize