Upgrading PXF
If you have installed the PXF .rpm
and have initialized, configured, and are using PXF in your current Greenplum Database 5.21.2+ or 6.x installation, you must perform some upgrade actions when you install a new version of the PXF .rpm
.
The PXF upgrade procedure has two parts. You perform one procedure before, and one procedure after, you install a new version to upgrade PXF:
- Step 1: Complete the PXF Pre-Upgrade Actions
- Install the new version of PXF
- Step 2: Upgrade PXF
Step 1: Complete the PXF Pre-Upgrade Actions
Perform this procedure before you upgrade to a new version of PXF:
Log in to the Greenplum Database master node. For example:
$ ssh gpadmin@<gpmaster>
Identify and note the version of PXF currently running in your Greenplum cluster:
gpadmin@gpmaster$ pxf version
If the
$GPHOME/pxf
directory exists, and you are running PXF version 5.12.x or older, back up the Greenplum PXF embedded installation. For example:gpadmin@gpmaster$ mkdir $HOME/pxf_gp_backup gpadmin@gpmaster$ cp -r $GPHOME/pxf $HOME/pxf_gp_backup/ gpadmin@gpmaster$ cp $GPHOME/share/postgresql/extension/pxf* $HOME/pxf_gp_backup/ gpadmin@gpmaster$ cp $GHOME/lib/postgresql/pxf* $HOME/pxf_gp_backup/
Stop PXF on each segment host as described in Stopping PXF.
Install the new version of PXF, identify and note the new PXF version number, and then continue your PXF upgrade with Step 2: Completing the PXF Upgrade.
Step 2: Upgrade PXF
After you install the new version of PXF, perform the following procedure:
Log in to the Greenplum Database master node. For example:
$ ssh gpadmin@<gpmaster>
Initialize PXF on each segment host as described in Initializing PXF. You may choose to use your existing
$PXF_CONF
for the initialization.If you are upgrading from PXF version 5.9.x or earlier and you have configured any JDBC servers that access Kerberos-secured Hive, you must now set the
hadoop.security.authentication
property to thejdbc-site.xml
file to explicitly identify use of the Kerberos authentication method. Perform the following for each of these server configs:- Navigate to the server configuration directory.
Open the
jdbc-site.xml
file in the editor of your choice and uncomment or add the following property block to the file:<property> <name>hadoop.security.authentication</name> <value>kerberos</value> </property>
Save the file and exit the editor.
If you are upgrading from PXF version 5.11.x or earlier: The PXF
Hive
andHiveRC
profiles now support column projection using column name-based mapping. If you have any existing PXF external tables that specify one of these profiles, and the external table relied on column index-based mapping, you may be required to drop and recreate the tables:- Identify all PXF external tables that you created that specify a
Hive
orHiveRC
profile. For each external table that you identify in step 1, examine the definitions of both the PXF external table and the referenced Hive table. If the column names of the PXF external table do not match the column names of the Hive table:
Drop the existing PXF external table. For example:
DROP EXTERNAL TABLE pxf_hive_table1;
Recreate the PXF external table using the Hive column names. For example:
CREATE EXTERNAL TABLE pxf_hive_table1( hivecolname int, hivecolname2 text ) LOCATION( 'pxf://default.hive_table_name?PROFILE=Hive') FORMAT 'custom' (FORMATTER='pxfwritable_import');
Review any SQL scripts that you may have created that reference the PXF external table, and update column names if required.
- Identify all PXF external tables that you created that specify a
Synchronize the PXF configuration from the master host to the standby master and each Greenplum Database segment host. For example:
gpadmin@gpmaster$ pxf cluster sync
Start PXF on each segment host as described in Starting PXF.