Installing the Streaming Server

Installing the Streaming Server

The Greenplum Streaming Server (GPSS) components are included in the Greenplum Database 5 and 6 server distributions. If you want to install the newest version of these components, you may be required to download a package from the Pivotal Greenplum Streaming Server tile on Pivotal Network.

GPSS packages available for download for Redhat/CentOS 6 and 7 systems include:

  • GPSS gppkg Installer - A .gppkg file that you install to upgrade GPSS on all hosts in your Greenplum Database cluster.
  • GPSS ETL Installer - An .rpm file to install or upgrade GPSS on a dedicated ETL server host with no Greenplum Database bits installed.
  • GPSS tarball - A .tar.gz file that you install to upgrade GPSS on a single dedicated ETL server host that includes a Greenplum Database server installation.

About the Download Packages

The GPSS gppkg Installer and the GPSS tarball package files install the libraries, executables, and script files required to register and use the Greenplum Streaming Server client and server utilities directly into your Greenplum Database installation.

The GPSS ETL Installer package file installs the client side executables and dependent libraries, and a script to set up the ETL runtime environment.

Table 1. Installed Programs
Name Description
gpkafka Load Kafka data into Greenplum Database using a single command.
gpss Start a Pivotal Greenplum Stream Server instance.
gpsscli Manage (submit, start, stop, and so forth) a Pivotal Greenplum Stream Server data load job; currently supports Kafka and file data sources.
kafkacat https://github.com/edenhill/kafkacat Kafka test and debug utility.

Downloading a GPSS Installer

Download the appropriate GPSS installer package for your Greenplum Database version and operating system platform from Pivotal Network. For example, to download the Greenplum 6 .gppkg package for Redhat/CentOS 7, click to select the RHEL7->GPSS gppkg Installer for gpdb6 file.

The naming format of the GPSS installer files is:
gpss-gpdb<major-version>-<gpss-version>-rhel<N>-x86_64.<filetype>
For example, the GPSS installer files for Greenplum Database 5 for Redhat/CentOS 6 are named:
gpss-gpdb5-1.3.5-rhel6-x86_64.gppkg
gpss-gpdb5-1.3.5-rhel6-x86_64.tar.gz
gpss-gpdb5-1.3.5-rhel6-x86_64.rpm

Note the name and the file system location of the downloaded file.

Prerequisites

Before you install a GPSS package, ensure that you have stopped all Greenplum Streaming Server load jobs and gpss server instances running in the Greenplum Database cluster and on the ETL host system.

Installing the GPSS gppkg

The GPSS gppkg Installer updates the GPSS components on all hosts in the Greenplum Database cluster.

Note: The GPSS executables, libraries, and supporting files are installed directly into $GPHOME, overwriting the previous versions of the files.

Perform the following procedure to install the GPSS .gppkg:

  1. Locate the installer file that you downloaded from Pivotal Network and copy the file to the Greenplum Database master host.
  2. Log in to the Greenplum Database master host as the gpadmin administrative user and set up your environment. For example:
    $ ssh gpadmin@<gpmaster>
    gpadmin@gpmaster$ . /usr/local/greenplum-db/greenplum_path.sh
  3. Ensure that Greenplum Database is running.
  4. Run the gppkg command to install the GPSS .gppkg on all hosts in the Greenplum Database cluster. For example, to install the package on a Greenplum 6 cluster running on Redhat/CentOS 7:
    $ gppkg -i gpss-gpdb6-1.3.5-rhel7-x86_64.gppkg

Installing the GPSS Tarball

The GPSS tarball .tar.gz installer updates the GPSS components on a single Greenplum Database host.

Note: The GPSS executables, libraries, and supporting files are installed directly into $GPHOME, overwriting the previous versions of the files.

Perform the following procedure to install the GPSS .tar.gz:

  1. Locate the installer file that you downloaded from Pivotal Network and copy the file to the Greenplum Database host.
  2. Log in to the Greenplum Database host as the gpadmin administrative user and set up your environment. For example:
    $ ssh gpadmin@<gphost>
    gpadmin@gphost$ . /usr/local/greenplum-db/greenplum_path.sh
  3. Unpack the .tar.gz file. For example, to unpack the file for Greenplum 5 on Redhat/CentOS 6:
    gpadmin@gphost$ tar xzvf gpss-gpdb5-1.3.5-rhel6-x86_64.tar.gz

    Unpacking the file creates a directory named gpss-gpdb5-1.3.5-rhel6_x86_64/ in the current working directory. Its contents include bin/, lib/, and share/ directories, as well as an install script named install_gpdb_component.

  4. Navigate to the unpacked directory. For example:
     gpadmin@gphost$ cd gpss-gpdb5-1.3.5-rhel6_x86_64
  5. Run the install script to install the new GPSS components into $GPHOME. For example:
     gpadmin@gphost$ ./install_gpdb_component

Installing the GPSS ETL RPM

The GPSS ETL Installer installs the GPSS commands on a single ETL host.

Perform the following procedure to install the GPSS ETL RPM:

  1. Locate the installer file that you downloaded from Pivotal Network and copy the file to the ETL host.
  2. Log in to the ETL host. For example:
    $ ssh <etluser>@<etlhost>
  3. Install the RPM using your package management utility. You must be the superuser or have sudo access to install packages. For example, to install the ETL package for Greenplum 6 on Redhat/CentOS 7:
    etluser@etlhost$ sudo yum install gpss-gpdb6-1.3.5-rhel7-x86_64.rpm

    The GPSS ETL tools are installed into the /usr/local/gpss-<version> directory. The installation process creates a symbolic link from /usr/local/gpss to this install directory.

  4. Before using the GPSS ETL tools, you must first source the gpss_path.sh environment file:
    etluser@etlhost$ . /usr/local/gpss/gpss_path.sh