Example of Managing Transaction IDs
Example of Managing Transaction IDs
- Calculations on XID values using modulo-232 arithmetic that allow
Greenplum Database to reuse XID values. The modulo calculations determine the order
of transactions, whether one transaction has occurred before or after another, based
on the XID.
Every XID value can have up to two billion (231) XID values that are considered previous transactions and two billion (231 -1 ) XID values that are considered newer transactions. The XID values can be considered a circular set of values with no endpoint similar to a 24 hour clock.
Using the Greenplum Database modulo calculations, as long as two XIDs are within 231 transactions of each other, comparing them yields the correct result.
- A frozen XID value that Greenplum Database uses as the XID for current (visible)
data rows. Setting a row's XID to the frozen XID performs two functions.
- When Greenplum Database compares XIDs using the modulo calculations, the frozen XID is always smaller, earlier, when compared to any other XID. If a row's XID is not set to the frozen XID and 231 new transactions are run, the row appears to be run in the future based on the modulo calculation.
- When the row's XID is set to the frozen XID, the original XID can be used, without duplicating the XID. This keeps the number of data rows on disk with assigned XIDs below (232).
Simple MVCC Example
- The table is a simple table with 2 columns and 4 rows of data.
- The valid transaction ID (XID) values are from 0 up to 9, after 9 the XID restarts at 0.
- The frozen XID is -2. This is different than the Greenplum Database frozen XID.
- Transactions are performed on a single row.
- Only insert and update operations are performed.
- All updated rows remain on disk, no operations are performed to remove obsolete rows.
The example only updates the amount values. No other changes to the table.
Managing Simultaneous Transactions
This table is the initial table data on disk with no updates. The table contains two database columns for transaction IDs, xmin (transaction that created the row) and xmax (transaction that updated the row). In the table, changes are added, in order, to the bottom of the table.
- xid = 4: update tbl set amount=208 where item = 'widget'
- xid = 5: update tbl set amount=133 where item = 'sprocket'
- xid = 6: update tbl set amount=16 where item = 'widget'
In the next table, the bold items are the current rows for the table. The other rows are obsolete rows, table data that on disk but is no longer current. Using the xmax value, you can determine the current rows of the table by selecting the rows with null value. Greenplum Database uses a slightly different method to determine current table rows.
- UPDATE command changes the sprocket amount value to 133 (xmin value 5)
- SELECT command returns the value of sprocket.
During the UPDATE transaction, the database returns the value of sprocket 300, until the UPDATE transaction completes.
Managing XIDs and the Frozen XID
- Greenplum Database issues a warning stating that the database is running out
WARNING: database "database_name" must be vacuumed within number_of_transactions transactions
- Before the last XID is assigned, Greenplum Database stops accepting
transactions to prevent assigning an XID value twice and issues this
FATAL: database is not accepting commands to avoid wraparound data loss in database "database_name"
- A VACUUM operation frees up XID values so that a table can have more than 10 rows by changing the xmin values to the frozen XID.
- A VACUUM operation manages obsolete or deleted table rows on disk. This database's VACUUM command changes the XID values obsolete to indicate obsolete rows. A Greenplum Database VACUUM operation, without the FULL option, deletes the data opportunistically to remove rows on disk with minimal impact to performance and data availability.
- For the widget and sprocket rows on disk that are no longer current, the rows have been marked as obsolete.
- For the giblet and gizmo rows that are current, the xmin has been changed to
the frozen XID.
The values are still current table values (the row's xmax value is null). However, the table row is visible to all transactions because the xmin value is frozen XID value that is older than all other XID values when modulo calculations are performed.
After the VACUUM operation, the XID values 0, 1, 2, and 3 available for use.
When a row disk with the xmin value of -2 is updated, the xmax value is replaced with the transaction XID as usual, and the row on disk is considered obsolete after any concurrent transactions that access the row have completed.
Obsolete rows can be deleted from disk. For Greenplum Database, the VACUUM command, with FULL option, does more extensive processing to reclaim disk space.
Example of XID Modulo Calculations
The next table shows the table data on disk after more UPDATE transactions. The XID values have rolled over and start over at 0. No additional VACUUM operations have been performed.
When performing the modulo calculations that compare XIDs, Greenplum Database, considers the XIDs of the rows and the current range of available XIDs to determine if XID wrapping has occurred between row XIDs.
For the example table XID wrapping has occurred. The XID 1 for giblet row is a later transaction than the XID 7 for widget row based on the modulo calculations for XID values even though the XID value 7 is larger than 1.
For the widget and sprocket rows, XID wrapping has not occurred and XID 7 is a later transaction than XID 5.