E.1. Postgres-XL Release 10r1beta1

E.1.1. Overview
E.1.2. Migration to Postgres-XL 10r1beta1
E.1.3. Major Enhancements in Postgres-XL 10r1beta1
E.1.4. Changes
E.1.5. Important Bug Fixes
E.1.6. Known Limitation
E.1.7. Contributing to Postgres-XL
E.1.8. Credits

Release Date

2018-09-17

Current as of 2018-09-17

E.1.1. Overview

This major release of Postgres-XL comes after Postgres-XL 9.5r1, which was based on PostgreSQL 9.5. This release includes most of the new features added in PostgreSQL 9.6 and 10 releases. This also includes almost all significant performance enhancements that were made to PostgreSQL in the last few years.

PostgreSQL 9.6 added query parallelism and PostgreSQL 10 added declarative partitioning. Both these features are now available in Postgres-XL 10r1beta1. Both these features come with certain limitations and hence users are advised to test these new features thoroughly before deploying applications.

A lot of work has gone into improving the code quality of the distributed query planner. The query planner now uses several features available in core PostgreSQL such as partial aggregates and upper planner pathification. This has resulted in fixing several corner bug cases, simplified the code and allowed us to enable some features such as GROUPING SETS, ROLLUP and CUBE.

Apart from this, a lot of work has gone into fixing outstanding bugs, improving stability of the product and significantly improving performance for OLTP workloads. The following sections describe the major enhancements in PostgreSQL that are now available in Postgres-XL. A short list of features that are currently not supported or have limited support is also mentioned in Section E.18.3.5.

E.1.2. Migration to Postgres-XL 10r1beta1

E.1.2.1. Migration to Postgres-XL 10r1beta1 from PostgreSQL 10

A dump/restore using pg_dumpall is required for those wishing to migrate data from PostgreSQL 10 release. We don't currently support pg_upgrade, as a mechanism to upgrade from PostgreSQL 10.

E.1.2.2. Migration to Postgres-XL 10r1beta1 from Postgres-XL 9.5r1.x

Version 10r1beta1 contains a number of changes that may affect compatibility with previous releases. Since Postgres-XL 10r1beta1 release includes all changes from PostgreSQL 9.6 and PostgreSQL 10 releases, it is recommended that you review respective release notes of those releases to find the exact incompatibilities.

E.1.3. Major Enhancements in Postgres-XL 10r1beta1

Postgres-XL 10r1beta1 is the first major release after Postgres-XL 9.5r1 release. So this release contains most of the major enhancements that went into PostgreSQL releases 9.6 and 10. This is a very short list of such enhancements, but all other enhancements also apply, unless otherwise stated.

E.1.3.1. Major Enhancements from PostgreSQL 10

  • Declarative table partitioning

  • Improved query parallelism

  • Significant general performance improvements

  • Improved monitoring and control

E.1.3.2. Major Enhancements from PostgreSQL 9.6

  • Parallel execution of sequential scans, joins and aggregates

  • Avoid scanning pages unnecessarily during vacuum freeze operations

  • Substantial performance improvements, especially in the area of scalability on multi-CPU-socket servers

  • Full-text search can now search for phrases (multiple adjacent words)

E.1.4. Changes

Below you will find a detailed account of the changes between Postgres-XL 10r1beta1 and the previous major release.

E.1.4.1. Performance

  • Allow minimum value of pgxl_remote_fetch_size to zero

    This is an experimental work. By setting pgxl_remote_fetch_size to 0, user can fetch all rows at once from the remote side (instead of fetching a small number at a time, thus giving the node to produce more rows as the previous set is consumed). While fetching all rows at once is not very useful, it allows us to circumvent PostgreSQL's limitation of not supporting parallel queries unless a Portal is run once and to the end.

  • Automatically trigger ANALYZE (COORDINATOR) on remote coordinators

    When a table is manually or automatically analyzed, the coordiantor will run ANALYZE (COORDINATOR) command to all other coordinators for them to update their planner stats. This ensures that the planner stats remain up to date on all coordinators in the cluster. The feature works when a table is analyzed outside a transaction block.

  • Collect index statistics during ANALYZE on coordinator.

  • Randomize the choice of the initial ROUNDROBIN node.

    With roundrobin node, the initial node was always set to the first node in the list. That works fine when inserting many rows at once (e.g. with INSERT SELECT), but with single-row inserts this puts all data on the first node, resulting in unbalanced distribution. This randomizes the choice of the initial node, so that with single-row inserts the ROUNDROBIN behaves a bit like RANDOM distribution.

E.1.4.2. Additional Features

  • pgxc_ctl is now a built-in command and gets automatically compiled and installed along with other binaries.

  • Enable Hot Standby on the replicas.

  • Add support for MODULO distribution on BIGINT.

  • Support an additional syntax ANALYZE (COORDINATOR) to allow users to rebuild coordinator side statistics without running ANALYZE again on the datanodes.

    One of the long standing problems in multi-coordinator setup is that when a table is either manually ANALYZEd or auto-analyzed on a coordinator, the other coordinators don't update their planner statistics automatically. This is even a bigger problem with auto-analyze because a coordinator which is not involved in any DMLs, may not be even aware about the changes to the table and hence it will not pick up the table for auto-analyze, thus often generating very poor query plans. We now fix that by automatically running ANALYZE (COORDINATOR) command on the remote coordinators when a table is either manually or automatically analyzed on one coordinator. ANALYZE (COORDINATOR) does not force a ANALYZE on the datanodes, but only rebuilds coordinator side stats using the current stats available on the datanodes.

  • Support GROUPING SETS, ROLLUP and CUBE.

E.1.5. Important Bug Fixes

  • Make improvements to cluster monitor process and global state reporting.

  • Correctly select the GTM proxy for a new node being added.

  • Fixes related to handling of schema qualification of temporary tables on remote nodes.

  • Fix handling of REFRESH MATERIALIZED VIEW CONCURRENTLY.

  • Improve locking semantics in GTM and GTM Proxy.

  • Ensure pooler process follows consistent model for SIGQUIT handling.

  • Fix broken implementation of recovery to barrier.

  • Track clearly whether to run a remote transaction in autocommit or a block.

  • Disable FQS for cursors defined with SCROLL.

  • Ensure that database objects are created consistently.

    We now create views/materialised views on all nodes, unless they are temporary objects in which case they are created only on the local coordinator and the datanodes. Similarly, temporary sequences are created on the local coordinator and the datanodes.

  • Prevent dropping distribution keys for MODULO.

  • Use thread-specific storage for computing a snapshot.

  • Add a user configurable parameter pgxl_remote_fetch_size to control the number of rows fetched from the remote side during RemoteSubplan execution. Switch connections after processing pgxl_remote_fetch_size rows.

  • Allow COPY (INSERT RETURNING).

  • Handle temp sequences so that duplicate values are not produced.

  • Ensure that the config parameters specified in the gtmPxyExtraConfig file take precedence over the defaults.

  • Handle sequence's transactional behaviour on GTM.

  • Handle multi-command queries correctly inside SQL as well as plpgsql functions.

  • Do not silently skip FK constraints if loose_constraints are ON.

  • Ensure that the statistics about all-visible pages in fetched from remote datanodes and updated at the coordinator for better planning.

E.1.6. Known Limitation

While Postgres-XL strives hard to stay compatible with PostgreSQL so that applications running on PostgreSQL can be easily ported to Postgres-XL, there are certain known limitations of the product. Many of these can be addressed as and when sufficient development funding and resources become available.

  • Distributed deadlock detection is not supported yet. Deadlock detection exists on each node, just not across nodes.

  • Materialised views are currently maintained only on the coordinator.

  • EXCLUSION CONSTRAINTS are enforced when both rows map to the same datanode.

  • User defined functions have several limitations.

  • There are restrictions on complex UPDATE/DELETE queries and updating distribution column values.

  • TRIGGERs are not supported.

  • EVENT TRIGGERs are not supported.

  • SERIALIZABLE TRANSACTIONs are not supported.

  • CREATE INDEX CONCURRENTLY is not supported.

  • SAVEPOINTs are not supported.

  • Large objects are not supported.

  • Recursive queries work only in certain conditions.

  • Foreign Data Wrappers are not supported.

  • INSENSITIVE/SCROLL/WITH HOLD cursors are not supported.

  • LISTEN/NOTIFY is not supported.

  • Remote Subplan nodes are marked parallel unsafe. This implies that parallel workers can only be used for bottom most Remote Subplan node(s) in a plan tree.

  • CREATE PUBLICATION/SUBSCRIPTION is not supported.

  • All partitions of a partitioned table must have consistent column ordering and column positioning.

E.1.7. Contributing to Postgres-XL

Contributions to Postgres-XL are welcome. The code will be accepted under the same open source license that governs this released version of Postgres-XL. The authors of the patches will be credited appropriately in the release notes of the future releases.

E.1.8. Credits

Postgres-XL has been evolving over many years, originating with the GridSQL project and Postgres-XC, later combined as TransLattice Storm (StormDB). TransLattice open sourced the project, resulting in Postgres-XL 9.2. More recently, the EU-funded Big Data project AXLE funded the main work to bring the code up to date as Postgres-XL 9.5, allowing Postgres-XL to take advantage of the rapidly increasing BI features plugged into the core of PostgreSQL.

2ndQuadrant has been actively developing and managing Postgres-XL since then. This new release of the product is a result of efforts by various 2ndQuadrant engineers, with collaboration of other members of the Postgres-XL community.