E.24. Postgres-XL Release 9.5r1.1

E.24.1. Migration to Version Postgres-XL 9.5r1.1
E.24.2. Changes

Release Date

2016-05-12

This release contains a variety of fixes from Postgres-XL 9.5r1. For information about new features in the Postgres-XL 9.5r1 major release, see Section E.25.

This release also incorporates all the bug and security fixes in PostgreSQL 9.5.3 release.

E.24.1. Migration to Version Postgres-XL 9.5r1.1

A dump/restore is not required for those running Postgres-XL 9.5r1

E.24.2. Changes

  • Fix a nasty bug that was zeroing out clog and subtrans pages, thus causing various sorts of data corruptions.

    The bug dates back to the XC days, but became prominent in XL because of certain recent changes, especially the addition of cluster monitor process. In Postgres-XL, a node may not see all the XIDs and hence clog/subtrans log must be extended whenever a new XID crosses the previously seen page boundary. We do this by comparing the pageno where the new XID maps with the latest_page_no as stored in the shared SLRU data structure. But to handle XID wrap-arounds, we added a check for difference in number of pages to be less than CLOG_WRAP_CHECK_DELTA, which was incorrectly defined as (2^30 / CLOG_XACTS_PER_PAGE). Note that "^" is a logical XOR operator in C and hence this was returned a very small number of 28, thus causing incorrect zeroing of pages if ExtendCLOG is called with an XID which is older than what 28 clog pages can hold. All such transactions would suddenly be marked as aborted, resulting in removal of perfectly valid tuples.

    This bug is now fixed.

  • Extend CLOG and Subtrans Log correctly when a new XID is received from the remote node.

    When a datanode assigns an XID for a running transaction, it sends it back to the coordinator. If the XID maps to a new CLOG page, it must extend the CLOG to include the page.

  • Correct shared memory size calculation for Shared Queue hashtable.

  • Add a reference count mechanism for Shared Queue management.

    When a process acquires or binds to a Shared Queue, reference count is incremented and decremented when the process releases or unbinds from the queue. The new mechanism ensures that the queue is returned to the free list once all users have finished their work.

  • Interprete shared_queue_size to be per datanode value.

    Earlier, each consumer of a shared queue would get shared_queue_size/num_consumers kilobytes of shared memory. So the amount of shared memory available to each consumer greatly depends on the number of datanodes in the cluster, thus making it difficult to choose a default value. So we now treat the shared_queue_size as a per datanode setting and compute the total size by multiplying it with the number of max datanodes as configured by max_datanodes.

  • Set shared_queues to at least 1/4th of max_connections.

    This parameter is highly dependent on the number of concurrent sessions and in the worst case, every session may use more than one shared queues. While the user should set this value high enough depending on the concurrent distributed queries, we now automatically set this to at least 1/4th of the max_connections to avoid running with too small value.

  • Fix a memory leak in the GTM proxy.

  • Properly deallocate prepared statements on the remote node when user makes such request.

  • Avoid protocol breakage when pooler fails to open connection to one or more nodes.

  • Add a mechanism to selectively refresh pooler information when only connection options, such as hostname/port changes for a node.

    This allows us to retain connections to all other nodes in the cluster and just recreate connections to the node whose connection information is changed. This will be especially handy while dealing with datanode/coordinator failover