18.3. Starting Postgres-XL Cluster

18.3.1. Creating Databases
18.3.2. Starting a GTM
18.3.3. Starting a GTM-Proxy
18.3.4. Configuring Datanodes
18.3.5. Configuring Coordinators
18.3.6. Starting Datanodes
18.3.7. Starting Coordinators
18.3.8. Server Start-up Failures
18.3.9. Client Connection Problems

Before anyone can access the database, you must start the database server. The database server program is called postgres. The postgres program must know where to find the data it is supposed to use. This is done with the -D option. Thus, the simplest way to start the server is:

$ postgres -D /usr/local/pgsql/data

which will leave the server running in the foreground. This must be done while logged into the PostgreSQL user account. Without -D, the server will try to use the data directory named by the environment variable PGDATA. If that variable is not provided either, it will fail.

Normally it is better to start postgres in the background. For this, use the usual Unix shell syntax:

$ postgres -D /usr/local/pgsql/data >logfile 2>&1 &

It is important to store the server's stdout and stderr output somewhere, as shown above. It will help for auditing purposes and to diagnose problems. (See Section 24.3 for a more thorough discussion of log file handling.)

The postgres program also takes a number of other command-line options. For more information, see the postgres reference page and Chapter 19 below.

This shell syntax can get tedious quickly. Therefore the wrapper program pg_ctl is provided to simplify some tasks. For example:

pg_ctl start -l logfile

will start the server in the background and put the output into the named log file. The -D option has the same meaning here as for postgres. pg_ctl is also capable of stopping the server.

Normally, you will want to start the database server when the computer boots. Autostart scripts are operating-system-specific. There are a few distributed with PostgreSQL in the contrib/start-scripts directory. Installing one will require root privileges.

Different systems have different conventions for starting up daemons at boot time. Many systems have a file /etc/rc.local or /etc/rc.d/rc.local. Others use init.d or rc.d directories. Whatever you do, the server must be run by the PostgreSQL user account and not by root or any other user. Therefore you probably should form your commands using su postgres -c '...'. For example:

su postgres -c 'pg_ctl start -D /usr/local/pgsql/data -l serverlog'

Here are a few more operating-system-specific suggestions. (In each case be sure to use the proper installation directory and user name where we show generic values.)

While the server is running, its PID is stored in the file postmaster.pid in the data directory. This is used to prevent multiple server instances from running in the same data directory and can also be used for shutting down the server.

As described in the previous chapter, XL consists of various components. Minimum set of components are a GTM, GTM-Proxy, Coordinator and Datanode. You must configure and start each of them. Following sections will give you how to configure and start them. pgxc_clean and GTM-Standby are described in high-availability sections.

18.3.1. Creating Databases

You should initialize each database which composes Postgres-XL database cluster system. Both Coordinator and Datanode has its own database and you should initialize these database. Coordinator holds just database catalog and temporary data store. Datanode holds most of your data. First of all, you should determine how many Coordinators/Datanodes to run and where they should run. It is a good convention that you run a Coordinator where you run a Datanode. In this case, you should run GTM-Proxy on the same server too. It simplifies XL configuration and help to make workload of each servers even.

Both Coordinator and Datanode have their own databases, essentially PostgreSQL databases. They are separate and you should initialize them separately.

18.3.2. Starting a GTM

The GTM provides global transaction management feature to all the other components in Postgres-XL database cluster. Because the GTM handles transaction requirements from all the Coordinators and Datanodes, it is highly advised to run this in a separate server.

Before you start the GTM, you should decide followings:

Where to run the GTM

Because the GTM receives all the request to begin/end transactions and to refer to sequence values, you should run the GTM in a separate server. If you run the GTM in the same server as Datanode or Coordinator, it will become harder to make a workload reasonably balanced.

Then, you should determine the GTM's working directory. Please create this directory before you run the GTM.

Listen address and port of GTM

Next, you should determine listen address and port of the GTM. Listen address can be either the IP address or host name which receives request from other component, typically GTM-Proxy.

GTM id

You have a chance to run more than one GTM in one Postgres-XL cluster. For example, if you need a backup of GTM in high-availability environment, you need to run two GTMs. You should give unique GTM id to each of such GTMs. GTM id value begins with one.

When this is determined, you can initialize the GTM with the command initgtm, for example:

$ initgtm -Z gtm -D /usr/local/pgsql/data_gtm

All the parameters related to the GTM can be modified in gtm.conf located in data folder initialized by initgtm.

Then you can start the GTM as follows:

$ gtm -D /usr/local/pgsql/data_gtm

where -D option specifies working directory of the GTM.

Alternatively, the GTM can be started using gtm_ctl, for example:

$ gtm_ctl -Z gtm start -D /usr/local/pgsql/data_gtm

18.3.3. Starting a GTM-Proxy

A GTM-Proxy is not a mandatory component of Postgres-XL cluster but it can be used to group messages between the GTM and cluster nodes, reducing workload and the number of packages exchanged through network.

As described in the previous section, a GTM-Proxy needs its own listen address, port, working directory and GTM-Proxy ID, which should be unique and begins with one. In addition, you should determine how many working threads to run. You should also use the GTM's address and port to start GTM-Proxy.

Then, you need first to initialize a GTM-Proxy with initgtm, for example:

$ initgtm -Z gtm_proxy -D /usr/local/pgsql/data_gtm_proxy

All the parameters related to a GTM-Proxy can be modified in gtm_proxy.conf located in data folder initialized by initgtm.

Then, you can start a GTM-Proxy like:

$ gtm_proxy -D /usr/local/pgsql/data_gtm_proxy

where -D specifies GTM-Proxy's working directory.

Alternatively, you can start a GTM-Proxy using gtm_ctl as follows:

$ gtm_ctl start -Z gtm_proxy -D /usr/local/pgsql/data_gtm_proxy

18.3.4. Configuring Datanodes

Before starting Coordinator or Datanode, you must configure them. You can configure Coordinator or Datanode by editing postgresql.conf file located at their working directory as you specified by -D option in initdb command.

Datanode is almost native PostgreSQL with some extensions. Additional options in postgresql.conf for the Datanode are as follows:

max_connections

This value is not just a number of connections you expect to each Coordinator. Each Coordinator backend has a chance to connect to all the Datanodes. You should specify number of total connections whole Coordinator may accept. For example, if you have five Coordinators and each of them may accept forty connections, you should specify 200 as this parameter value.

max_prepared_transactions

Even though your application does not intend to issue PREPARE TRANSACTION, a Coordinator may issue this internally when more than one Datanodes are involved. You should specify this parameter the same value as max_connections.

pgxc_node_name

The GTM needs to identify each Datanode, as specified by this parameter. The value should be unique and start with one.

port

Because both Coordinator and Datanode may run on the same server, you may want to assign separate port number to the Datanode.

gtm_port

Specify the port number of the GTM-Proxy, as specified in -p option in gtm_proxy or gtm_ctl.

gtm_host

Specify the host name or IP address of the GTM-Proxy, as specified in -h option in gtm_proxy or gtm_ctl.

shared_queues

For some joins that occur in queries, data from one Datanode may need to be joined with data from another Datanode. Postgres-XL uses shared queues for this purpose. During execution each Datanode knows if it needs to produce or consume tuples, or both.

Note that there may be mulitple shared_queues used even for a single query. So a value should be set taking into account the number of connections it can accept and expected number of such joins occurring simultaneously.

shared_queue_size

This parameter sets the size of each each shared queue allocated.

18.3.5. Configuring Coordinators

Although Coordinators and Datanodes shares the same binary, their configuration is a little different due to their functionalities.

max_connections

You don't have to take other Coordinators or Datanodes into account. Just specify the number of connections the Coordinator accepts from applications.

max_prepared_transactions

Specify at least total number of Coordinators in the cluster.

pgxc_node_name

The GTM needs to identify each Datanode, as specified by this parameter.

port

Because both a Coordinator and Datanode may run on the same server, you may want to assign separate port numbers to the Coordinator. It may be convenient to use default value of PostgreSQL listen port.

gtm_port

Specify the port number of the GTM-Proxy, as specified in -p option in gtm_proxy or gtm_ctl.

gtm_host

Specify the host name or IP address of the GTM-Proxy, as specified in -h option in gtm_proxy or gtm_ctl.

pooler_port

Specify the port number that the pooler should use. This must not conflict with any other server ports used on this host.

max_pool_size

A Coordinator maintains connections to Datanodes as a pool. This parameter specifies max number of connections the Coordinator maintains. Specify max_connection value of remote nodes as this parameter value.

min_pool_size

This is the minimum number of Coordinators to remote node connections maintained by the pooler. Typically specify 1.

pool_conn_keepalive

This parameter specifies how long to keep the connection alive. If older than this amount, the pooler discards the connection. This parameter is useful in multi-tenant environments where many connections to many different databases may be used, so that idle connections may cleaned up. It is also useful for automatically closing connections occasionally in case there is some unknown memory leak so that this memory can be freed.

pool_maintenance_timeout

This parameter specifies how long to wait until pooler maintenance is performed. During such maintenance, old idle connections are discarded. This parameter is useful in multi-tenant environments where many connections to many different databases may be used, so that idle connections may cleaned up.

remote_query_cost

This parameter specifies the cost overhead of setting up a remote query to obtain remote data. It is used by the planner in costing queries.

network_byte_cost

This parameter is used in query cost planning to estimate the cost involved in row shipping and obtaining remote data based on the expected data size. Row shipping is expensive and adds latency, so this setting helps to favor plans that minimizes row shipping.

sequence_range

This parameter is used to get several sequence values at once from the GTM. This greatly speeds up COPY and INSERT SELECT operations where the target table uses sequences. Postgres-XL will not use this entire amount at once, but will increase the request size over time if many requests are done in a short time frame in the same session. After a short time without any sequence requests, decreases back down to 1. Note that any settings here are overriden if the CACHE clause was used in CREATE SEQUENCE or ALTER SEQUENCE.

max_coordinators

This is the maximum number of Coordinators that can be configured in the cluster. Specify exact number if it is not planned to add more Coordinators while cluster is running, or greater, if it is desired to dynamically resize cluster. It costs about 140 bytes of shared memory per slot.

max_datanodes

This is the maximum number of Datanodes configured in the cluster. Specify exact number if it is not planned to add more Datanodes while cluster is running, or greater, if it is desired to dynamically resize cluster. It costs about 140 bytes of shared memory per slot.

enforce_two_phase_commit

Enforce the usage of two-phase commit on transactions involving ON COMMIT actions or temporary objects. Usage of autocommit instead of two-phase commit may break data consistency so use at your own risk.

18.3.6. Starting Datanodes

Now you can start central component of Postgres-XL, Datanode and Coordinator. If you're familiar with starting PostgreSQL database server, this step is very similar to PostgreSQL.

You can start a Datanode as follows:

$ postgres --datanode -D /usr/local/pgsql/data

--datanode specifies postgres should run as a Datanode. You may need to specify -i postgres to accept connection from TCP/IP connections or edit pg_hba.conf if cluster uses nodes among several servers.

18.3.7. Starting Coordinators

You can start a Coordinator as follows:

$ postgres --coordinator -D /usr/local/pgsql/Datanode

--coordinator specifies postgres should run as a Coordinator. You may need to specify -i postgres to accept connection from TCP/IP connections or edit pg_hba.conf if cluster uses nodes among several servers.

18.3.8. Server Start-up Failures

There are several common reasons the server might fail to start. Check the server's log file, or start it by hand (without redirecting standard output or standard error) and see what error messages appear. Below we explain some of the most common error messages in more detail.

LOG:  could not bind IPv4 address "127.0.0.1": Address already in use
HINT:  Is another postmaster already running on port 5432? If not, wait a few seconds and retry.
FATAL:  could not create any TCP/IP sockets

This usually means just what it suggests: you tried to start another server on the same port where one is already running. However, if the kernel error message is not Address already in use or some variant of that, there might be a different problem. For example, trying to start a server on a reserved port number might draw something like:

$ postgres -p 666
LOG:  could not bind IPv4 address "127.0.0.1": Permission denied
HINT:  Is another postmaster already running on port 666? If not, wait a few seconds and retry.
FATAL:  could not create any TCP/IP sockets

A message like:

FATAL:  could not create shared memory segment: Invalid argument
DETAIL:  Failed system call was shmget(key=5440001, size=4011376640, 03600).

probably means your kernel's limit on the size of shared memory is smaller than the work area PostgreSQL is trying to create (4011376640 bytes in this example). Or it could mean that you do not have System-V-style shared memory support configured into your kernel at all. As a temporary workaround, you can try starting the server with a smaller-than-normal number of buffers (shared_buffers). You will eventually want to reconfigure your kernel to increase the allowed shared memory size. You might also see this message when trying to start multiple servers on the same machine, if their total space requested exceeds the kernel limit.

An error like:

FATAL:  could not create semaphores: No space left on device
DETAIL:  Failed system call was semget(5440126, 17, 03600).

does not mean you've run out of disk space. It means your kernel's limit on the number of System V semaphores is smaller than the number PostgreSQL wants to create. As above, you might be able to work around the problem by starting the server with a reduced number of allowed connections (max_connections), but you'll eventually want to increase the kernel limit.

If you get an illegal system call error, it is likely that shared memory or semaphores are not supported in your kernel at all. In that case your only option is to reconfigure the kernel to enable these features.

Details about configuring System V IPC facilities are given in Section 18.4.1.

18.3.9. Client Connection Problems

Although the error conditions possible on the client side are quite varied and application-dependent, a few of them might be directly related to how the server was started. Conditions other than those shown below should be documented with the respective client application.

psql: could not connect to server: Connection refused
        Is the server running on host "server.joe.com" and accepting
        TCP/IP connections on port 5432?

This is the generic I couldn't find a server to talk to failure. It looks like the above when TCP/IP communication is attempted. A common mistake is to forget to configure the server to allow TCP/IP connections.

Alternatively, you'll get this when attempting Unix-domain socket communication to a local server:

psql: could not connect to server: No such file or directory
        Is the server running locally and accepting
        connections on Unix domain socket "/tmp/.s.PGSQL.5432"?

The last line is useful in verifying that the client is trying to connect to the right place. If there is in fact no server running there, the kernel error message will typically be either Connection refused or No such file or directory, as illustrated. (It is important to realize that Connection refused in this context does not mean that the server got your connection request and rejected it. That case will produce a different message, as shown in Section 20.4.) Other error messages such as Connection timed out might indicate more fundamental problems, like lack of network connectivity.