= Verify Cluster Installation =

ifdef::pcs[]
== Start the Cluster ==

Now that corosync is configured, it is time to start the cluster.
The command below will start corosync and pacemaker on both nodes
in the cluster.  If you are issuing the start command from a different
node than the one you ran the 'pcs cluster auth' command on earlier, you
must authenticate on current node you are logged into before you will
be allowed to start the cluster.

[source,C]
----
# pcs cluster start --all
pcmk-1: Starting Cluster...
pcmk-2: Starting Cluster...
----

An alternative to using the 'pcs cluster startall' command
is to issue either of the below commands on each node in the
cluster by hand.

[source,C]
----
# pcs cluster start
Starting Cluster...
----

or

[source,C]
----
# systemctl start corosync.service
# systemctl start pacemaker.service
----

endif::[]

== Verify Corosync Installation ==

ifdef::crmsh[]
Start Corosync on the first node

[source,C]
----
# systemctl start corosync.service
----
endif::[]

The first thing to check is if cluster communication is happy, for
that we use `corosync-cfgtool`.

ifdef::crmsh[]
[source,C]
----
# corosync-cfgtool -s
Printing ring status.
Local node ID 1702537408
RING ID 0
	id	= 192.168.122.101
	status	= ring 0 active with no faults
----
endif::[]

ifdef::pcs[]
[source,C]
----
# corosync-cfgtool -s
Printing ring status.
Local node ID 1
RING ID 0
	id	= 192.168.122.101
	status	= ring 0 active with no faults
----
endif::[]

We can see here that everything appears normal with our fixed IP
address, not a 127.0.0.x loopback address, listed as the +id+ and +no
faults+ for the status.

If you see something different, you might want to start by checking
the node's network, firewall and selinux configurations.

Next we check the membership and quorum APIs:

ifdef::pcs[]
[source,C]
----
# corosync-cmapctl  | grep members 
runtime.totem.pg.mrp.srp.members.1.ip (str) = r(0) ip(192.168.122.101) 
runtime.totem.pg.mrp.srp.members.1.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.1.status (str) = joined
runtime.totem.pg.mrp.srp.members.2.ip (str) = r(0) ip(192.168.122.102) 
runtime.totem.pg.mrp.srp.members.2.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.2.status (str) = joined

# pcs status corosync 
Membership information
 --------------------------
    Nodeid      Votes Name
         1          1 pcmk-1
         2          1 pcmk-2
----

You should see both nodes have joined the cluster.
endif::[]

ifdef::crmsh[]
[source,C]
----
# corosync-cmapctl  | grep members
runtime.totem.pg.mrp.srp.members.1702537408.ip (str) = r(0) ip(192.168.122.101)
runtime.totem.pg.mrp.srp.members.1702537408.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.1702537408.status (str) = joined

# corosync-quorumtool -l
Membership information
 --------------------------
    Nodeid      Votes Name
1702537408          1 pcmk-1
----

The node see's itself in both locations which is a good sign.

If the node list is empty when you call `corosync-quorumtool`, then
you've not correctly quorum in 'corosync.conf'.

With everything looking healthy, we start Corosync on the second node
and run the same communications check.

[source,C]
----
# ssh pcmk-2 -- systemctl start corosync.service
# ssh pcmk-2 -- corosync-cfgtool -s
Printing ring status.
Local node ID 1719314624
RING ID 0
	id	= 192.168.122.102
	status	= ring 0 active with no faults
----

Everything appears to look ok from +pcmk-2+, time to re-run the
membership and quorum checks to see if it shows up there too.

Again, if you see something different to the above, check for the
usual suspects: network, firewall and selinux.

[source,C]
----
# corosync-cmapctl  | grep members
runtime.totem.pg.mrp.srp.members.1702537408.ip (str) = r(0) ip(192.168.122.101)
runtime.totem.pg.mrp.srp.members.1702537408.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.1702537408.status (str) = joined
runtime.totem.pg.mrp.srp.members.1719314624.ip (str) = r(0) ip(192.168.122.102)
runtime.totem.pg.mrp.srp.members.1719314624.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.1719314624.status (str) = joined

# corosync-quorumtool -l

Membership information
 --------------------------
    Nodeid      Votes Name
1702537408          1 pcmk-1
1719314624          1 pcmk-2
----
endif::[]

All good!

== Verify Pacemaker Installation ==


ifdef::pcs[]
Now that we have confirmed that Corosync is functional we can check
the rest of the stack. Pacemaker has already been started, so verify
the necessary processes are running.

[source,C]
----
# ps axf
  PID TTY      STAT   TIME COMMAND
    2 ?        S      0:00 [kthreadd]
...lots of processes...
28019 ?        Ssl    0:03 /usr/sbin/corosync
28047 ?        Ss     0:00 /usr/sbin/pacemakerd -f
28048 ?        Ss     0:00  \_ /usr/libexec/pacemaker/cib
28049 ?        Ss     0:00  \_ /usr/libexec/pacemaker/stonithd
28050 ?        Ss     0:00  \_ /usr/lib64/heartbeat/lrmd
28051 ?        Ss     0:00  \_ /usr/libexec/pacemaker/attrd
28052 ?        Ss     0:00  \_ /usr/libexec/pacemaker/pengine
28053 ?        Ss     0:00  \_ /usr/libexec/pacemaker/crmd
----

If that looks ok, check the pcs status output.

[source,C]
----
# pcs status
Last updated: Fri Sep 14 09:52:25 2012
Last change: Fri Sep 14 09:51:55 2012 via crmd on pcmk-2
Stack: corosync
Current DC: pcmk-2 (2) - partition with quorum
Version: 1.1.8-1.el7-60a19ed12fdb4d5c6a6b6767f52e5391e447fec0
2 Nodes configured, unknown expected votes
0 Resources configured.

Online: [ pcmk-1 pcmk-2 ]

Full list of resources:
----

Next, check for any ERRORs during startup - there shouldn't be any.

[source,C]
----
# grep -i error /var/log/messages
----

Repeat these checks on the other node. The results should be the same.

endif::[]

ifdef::crmsh[]
Now that we have confirmed that Corosync is functional we can check
the rest of the stack. Start Pacemaker and check the necessary
processes have been started.

[source,C]
----
# systemctl start pacemaker.service
# ps axf
  PID TTY      STAT   TIME COMMAND
    2 ?        S      0:00 [kthreadd]
...lots of processes...
28019 ?        Ssl    0:03 /usr/sbin/corosync
28047 ?        Ss     0:00 /usr/sbin/pacemakerd -f
28048 ?        Ss     0:00  \_ /usr/libexec/pacemaker/cib
28049 ?        Ss     0:00  \_ /usr/libexec/pacemaker/stonithd
28050 ?        Ss     0:00  \_ /usr/lib64/heartbeat/lrmd
28051 ?        Ss     0:00  \_ /usr/libexec/pacemaker/attrd
28052 ?        Ss     0:00  \_ /usr/libexec/pacemaker/pengine
28053 ?        Ss     0:00  \_ /usr/libexec/pacemaker/crmd
----

If that looks ok, check the logs and crm_mon.

[source,C]
----
# grep pacemakerd /var/log/messages | grep -e get_cluster_type -e read_config
Apr  3 09:19:32 pcmk-1 pacemakerd[28047]:     info: get_cluster_type: Detected an active 'corosync' cluster
Apr  3 09:19:32 pcmk-1 pacemakerd[28047]:     info: read_config: Reading configure for stack: corosync
# crm_mon -1
============
Last updated: Tue Apr  3 09:21:37 2012
Last change: Tue Apr  3 09:19:54 2012 via crmd on pcmk-1
Stack: corosync
Current DC: pcmk-1 (1702537408) - partition with quorum
Version: 1.1.7-2.fc17-ee0730e13d124c3d58f00016c3376a1de5323cff
1 Nodes configured, unknown expected votes
0 Resources configured.
============

Online: [ pcmk-1 ]
----

Next, check for any ERRORs during startup - there shouldn't be any.

[source,C]
----
# grep -i error /var/log/messages
----

Repeat on the other node and display the cluster's status.

[source,C]
----
# ssh pcmk-2 -- systemctl start pacemaker.service
# crm_mon -1
============
Last updated: Tue Apr  3 09:26:23 2012
Last change: Tue Apr  3 09:26:21 2012 via crmd on pcmk-1
Stack: corosync
Current DC: pcmk-1 (1702537408) - partition with quorum
Version: 1.1.7-2.fc17-ee0730e13d124c3d58f00016c3376a1de5323cff
2 Nodes configured, unknown expected votes
0 Resources configured.
============

Online: [ pcmk-1 pcmk-2 ]
----

endif::[]
