IBM’s GPFS filesystem expects that all members of the GPFS cluster, be they clients or servers, have a complete and consistent state of the cluster. This causes some difficulty in re-adding a node to the cluster after it’s lost it’s configuration state for some reason. I frequently hit this problem when reprovision a compute node for some reason. The newly provisioned node will not allow to execute any GPFS commands as it knows that it’s not a member of a GPFS cluster. The other nodes of the cluster will also refuse to execute commands reguarding it because it’s state is inconsistent with their own perceived state of the cluster or “unknown”.
In this example, the node that has been reprovisioned from scratch is “dec06”.
As far as the rest of the cluster is conerned, that node is in a nonsensical state:
[root@dec01 ~]# mmgetstate -N dec06 Node number Node name GPFS state ------------------------------------------ 6 dec06 unknown
It can not be deleted from the cluster:
root@dec01 ~]# mmdelnode -N dec06 Verifying GPFS is stopped on all affected nodes ... dec06.tuc.noao.edu: mmremote: Unknown GPFS execution environment mmdelnode: Command failed. Examine previous error messages to determine cause.
Nor can GPFS be started on it:
[root@dec01 ~]# mmstartup -N dec06 Fri Mar 30 16:27:29 MST 2012: mmstartup: Starting GPFS ... dec06.tuc.noao.edu: mmremote: Unknown GPFS execution environment mmstartup: Command failed. Examine previous error messages to determine cause.
The solution lives in the manpage for mmdelnode
.
A node cannot be deleted if any of the following are true:
…
3. If the GPFS state is unknown and the node is reachable on the
network.You cannot delete a node if both of the following are true:
– The node responds to a TCP/IP ping command from another node.
– The status of the node shows unknown when you use the mmget-
state command from another node in the cluster.
Thus, we just have to make the node unpingable for a few moments.
root@dec01 ~]# ssh dec06 reboot
Wait a few moments for the host to go down and then we can delete it from cluster.
[root@dec01 ~]# mmdelnode -N dec06 Verifying GPFS is stopped on all affected nodes ... mmdelnode: Command successfully completed mmdelnode: Propagating the cluster configuration data to all affected nodes. This is an asynchronous process.
Wait another few moments for it come back online…
[root@dec01 ~]# ping dec06 PING dec06.tuc.noao.edu (140.252.27.26) 56(84) bytes of data. 64 bytes from dec06.tuc.noao.edu (140.252.27.26): icmp_seq=1 ttl=64 time=0.810 ms ^C --- dec06.tuc.noao.edu ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 655ms rtt min/avg/max/mdev = 0.810/0.810/0.810/0.000 ms
Then add it back into the cluster:
[root@dec01 ~]# mmaddnode -N dec06 Fri Mar 30 16:32:42 MST 2012: mmaddnode: Processing node dec06.tuc.noao.edu mmaddnode: Command successfully completed mmaddnode: Warning: Not all nodes have proper GPFS license designations. Use the mmchlicense command to designate licenses as needed. mmaddnode: Propagating the cluster configuration data to all affected nodes. This is an asynchronous process. [root@dec01 ~]# mmchlicense client -N dec06 The following nodes will be designated as possessing GPFS client licenses: dec06.tuc.noao.edu Please confirm that you accept the terms of the GPFS client Licensing Agreement. The full text can be found at www.ibm.com/software/sla Enter "yes" or "no": yes mmchlicense: Command successfully completed mmchlicense: Propagating the cluster configuration data to all affected nodes. This is an asynchronous process.
And the node is now back into a rational state:
[root@dec01 ~]# mmgetstate -N dec06 Node number Node name GPFS state ------------------------------------------ 18 dec06 down
2012-03-31 at 10:47
If you are just re-provisioning the node. You could always just copy the /var/mmfs/gen/mmsdrfs file from a node already in the GPFS cluster. Then GPFS would start up on the reinstalled node.
2012-03-31 at 23:13
@adamparker That’s a good point. It may also be easier to automate replacing the cluster config file with puppet instead of deleting/readding the node. My goal is to have it so puppet can handle both re-adding existing clients and adding new ones fully unattended.