RTFM

[Read This Fine Material] from Joshua Hoblitt

How to fix Pacemaker pcs “Error: node ‘foo’ does not appear to exist in configuration”

| 0 comments

# pcs cluster unstandby pollux1
Error: node 'pollux1' does not appear to exist in configuration

I hit this error message from Pacemaker pcs util while trying to bring nodes out of standby after having put them into that state with the command pcs cluster standby foo, updating the OS from ~RHEL6.3 -> RHEL6.5, and then rebooting the nodes to bring them up on a newer kernel.

It appears that this is an bug in pcs and there are some details in this (subscriber only) Red Hat KB article: ‘pcs cluster standby ‘ fails with “Error: node ‘nodename’ does not appear to exist in configuration” in RHEL 6 with pacemaker.

According to the KB article, this problem is fixed in pcs-0.9.90-1.el6_4.1 and pcs-0.9.90-2.el6_5.2 but can be worked around by using the crm_standby util in place of pcs for the standby/unstandby operation.

Example from the KB article:

### Standby node
# crm_standby -v on -N
### Unstandby node
# crm_standby -D -N

Continue Reading →

How to force a “sync” or “repair” check of mdadm arrays

| 0 comments

It’s likely that your distribution already has mechanism(s) for periodicially running mdadm array checks but occasionally it’s useful/nessicary to force an immediate verifcation pass. A check of a Linux mdadm/device-mapper array is initiated by echoing a command into the coresponding sysfs attribute for the mdX device to be inspected.

There are two types of checks that can be run. A “sync” check that looks for and reports issues and a “repair” check that will attempt to resolve identified issues.

To run a “sync” check:

echo check > /sys/block/mdX/md/sync_action

To run a “repair” check:

echo repair > /sys/block/mdX/md/sync_action

See the RAID Administration page of Linux RAID wiki for more details.

Demo:
Continue Reading →

How to fix “Provider must have features ‘manages_symlinks’ to set ‘ensure’ to ‘link'” exceptions

| 0 comments

I’ve been scratching my head trying to figure out why the Puppet agent on a node stopped functioning with an odd error the implies that provider detection had gone arry.

Error: Failed to apply catalog: Parameter ensure failed on File[/root/.ssh/id_rsa]: Provider must have features 'manages_symlinks' to set 'ensure' to 'link' at /etc/puppet/env/production/modules/sdm/manifests/users/mss.pp:55
Wrapped exception:
Provider must have features 'manages_symlinks' to set 'ensure' to 'link'

After chasing internal puppet issues and even network problems I finally found this post to the puppet-users group. Sure enough, a bunch of local gems had been installed — almost certainly by me testing a module in --noop mode on a host I shouldn’t have been testing on.

# gem list

*** LOCAL GEMS ***

builder (3.2.2)
bundler (1.3.5)
diff-lcs (1.2.4)
facter (1.7.3)
hiera (1.2.1)
highline (1.6.19)
json (1.4.6)
json_pure (1.8.0)
kwalify (0.7.2)
metaclass (0.0.1)
mocha (0.14.0)
net-scp (1.1.2)
net-ssh (2.7.0)
nokogiri (1.5.10)
puppet (3.3.0)
puppet-lint (0.3.2)
puppet-syntax (1.1.0)
puppetlabs_spec_helper (0.4.1)
rake (10.1.0)
rbvmomi (1.6.0)
rgen (0.6.6)
rspec (2.14.1)
rspec-core (2.14.5)
rspec-expectations (2.14.3)
rspec-mocks (2.14.3)
rspec-puppet (0.1.6)
rspec-system (2.3.0)
rspec-system-puppet (2.2.0)
rspec-system-serverspec (1.0.0)
serverspec (0.6.3)
stomp (1.2.2)
systemu (2.5.2)
trollop (2.0)

The good news is that this is easy to fix with my recipe for how to remove all Ruby Gems except those installed by system packages.