RTFM

[Read This Fine Material] from Joshua Hoblitt

How to create a new GPFS 3.5 filesystem

| 0 comments

First we need to identify the block devices that will be part of the new filesystem. In this example, we’re creating a new filesystem from disks directly attached to only one GPFS server that already has existing filesystems with disks attached to multiple server nodes.

# lsscsi
[0:0:0:0]    disk    ATA      INTEL SSDSC2CW12 400i  /dev/sda 
[1:0:0:0]    disk    ATA      INTEL SSDSC2CW12 400i  /dev/sdb 
[6:0:8:0]    enclosu LSI CORP SAS2X36          0717  -       
[6:0:9:0]    enclosu LSI CORP SAS2X36          0717  -       
[6:1:12:0]   enclosu LSI      SAS2X36          0e0b  -       
[6:1:34:0]   enclosu LSI      SAS2X36          0e0b  -       
[6:2:0:0]    disk    LSI      MR9286CV-8e      3.23  /dev/sdc 
[6:2:1:0]    disk    LSI      MR9286CV-8e      3.23  /dev/sdd 
[6:2:2:0]    disk    LSI      MR9286CV-8e      3.23  /dev/sde 
[6:2:3:0]    disk    LSI      MR9286CV-8e      3.23  /dev/sdf 
[6:2:4:0]    disk    LSI      MR9286CV-8e      3.23  /dev/sdg 
[6:2:5:0]    disk    LSI      MR9286CV-8e      3.23  /dev/sdh 
[6:2:6:0]    disk    LSI      MR9286CV-8e      3.23  /dev/sdi 
[6:2:7:0]    disk    LSI      MR9286CV-8e      3.23  /dev/sdj 
#  mmlsnsd -m | grep `hostname`
 foo_nsd1 8CFC1C0B507CC600   /dev/sdc       foo.example.com     server node
 foo_nsd2 8CFC1C0B507CC602   /dev/sdd       foo.example.com     server node
 foo_nsd3 8CFC1C0B507CC605   /dev/sde       foo.example.com     server node
 foo_nsd4 8CFC1C0B5122CB3C   /dev/sdf       foo.example.com     server node

By manually comparing the two lists of block devices, we see that /dev/sd[ghij] are the new block devices that don’t have a GPFS NSD.

We need to create an NSD for each new block device.

cat > stanzafile.txt << END
%nsd: device=sdg
  nsd=bar_nsd1
  servers=pollux3
  usage=dataAndMetadata
  failureGroup=1
  pool=system
%nsd: device=sdh
  nsd=bar_nsd2
  servers=pollux3
  usage=dataAndMetadata
  failureGroup=1
  pool=system
%nsd: device=sdi
  nsd=bar_nsd3
  servers=pollux3
  usage=dataAndMetadata
  failureGroup=1
  pool=system
%nsd: device=sdj
  nsd=bar_nsd4
  servers=pollux3
  usage=dataAndMetadata
  failureGroup=1
  pool=system
END
# mmcrnsd -F stanzafile.txt
mmcrnsd: Processing disk sdg
mmcrnsd: Processing disk sdh
mmcrnsd: Processing disk sdi
mmcrnsd: Processing disk sdj
mmcrnsd: Propagating the cluster configuration data to all
  affected nodes.  This is an asynchronous process.
#  mmlsnsd -X | grep `hostname`
 bar_nsd1   8CFC1C0B519A7DF7   /dev/sdg       generic  foo.example.com     server node
 bar_nsd2   8CFC1C0B519A7DF8   /dev/sdh       generic  foo.example.com     server node
 bar_nsd3   8CFC1C0B519A7DF9   /dev/sdi       generic  foo.example.com     server node
 bar_nsd4   8CFC1C0B519A7DFA   /dev/sdj       generic  foo.example.com     server node
 foo_nsd1   8CFC1C0B507CC600   /dev/sdc       generic  foo.example.com     server node
 foo_nsd2   8CFC1C0B507CC602   /dev/sdd       generic  foo.example.com     server node
 foo_nsd3   8CFC1C0B507CC605   /dev/sde       generic  foo.example.com     server node
 foo_nsd4   8CFC1C0B5122CB3C   /dev/sdf       generic  foo.example.com     server node

We're now ready to create the filesystem. Note that GPFS must be running on the server (it's not required for defining new NSDs) in order to create a new filesystem.

# mmgetstate 

 Node number  Node name        GPFS state 
------------------------------------------
       3      foo          active
# mmcrfs bar3 -F ./stanzafile.txt -B 2M -E no -K no -L16M -Q yes --perfileset-quota --filesetdf -v yes -S yes -T /net/bar3

The following disks of bar3 will be formatted on node foo.example.com:
    bar_nsd1: size 31251759104 KB
    bar_nsd2: size 31251759104 KB
    bar_nsd3: size 31251759104 KB
    bar_nsd4: size 31251759104 KB
Formatting file system ...
Disks up to size 256 TB can be added to storage pool system.
Creating Inode File
Creating Allocation Maps
Creating Log Files
Clearing Inode Allocation Map
Clearing Block Allocation Map
Formatting Allocation Map for storage pool system
  86 % complete on Mon May 20 12:53:36 2013
 100 % complete on Mon May 20 12:53:41 2013
Completed creation of file system /dev/bar3.
mmcrfs: Propagating the cluster configuration data to all
  affected nodes.  This is an asynchronous process.
# mmlsfs bar3
flag                value                    description
------------------- ------------------------ -----------------------------------
 -f                 65536                    Minimum fragment size in bytes
 -i                 512                      Inode size in bytes
 -I                 32768                    Indirect block size in bytes
 -m                 1                        Default number of metadata replicas
 -M                 2                        Maximum number of metadata replicas
 -r                 1                        Default number of data replicas
 -R                 2                        Maximum number of data replicas
 -j                 cluster                  Block allocation type
 -D                 nfs4                     File locking semantics in effect
 -k                 all                      ACL semantics in effect
 -n                 32                       Estimated number of nodes that will mount file system
 -B                 2097152                  Block size
 -Q                 user;group;fileset       Quotas enforced
                    none                     Default quotas enabled
 --filesetdf        Yes                      Fileset df enabled?
 -V                 13.01 (3.5.0.0)          File system version
 --create-time      Mon May 20 12:53:43 2013 File system creation time
 -u                 Yes                      Support for large LUNs?
 -z                 No                       Is DMAPI enabled?
 -L                 16777216                 Logfile size
 -E                 No                       Exact mtime mount option
 -S                 Yes                      Suppress atime mount option
 -K                 no                       Strict replica allocation option
 --fastea           Yes                      Fast external attributes enabled?
 --inode-limit      122081280                Maximum number of inodes
 -P                 system                   Disk storage pools in file system
 -d                 bar_nsd1;bar_nsd2;bar_nsd3;bar_nsd4  Disks in file system
 --perfileset-quota yes                      Per-fileset quota enforcement
 -A                 yes                      Automatic mount option
 -o                 none                     Additional mount options
 -T                 /net/bar3                Default mount point
 --mount-priority   0                        Mount priority

The new filesystem needs to be manually mounted even through we set it to automount on GPFS startup (because GPFS was already running).

# mmmount /net/bar3
Mon May 20 12:59:01 MST 2013: mmmount: Mounting file systems ...
# df -h /net/bar3
Filesystem            Size  Used Avail Use% Mounted on
/dev/bar3             117T  2.7G  117T   1% /net/bar3
# mmdf bar3
disk                disk size  failure holds    holds              free KB             free KB
name                    in KB    group metadata data        in full blocks        in fragments
--------------- ------------- -------- -------- ----- -------------------- -------------------
Disks in storage pool: system (Maximum disk size allowed is 233 TB)
bar_nsd1        31251759104        1 Yes      Yes     31251075072 (100%)          3968 ( 0%) 
bar_nsd2        31251759104        1 Yes      Yes     31251077120 (100%)          1984 ( 0%) 
bar_nsd3        31251759104        1 Yes      Yes     31251073024 (100%)          1984 ( 0%) 
bar_nsd4        31251759104        1 Yes      Yes     31251075072 (100%)          3904 ( 0%) 
                -------------                         -------------------- -------------------
(pool total)     125007036416                          125004300288 (100%)         11840 ( 0%)

                =============                         ==================== ===================
(total)          125007036416                          125004300288 (100%)         11840 ( 0%)

Inode Information
-----------------
Number of used inodes:            4041
Number of free inodes:          503863
Number of allocated inodes:     507904
Maximum number of inodes:    122081280

And we're done.

Leave a Reply