Summary of Contents for IBM Storage Scale System 6000
Page 1
IBM Storage Scale System 6000 Service Guide SC28-3484-04...
Page 2
“How to submit your comments” on page xi. When you send information to IBM, you grant IBM a nonexclusive right to use or distribute the information in any way it believes appropriate without incurring any obligation to you.
Page 3
Miscellaneous equipment specification (MES) instructions..............26 NVMe storage drives concurrent MES upgrade for IBM Storage Scale System 6000......26 IBM FlashCore Module concurrent MES upgrade for IBM Storage Scale System 6000....35 Adapter concurrent MES upgrade for IBM Storage Scale System 6000..........44 Chapter 2. Part Listings..................53...
Page 5
Figures 1. Displaying PSUs - rear of the enclosure view....................2 2. Locating the faulty PSU..........................2 3. Identifying a faulty PSU based on the PSU amber LED................3 4. Removing a faulty PSU..........................4 5. Installing a FRU PSU............................4 6.
Page 6
28. Hybrid model: NVMe MES......................... 29 29. Identifying drives and drive fillers......................36 30. Performance model: FCM MES......................... 37 31. Hybrid model: FCM MES........................... 38 32. IBM Storage Scale System 6000 adapter slots..................45 33. IBM Storage Scale System 6000 adapter slots..................46 vi ...
Page 9
About this information Who should read this information This information is intended for administrators of IBM Storage Scale System that includes IBM Storage Scale RAID. IBM Storage Scale System information units IBM Storage Scale System 6000 documentation consists of the following information units.
Page 10
• IBM Power9 servers, see 5105 22E: Reference information. • IBM Storage Scale System Utility Node servers, see IBM Storage Scale System Utility Node documentation. For the latest support information about IBM Storage Scale RAID, see the IBM Storage Scale RAID FAQ in IBM Documentation. Conventions used in this information Table 1 on page x describes the typographic conventions used in this information.
Page 11
In the left margin of the document, vertical lines indicate technical changes to the information. How to submit your comments To contact the IBM Storage Scale development organization, send your comments to the following email address: scale@us.ibm.com About this information xi...
Page 12
Storage Scale System 6000: Service Guide...
Page 13
Note: An IBM intranet connection is required to access this content. Removing and replacing a power supply unit Refer to the service procedure to remove and replace a faulty power supply unit (PSU) in an IBM Storage Scale System 6000 enclosure.
Page 14
No tools are needed to complete this task. Do not remove or loosen any screws. Use the following replacement parts to service a faulty PSU in an IBM Storage Scale System 6000 enclosure: •...
Page 15
Figure 3. Identifying a faulty PSU based on the PSU amber LED The following is a sample output of the system reporting a faulty PSU. # mmhealth node eventlog | grep power 2023-12-08 11:56:36.794146 CST power_supply_absent WARNING Power supply psu4_bottom_right_id3 is missing. To check for an absent (missing power) PSU, use the mmlsenclosure all -L | grep power command.
Page 16
With a finger, press the metallic latch of the SAS cable connector to fasten it to the adapter. The metallic latch must be flush with the PCIe slot. See the next figure for reference. 4 IBM Storage Scale System 6000: Service Guide...
Page 17
Figure 6. Inserting the SAS cable in its port b) To make sure that the metallic latch of the SAS cable connector is fastened, grab the cable connector with two fingers and gently pull the cable. Tip: When a SAS cable is properly installed, it can be disconnected only by pulling its blue plastic latch.
Page 18
Figure 8. Verifying PSU status based on the PSU green LEDs Removing and replacing an NVMe drive Refer to the service procedure to remove and replace a faulty NVMe drive in an IBM Storage Scale System 6000 enclosure. The following steps are the high-level flow of the procedure: 1.
Page 19
No tools are needed to complete this task. Do not remove or loosen any screws. Use one of the following replacement parts to service a faulty drive in an IBM Storage Scale System 6000 enclosure: Important: The FRU PN that you use to replace the faulty drive must be an exact match to the other drives that are installed in the same enclosure.
Page 20
This procedure requires access to either management GUI or CLI command as a root user. Replacing a drive is a task that a customer can complete following the service procedure. If IBM service personnel are running this task, it requires coordination with the customer for steps that require root user access as part of the procedure.
Page 21
A drive carrier assembly is composed of the drive or drive blank (drive slot filler) and a drive carrier, and it is used to provide for controlled insertion into, and extraction from, the storage enclosure. Drive carrier assemblies are installed from the front of the enclosure, which simplifies service access. Closing the drive carrier handle ensures complete seating of the connectors.
Page 22
2. Prepare the faulty drive for replacement. In an IBM Storage Scale System 6000 enclosure, a faulty drive causes the amber fault LED of its drive carrier to be on (nonflashing). At the end of this step, confirm that the amber LED for the drive to be removed is on.
Page 23
Important: If the faulty drive is an IBM FlashCore Module 4 (FCM 4), before you can proceed to replace and prepare all the NVMe drives, an additional subprocedure is required to trigger and collect the drive data export. First, confirm that the IBM Storage Scale System 6000 has FCM 4 drives by running the tslsenclslot_nvme | grep -i NVMe command.
Page 24
LED on to indicate that it is safe to remove the drive. 3. Remove the faulty drive, which has its amber LED on. a) To unlock the drive and release latch, press the touchpoint (blue triangle), as shown in the next figure. 12 IBM Storage Scale System 6000: Service Guide...
Page 25
Figure 13. Unlocking the drive and release latch b) Lift the handle and slide the drive out of the enclosure, as shown in the following figure. Chapter 1. Servicing (customer tasks) 13...
Page 26
Figure 14. Removing an NVMe drive 4. Insert the FRU drive into the slot. a) Press the release catch on the drive carrier to open the handle. The carrier handle releases from the locked position. 14 IBM Storage Scale System 6000: Service Guide...
Page 27
Figure 15. Displaying NVMe drive replacement part b) Hold the frame beneath the touch point and gently push the drive carrier assembly into the drive bay until the carrier handle engages. See the next figure for reference. Figure 16. Pushing the FRU drive carrier assembly Chapter 1.
Page 28
After the drive is replaced, run the following command: mmvdisk pdisk replace –-rg <rg name> --pdisk <pdisk name> Only for FCM: If the drive that you are replacing is an IBM FlashCore Module, run the mmvdisk pdisk replace command twice.
Page 29
Preparing an FCM 4 drive for service Before you replace an IBM FlashCore Module 4 (FCM 4) in an IBM Storage Scale System 6000 enclosure, you must manually trigger and collect the drive debug log from the powered off drive.
Page 30
Consult the current value for disk discovery interval and verify that it is set to 180 seconds. Issue the next command: mmlsconfig nsdRAIDDiskDiscoveryInterval In the following example, the value is corroborated as 180. 18 IBM Storage Scale System 6000: Service Guide...
Page 31
d) Set the value for disk discovery interval to 0 seconds in the impacted node class. Issue the next command: mmchconfig nsdRAIDDiskDiscoveryInterval=0 -i -N <node class> In the following example, the value is set as 0 for the node class noted in step 2b. e) Verify that the value for disk discovery interval changed to 0 seconds.
Page 32
In the following example, the value is set as DEFAULT for the node class noted in step 2b. And issue the following command to verify that the value is the same as initially seen in step 2c: 20 IBM Storage Scale System 6000: Service Guide...
Page 33
Removing and replacing a drive slot filler Refer to the service procedure to remove and replace a faulty or damaged drive slot filler in an IBM Storage Scale System 6000 enclosure. The following steps are the high-level flow of the procedure: 1.
Page 34
Refer to the following steps to remove and replace a drive slot filler. 1. Remove the drive slot filler. a) Press both release latches to unlock the drive filler, as shown in the following figure. 22 IBM Storage Scale System 6000: Service Guide...
Page 35
Figure 22. Pressing the release latches of a drive slot filler b) Hold both release latches and pull the drive filler out of the enclosure, away from its slot. See the next figure for reference. Chapter 1. Servicing (customer tasks) 23...
Page 36
Still holding the release latches, insert the drive slot filler into the slot. Keep holding the release latches until the release latches connect with the release catches and the drive carrier assembly locks into place. 24 IBM Storage Scale System 6000: Service Guide...
Page 37
Figure 24. Inserting a drive slot filler c) Make sure that the drive carrier assembly is properly installed, fully inside the drive bay so that it is flush with the face of the enclosure. Then, stop holding the release latches. Chapter 1.
Page 38
System 6000 Prerequisites • All new or existing building blocks must be at the IBM Storage Scale System 6.2.1.0 or higher. If the setup has any protocol nodes, these nodes must also be upgraded to IBM Storage Scale System 6.2.1.0 levels.
Page 39
• If the canister servers are allocated as quorum servers, understand the implications of losing a quorum server on one canister server at a time during this operation. If you do not want to lose the quorum, move the quorum to different servers during this procedure. •...
Page 40
Figure 27. Performance model: NVMe MES 28 IBM Storage Scale System 6000: Service Guide...
Page 41
1. Ensure that the technical delivery assessment (TDA) process is complete before you start the MES upgrade. This step is necessary if you did not purchase MES and attend the TDA with sales. 2. Ensure that the system is at the minimum IBM Storage Scale System 6.2.1.0 or higher for the storage MES.
Page 42
7. To check whether all 48 NVMe drives have the current firmware version, issue the following command from one of the canisters. For more information about the current firmware version, see the Drives section in IBM Storage Scale System 6000 documentation. # mmlsfirmware --type drive Example # date;...
Page 44
For more information and an example on how to manually restarting GPFS, see Manually restarting GPFS on the IBM Storage Scale System 3500 canisters example in IBM Storage Scale System 3500 documentation.
Page 45
Because the decluster array size is doubled after new 12 drives, the set-size is set to 40% to keep the new VDisk size same as the original VDisk size. For information about non-default configurations, see the IBM Storage Scale RAID: Administration Guide in IBM Storage Scale System Software Documentation.
Page 46
------------------------------ --------- -------- ---------------------- ess6000_x86_64_mmvdisk_78E400K 120 GiB 7080 MiB vs_fs6000_1 (1074 MiB), vs_fs6000_2 (1077 MiB) Mon Aug 16 14:24:01 MST 2021 34 IBM Storage Scale System 6000: Service Guide...
Page 47
6000 Prerequisites • All new or existing building blocks must be at the IBM Storage Scale System 6.2.2.0 or higher. If the setup has any protocol nodes, these nodes must also be upgraded to IBM Storage Scale System 6.2.2.0 levels.
Page 48
New data that is added to the file system is correctly striped. Restriping a large file system requires many insert operations and delete operations, which might affect the system performance. Restripe a large file system, when the system demand is low. 36 IBM Storage Scale System 6000: Service Guide...
Page 50
1. Ensure that the technical delivery assessment (TDA) process is complete before you start the MES upgrade. This step is necessary if you did not purchase MES and attend the TDA with sales. 2. Ensure that the system is at the minimum IBM Storage Scale System 6.2.2.0 or higher for the storage MES.
Page 51
7. To check whether all 48 FCM storage drives have the current firmware version, issue the following command from one of the canisters. For more information about the current firmware version, see the Drives section in IBM Storage Scale System 6000 documentation. # mmlsfirmware --type drive Example # date;...
Page 53
For more information and an example on how to manually restarting GPFS, see Manually restarting GPFS on the IBM Storage Scale System 3500 canisters example in IBM Storage Scale System 3500 documentation.
Page 54
Because the decluster array size is doubled after new 12 drives, the set-size is set to 40% to keep the new VDisk size same as the original VDisk size. For information about non-default configurations, see the IBM Storage Scale RAID: Administration Guide in IBM Storage Scale System Software Documentation.
Page 55
file system (if needed), the restripe operation can be run. For more information, see IBM Storage Scale: Administration Guide. When you are adding a vdisk set to a file system, the default TRIM setting is "no" irrespective of the TRIM settings of the existing vdisk sets.
Page 56
Adapter concurrent MES upgrade for IBM Storage Scale System 6000 This procedure is intended to add an adapter to each of the server canisters of IBM Storage Scale System 6000 during the concurrent (online) adapter MES installation. For this concurrent upgrade, it is suggested to complete this operation during a period of low workload stress.
Page 57
Figure 32. IBM Storage Scale System 6000 adapter slots Table 3. Adapter placement priority Placem Featur Descrip Paired adapters to be installed in the same slot of BOTH canisters e Code tion pair 2 pair 3 pair 4 pair 5...
Page 58
• Steps 16 – 18 are expected to be customer tasks Summary The goal of this procedure is primarily to add a third high-speed adapter into each IBM Storage Scale System 6000 canister. Customer can add supported InfiniBand, Ethernet, or SAS adapters in available slots.
Page 59
Display the state of the nodeclass associated with the IBM Storage Scale System. # mmgetstate -N <node class name> Example: [root@ems9 ~]# mmgetstate -N ess6000_mmvdisk_ess6000rw6a_hs_ess6000rw6b_hs Node number Node name GPFS state ------------------------------------------ 1 ess6000rw6a-hs active 2 ess6000rw6b-hs active [root@ems9 ~]# c.
Page 60
Move the quorum node back to the original state, if necessary. 10. Customer task: Prepare canister 2 (bottom) slot for an adapter MES. a. Determine the node class of the IBM Storage Scale System 3500 enclosure where you want to perform MES upgrade.
Page 61
Display the state of the nodeclass associated with the IBM Storage Scale System 3500. # mmgetstate -N <node class name> c. If canister 2 (bottom) is currently designated as a quorum node, move the quorum to a different node.
Page 62
/opt/ibm/ess/tools/samples/essServerConfig.sh <node class name> # mmlsconfig -Y | grep -i <verbsPort> 18. Customer task: Mount file systems on canister 2 (bottom) again. a. Check mounted file systems and mount to canister 2 (bottom) again, if necessary. 50 IBM Storage Scale System 6000: Service Guide...
Page 63
# mmlsmount <file system name> -L # mmlsmount all b. Move the quorum node back to the original state, if necessary. Chapter 1. Servicing (customer tasks) 51...
Page 64
52 IBM Storage Scale System 6000: Service Guide...
Page 65
15.36 TB 2.5 inches PCIe Gen 4 NVMe drive 03JN946 03NK288 30.72 TB 2.5 inches PCIe Gen 4 NVMe drive 03JN947 03MT726 19.2 TBu 2.5 inches IBM FlashCore Module 4 03LG010 38.4 TBu 2.5 inches IBM FlashCore Module 4 03LG019 Drive slot filler 01ML137 FRU part numbers list The FRU part numbers are listed in the table.
Page 66
PCIe adapter CX7 NDR 200 GB dual port VPI 01LL984 PCIe adapter 24 Gb SAS 03JN783 Intel Ethernet X550-T2 03JN786 Coin-cell battery 00RY543 Canister air duct 03JN719 Server canister 03JN713 Enclosure chassis 03JN745 54 IBM Storage Scale System 6000: Service Guide...
Page 68
56 IBM Storage Scale System 6000: Service Guide...
Page 69
Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead.
Page 70
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at www.ibm.com/legal/copytrade.shtml.
Page 71
IBM reserves the right to withdraw the permissions that are granted herein whenever, in its discretion, the use of the publications is detrimental to its interest or as determined by IBM, the above instructions are not being properly followed.
Page 72
60 IBM Storage Scale System 6000: Service Guide...
Page 73
• See refers you from a non-preferred term to the preferred term or from an abbreviation to the spelled- out form. • See also refers you to a related or contrasting term. For other terms and definitions, see the IBM Terminology website (opens in new window): http://www.ibm.com/software/globalization/terminology building block A pair of servers with shared disk enclosures attached.
Page 74
Elastic Storage System (ESS) See IBM Storage Scale System. EMS virtual machine (EMSVM) Virtual machine of an IBM Storage Scale System management server. See MS management server and IBM Storage Scale System. ESS management server (EMS) See management server and IBM Storage Scale System.
Page 75
file encryption key (FEK) A key used to encrypt sectors of an individual file. See also encryption key. file system The methods and data structures used to control how data is stored and retrieved. file system descriptor A data structure containing key information about a file system. This information includes the disks assigned to the file system (stripe group), the current state of the file system, and pointers to key files such as quota files and log files.
Page 76
For GPFS encryption, the ISKLM is used as an RKM server to store MEKs. IBM Storage Scale System A high-performance, IBM Storage Scale NSD solution made up of one or more building blocks. The IBM Storage Scale System software runs on IBM Storage Scale System nodes (management server nodes and I/O server nodes).
Page 77
It must be part of an IBM Storage Scale cluster. From a system management perspective, it is the central coordinator of the cluster. It also serves as a client node in an IBM Storage Scale System building block.
Page 78
This permits high-throughput, low-latency networking, which is especially useful in massively parallel computer clusters. See recovery group data (RGD). remote key management server (RKM server) A server that is used to store master encryption keys. See recovery group (RG). 66 IBM Storage Scale System 6000: Service Guide...
Page 79
recovery group data (RGD) Data that is associated with a recovery group. RKM server See remote key management server (RKM server). See Serial Attached SCSI (SAS). secure shell (SSH) A cryptographic (encrypted) network protocol for initiating text-based shell sessions securely on remote computers.
Page 80
68 IBM Storage Scale System 6000: Service Guide...
Page 81
Index accessibility features 55 audience ix comments xi documentation on web ix information overview ix license inquiries 57 notices 57 overview of information ix patent information 57 preface ix resources on web ix submitting xi trademarks 58 documentation ix resources ix Index 69...
Page 82
70 IBM Storage Scale System 6000: Service Guide...
Need help?
Do you have a question about the Storage Scale System 6000 and is the answer not in the manual?
Questions and answers