Download Print this page

Advertisement

AFF A-Series systems
Install and maintain
NetApp
June 30, 2023
This PDF was generated from https://docs.netapp.com/us-en/ontap-systems/a150/install-setup.html on
June 30, 2023. Always check docs.netapp.com for the latest.

Advertisement

loading
Need help?

Need help?

Do you have a question about the AFF A Series and is the answer not in the manual?

Questions and answers

Subscribe to Our Youtube Channel

Summary of Contents for NetApp AFF A Series

  • Page 1 AFF A-Series systems Install and maintain NetApp June 30, 2023 This PDF was generated from https://docs.netapp.com/us-en/ontap-systems/a150/install-setup.html on June 30, 2023. Always check docs.netapp.com for the latest.
  • Page 2: Table Of Contents

    Table of Contents AFF A-Series systems .............. ...
  • Page 3: Aff A-Series Systems

    Learn how to install your system from racking and cabling, through initial system bring-up. AFF A150 System Installation and Setup Instructions if you are familiar with installing NetApp systems. Videos - AFF A150 Use the following videos to learn how to rack and cable your system and perform initial system configuration.
  • Page 4 Step 1: Prepare for installation To install your AFF A150 system, you create an account on the NetApp Support Site, register your system, and obtain your license keys. You also need to inventory the appropriate number and type of cables for your system and collect specific network information.
  • Page 5 Download and complete the Cluster Configuration Worksheet. Step 2: Install the hardware You install your system in a 4-post rack or NetApp system cabinet, as applicable. Steps 1. Install the rail kits, as needed. 2. Install and secure your system using the instructions included with the rail kit.
  • Page 6 4. Place the bezel on the front of the system. Step 3: Cable controllers to network You cable the controllers to your network by using either the two-node switchless cluster method or the switched cluster method. About this task The following table identifies the cable type with the call out number and cable color in the illustrations for both two-node switchless cluster network cabling and switched cluster network cabling.
  • Page 7 Option 1: Two-node switchless cluster Cable your two-node switchless cluster. About this task Be sure to check the illustration arrow for the proper cable connector pull-tab orientation. As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it around and try again.
  • Page 8 UTA2 data network configurations Use one of the following cable types to cable the UTA2 data ports to your host network. ◦ For an FC host, use 0c and 0d or 0e and 0f. ◦ For an 10GbE system, use e0c and e0d or e0e and e0f.
  • Page 9 DO NOT plug in the power cords at this point. Option 2: Switched cluster Cable your switched cluster. About this task Be sure to check the illustration arrow for the proper cable connector pull-tab orientation. As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it around and try again.
  • Page 10 UTA2 data network configurations Use one of the following cable types to cable the UTA2 data ports to your host network. ◦ For an FC host, use 0c and 0d or 0e and 0f. ◦ For an 10GbE system, use e0c and e0d or e0e and e0f.
  • Page 11 DO NOT plug in the power cords at this point. Step 4: Cable controllers to drive shelves Cable the controllers to your shelves using the onboard storage ports. NetApp recommends MP-HA cabling for systems with external storage. About this task •...
  • Page 12 3. Connect each node to IOM A in the stack. ◦ Controller 1 port 0b to IOM A port 3 on last drive shelf in the stack. ◦ Controller 2 port 0a to IOM A port 1 on the first drive shelf in the stack. mini-SAS HD to mini-SAS HD cables 4.
  • Page 13 Option 1: If network discovery is enabled If you have network discovery enabled on your laptop, you can complete system setup and configuration using automatic cluster discovery. Steps 1. Use the following animation to set one or more drive shelf IDs Animation - Set drive shelf IDs 2.
  • Page 14 a. Open File Explorer. b. Click network in the left pane. c. Right click and select refresh. d. Double-click either ONTAP icon and accept any certificates displayed on your screen. XXXXX is the system serial number for the target node. System Manager opens.
  • Page 15 c. Connect the laptop or console to the switch on the management subnet. d. Assign a TCP/IP address to the laptop or console, using one that is on the management subnet. 2. Use the following animation to set one or more drive shelf IDs: Animation - Set drive shelf IDs 3.
  • Page 16 If the management network has DHCP… Then… Not configured a. Open a console session using PuTTY, a terminal server, or the equivalent for your environment. Check your laptop or console’s online help if you do not know how to configure PuTTY. b.
  • Page 17 ◦ If the impaired controller is in a standalone configuration and at LOADER prompt, contact mysupport.netapp.com. 2. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message...
  • Page 18 NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration. Steps 1. Connect the console cable to the impaired controller. 2. Check whether NVE is configured for any volumes in the cluster: volume show -is-encrypted true If any volumes are listed in the output, NVE is configured and you need to verify the NVE configuration.
  • Page 19 Retrieve and restore all authentication keys and associated key IDs: security key-manager restore -address * If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column displays for all authentication keys and that all key managers...
  • Page 20 Option 2: Check NVE or NSE on systems running ONTAP 9.6 and later Before shutting down the impaired controller, you need to verify whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
  • Page 21 Restore the external key management authentication keys to all nodes in the cluster: security key- manager external restore If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys:...
  • Page 22 Enter the customer’s onboard key management passphrase at the prompt. If the passphrase cannot be provided, contact NetApp Support. mysupport.netapp.com b. Verify the column shows for all authentication keys: Restored security key-manager key query c. Verify that the type shows onboard, and then manually back up the OKM information.
  • Page 23 If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the Restored column equals for all authentication keys: security key-manager key query c. You can safely shut down the controller. 4. If the type displays and the...
  • Page 24 If the impaired controller Then… displays… System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y. b.
  • Page 25 If the impaired controller is Then… displaying… Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y.
  • Page 26 5. Turn the controller module over and place it on a flat, stable surface. 6. Open the cover by sliding in the blue tabs to release the cover, and then swing the cover up and open. Step 2: Replace the boot media You must locate the boot media in the controller and follow the directions to replace it.
  • Page 27 • A copy of the same image version of ONTAP as what the impaired controller was running. You can download the appropriate image from the Downloads section on the NetApp Support Site ◦ If NVE is enabled, download the image with NetApp Volume Encryption, as indicated in the download button.
  • Page 28 ◦ If NVE is not enabled, download the image without NetApp Volume Encryption, as indicated in the download button. • If your system is an HA pair, you must have a network connection. • If your system is a stand-alone system you do not need a network connection, but you must perform an additional reboot when restoring the var file system.
  • Page 29 If you use this optional parameter, you do not need a fully qualified domain name in the netboot server URL. You need only the server’s host name. Other parameters might be necessary for your interface. You can enter help ifconfig the firmware prompt for details.
  • Page 30 Restore OKM, NSE, and NVE as needed - AFF A150 Once environment variables are checked, you must complete steps specific to systems that have Onboard Key Manager (OKM), NetApp Storage Encryption (NSE) or NetApp Volume Encryption (NVE) enabled. Determine which section you should use to restore your OKM, NSE, or NVE configurations: If NSE or NVE are enabled along with Onboard Key Manager you must restore settings you captured at the beginning of this procedure.
  • Page 31 Option 1: Restore NVE or NSE when Onboard Key Manager is enabled Steps 1. Connect the console cable to the target controller. 2. Use the command at the LOADER prompt to boot the controller. boot_ontap 3. Check the console output: If the console Then…...
  • Page 32 --------------------------BEGIN BACKUP-------------------------- TmV0QXBwIEtleSBCbG9iAAEAAAAEAAAAcAEAAAAAAADuD+byAAAAACEAAAAAAAAA QAAAAAAAAABvOlH0AAAAAMh7qDLRyH1DBz12piVdy9ATSFMT0C0TlYFss4PDjTaV dzRYkLd1PhQLxAWJwOIyqSr8qY1SEBgm1IWgE5DLRqkiAAAAAAAAACgAAAAAAAAA 3WTh7gAAAAAAAAAAAAAAAAIAAAAAAAgAZJEIWvdeHr5RCAvHGclo+wAAAAAAAAAA IgAAAAAAAAAoAAAAAAAAAEOTcR0AAAAAAAAAAAAAAAACAAAAAAAJAGr3tJA/ LRzUQRHwv+1aWvAAAAAAAAAAACQAAAAAAAAAgAAAAAAAAACdhTcvAAAAAJ1PXeBf ml4NBsSyV1B4jc4A7cvWEFY6lLG6hc6tbKLAHZuvfQ4rIbYAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA H4nPQM0nrDRYRa9SCv8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAA ---------------------------END BACKUP--------------------------- 7. At the Boot Menu select the option for Normal Boot. The system boots to prompt. Waiting for giveback… 8. Move the console cable to the partner controller and login as admin. 9.
  • Page 33 b. Enter the key-manager key show -detail command to see a detailed view of all keys stored in the onboard key manager and verify that the column = for all authentication keys. Restored If the column = anything other than yes, contact Customer Support. Restored c.
  • Page 34 This command does not work if NVE (NetApp Volume Encryption) is configured 10. Use the security key-manager query to display the key IDs of the authentication keys that are stored on the key management servers.
  • Page 35 11. If the Onboard Key Management is enabled: a. Use the to see a detailed view of all keys stored in security key-manager key show -detail the onboard key manager. b. Use the command and verify that the security key-manager key show -detail Restored column = for all authentication keys.
  • Page 36 -node * -type all -message MAINT=END Return the failed part to NetApp - AFF A150 Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information.
  • Page 37 provider. Step 1: Shut down the impaired controller To shut down the impaired controller, you must determine the status of the controller and, if necessary, take over the controller so that the healthy controller continues to serve data from the impaired controller storage. About this task If you have a cluster with more than two nodes, it must be in quorum.
  • Page 38 Step 2: Remove controller module To access components inside the controller, you must first remove the controller module from the system and then remove the cover on the controller module. Steps 1. If you are not already grounded, properly ground yourself. 2.
  • Page 39 Step 3: Replace a caching module To replace a caching module referred to as the M.2 PCIe card on the label on your controller, locate the slot inside the controller and follow the specific sequence of steps. Your storage system must meet certain criteria depending on your situation: •...
  • Page 40 3. Gently pull the caching module straight out of the housing. 4. Align the edges of the caching module with the socket in the housing, and then gently push it into the socket. 5. Verify that the caching module is seated squarely and completely in the socket. If necessary, remove the caching module and reseat it into the socket.
  • Page 41 4. Recable the system, as needed. If you removed the media converters (QSFPs or SFPs), remember to reinstall them if you are using fiber optic cables. 5. Complete the reinstallation of the controller module: If your system is in… Then perform these steps… An HA pair The controller module begins to boot as soon as it is fully seated in the chassis.
  • Page 42 If your system is in… Then perform these steps… A stand-alone configuration a. With the cam handle in the open position, firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position. Do not use excessive force when sliding the controller module into the chassis to avoid damaging the connectors.
  • Page 43 appears. 3. Run diagnostics on the caching module: sldiag device run -dev fcache 4. Verify that no hardware problems resulted from the replacement of the caching module: sldiag device status -dev fcache -long -state failed System-level diagnostics returns you to the prompt if there are no test failures, or lists the full status of failures resulting from testing the component.
  • Page 44 If the system-level Then… diagnostics tests… Resulted in some test Determine the cause of the problem: failures a. Exit Maintenance mode: halt After you issue the command, wait until the system stops at the LOADER prompt. b. Turn off or leave on the power supplies, depending on how many controller modules are in the chassis: ◦...
  • Page 45 cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A   controller_A_1 configured enabled heal roots completed   cluster_B   controller_B_1 configured enabled waiting for switchback recovery 2 entries were displayed. 2. Verify that resynchronization is complete on all SVMs: metrocluster vserver show 3.
  • Page 46 • Suspend external backup jobs. • Necessary tools and equipment for the replacement. If the system is a NetApp StorageGRID or ONTAP S3 used as FabricPool cloud tier, refer to the Gracefully shutdown and power up your storage system Resolution Guide after performing this procedure.
  • Page 47 If using FlexArray array LUNs, follow the specific vendor storage array documentation for the shutdown procedure to perform for those systems after performing this procedure. If using SSDs, refer to SU490: (Impact: Critical) SSD Best Practices: Avoid risk of drive failure and data loss if powered off for more than two months As a best practice before shutdown, you should: •...
  • Page 48 10. Unplug the power cord from each PSU. 11. Verify that all controllers in the impaired chassis are powered down. Option 2: Controller is in a MetroCluster configuration Do not use this procedure if your system is in a two-node MetroCluster configuration. To shut down the impaired controller, you must determine the status of the controller and, if necessary, take over the controller so that the healthy controller continues to serve data from the impaired controller storage.
  • Page 49 Step 1: Move a power supply Moving out a power supply when replacing a chassis involves turning off, disconnecting, and removing the power supply from the old chassis and installing and connecting it on the replacement chassis. 1. If you are not already grounded, properly ground yourself. 2.
  • Page 50 3. Squeeze the latch on the cam handle until it releases, open the cam handle fully to release the controller module from the midplane, and then, using two hands, pull the controller module out of the chassis. 4. Set the controller module aside in a safe place, and repeat these steps if you have another controller module in the chassis.
  • Page 51 4. Gently push the drive into the chassis as far as it will go. The cam handle engages and begins to rotate upward. 5. Firmly push the drive the rest of the way into the chassis, and then lock the cam handle by pushing it up and against the drive holder.
  • Page 52 Restore and verify the configuration - AFF A150 You must verify the HA state of the chassis and run System-Level diagnostics, switch back aggregates, and return the failed part to NetApp, as described in the RMA instructions shipped with the kit.
  • Page 53 Step 1: Verify and set the HA state of the chassis You must verify the HA state of the chassis, and, if necessary, update the state to match your system configuration. 1. In Maintenance mode, from either controller module, display the HA state of the local controller module and chassis: ha-config show The HA state should be the same for all components.
  • Page 54 During the boot process, you can safely respond to prompts: 2. Repeat the previous step on the second controller if you are in an HA configuration. Both controllers must be in Maintenance mode to run the interconnect test. 3. At the LOADER prompt, access the special drivers specifically designed for system-level diagnostics to function properly: boot_diags During the boot process, you can safely respond...
  • Page 55 If the system-level diagnostics Then… tests… Were completed without any a. Clear the status logs: sldiag device clearstatus failures b. Verify that the log was cleared: sldiag device status The following default response is displayed: SLDIAG: No log messages are present. c.
  • Page 56 If your system is running Then… ONTAP… Resulted in some test failures Determine the cause of the problem. a. Exit Maintenance mode: halt b. Perform a clean shutdown, and then disconnect the power supplies. c. Verify that you have observed all of the considerations identified for running system-level diagnostics, that cables are securely connected, and that hardware components are properly installed in the storage system.
  • Page 57 6. Reestablish any SnapMirror or SnapVault configurations. Step 4: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information.
  • Page 58 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the Returning SEDs to unprotected mode.
  • Page 59 2. Disable automatic giveback from the console of the healthy controller: storage failover modify –node local -auto-giveback false When you see Do you want to disable auto-giveback?, enter y. 3. Take the impaired controller to the LOADER prompt: If the impaired controller is Then…...
  • Page 60 module. 5. Squeeze the latch on the cam handle until it releases, open the cam handle fully to release the controller module from the midplane, and then, using two hands, pull the controller module out of the chassis. 6. Turn the controller module over and place it on a flat, stable surface. 7.
  • Page 61 Step 2: Move the NVMEM battery To move the NVMEM battery from the old controller module to the new controller module, you must perform a specific sequence of steps. 1. Check the NVMEM LED: ◦ If your system is in an HA configuration, go to the next step. ◦...
  • Page 62 4. Grasp the battery and press the blue locking tab marked PUSH, and then lift the battery out of the holder and controller module. 5. Move the battery to the replacement controller module. 6. Loop the battery cable around the cable channel on the side of the battery holder. 7.
  • Page 63 Step 4: Move the DIMMs To move the DIMMs, you must follow the directions to locate and move them from the old controller module into the replacement controller module. You must have the new controller module ready so that you can move the DIMMs directly from the impaired controller module to the corresponding slots in the replacement controller module.
  • Page 64 8. Repeat these steps for the remaining DIMMs. 9. Locate the NVMEM battery plug socket, and then squeeze the clip on the face of the battery cable plug to insert it into the socket. Make sure that the plug locks down onto the controller module. Step 5: Move a caching module, if present If your AFF A220 or FAS2700 system has a caching module, you need to move the caching module from the old controller module to the replacement controller module.
  • Page 65 4. Verify that the caching module is seated squarely and completely in the socket. If necessary, remove the caching module and reseat it into the socket. 5. Reseat and push the heatsink down to engage the locking button on the caching module housing. 6.
  • Page 66 If your system is in… Then perform these steps… An HA pair The controller module begins to boot as soon as it is fully seated in the chassis. Be prepared to interrupt the boot process. a. With the cam handle in the open position, firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position.
  • Page 67 If your system is in… Then perform these steps… A stand-alone configuration a. With the cam handle in the open position, firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position. Do not use excessive force when sliding the controller module into the chassis to avoid damaging the connectors.
  • Page 68 Restore and verify the system configuration - AFF A150 After completing the hardware replacement and booting to Maintenance mode, you verify the low-level system configuration of the replacement controller and reconfigure system settings as necessary. Step 1: Set and verify system time after replacing the controller You should check the time and date on the replacement controller module against the healthy controller module in an HA pair, or against a reliable time server in a stand-alone configuration.
  • Page 69 ▪ ▪ ▪ mcc-2n ▪ mccip ▪ non-ha b. Confirm that the setting has changed: ha-config show Step 3: Run system-level diagnostics You should run comprehensive or focused diagnostic tests for specific components and subsystems whenever you replace the controller. All commands in the diagnostic procedures are issued from the controller where the component is being replaced.
  • Page 70 If you want to run diagnostic Then… tests on… Individual components a. Clear the status logs: sldiag device clearstatus b. Display the available tests for the selected devices: sldiag device show -dev dev_name dev_name can be any one of the ports and devices identified in the preceding step.
  • Page 71 If you want to run diagnostic Then… tests on… Multiple components at the same a. Review the enabled and disabled devices in the output from the time preceding procedure and determine which ones you want to run concurrently. b. List the individual tests for the device: sldiag device show -dev dev_name c.
  • Page 72 Recable the system and reassign disks - AFF A150 To complete the replacement procedure and restore your system to full operation, you must recable the storage, confirm disk reassignment, restore the NetApp Storage Encryption configuration (if necessary), and install licenses for the new controller. You must complete a series of tasks before restoring your system to full operation.
  • Page 73 c. Click the Cabling tab, and then examine the output. Make sure that all disk shelves are displayed and all disks appear in the output, correcting any cabling issues you find. d. Check other cabling by clicking the appropriate tab, and then examining the output from Config Advisor. Step 2: Reassign disks If the storage system is in an HA pair, the system ID of the new controller module is automatically assigned to the disks when the giveback occurs at the end of the procedure.
  • Page 74 node1> `storage failover show`   Takeover Node Partner Possible State Description ------------ ------------ -------- ------------------------------------- node1 node2 false System ID changed on partner (Old:   151759755, New: 151759706), In takeover node2 node1 Waiting for giveback (HA mailboxes) 4. From the healthy controller, verify that any coredumps are saved: a.
  • Page 75 on partner message. 7. Verify that the disks were assigned correctly: storage disk show -ownership The disks belonging to the replacement controller should show the new system ID. In the following example, the disks owned by node1 now show the new system ID, 1873775277: node1>...
  • Page 76 *> disk show -a Local System ID: 118065481   DISK OWNER POOL SERIAL NUMBER HOME -------- ------------- ----- ------------- ------------- disk_name system-1 (118073209) Pool0 J8XJE9LC system-1 (118073209) disk_name system-1 (118073209) Pool0 J8Y478RC system-1 (118073209) 5. Reassign disk ownership by using the system ID information obtained from the disk show command: disk reassign -s old system ID disk reassign -s 118073209 6.
  • Page 77 About this task This procedure applies only to systems in a two-node MetroCluster configuration running ONTAP. You must be sure to issue the commands in this procedure on the correct node: • The impaired node is the node on which you are performing maintenance. •...
  • Page 78 Verify that the disks belonging to the replacement node show the new system ID for the replacement node. In the following example, the disks owned by system-1 now show the new system ID, 118065481: *> disk show -a Local System ID: 118065481  ...
  • Page 79 Display the results of the MetroCluster check: metrocluster check show e. Run Config Advisor. Go to the Config Advisor page on the NetApp Support Site at support.netapp.com/NOW/download/tools/config_advisor/. After running Config Advisor, review the tool’s output and follow the recommendations in the output to address any issues discovered.
  • Page 80 If any LIFs are listed as false, revert them to their home ports: network interface revert -vserver * -lif * 2. Register the system serial number with NetApp Support. ◦ If AutoSupport is enabled, send an AutoSupport message to register the serial number. ◦ If AutoSupport is not enabled, call NetApp Support to register the serial number.
  • Page 81 MetroCluster switchback operation. This returns the configuration to its normal operating state, with the sync- source storage virtual machines (SVMs) on the formerly impaired site now active and serving data from the local disk pools. This task only applies to two-node MetroCluster configurations. Steps 1.
  • Page 82 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the Returning SEDs to unprotected mode.
  • Page 83 The following AutoSupport message suppresses automatic case creation for two hours: cluster1:> system node autosupport invoke -node * -type all -message MAINT=2h 2. Disable automatic giveback from the console of the healthy controller: storage failover modify –node local -auto-giveback false When you see Do you want to disable auto-giveback?, enter y.
  • Page 84 4. Squeeze the latch on the cam handle until it releases, open the cam handle fully to release the controller module from the midplane, and then, using two hands, pull the controller module out of the chassis. 5. Turn the controller module over and place it on a flat, stable surface. 6.
  • Page 85 Step 3: Replace the DIMMs To replace the DIMMs, locate them inside the controller and follow the specific sequence of steps. If you are replacing a DIMM, you need to remove it after you have unplugged the NVMEM battery from the controller module.
  • Page 86 c. Reconnect the battery connector. 5. Return to Replace the DIMMs of this procedure to recheck the NVMEM LED. 6. Locate the DIMMs on your controller module. 7. Note the orientation of the DIMM in the socket so that you can insert the replacement DIMM in the proper orientation.
  • Page 87 12. Locate the NVMEM battery plug socket, and then squeeze the clip on the face of the battery cable plug to insert it into the socket. Make sure that the plug locks down onto the controller module. 13. Close the controller module cover. Step 4: Reinstall the controller module After you replace components in the controller module, reinstall it into the chassis.
  • Page 88 If your system is in… Then perform these steps… An HA pair The controller module begins to boot as soon as it is fully seated in the chassis. Be prepared to interrupt the boot process. a. With the cam handle in the open position, firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position.
  • Page 89 If your system is in… Then perform these steps… A stand-alone configuration a. With the cam handle in the open position, firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position. Do not use excessive force when sliding the controller module into the chassis to avoid damaging the connectors.
  • Page 90 function properly: boot_diags During the boot process, you can safely respond to the prompts until the Maintenance mode prompt (*>) appears. 3. Run diagnostics on the system memory: sldiag device run -dev mem 4. Verify that no hardware problems resulted from the replacement of the DIMMs: sldiag device status -dev mem -long -state failed System-level diagnostics returns you to the prompt if there are no test failures, or lists the full status of...
  • Page 91 If your controller is in… Then… Resulted in some test failures Determine the cause of the problem: a. Exit Maintenance mode: halt After you issue the command, wait until the system stops at the LOADER prompt. b. Turn off or leave on the power supplies, depending on how many controller modules are in the chassis: ◦...
  • Page 92 1. Verify that all nodes are in the enabled state: metrocluster node show cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A   controller_A_1 configured enabled heal roots completed   cluster_B  ...
  • Page 93 Step 7: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information. Replace SSD Drive or HDD Drive - AFF A150 You can replace a failed drive nondisruptively while I/O is in progress.
  • Page 94 Option 1: Replace SSD 1. If you want to manually assign drive ownership for the replacement drive, you need to disable automatic drive assignment replacement drive, if it is enabled You manually assign drive ownership and then reenable automatic drive assignment later in this procedure.
  • Page 95 -node node_name -autoassign on You must reenable automatic drive assignment on both controller modules. 10. Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 96 Depending on the storage system, the disk drives have the release button located at the top or on the left of the disk drive face. For example, the following illustration shows a disk drive with the release button located on the top of the disk drive face: The cam handle on the disk drive springs open partially and the disk drive releases from the midplane.
  • Page 97 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the Returning SEDs to unprotected mode.
  • Page 98 If the impaired controller is Then… displaying… The LOADER prompt Go to Remove controller module. Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired controller from the healthy controller: prompt storage failover takeover -ofnode impaired_node_name...
  • Page 99 5. Turn the controller module over and place it on a flat, stable surface. 6. Open the cover by sliding in the blue tabs to release the cover, and then swing the cover up and open. Step 3: Replace the NVMEM battery To replace the NVMEM battery in your system, you must remove the failed NVMEM battery from the system and replace it with a new NVMEM battery.
  • Page 100 4. Locate the battery plug and squeeze the clip on the face of the battery plug to release the plug from the socket, and then unplug the battery cable from the socket. 5. Remove the battery from the controller module and set it aside. 6.
  • Page 101 optic cables. 5. Complete the reinstallation of the controller module: If your system is in… Then perform these steps… An HA pair The controller module begins to boot as soon as it is fully seated in the chassis. Be prepared to interrupt the boot process. a.
  • Page 102 If your system is in… Then perform these steps… A stand-alone configuration a. With the cam handle in the open position, firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position. Do not use excessive force when sliding the controller module into the chassis to avoid damaging the connectors.
  • Page 103 function properly: boot_diags During the boot process, you can safely respond to the prompts until the Maintenance mode prompt (*>) appears. 3. Run diagnostics on the NVMEM memory: sldiag device run -dev nvmem 4. Verify that no hardware problems resulted from the replacement of the NVMEM battery: sldiag device status -dev nvmem -long -state failed System-level diagnostics returns you to the prompt if there are no test failures, or lists the full status of...
  • Page 104 If your controller is in… Then… Resulted in some test failures Determine the cause of the problem: a. Exit Maintenance mode: halt After you issue the command, wait until the system stops at the LOADER prompt. b. Turn off or leave on the power supplies, depending on how many controller modules are in the chassis: ◦...
  • Page 105 1. Verify that all nodes are in the enabled state: metrocluster node show cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A   controller_A_1 configured enabled heal roots completed   cluster_B  ...
  • Page 106 Step 7: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information. Swap out a power supply - AFF A150 Swapping out a power supply involves turning off, disconnecting, and removing the old power supply and installing, connecting, and turning on the replacement power supply.
  • Page 107 5. Use the cam handle to slide the power supply out of the system. When removing a power supply, always use two hands to support its weight. 6. Make sure that the on/off switch of the new power supply is in the Off position. 7.
  • Page 108 11. Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information. Replace the real-time clock battery - AFF A150 You replace the real-time clock (RTC) battery in the controller module so that your system’s services and applications that depend on accurate time synchronization...
  • Page 109 If the impaired controller is Then… displaying… The LOADER prompt Go to Remove controller module. Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired controller from the healthy controller: prompt storage failover takeover -ofnode impaired_node_name...
  • Page 110 5. Turn the controller module over and place it on a flat, stable surface. 6. Open the cover by sliding in the blue tabs to release the cover, and then swing the cover up and open. Step 3: Replace the RTC battery To replace the RTC battery, locate it inside the controller and follow the specific sequence of steps.
  • Page 111 1. If you are not already grounded, properly ground yourself. 2. Locate the RTC battery. 3. Gently push the battery away from the holder, rotate it away from the holder, and then lift it out of the holder. Note the polarity of the battery as you remove it from the holder. The battery is marked with a plus sign and must be positioned in the holder correctly.
  • Page 112 module halfway into the system. Do not completely insert the controller module in the chassis until instructed to do so. 3. Recable the system, as needed. If you removed the media converters (QSFPs or SFPs), remember to reinstall them if you are using fiber optic cables.
  • Page 113 1. Verify that all nodes are in the enabled state: metrocluster node show cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A   controller_A_1 configured enabled heal roots completed   cluster_B  ...
  • Page 114: Aff A220 System Documentation

    Step 6: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information. AFF A220 System Documentation Install and setup Start here: Choose your installation and setup experience For most configurations, you can choose from different content formats.
  • Page 115 Step 1: Prepare for installation To install your FAS2700 or AFF A220 system, you need to create an account on the NetApp Support Site, register your system, and get license keys. You also need to inventory the appropriate number and type of cables for your system and collect specific network information.
  • Page 116 6. Download and complete the Cluster configuration worksheet. Cluster Configuration Worksheet Step 2: Install the hardware You need to install your system in a 4-post rack or NetApp system cabinet, as applicable. Steps 1. Install the rail kits, as needed.
  • Page 117 You need to be aware of the safety concerns associated with the weight of the system. 3. Attach cable management devices (as shown). 4. Place the bezel on the front of the system. Step 3: Cable controllers to your network You can cable the controllers to your network by using the two-node switchless cluster method or by using the cluster interconnect network.
  • Page 118 Step Perform on each controller Cable the cluster interconnect ports to each other with the cluster interconnect cable: • e0a to e0a • e0b to e0b...
  • Page 119 Step Perform on each controller Use one of the following cable types to cable the UTA2 data ports to your host network: An FC host • 0c and 0d • or 0e and 0f A 10GbE • e0c and e0d •...
  • Page 120 2. To cable your storage, see Step 4: Cable controllers to drive shelves Option 2: Cable a switched cluster, unified network configuration Management network, UTA2 data network, and management ports on the controllers are connected to switches. The cluster interconnect ports are cabled to the cluster interconnect switches. You must have contacted your network administrator for information about connecting the system to the switches.
  • Page 121 Step Perform on each controller module Use one of the following cable types to cable the UTA2 data ports to your host network: An FC host • 0c and 0d • or 0e and 0f A 10GbE • e0c and e0d •...
  • Page 122 Step Perform on each controller module DO NOT plug in the power cords at this point. 2. To cable your storage, see Step 4: Cable controllers to drive shelves Option 3: Cable a two-node switchless cluster, Ethernet network configuration Management network, Ethernet data network, and management ports on the controllers are connected to switches.
  • Page 123 Step Perform on each controller Cable the cluster interconnect ports to each other with the cluster interconnect cable: • e0a to e0a • e0b to e0b Use the Cat 6 RJ45 cable to cable the e0c through e0f ports to your host network: Cable the e0M ports to the management network switches with the RJ45 cables:...
  • Page 124 2. To cable your storage, see Step 4: Cable controllers to drive shelves Option 4: Cable a switched cluster, Ethernet network configuration Management network, Ethernet data network, and management ports on the controllers are connected to switches. The cluster interconnect ports are cabled to the cluster interconnect switches. You must have contacted your network administrator for information about connecting the system to the switches.
  • Page 125 Step 4: Cable controllers to drive shelves You must cable the controllers to your shelves using the onboard storage ports. NetApp recommends MP-HA cabling for systems with external storage. If you have a SAS tape drive, you can use single-path cabling. If you have no external shelves, MP-HA cabling to internal drives is optional (not shown) if the SAS cables are ordered with the system.
  • Page 126 Steps 1. Cable the HA pair with external drive shelves: The example uses DS224C. Cabling is similar with other supported drive shelves. Step Perform on each controller Cable the shelf-to-shelf ports. • Port 3 on IOM A to port 1 on the IOM A on the shelf directly below. •...
  • Page 127 If you have more than one drive shelf stack, see the Installation and Cabling Guide for your drive shelf type. 2. To complete setting up your system, see Step 5: Complete system setup and configuration Step 5: Complete system setup and configuration You can complete the system setup and configuration using cluster discovery with only a connection to the switch and laptop, or by connecting directly to a controller in the system and then connecting to the management switch.
  • Page 128 Double-click either ONTAP icon and accept any certificates displayed on your screen. XXXXX is the system serial number for the target node. System Manager opens. 7. Use System Manager guided setup to configure your system using the data you collected in the NetApp ONTAP Configuration Guide. ONTAP Configuration Guide 8.
  • Page 129 c. Connect the laptop or console to the switch on the management subnet. d. Assign a TCP/IP address to the laptop or console, using one that is on the management subnet. 2. Use the following animation to set one or more drive shelf IDs: Animation - Set drive shelf IDs 3.
  • Page 130 Point your browser to the node management IP address. The format for the address is https://x.x.x.x. b. Configure the system using the data you collected in the NetApp ONTAP Configuration guide. ONTAP Configuration Guide 7. Verify the health of your system by running Config Advisor.
  • Page 131 Option 1: Check NVE or NSE on systems running ONTAP 9.5 and earlier Before shutting down the impaired controller, you need to check whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
  • Page 132 Retrieve and restore all authentication keys and associated key IDs: security key-manager restore -address * If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column displays for all authentication keys and that all key managers...
  • Page 133 Restored unavailable: a. Retrieve and restore all authentication keys and associated key IDs: security key-manager restore -address * If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the Restored column displays for all authentication keys and that all key managers...
  • Page 134 Option 2: Check NVE or NSE on systems running ONTAP 9.6 and later Before shutting down the impaired controller, you need to verify whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
  • Page 135 Restored yes: a. Restore the external key management authentication keys to all nodes in the cluster: security key- manager external restore If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys: Restored...
  • Page 136 Restored yes: a. Restore the external key management authentication keys to all nodes in the cluster: security key- manager external restore If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys: Restored...
  • Page 137 Restored a. Enter the onboard security key-manager sync command: security key-manager onboard sync Enter the customer’s onboard key management passphrase at the prompt. If the passphrase cannot be provided, contact NetApp Support. mysupport.netapp.com b. Verify the Restored column shows...
  • Page 138 to your log file. This command may not work if the boot device is corrupted or non-functional. Option 2: Controller is in a MetroCluster After completing the NVE or NSE tasks, you need to complete the shutdown of the impaired controller. Do not use this procedure if your system is in a two-node MetroCluster configuration.
  • Page 139 Step 1: Remove the controller module To access components inside the controller, you must first remove the controller module from the system and then remove the cover on the controller module. 1. If you are not already grounded, properly ground yourself. 2.
  • Page 140 Step 2: Replace the boot media You must locate the boot media in the controller and follow the directions to replace it. Steps 1. If you are not already grounded, properly ground yourself. 2. Locate the boot media using the following illustration or the FRU map on the controller module:...
  • Page 141 • A copy of the same image version of ONTAP as what the impaired controller was running. You can download the appropriate image from the Downloads section on the NetApp Support Site ◦ If NVE is enabled, download the image with NetApp Volume Encryption, as indicated in the download button.
  • Page 142 Steps 1. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. 2. Reinstall the cable management device and recable the system, as needed. When recabling, remember to reinstall the media converters (SFPs) if they were removed. 3.
  • Page 143 Boot the recovery image - AFF A220 and FAS2700 You must boot the ONTAP image from the USB drive, restore the file system, and verify the environmental variables. Steps 1. From the LOADER prompt, boot the recovery image from the USB flash drive: boot_recovery The image is downloaded from the USB flash drive.
  • Page 144 Restore OKM, NSE, and NVE as needed - AFF A220 and FAS2700 Once environment variables are checked, you must complete steps specific to systems that have Onboard Key Manager (OKM), NetApp Storage Encryption (NSE) or NetApp Volume Encryption (NVE) enabled.
  • Page 145 If the console Then… displays… The LOADER prompt Boot the controller to the boot menu: boot_ontap menu Waiting for giveback… a. Enter at the prompt Ctrl-C b. At the message: Do you wish to halt this controller rather than wait [y/n]? , enter: c.
  • Page 146 9. Confirm the target controller is ready for giveback with the storage failover show command. 10. Give back only the CFO aggregates with the storage failover giveback -fromnode local -only-cfo command. -aggregates true ◦ If the command fails because of a failed disk, physically disengage the failed disk, but leave the disk in the slot until a replacement is received.
  • Page 147 If giveback is not complete after 20 minutes, contact Customer Support. 18. At the clustershell prompt, enter the command to list the logical net int show -is-home false interfaces that are not on their home controller and port. If any interfaces are listed as false, revert those interfaces back to their home port using the net int command.
  • Page 148 This command does not work if NVE (NetApp Volume Encryption) is configured 10. Use the security key-manager query to display the key IDs of the authentication keys that are stored on the key management servers.
  • Page 149 If the console Then… displays… The login prompt Go to Step 7. Waiting for giveback… a. Log into the partner controller. b. Confirm the target controller is ready for giveback with the storage command. failover show 4. Move the console cable to the partner controller and give back the target controller storage using the storage failover giveback -fromnode local -only-cfo-aggregates true local command.
  • Page 150 -node * -type all -message MAINT=END Return the failed part to NetApp - AFF A220 and FAS2700 Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information.
  • Page 151 b. Verify that the data has been erased from the caching module: system controller flash-cache secure-erase show 2. If the impaired controller is part of an HA pair, disable automatic giveback from the console of the healthy controller: storage failover modify -node local -auto-giveback false 3.
  • Page 152 4. Squeeze the latch on the cam handle until it releases, open the cam handle fully to release the controller module from the midplane, and then, using two hands, pull the controller module out of the chassis. 5. Turn the controller module over and place it on a flat, stable surface. 6.
  • Page 153 Step 3: Replace a caching module To replace a caching module referred to as the M.2 PCIe card on the label on your controller, locate the slot inside the controller and follow the specific sequence of steps. Your storage system must meet certain criteria depending on your situation: •...
  • Page 154 3. Gently pull the caching module straight out of the housing. 4. Align the edges of the caching module with the socket in the housing, and then gently push it into the socket. 5. Verify that the caching module is seated squarely and completely in the socket. If necessary, remove the caching module and reseat it into the socket.
  • Page 155 4. Recable the system, as needed. If you removed the media converters (QSFPs or SFPs), remember to reinstall them if you are using fiber optic cables. 5. Complete the reinstallation of the controller module: If your system is in… Then perform these steps… An HA pair The controller module begins to boot as soon as it is fully seated in the chassis.
  • Page 156 If your system is in… Then perform these steps… A stand-alone configuration a. With the cam handle in the open position, firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position. Do not use excessive force when sliding the controller module into the chassis to avoid damaging the connectors.
  • Page 157 appears. 3. Run diagnostics on the caching module: sldiag device run -dev fcache 4. Verify that no hardware problems resulted from the replacement of the caching module: sldiag device status -dev fcache -long -state failed System-level diagnostics returns you to the prompt if there are no test failures, or lists the full status of failures resulting from testing the component.
  • Page 158 If the system-level Then… diagnostics tests… Resulted in some test Determine the cause of the problem: failures a. Exit Maintenance mode: halt After you issue the command, wait until the system stops at the LOADER prompt. b. Turn off or leave on the power supplies, depending on how many controller modules are in the chassis: ◦...
  • Page 159 cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A   controller_A_1 configured enabled heal roots completed   cluster_B   controller_B_1 configured enabled waiting for switchback recovery 2 entries were displayed. 2. Verify that resynchronization is complete on all SVMs: metrocluster vserver show 3.
  • Page 160 Step 7: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information. Chassis Overview of chassis replacement - AFF A220 and FAS2700...
  • Page 161 If using SSDs, refer to SU490: (Impact: Critical) SSD Best Practices: Avoid risk of drive failure and data loss if powered off for more than two months As a best practice before shutdown, you should: • Perform additional system health checks.
  • Page 162 Option 2: Controller is in a MetroCluster configuration Do not use this procedure if your system is in a two-node MetroCluster configuration. To shut down the impaired controller, you must determine the status of the controller and, if necessary, take over the controller so that the healthy controller continues to serve data from the impaired controller storage.
  • Page 163 power supply from the old chassis and installing and connecting it on the replacement chassis. 1. If you are not already grounded, properly ground yourself. 2. Turn off the power supply and disconnect the power cables: a. Turn off the power switch on the power supply. b.
  • Page 164 3. Squeeze the latch on the cam handle until it releases, open the cam handle fully to release the controller module from the midplane, and then, using two hands, pull the controller module out of the chassis. 4. Set the controller module aside in a safe place, and repeat these steps if you have another controller module in the chassis.
  • Page 165 when it is secure. 6. Repeat the process for the remaining drives in the system. Step 4: Replace a chassis from within the equipment rack or system cabinet You must remove the existing chassis from the equipment rack or system cabinet before you can install the replacement chassis.
  • Page 166 Restore and verify the configuration - AFF A220 and FAS2700 You must verify the HA state of the chassis and run System-Level diagnostics, switch back aggregates, and return the failed part to NetApp, as described in the RMA instructions shipped with the kit.
  • Page 167 Step 1: Verify and set the HA state of the chassis You must verify the HA state of the chassis, and, if necessary, update the state to match your system configuration. 1. In Maintenance mode, from either controller module, display the HA state of the local controller module and chassis: ha-config show The HA state should be the same for all components.
  • Page 168 During the boot process, you can safely respond to prompts: 2. Repeat the previous step on the second controller if you are in an HA configuration. Both controllers must be in Maintenance mode to run the interconnect test. 3. At the LOADER prompt, access the special drivers specifically designed for system-level diagnostics to function properly: boot_diags During the boot process, you can safely respond...
  • Page 169 If the system-level diagnostics Then… tests… Were completed without any a. Clear the status logs: sldiag device clearstatus failures b. Verify that the log was cleared: sldiag device status The following default response is displayed: SLDIAG: No log messages are present. c.
  • Page 170 If your system is running Then… ONTAP… Resulted in some test failures Determine the cause of the problem. a. Exit Maintenance mode: halt b. Perform a clean shutdown, and then disconnect the power supplies. c. Verify that you have observed all of the considerations identified for running system-level diagnostics, that cables are securely connected, and that hardware components are properly installed in the storage system.
  • Page 171 6. Reestablish any SnapMirror or SnapVault configurations. Step 4: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information.
  • Page 172 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the Returning SEDs to unprotected mode.
  • Page 173 2. Disable automatic giveback from the console of the healthy controller: storage failover modify –node local -auto-giveback false When you see Do you want to disable auto-giveback?, enter y. 3. Take the impaired controller to the LOADER prompt: If the impaired controller is Then…...
  • Page 174 module. 5. Squeeze the latch on the cam handle until it releases, open the cam handle fully to release the controller module from the midplane, and then, using two hands, pull the controller module out of the chassis. 6. Turn the controller module over and place it on a flat, stable surface. 7.
  • Page 175 Step 2: Move the NVMEM battery To move the NVMEM battery from the old controller module to the new controller module, you must perform a specific sequence of steps. 1. Check the NVMEM LED: ◦ If your system is in an HA configuration, go to the next step. ◦...
  • Page 176 4. Grasp the battery and press the blue locking tab marked PUSH, and then lift the battery out of the holder and controller module. 5. Move the battery to the replacement controller module. 6. Loop the battery cable around the cable channel on the side of the battery holder. 7.
  • Page 177 Step 4: Move the DIMMs To move the DIMMs, you must follow the directions to locate and move them from the old controller module into the replacement controller module. You must have the new controller module ready so that you can move the DIMMs directly from the impaired controller module to the corresponding slots in the replacement controller module.
  • Page 178 8. Repeat these steps for the remaining DIMMs. 9. Locate the NVMEM battery plug socket, and then squeeze the clip on the face of the battery cable plug to insert it into the socket. Make sure that the plug locks down onto the controller module. Step 5: Move a caching module, if present If your AFF A220 or FAS2700 system has a caching module, you need to move the caching module from the old controller module to the replacement controller module.
  • Page 179 4. Verify that the caching module is seated squarely and completely in the socket. If necessary, remove the caching module and reseat it into the socket. 5. Reseat and push the heatsink down to engage the locking button on the caching module housing. 6.
  • Page 180 If your system is in… Then perform these steps… An HA pair The controller module begins to boot as soon as it is fully seated in the chassis. Be prepared to interrupt the boot process. a. With the cam handle in the open position, firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position.
  • Page 181 If your system is in… Then perform these steps… A stand-alone configuration a. With the cam handle in the open position, firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position. Do not use excessive force when sliding the controller module into the chassis to avoid damaging the connectors.
  • Page 182 Restore and verify the system configuration - AFF A220 and FAS2700 After completing the hardware replacement and booting to Maintenance mode, you verify the low-level system configuration of the replacement controller and reconfigure system settings as necessary. Step 1: Set and verify system time after replacing the controller You should check the time and date on the replacement controller module against the healthy controller module in an HA pair, or against a reliable time server in a stand-alone configuration.
  • Page 183 ▪ ▪ ▪ mcc-2n ▪ mccip ▪ non-ha b. Confirm that the setting has changed: ha-config show Step 3: Run system-level diagnostics You should run comprehensive or focused diagnostic tests for specific components and subsystems whenever you replace the controller. All commands in the diagnostic procedures are issued from the controller where the component is being replaced.
  • Page 184 If you want to run diagnostic Then… tests on… Individual components a. Clear the status logs: sldiag device clearstatus b. Display the available tests for the selected devices: sldiag device show -dev dev_name dev_name can be any one of the ports and devices identified in the preceding step.
  • Page 185 If you want to run diagnostic Then… tests on… Multiple components at the same a. Review the enabled and disabled devices in the output from the time preceding procedure and determine which ones you want to run concurrently. b. List the individual tests for the device: sldiag device show -dev dev_name c.
  • Page 186 Recable the system and reassign disks - AFF A220 and FAS2700 To complete the replacement procedure and restore your system to full operation, you must recable the storage, confirm disk reassignment, restore the NetApp Storage Encryption configuration (if necessary), and install licenses for the new controller. You must complete a series of tasks before restoring your system to full operation.
  • Page 187 c. Click the Cabling tab, and then examine the output. Make sure that all disk shelves are displayed and all disks appear in the output, correcting any cabling issues you find. d. Check other cabling by clicking the appropriate tab, and then examining the output from Config Advisor. Step 2: Reassign disks If the storage system is in an HA pair, the system ID of the new controller module is automatically assigned to the disks when the giveback occurs at the end of the procedure.
  • Page 188 node1> `storage failover show`   Takeover Node Partner Possible State Description ------------ ------------ -------- ------------------------------------- node1 node2 false System ID changed on partner (Old:   151759755, New: 151759706), In takeover node2 node1 Waiting for giveback (HA mailboxes) 4. From the healthy controller, verify that any coredumps are saved: a.
  • Page 189 on partner message. 7. Verify that the disks were assigned correctly: storage disk show -ownership The disks belonging to the replacement controller should show the new system ID. In the following example, the disks owned by node1 now show the new system ID, 1873775277: node1>...
  • Page 190 *> disk show -a Local System ID: 118065481   DISK OWNER POOL SERIAL NUMBER HOME -------- ------------- ----- ------------- ------------- disk_name system-1 (118073209) Pool0 J8XJE9LC system-1 (118073209) disk_name system-1 (118073209) Pool0 J8Y478RC system-1 (118073209) 5. Reassign disk ownership by using the system ID information obtained from the disk show command: disk reassign -s old system ID disk reassign -s 118073209 6.
  • Page 191 About this task This procedure applies only to systems in a two-node MetroCluster configuration running ONTAP. You must be sure to issue the commands in this procedure on the correct node: • The impaired node is the node on which you are performing maintenance. •...
  • Page 192 Verify that the disks belonging to the replacement node show the new system ID for the replacement node. In the following example, the disks owned by system-1 now show the new system ID, 118065481: *> disk show -a Local System ID: 118065481  ...
  • Page 193 Display the results of the MetroCluster check: metrocluster check show e. Run Config Advisor. Go to the Config Advisor page on the NetApp Support Site at support.netapp.com/NOW/download/tools/config_advisor/. After running Config Advisor, review the tool’s output and follow the recommendations in the output to address any issues discovered.
  • Page 194 If any LIFs are listed as false, revert them to their home ports: network interface revert -vserver * -lif * 2. Register the system serial number with NetApp Support. ◦ If AutoSupport is enabled, send an AutoSupport message to register the serial number. ◦ If AutoSupport is not enabled, call NetApp Support to register the serial number.
  • Page 195 MetroCluster switchback operation. This returns the configuration to its normal operating state, with the sync- source storage virtual machines (SVMs) on the formerly impaired site now active and serving data from the local disk pools. This task only applies to two-node MetroCluster configurations. Steps 1.
  • Page 196 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the Returning SEDs to unprotected mode.
  • Page 197 The following AutoSupport message suppresses automatic case creation for two hours: cluster1:> system node autosupport invoke -node * -type all -message MAINT=2h 2. Disable automatic giveback from the console of the healthy controller: storage failover modify –node local -auto-giveback false When you see Do you want to disable auto-giveback?, enter y.
  • Page 198 4. Squeeze the latch on the cam handle until it releases, open the cam handle fully to release the controller module from the midplane, and then, using two hands, pull the controller module out of the chassis. 5. Turn the controller module over and place it on a flat, stable surface. 6.
  • Page 199 Step 3: Replace the DIMMs To replace the DIMMs, locate them inside the controller and follow the specific sequence of steps. If you are replacing a DIMM, you need to remove it after you have unplugged the NVMEM battery from the controller module.
  • Page 200 5. Return to Step 3: Replace the DIMMs in this procedure to recheck the NVMEM LED. 6. Locate the DIMMs on your controller module. 7. Note the orientation of the DIMM in the socket so that you can insert the replacement DIMM in the proper orientation.
  • Page 201 Make sure that the plug locks down onto the controller module. 13. Close the controller module cover. Step 4: Reinstall the controller module After you replace components in the controller module, reinstall it into the chassis. Steps 1. If you are not already grounded, properly ground yourself. 2.
  • Page 202 If your system is in… Then perform these steps… An HA pair The controller module begins to boot as soon as it is fully seated in the chassis. Be prepared to interrupt the boot process. a. With the cam handle in the open position, firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position.
  • Page 203 If your system is in… Then perform these steps… A stand-alone configuration a. With the cam handle in the open position, firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position. Do not use excessive force when sliding the controller module into the chassis to avoid damaging the connectors.
  • Page 204 During the boot process, you can safely respond to the prompts until the Maintenance mode prompt (*>) appears. 3. Run diagnostics on the system memory: sldiag device run -dev mem 4. Verify that no hardware problems resulted from the replacement of the DIMMs: sldiag device status -dev mem -long -state failed System-level diagnostics returns you to the prompt if there are no test failures, or lists the full status of...
  • Page 205 If your controller is in… Then… Resulted in some test failures Determine the cause of the problem: a. Exit Maintenance mode: halt After you issue the command, wait until the system stops at the LOADER prompt. b. Turn off or leave on the power supplies, depending on how many controller modules are in the chassis: ◦...
  • Page 206 cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A   controller_A_1 configured enabled heal roots completed   cluster_B   controller_B_1 configured enabled waiting for switchback recovery 2 entries were displayed. 2. Verify that resynchronization is complete on all SVMs: metrocluster vserver show 3.
  • Page 207 Step 7: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information. Replace SSD Drive or HDD Drive - AFF A220 and FAS2700 You can replace a failed drive nondisruptively while I/O is in progress.
  • Page 208 Option 1: Replace SSD 1. If you want to manually assign drive ownership for the replacement drive, you need to disable automatic drive assignment replacement drive, if it is enabled You manually assign drive ownership and then reenable automatic drive assignment later in this procedure.
  • Page 209 -node node_name -autoassign on You must reenable automatic drive assignment on both controller modules. 10. Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 210 Depending on the storage system, the disk drives have the release button located at the top or on the left of the disk drive face. For example, the following illustration shows a disk drive with the release button located on the top of the disk drive face: The cam handle on the disk drive springs open partially and the disk drive releases from the midplane.
  • Page 211 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the Returning SEDs to unprotected mode.
  • Page 212 If the impaired controller is Then… displaying… The LOADER prompt Go to Remove controller module. Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired controller from the healthy controller: prompt storage failover takeover -ofnode impaired_node_name...
  • Page 213 5. Turn the controller module over and place it on a flat, stable surface. 6. Open the cover by sliding in the blue tabs to release the cover, and then swing the cover up and open. Step 3: Replace the NVMEM battery To replace the NVMEM battery in your system, you must remove the failed NVMEM battery from the system and replace it with a new NVMEM battery.
  • Page 214 4. Locate the battery plug and squeeze the clip on the face of the battery plug to release the plug from the socket, and then unplug the battery cable from the socket. 5. Remove the battery from the controller module and set it aside. 6.
  • Page 215 optic cables. 5. Complete the reinstallation of the controller module: If your system is in… Then perform these steps… An HA pair The controller module begins to boot as soon as it is fully seated in the chassis. Be prepared to interrupt the boot process. a.
  • Page 216 If your system is in… Then perform these steps… A stand-alone configuration a. With the cam handle in the open position, firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position. Do not use excessive force when sliding the controller module into the chassis to avoid damaging the connectors.
  • Page 217 During the boot process, you can safely respond to the prompts until the Maintenance mode prompt (*>) appears. 3. Run diagnostics on the NVMEM memory: sldiag device run -dev nvmem 4. Verify that no hardware problems resulted from the replacement of the NVMEM battery: sldiag device status -dev nvmem -long -state failed System-level diagnostics returns you to the prompt if there are no test failures, or lists the full status of...
  • Page 218 If your controller is in… Then… Resulted in some test failures Determine the cause of the problem: a. Exit Maintenance mode: halt After you issue the command, wait until the system stops at the LOADER prompt. b. Turn off or leave on the power supplies, depending on how many controller modules are in the chassis: ◦...
  • Page 219 cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A   controller_A_1 configured enabled heal roots completed   cluster_B   controller_B_1 configured enabled waiting for switchback recovery 2 entries were displayed. 2. Verify that resynchronization is complete on all SVMs: metrocluster vserver show 3.
  • Page 220 Step 7: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information. Swap out a power supply - AFF A220 and FAS2700 Swapping out a power supply involves turning off, disconnecting, and removing the old power supply and installing, connecting, and turning on the replacement power supply.
  • Page 221 5. Use the cam handle to slide the power supply out of the system. When removing a power supply, always use two hands to support its weight. 6. Make sure that the on/off switch of the new power supply is in the Off position. 7.
  • Page 222 11. Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information. Replace the real-time clock battery - AFF A220 and FAS2700 You replace the real-time clock (RTC) battery in the controller module so that your system’s services and applications that depend on accurate time synchronization...
  • Page 223 If the impaired controller is Then… displaying… Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired controller from the healthy controller: prompt storage failover takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y.
  • Page 224 5. Turn the controller module over and place it on a flat, stable surface. 6. Open the cover by sliding in the blue tabs to release the cover, and then swing the cover up and open. Step 3: Replace the RTC battery To replace the RTC battery, locate it inside the controller and follow the specific sequence of steps.
  • Page 225 1. If you are not already grounded, properly ground yourself. 2. Locate the RTC battery. 3. Gently push the battery away from the holder, rotate it away from the holder, and then lift it out of the holder. Note the polarity of the battery as you remove it from the holder. The battery is marked with a plus sign and must be positioned in the holder correctly.
  • Page 226 Do not completely insert the controller module in the chassis until instructed to do so. 3. Recable the system, as needed. If you removed the media converters (QSFPs or SFPs), remember to reinstall them if you are using fiber optic cables. 4.
  • Page 227 cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A   controller_A_1 configured enabled heal roots completed   cluster_B   controller_B_1 configured enabled waiting for switchback recovery 2 entries were displayed. 2. Verify that resynchronization is complete on all SVMs: metrocluster vserver show 3.
  • Page 228: Aff A250 System Documentation

    Step 6: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information. AFF A250 System Documentation Install and setup Start here: Choose your installation and setup experience For most configurations, you can choose from different content formats.
  • Page 229 Register your system. 4. Download and install NetApp Downloads: Config Advisor on your laptop. 5. Inventory and make a note of the number and types of cables you received. The following table identifies the types of cables you might receive. If you receive a cable not listed in the...
  • Page 230 ONTAP Configuration Guide and collect the required information listed in that guide. Step 2: Install the hardware You need to install your system in a 4-post rack or NetApp system cabinet, as applicable. Steps 1. Install the rail kits, as needed.
  • Page 231 3. Identify and manage cables because this system does not have a cable management device. 4. Place the bezel on the front of the system. Step 3: Cable controllers to cluster Cable the controllers to a cluster by using the two-node switchless cluster method or by using the cluster interconnect network method.
  • Page 232 Option 1: Two-node switchless cluster The management, Fibre Channel, and data or host network ports on the controller modules are connected to switches. The cluster interconnect ports are cabled on both controller modules. Before you begin • Contact your network administrator for information about connecting the system to the switches. •...
  • Page 233 DO NOT plug in the power cords at this point. Option 2: Switched cluster All ports on the controllers are connected to switches; cluster interconnect, management, Fibre Channel, and data or host network switches. Before you begin • Contact your network administrator for information about connecting the system to the switches. •...
  • Page 234 Step 4: Cable to host network or storage (Optional) You have configuration-dependent optional cabling to the Fibre Channel or iSCSI host networks or direct- attached storage. This cabling is not exclusive; you can have cabling to a host network and storage.
  • Page 235 Option 1: Cable to Fibre Channel host network Fibre Channel ports on the controllers are connected to Fibre Channel host network switches. Before you begin • Contact your network administrator for information about connecting the system to the switches. • Be sure to check the illustration arrow for the proper cable connector pull-tab orientation. As you insert the connector, you should feel it click into place;...
  • Page 236 As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it around and try again. About this task Perform the following step on each controller module. Steps 1. Cable ports e4a through e4d to the 10GbE host network switches. Option 3: Cable controllers to single drive shelf Cable each controller to the NSM modules on the NS224 drive shelf.
  • Page 237 2. Cable controller B to the shelf. Step 5: Complete system setup Complete the system setup and configuration using cluster discovery with only a connection to the switch and laptop, or by connecting directly to a controller in the system and then connecting to the management switch.
  • Page 238 Option 1: If network discovery is enabled If you have network discovery enabled on your laptop, you can complete system setup and configuration using automatic cluster discovery. Steps 1. Plug the power cords into the controller power supplies, and then connect them to power sources on different circuits.
  • Page 239 Documentation Resources page for information about configuring additional features in ONTAP. Option 2: If network discovery is not enabled If network discovery is not enabled on your laptop, you must complete the configuration and setup using this task. Steps 1. Cable and configure your laptop or console: a.
  • Page 240 b. Configure the system using the data you collected in the ONTAP Configuration Guide. 5. Verify the health of your system by running Config Advisor. 6. After you have completed the initial configuration, go to the ONTAP & ONTAP System Manager Documentation Resources page for information about configuring additional features in ONTAP.
  • Page 241 Check NVE or NSE on systems running ONTAP 9.6 and later Before shutting down the impaired controller, you need to verify whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
  • Page 242 Restored yes: a. Restore the external key management authentication keys to all nodes in the cluster: security key- manager external restore If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys: Restored...
  • Page 243 Restored yes: a. Restore the external key management authentication keys to all nodes in the cluster: security key- manager external restore If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys: Restored...
  • Page 244 Verify the column shows for all authentication keys: Restored security key-manager key query c. Verify that the type shows onboard, and then manually back up the OKM information. Key Manager d. Go to advanced privilege mode and enter...
  • Page 245 Do not use this procedure if your system is in a two-node MetroCluster configuration. To shut down the impaired controller, you must determine the status of the controller and, if necessary, take over the controller so that the healthy controller continues to serve data from the impaired controller storage. •...
  • Page 246 2. Unplug the controller module power supplies from the source. 3. Release the power cable retainers, and then unplug the cables from the power supplies. 4. Insert your forefinger into the latching mechanism on either side of the controller module, press the lever with your thumb, and gently pull the controller a few inches out of the chassis.
  • Page 247 Thumbscrew Controller module cover. 7. Lift out the air duct cover.
  • Page 248 Step 2: Replace the boot media You locate the failed boot media in the controller module by removing the air duct on the controller module before you can replace the boot media. You need a #1 magnetic Phillips head screwdriver to remove the screw that holds the boot media in place. Due to the space constraints within the controller module, you should also have a magnet to transfer the screw on to so that you do not lose it.
  • Page 249 • A copy of the same image version of ONTAP as what the impaired controller was running. You can download the appropriate image from the Downloads section on the NetApp Support Site ◦ If NVE is enabled, download the image with NetApp Volume Encryption, as indicated in the download...
  • Page 250 ◦ If NVE is not enabled, download the image without NetApp Volume Encryption, as indicated in the download button. • If your system is an HA pair, you must have a network connection. • If your system is a stand-alone system you do not need a network connection, but you must perform an additional reboot when restoring the var file system.
  • Page 251 7. Close the controller module cover and tighten the thumbscrew.
  • Page 252 Controller module cover Thumbscrew 8. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. 9. Plug the power cable into the power supply and reinstall the power cable retainer. 10.
  • Page 253 ▪ dns_addr is the IP address of a name server on your network. ▪ is the Domain Name System (DNS) domain name. dns_domain If you use this optional parameter, you do not need a fully qualified domain name in the netboot server URL.
  • Page 254 If your system has… Then… No network connection and is in a a. Press when prompted to restore the backup configuration. MetroCluster IP configuration b. Reboot the system when prompted by the system. c. Wait for the iSCSI storage connections to connect. You can proceed after you see the following messages: date-and-time [node- name:iscsi.session.stateChanged:notice]:...
  • Page 255 Restore OKM, NSE, and NVE as needed - AFF A250 Once environment variables are checked, you must complete steps specific to systems that have Onboard Key Manager (OKM), NetApp Storage Encryption (NSE) or NetApp Volume Encryption (NVE) enabled. 1. Determine which section you should use to restore your OKM, NSE, or NVE configurations: If NSE or NVE are enabled along with Onboard Key Manager you must restore settings you captured at the beginning of this procedure.
  • Page 256 If the console displays… Then… The LOADER prompt Boot the controller to the boot menu: boot_ontap menu Waiting for giveback…. a. Enter at the prompt Ctrl-C b. At the message: Do you wish to halt this node rather than wait [y/n]? , enter: c.
  • Page 257 9. Confirm the target controller is ready for giveback with the storage failover show command. 10. Giveback only the CFO aggregates with the storage failover giveback -fromnode local command. -only-cfo-aggregates true ◦ If the command fails because of a failed disk, physically disengage the failed disk, but leave the disk in the slot until a replacement is received.
  • Page 258 Restore NSE/NVE on systems running ONTAP 9.6 and later Steps 1. Connect the console cable to the target controller. 2. Use the command at the LOADER prompt to boot the controller. boot_ontap 3. Check the console output: If the console displays… Then…...
  • Page 259 • This procedure is written with the assumption that you are moving the bezel, NVMe drives, and controller modules to the new chassis, and that the replacement chassis is a new component from NetApp. • This procedure is disruptive. For a two-node cluster, you will have a complete service outage and a partial outage in a multi-node cluster.
  • Page 260 • Suspend external backup jobs. • Necessary tools and equipment for the replacement. If the system is a NetApp StorageGRID or ONTAP S3 used as FabricPool cloud tier, refer to the Gracefully shutdown and power up your storage system Resolution Guide after performing this procedure.
  • Page 261 For clusters using SnapMirror synchronous operating in StrictSync mode: system node halt -node * -skip-lif-migration-before-shutdown true -ignore-quorum -warnings true -inhibit-takeover true -ignore-strict-sync-warnings true 7. Enter y for each controller in the cluster when you see Warning: Are you sure you want to halt node "cluster name-controller number"? {y|n}: 8.
  • Page 262 Lever Latching mechanism 5. Using both hands, grasp the controller module sides and gently pull it out of the chassis and set it on a flat, stable surface. 6. Set the controller module aside in a safe place, and repeat these steps for the other controller module in the chassis.
  • Page 263 and against the drive holder. Be sure to close the cam handle slowly so that it aligns correctly with the front of the drive carrier. It clicks when it is secure. 6. Repeat the process for the remaining drives in the system. Step 3: Replace a chassis from within the equipment rack or system cabinet You must remove the existing chassis from the equipment rack or system cabinet before you can install the replacement chassis.
  • Page 264 Complete the restoration and replacement process - AFF A250 You must verify the HA state of the chassis, run diagnostics, and return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Step 1: Verify and set the HA state of the chassis You must verify the HA state of the chassis, and, if necessary, update the state to match your system configuration.
  • Page 265 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the...
  • Page 266 Returning SEDs to unprotected mode. • If you have a SAN system, you must have checked event messages (cluster kernel-service show) for impaired controller SCSI blade. The command displays the node cluster kernel-service show name, quorum status of that node, availability status of that node, and operational status of that node. Each SCSI-blade process should be in quorum with the other nodes in the cluster.
  • Page 267 Step 1: Remove the controller module You must remove the controller module from the chassis when you replace a component inside the controller module. Make sure that you label the cables so that you know where they came from. Use the following video or the tabulated steps to replace a controller module: Animation - Replace a controller module 1.
  • Page 268 Thumbscrew Controller module cover. 7. Lift out the air duct cover.
  • Page 269 Step 2: Move the power supply You must move the power supply from the impaired controller module to the replacement controller module when you replace a controller module. 1. Disconnect the power supply. 2. Open the power cable retainer, and then unplug the power cable from the power supply. 3.
  • Page 270 Blue power supply locking tab Power supply 5. Move the power supply to the new controller module, and then install it. 6. Using both hands, support and align the edges of the power supply with the opening in the controller module, and then gently push the power supply into the controller module until the locking tab clicks into place.
  • Page 271 Fan module 2. Move the fan module to the replacement controller module, and align the edges of the fan module with the opening in the controller module, and then slide the fan module in. 3. Repeat these steps for the remaining fan modules. Step 4: Move the boot media There is one boot media device in the AFF A250 under the air duct in the controller module.
  • Page 272 Remove the screw securing the boot media to the motherboard in the impaired controller module. Lift the boot media out of the impaired controller module. a. Using the #1 magnetic screwdriver, remove the screw from the boot media, and set it aside safely on the magnet.
  • Page 273 Install each DIMM into the same slot it occupied in the impaired controller module. 1. Slowly push apart the DIMM ejector tabs on either side of the DIMM, and slide the DIMM out of the slot. Hold the DIMM by the edges to avoid pressure on the components on the DIMM circuit board.
  • Page 274 Remove screws on the face of the controller module. Loosen the screw in the controller module. Move the mezzanine card. 2. Unplug any cabling associated with the mezzanine card. Make sure that you label the cables so that you know where they came from. a.
  • Page 275 4. Insert the SFP or QSFP modules that were removed onto the mezzanine card. Step 7: Move the NV battery When replacing the controller module, you must move the NV battery from the impaired controller module to the replacement controller module. 1.
  • Page 276 4. Locate the corresponding NV battery holder on the replacement controller module and align the NV battery to the battery holder. 5. Insert the NV battery plug into the socket. 6. Slide the battery pack down along the sheet metal side wall until the support tabs on the side wall hook into the slots on the battery pack, and the battery pack latch engages and clicks into the opening on the side wall.
  • Page 277 Controller module cover Thumbscrew 3. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. Do not completely insert the controller module in the chassis until instructed to do so. 4.
  • Page 278 mechanisms snap into place. The controller module begins to boot as soon as it is fully seated in the chassis. Be prepared to interrupt the boot process. The controller module should be fully inserted and flush with the edges of the chassis. Restore and verify the system configuration - AFF A250 After completing the hardware replacement and booting to Maintenance mode, you verify the low-level system configuration of the replacement controller and reconfigure system...
  • Page 279 The HA state should be the same for all components. 2. If the displayed system state of the controller module does not match your system configuration, set the state for the controller module: ha-config modify controller ha-state The value for HA-state can be one of the following: ◦...
  • Page 280 Step 1: Recable the system Recable the controller module’s storage and network connections. Steps 1. Recable the system. 2. Verify that the cabling is correct by using Active IQ Config Advisor. a. Download and install Config Advisor. b. Enter the information for the target system, and then click Collect Data. c.
  • Page 281 You can respond when prompted to continue into advanced mode. The advanced mode prompt appears (*>). b. Save any coredumps: system node run -node local-node-name partner savecore c. Wait for the `savecore`command to complete before issuing the giveback. You can enter the following command to monitor the progress of the savecore command: system node run -node local-node-name partner savecore -s d.
  • Page 282 node1> `storage disk show -ownership` Disk Aggregate Home Owner DR Home Home ID Owner ID DR Home ID Reserver Pool ----- ------ ----- ------ -------- ------- ------- ------- --------- 1.0.0 aggr0_1 node1 node1 1873775277 1873775277 1873775277 Pool0 1.0.1 aggr0_1 node1 node1 1873775277 1873775277 1873775277 Pool0 8.
  • Page 283 -node replacement-node-name -onreboot true Complete system restoration - AFF A250 To restore your system to full operation, you must restore the NetApp Storage Encryption configuration (if necessary), and install licenses for the new controller, and return the failed part to NetApp, as described in the RMA instructions shipped with the kit.
  • Page 284 -node local -auto -giveback true Step 3: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information.
  • Page 285 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the Returning SEDs to unprotected mode.
  • Page 286 Make sure that you label the cables so that you know where they came from. 1. If you are not already grounded, properly ground yourself. 2. Unplug the controller module power supplies from the source. 3. Release the power cable retainers, and then unplug the cables from the power supplies. 4.
  • Page 287 Thumbscrew Controller module cover. 7. Lift out the air duct cover.
  • Page 288 Step 3: Replace a DIMM To replace a DIMM, you must locate it in the controller module using the DIMM map label on top of the air duct and then replace it following the specific sequence of steps. Use the following video or the tabulated steps to replace a DIMM: Animation - Replace a DIMM 1.
  • Page 289 2. Note the orientation of the DIMM in the socket so that you can insert the replacement DIMM in the proper orientation. 3. Slowly push apart the DIMM ejector tabs on either side of the DIMM, and slide the DIMM out of the slot. 4.
  • Page 290 2. Close the controller module cover and tighten the thumbscrew.
  • Page 291 Controller module cover Thumbscrew 3. Insert the controller module into the chassis: a. Ensure the latching mechanism arms are locked in the fully extended position. b. Using both hands, align and gently slide the controller module into the latching mechanism arms until it stops.
  • Page 292 Step 6: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information. Replace SSD Drive or HDD Drive - AFF A250 You can replace a failed drive nondisruptively while I/O is in progress.
  • Page 293 Option 1: Replace SSD 1. If you want to manually assign drive ownership for the replacement drive, you need to disable automatic drive assignment replacement drive, if it is enabled You manually assign drive ownership and then reenable automatic drive assignment later in this procedure.
  • Page 294 -node node_name -autoassign on You must reenable automatic drive assignment on both controller modules. 10. Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 295 Depending on the storage system, the disk drives have the release button located at the top or on the left of the disk drive face. For example, the following illustration shows a disk drive with the release button located on the top of the disk drive face: The cam handle on the disk drive springs open partially and the disk drive releases from the midplane.
  • Page 296 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the Returning SEDs to unprotected mode.
  • Page 297 If the impaired controller is Then… displaying… System prompt or password Take over or halt the impaired controller from the healthy controller: prompt storage failover takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y. Step 2: Remove the controller module You must remove the controller module from the chassis when you replace a component inside the controller module.
  • Page 298 6. Turn the thumbscrew on the front of the controller module anti-clockwise and open the controller module cover. Thumbscrew Controller module cover Step 3: Replace a fan To replace a fan, remove the failed fan module and replace it with a new fan module. Use the following video or the tabulated steps to replace a fan: Animation - Replace a fan 1.
  • Page 299 Fan module 3. Align the edges of the replacement fan module with the opening in the controller module, and then slide the replacement fan module into the controller module. Step 4: Reinstall the controller module After you replace a component within the controller module, you must reinstall the controller module in the system chassis and boot it.
  • Page 300 Controller module cover Thumbscrew 2. Insert the controller module into the chassis: a. Ensure the latching mechanism arms are locked in the fully extended position. b. Using both hands, align and gently slide the controller module into the latching mechanism arms until it stops.
  • Page 301 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the Returning SEDs to unprotected mode.
  • Page 302 3. Take the impaired controller to the LOADER prompt: If the impaired controller is Then… displaying… The LOADER prompt Go to Remove controller module. Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired controller from the healthy controller: prompt storage failover takeover -ofnode impaired_node_name...
  • Page 303 Latching mechanism 5. Using both hands, grasp the controller module sides and gently pull it out of the chassis and set it on a flat, stable surface. 6. Turn the thumbscrew on the front of the controller module anti-clockwise and open the controller module cover.
  • Page 304 Remove screws on the face of the controller module. Loosen the screw in the controller module. Remove the mezzanine card. a. Unplug any cabling associated with the impaired mezzanine card. Make sure that you label the cables so that you know where they came from. b.
  • Page 305 Do not apply force when tightening the screw on the mezzanine card; you might crack it. i. Insert any SFP or QSFP modules that were removed from the impaired mezzanine card to the replacement mezzanine card. 3. To install a mezzanine card: 4.
  • Page 306 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the Returning SEDs to unprotected mode.
  • Page 307 name, quorum status of that node, availability status of that node, and operational status of that node. Each SCSI-blade process should be in quorum with the other nodes in the cluster. Any issues must be resolved before you proceed with the replacement. •...
  • Page 308 If you have difficulty removing the controller module, place your index fingers through the finger holes from the inside (by crossing your arms). Lever Latching mechanism 5. Using both hands, grasp the controller module sides and gently pull it out of the chassis and set it on a flat, stable surface.
  • Page 309 Thumbscrew Controller module cover. Step 3: Replace the NVMEM battery To replace the NVMEM battery, you must remove the failed battery from the controller module and install the replacement battery into the controller module. Use the following video or the tabulated steps to replace the NVMEM battery: Animation - Replace the NVMEM battery 1.
  • Page 310 Squeeze the clip on the face of the battery plug. Unplug the battery cable from the socket. Grasp the battery and press the blue locking tab marked PUSH. Lift the battery out of the holder and controller module. 2. Locate the battery plug and squeeze the clip on the face of the battery plug to release the plug from the socket.
  • Page 311 Step 4: Install the controller module After you have replaced the component in the controller module, you must reinstall the controller module into the chassis, and then boot it to Maintenance mode. You can use the following illustration or the written steps to install the replacement controller module in the chassis.
  • Page 312 ◦ If the scan reported no failures, select Reboot from the menu to reboot the system. Step 6: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return &...
  • Page 313 • Power supplies are auto-ranging. Do not mix PSUs with different efficiency ratings. Always replace like for like. Use the appropriate procedure for your type of PSU; AC or DC.
  • Page 314 Option 1: Replace an AC PSU Use the following video or the tabulated steps to replace the PSU: Animation - Replace the AC PSU 1. If you are not already grounded, properly ground yourself. 2. Identify the PSU you want to replace, based on console error messages or through the red Fault LED on the PSU.
  • Page 315 Secure the power cable to the PSU using the power cable retainer. Once power is restored to the PSU, the status LED should be green. 7. Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information.
  • Page 316 Secure the power cable to the PSU with the thumbscrews. Once power is restored to the PSU, the status LED should be green. 7. Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information.
  • Page 317 name, quorum status of that node, availability status of that node, and operational status of that node. Each SCSI-blade process should be in quorum with the other nodes in the cluster. Any issues must be resolved before you proceed with the replacement. •...
  • Page 318 If you have difficulty removing the controller module, place your index fingers through the finger holes from the inside (by crossing your arms). Lever Latching mechanism 5. Using both hands, grasp the controller module sides and gently pull it out of the chassis and set it on a flat, stable surface.
  • Page 319 Thumbscrew Controller module cover. 7. Lift out the air duct cover.
  • Page 320 Step 3: Replace the RTC battery To replace the RTC battery, locate it inside the controller and follow the specific sequence of steps. Use the following video or the tabulated steps to replace the RTC battery: Animation - Replace the RTC battery 1.
  • Page 321 Gently pull tab away from the battery housing. Attention: Pulling it away aggressively might displace the tab. Lift the battery up. Note: Make a note of the polarity of the battery. The battery should eject out. The battery will be ejected out. 2.
  • Page 322 With positive polarity face up, slide the battery under the tab of the battery housing. Push the battery gently into place and make sure the tab secures it to the housing. Pushing it in aggressively might cause the battery to eject out again. 4.
  • Page 323: Aff A400 System Documentation

    -node local -auto -giveback true Step 5: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information.
  • Page 324 The following video shows end-to-end software configuration for systems running ONTAP 9.2 and later. | https://img.youtube.com/vi/WAE0afWhj1c?/maxresdefault.jpg Detailed guide - AFF A400 This guide gives detailed step-by-step instructions for installing a typical NetApp system. Use this guide if you want more detailed installation instructions.
  • Page 325 You might also want to have access to the Release Notes for your version of ONTAP for more information about this system. NetApp Hardware Universe Find the Release Notes for your version of ONTAP 9 You need to provide the following at your site: •...
  • Page 326 Power cables Not applicable Powering up the system 4. Review the NetApp ONTAP Configuration Guide and collect the required information listed in that guide. ONTAP Configuration Guide Step 2: Install the hardware You need to install your system in a 4-post rack or NetApp system cabinet, as applicable.
  • Page 327 A400 and FAS8300/8700), and then look for the card, by part number, in the NetApp Hardware Universe for a graphic of the bezel which will show the port labels. The card part number can be found using the sysconfig -a command or on the system packing list.
  • Page 328 As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it around and try again. Steps 1. Use the animation or illustration to complete the cabling between the controllers and to the switches: Animation - Two-node switchless cluster cabling 2.
  • Page 329 As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it around and try again. Steps 1. Use the animation or illustration to complete the cabling between the controllers and to the switches: Animation - Switched cluster cabling 2.
  • Page 330 As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it around and try again. Steps 1. Use the following animation or illustration to cable your controllers to a single drive shelf. Animation - Cable the controllers to one NS224 drive shelf 2.
  • Page 331 Animation - Cable the controllers to one NS224 drive shelf 2. Go to Step 5: Complete system setup and configuration to complete system setup and configuration. Option 3: Cable the controllers to SAS drive shelves You must cable each controller to the IOM modules on both SAS drive shelves. Be sure to check the illustration arrow for the proper cable connector pull-tab orientation.
  • Page 332 Steps 1. Use the following illustration to cable your controllers to two drive shelves. Animation - Cable the controllers to SAS drive shelves 2. Go to Step 5: Complete system setup and configuration to complete system setup and configuration. Step 5: Complete system setup and configuration You can complete the system setup and configuration using cluster discovery with only a connection to the switch and laptop, or by connecting directly to a controller in the system and then connecting to the management switch.
  • Page 333 Double-click either ONTAP icon and accept any certificates displayed on your screen. XXXXX is the system serial number for the target node. System Manager opens. 6. Use System Manager guided setup to configure your system using the data you collected in the NetApp ONTAP Configuration Guide.
  • Page 334 Register your system. NetApp Product Registration c. Download Active IQ Config Advisor. NetApp Downloads: Config Advisor 8. Verify the health of your system by running Config Advisor. 9. After you have completed the initial configuration, go to the ONTAP & ONTAP System Manager Documentation Resources page for information about configuring additional features in ONTAP.
  • Page 335 Point your browser to the node management IP address. The format for the address is https://x.x.x.x. b. Configure the system using the data you collected in the NetApp ONTAP Configuration guide. ONTAP Configuration Guide 6. Set up your account and download Active IQ Config Advisor: a.
  • Page 336 ◦ If the impaired controller is in a standalone configuration and at LOADER prompt, contact mysupport.netapp.com. 2. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message...
  • Page 337 Check NVE or NSE on systems running ONTAP 9.6 and later Before shutting down the impaired controller, you need to verify whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
  • Page 338 Restore the external key management authentication keys to all nodes in the cluster: security key- manager external restore If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys:...
  • Page 339 Restored yes: a. Restore the external key management authentication keys to all nodes in the cluster: security key- manager external restore If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys: Restored...
  • Page 340 impaired controller. Shut down or take over the impaired controller using the appropriate procedure for your configuration. Option 1: Most configurations After completing the NVE or NSE tasks, you need to complete the shutdown of the impaired controller. Steps a. Take the impaired controller to the LOADER prompt: If the impaired controller Then…...
  • Page 341 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with the CLI.
  • Page 342 command from the surviving cluster. controller_A_1::> metrocluster heal -phase aggregates [Job 130] Job succeeded: Heal Aggregates is successful. If the healing is vetoed, you have the option of reissuing the command with the metrocluster heal parameter. If you use this optional parameter, the system overrides any soft vetoes -override-vetoes that prevent the healing operation.
  • Page 343 mcc1A::> metrocluster operation show   Operation: heal-root-aggregates   State: successful  Start Time: 7/29/2016 20:54:41   End Time: 7/29/2016 20:54:42   Errors: - 8. On the impaired controller module, disconnect the power supplies. Replace the boot media - AFF A400 To replace the boot media, you must remove the impaired controller module, install the replacement boot media, and transfer the boot image to a USB flash drive.
  • Page 344 NetApp Support Site. You must log into the NetApp Support Site to display the Statement of Volatility for your system. You can use the following animation, illustration, or the written steps to replace the boot media.
  • Page 345 Locking tabs Slide air duct toward back of controller Rotate air duct up a. Press the locking tabs on the sides of the air duct in toward the middle of the controller module. b. Slide the air duct toward the back of the controller module, and then rotate it upward to its completely open position.
  • Page 346 • A copy of the same image version of ONTAP as what the impaired controller was running. You can download the appropriate image from the Downloads section on the NetApp Support Site ◦ If NVE is enabled, download the image with NetApp Volume Encryption, as indicated in the download button.
  • Page 347 Steps 1. Download and copy the appropriate service image from the NetApp Support Site to the USB flash drive. a. Download the service image to your work space on your laptop. b. Unzip the service image.
  • Page 348 c. Rotate the locking latches upward, tilting them so that they clear the locking pins, and then lower them into the locked position. d. If you have not already done so, reinstall the cable management device. 8. Interrupt the boot process by pressing Ctrl-C to stop at the LOADER prompt. If you miss this message, press Ctrl-C, select the option to boot to Maintenance mode, and then halt controller to boot to LOADER.
  • Page 349 If your system has… Then… A network connection a. Press when prompted to restore the backup configuration. b. Set the healthy controller to advanced privilege level: -privilege advanced c. Run the restore backup command: system node restore- backup -node local -target-address impaired_node_IP_address d.
  • Page 350 8. Give back the controller using the storage failover giveback -fromnode local command. 9. At the cluster prompt, check the logical interfaces with the command. net int -is-home false If any interfaces are listed as "false", revert those interfaces back to their home port using the net int command.
  • Page 351 This task only applies to two-node MetroCluster configurations. Steps 1. Verify that all nodes are in the enabled state: metrocluster node show cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A  ...
  • Page 352 Restore OKM, NSE, and NVE as needed - AFF A400 Once environment variables are checked, you must complete steps specific to systems that have Onboard Key Manager (OKM), NetApp Storage Encryption (NSE) or NetApp Volume Encryption (NVE) enabled. 1. Determine which section you should use to restore your OKM, NSE, or NVE configurations: If NSE or NVE are enabled along with Onboard Key Manager you must restore settings you captured at the beginning of this procedure.
  • Page 353 --------------------------BEGIN BACKUP-------------------------- TmV0QXBwIEtleSBCbG9iAAEAAAAEAAAAcAEAAAAAAADuD+byAAAAACEAAAAAAAAA QAAAAAAAAABvOlH0AAAAAMh7qDLRyH1DBz12piVdy9ATSFMT0C0TlYFss4PDjTaV dzRYkLd1PhQLxAWJwOIyqSr8qY1SEBgm1IWgE5DLRqkiAAAAAAAAACgAAAAAAAAA 3WTh7gAAAAAAAAAAAAAAAAIAAAAAAAgAZJEIWvdeHr5RCAvHGclo+wAAAAAAAAAA IgAAAAAAAAAoAAAAAAAAAEOTcR0AAAAAAAAAAAAAAAACAAAAAAAJAGr3tJA/ LRzUQRHwv+1aWvAAAAAAAAAAACQAAAAAAAAAgAAAAAAAAACdhTcvAAAAAJ1PXeBf ml4NBsSyV1B4jc4A7cvWEFY6lLG6hc6tbKLAHZuvfQ4rIbYAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA H4nPQM0nrDRYRa9SCv8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAA ---------------------------END BACKUP--------------------------- 7. At the Boot Menu select the option for Normal Boot. The system boots to Waiting for giveback… prompt. 8. Move the console cable to the partner controller and login as "admin". 9.
  • Page 354 c. Enter the security key-manager key query command to see a detailed view of all keys stored in the onboard key manager and verify that the column = for all authentication Restored yes/true keys. If the Restored column = anything other than yes/true, contact Customer Support. d.
  • Page 355 -node local command. -auto-giveback true Return the failed part to NetApp - AFF A400 Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information.
  • Page 356 • Suspend external backup jobs. • Necessary tools and equipment for the replacement. If the system is a NetApp StorageGRID or ONTAP S3 used as FabricPool cloud tier, refer to the Gracefully shutdown and power up your storage system Resolution Guide after performing this procedure.
  • Page 357 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with the CLI.
  • Page 358 Steps 1. Check the MetroCluster status to determine whether the impaired controller has automatically switched over to the healthy controller: metrocluster show 2. Depending on whether an automatic switchover has occurred, proceed according to the following table: If the impaired controller… Then…...
  • Page 359 controller_A_1::> storage aggregate show Aggregate Size Available Used% State #Vols Nodes RAID Status --------- -------- --------- ----- ------- ------ ---------------- ------------ aggr_b2 227.1GB 227.1GB 0% online 0 mcc1-a2 raid_dp, mirrored, normal... 6. Heal the root aggregates by using the command. metrocluster heal -phase root-aggregates mcc1A::>...
  • Page 360 Leave the cables in the cable management device so that when you reinstall the cable management device, the cables are organized. 4. Remove and set aside the cable management devices from the left and right sides of the controller module. 5.
  • Page 361 2. With two people, slide the old chassis off the rack rails in a system cabinet or equipment rack, and then set it aside. 3. If you are not already grounded, properly ground yourself. 4. Using two people, install the replacement chassis into the equipment rack or system cabinet by guiding the chassis onto the rack rails in a system cabinet or equipment rack.
  • Page 362 Complete the restoration and replacement process - AFF A400 You must verify the HA state of the chassis, run diagnostics, and return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Step 1: Verify and set the HA state of the chassis You must verify the HA state of the chassis, and, if necessary, update the state to match your system configuration.
  • Page 363 4. Select Test system from the displayed menu to run diagnostics tests. 5. Select the test or series of tests from the various sub-menus. 6. Proceed based on the result of the preceding step: ◦ If the test failed, correct the failure, and then rerun the test. ◦...
  • Page 364 6. Reestablish any SnapMirror or SnapVault configurations. Step 4: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information.
  • Page 365 • It is important that you apply the commands in these steps on the correct systems: ◦ The impaired controller is the controller that is being replaced. ◦ The replacement node is the new controller that is replacing the impaired controller. ◦...
  • Page 366 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in Returning SEDs to unprotected mode.
  • Page 367 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with CLI.
  • Page 368 controller_A_1::> metrocluster operation show   Operation: heal-aggregates   State: successful Start Time: 7/25/2016 18:45:55   End Time: 7/25/2016 18:45:56   Errors: - 5. Check the state of the aggregates by using the command. storage aggregate show controller_A_1::> storage aggregate show Aggregate Size Available Used% State #Vols...
  • Page 369 Replace the controller module hardware - AFF A400 To replace the controller module hardware, you must remove the impaired controller, move FRU components to the replacement controller module, install the replacement controller module in the chassis, and then boot the system to Maintenance mode. Step 1: Remove the controller module To access components inside the controller module, you must remove the controller module from the chassis.
  • Page 370 7. Place the controller module on a stable, flat surface. 8. On the replacement controller module, open the air duct and remove the empty risers from the controller module using the animation, illustration, or the written steps: Animation - Remove the empty risers from the replacement controller module a.
  • Page 371 a. Rotate the cam handle so that it can be used to pull the power supply out of the chassis. b. Press the blue locking tab to release the power supply from the chassis. c. Using both hands, pull the power supply out of the chassis, and then set it aside. 2.
  • Page 372 1. Open the air duct: a. Press the locking tabs on the sides of the air duct in toward the middle of the controller module. b. Slide the air duct toward the back of the controller module, and then rotate it upward to its completely open position.
  • Page 373 1. Locate and remove the boot media from the controller module: a. Press the blue button at the end of the boot media until the lip on the boot media clears the blue button. b. Rotate the boot media up and gently pull the boot media out of the socket. 2.
  • Page 374 1. Move PCIe risers one and two from the impaired controller module to the replacement controller module: a. Remove any SFP or QSFP modules that might be in the PCIe cards. b. Rotate the riser locking latch on the left side of the riser up and toward air duct. The riser raises up slightly from the controller module.
  • Page 375 You can use the following animation, illustration, or the written steps to move the DIMMs from the impaired controller module to the replacement controller module. Animation - Move the DIMMs 1. Locate the DIMMs on your controller module. 2. Note the orientation of the DIMM in the socket so that you can insert the DIMM in the replacement controller module in the proper orientation.
  • Page 376 d. Visually inspect the DIMM to verify that it is evenly aligned and fully inserted into the socket. e. Repeat these substeps for the remaining DIMMs. 5. Plug the NVDIMM battery into the motherboard. Make sure that the plug locks down onto the controller module. Step 7: Install the controller module After all of the components have been moved from the impaired controller module to the replacement controller module, you must install the replacement controller module into the chassis, and then boot it to Maintenance...
  • Page 377 b. Using the locking latches, firmly push the controller module into the chassis until the locking latches begin to rise. Do not use excessive force when sliding the controller module into the chassis to avoid damaging the connectors. c. Fully seat the controller module in the chassis by rotating the locking latches upward, tilting them so that they clear the locking pins, gently push the controller all the way in, and then lower the locking latches into the locked position.
  • Page 378 The date and time are given in GMT. 4. If necessary, set the date in GMT on the replacement node: set date mm/dd/yyyy 5. If necessary, set the time in GMT on the replacement node: set time hh:mm:ss 6. At the LOADER prompt, confirm the date and time on the replacement node: date The date and time are given in GMT.
  • Page 379 3. Select Scan System from the displayed menu to enable running the diagnostics tests. 4. Select Test system from the displayed menu to run diagnostics tests. 5. Select the test or series of tests from the various sub-menus. 6. Proceed based on the result of the preceding step: ◦...
  • Page 380 In the command output, you should see a message that the system ID has changed on the impaired controller, showing the correct old and new IDs. In the following example, node2 has undergone replacement and has a new system ID of 151759706. node1>...
  • Page 381 b. After the giveback has been completed, confirm that the HA pair is healthy and that takeover is possible: storage failover show The output from the command should not include the System ID changed storage failover show on partner message. 7.
  • Page 382 -node replacement-node-name -onreboot true Complete system restoration - AFF A400 To restore your system to full operation, you must restore the NetApp Storage Encryption configuration (if necessary), and install licenses for the new controller, and return the failed part to NetApp, as described in the RMA instructions shipped with the kit.
  • Page 383 If any LIFs are listed as false, revert them to their home ports: network interface revert -vserver * -lif * 2. Register the system serial number with NetApp Support. ◦ If AutoSupport is enabled, send an AutoSupport message to register the serial number. ◦ If AutoSupport is not enabled, call NetApp Support to register the serial number.
  • Page 384 cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A   controller_A_1 configured enabled heal roots completed   cluster_B   controller_B_1 configured enabled waiting for switchback recovery 2 entries were displayed. 2. Verify that resynchronization is complete on all SVMs: metrocluster vserver show 3.
  • Page 385 Step 4: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information. Replace a DIMM - AFF A400 You must replace a DIMM in the controller module when your system registers an increasing number of correctable error correction codes (ECC);...
  • Page 386 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in Returning SEDs to unprotected mode.
  • Page 387 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with CLI.
  • Page 388 controller_A_1::> metrocluster operation show   Operation: heal-aggregates   State: successful Start Time: 7/25/2016 18:45:55   End Time: 7/25/2016 18:45:56   Errors: - 5. Check the state of the aggregates by using the command. storage aggregate show controller_A_1::> storage aggregate show Aggregate Size Available Used% State #Vols...
  • Page 389 Step 2: Remove the controller module To access components inside the controller module, you must remove the controller module from the chassis. You can use the following animation, illustration, or the written steps to remove the controller module from the chassis.
  • Page 390 Step 3: Replace system DIMMs Replacing a system DIMM involves identifying the target DIMM through the associated error message, locating the target DIMM using the FRU map on the air duct, and then replacing the DIMM. You can use the following animation, illustration, or the written steps to replace a system DIMM. The animation and illustration show empty slots for sockets without DIMMs.
  • Page 391 b. Slide the air duct toward the back of the controller module, and then rotate it upward to its completely open position. 2. Locate the DIMMs on your controller module. 3. Note the orientation of the DIMM in the socket so that you can insert the replacement DIMM in the proper orientation.
  • Page 392 1. If you have not already done so, close the air duct. 2. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. Do not completely insert the controller module in the chassis until instructed to do so. 3.
  • Page 393 f. At the LOADER prompt, enter to reinitialize the PCIe cards and other components. g. Interrupt the boot process and boot to the LOADER prompt by pressing Ctrl-C. If your system stops at the boot menu, select the option to boot to LOADER. Step 5: Run diagnostics After you have replaced a system DIMM in your system, you should run diagnostic tests on that component.
  • Page 394 This task only applies to two-node MetroCluster configurations. Steps 1. Verify that all nodes are in the state: enabled metrocluster node show cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A  ...
  • Page 395 6. Reestablish any SnapMirror or SnapVault configurations. Step 8: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information.
  • Page 396 The Attention LED should not be lit after the fan is seated and has spun up to operational speed. 10. Align the bezel with the ball studs, and then gently push the bezel onto the ball studs. 11. Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return &...
  • Page 397 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in Returning SEDs to unprotected mode.
  • Page 398 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with CLI.
  • Page 399 controller_A_1::> metrocluster operation show   Operation: heal-aggregates   State: successful Start Time: 7/25/2016 18:45:55   End Time: 7/25/2016 18:45:56   Errors: - 5. Check the state of the aggregates by using the command. storage aggregate show controller_A_1::> storage aggregate show Aggregate Size Available Used% State #Vols...
  • Page 400 Step 2: Remove the controller module To access components inside the controller module, you must remove the controller module from the chassis. You can use the following animations, illustration, or the written steps to remove the controller module from the chassis.
  • Page 401 Step 3: Replace the NVDIMM battery To replace the NVDIMM battery, you must remove the failed battery from the controller module and install the replacement battery into the controller module. See the FRU map inside the controller module to locate the NVDIMM battery.
  • Page 402 1. If you have not already done so, close the air duct. 2. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. Do not completely insert the controller module in the chassis until instructed to do so. 3.
  • Page 403 f. At the LOADER prompt, enter to reinitialize the PCIe cards and other components. g. Interrupt the boot process and boot to the LOADER prompt by pressing Ctrl-C. If your system stops at the boot menu, select the option to boot to LOADER. Step 5: Run diagnostics After you have replaced a component in your system, you should run diagnostic tests on that component.
  • Page 404 Steps 1. Verify that all nodes are in the state: enabled metrocluster node show cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A   controller_A_1 configured enabled heal roots completed   cluster_B  ...
  • Page 405 6. Reestablish any SnapMirror or SnapVault configurations. Step 8: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information.
  • Page 406 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in Returning SEDs to unprotected mode.
  • Page 407 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with CLI.
  • Page 408 controller_A_1::> metrocluster operation show   Operation: heal-aggregates   State: successful Start Time: 7/25/2016 18:45:55   End Time: 7/25/2016 18:45:56   Errors: - 5. Check the state of the aggregates by using the command. storage aggregate show controller_A_1::> storage aggregate show Aggregate Size Available Used% State #Vols...
  • Page 409 Step 2: Remove the controller module To access components inside the controller module, you must remove the controller module from the chassis. You can use the following animations, illustration, or the written steps to remove the controller module from the chassis.
  • Page 410 Statement of Volatility on the NetApp Support Site. You must log into the NetApp Support Site to display the Statement of Volatility for your system. You can use the following animation, illustration, or the written steps to replace the NVDIMM.
  • Page 411 Carefully hold the NVDIMM by the edges to avoid pressure on the components on the NVDIMM circuit board. 3. Remove the replacement NVDIMM from the antistatic shipping bag, hold the NVDIMM by the corners, and then align it to the slot. The notch among the pins on the NVDIMM should line up with the tab in the socket.
  • Page 412 Do not completely insert the controller module in the chassis until instructed to do so. 3. Cable the management and console ports only, so that you can access the system to perform the tasks in the following sections. You will connect the rest of the cables to the controller module later in this procedure. 4.
  • Page 413 3. Select Scan System from the displayed menu to enable running the diagnostics tests. 4. Select Test Memory from the displayed menu. 5. Select NVDIMM Test from the displayed menu. 6. Proceed based on the result of the preceding step: ◦...
  • Page 414 6. Reestablish any SnapMirror or SnapVault configurations. Step 8: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information.
  • Page 415 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in Returning SEDs to unprotected mode.
  • Page 416 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with CLI.
  • Page 417 controller_A_1::> metrocluster operation show   Operation: heal-aggregates   State: successful Start Time: 7/25/2016 18:45:55   End Time: 7/25/2016 18:45:56   Errors: - 5. Check the state of the aggregates by using the command. storage aggregate show controller_A_1::> storage aggregate show Aggregate Size Available Used% State #Vols...
  • Page 418 Step 2: Remove the controller module To access components inside the controller module, you must remove the controller module from the chassis. You can use the following animations, illustration, or the written steps to remove the controller module from the chassis.
  • Page 419 Step 3: Replace a PCIe card To replace a PCIe card, you must locate the failed PCIe card, remove the riser that contains the card from the controller module, replace the card, and then reinstall the PCIe riser in the controller module. You can use the following animation, illustration, or the written steps to replace a PCIe card.
  • Page 420 If you are installing a card in the bottom slot and cannot see the card socket well, remove the top card so that you can see the card socket, install the card, and then reinstall the card you removed from the top slot. 4.
  • Page 421 the socket. d. Tighten the thumbscrews on the mezzanine card. 3. Reinstall the riser: a. Align the riser with the pins to the side of the riser socket, lower the riser down on the pins. b. Push the riser squarely into the socket on the motherboard. c.
  • Page 422 b. Using the locking latches, firmly push the controller module into the chassis until it meets the midplane and is fully seated. The locking latches rise when the controller module is fully seated. Do not use excessive force when sliding the controller module into the chassis to avoid damaging the connectors.
  • Page 423 1. Verify that all nodes are in the enabled state: metrocluster node show cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A   controller_A_1 configured enabled heal roots completed   cluster_B  ...
  • Page 424 Step 8: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information. Replacing a power supply - AFF A400...
  • Page 425 Secure the power cable to the power supply using the power cable retainer. Once power is restored to the power supply, the status LED should be green. 8. Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return &...
  • Page 426 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in Returning SEDs to unprotected mode.
  • Page 427 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with CLI.
  • Page 428 controller_A_1::> metrocluster operation show   Operation: heal-aggregates   State: successful Start Time: 7/25/2016 18:45:55   End Time: 7/25/2016 18:45:56   Errors: - 5. Check the state of the aggregates by using the command. storage aggregate show controller_A_1::> storage aggregate show Aggregate Size Available Used% State #Vols...
  • Page 429 Step 2: Remove the controller module To access components inside the controller module, you must remove the controller module from the chassis. You can use the following animations, illustration, or the written steps to remove the controller module from the chassis.
  • Page 430 Step 3: Replace the RTC battery You need to locate the RTC battery inside the controller module, and then follow the specific sequence of steps. See the FRU map inside the controller module for the location of the RTC battery. You can use the following animation, illustration, or the written steps to replace the RTC battery.
  • Page 431 You can use the following animation, illustration, or the written steps to install the controller module in the chassis. Animation - Install the controller module 1. If you have not already done so, close the air duct or controller module cover. 2.
  • Page 432 d. Interrupt the normal boot process and boot to LOADER by pressing Ctrl-C. If your system stops at the boot menu, select the option to boot to LOADER. 6. Reset the time and date on the controller: a. Check the date and time on the healthy controller with the command.
  • Page 433: Aff A700 System Documentation

    6. Reestablish any SnapMirror or SnapVault configurations. Step 6: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information.
  • Page 434 Step 1: Prepare for installation To install your system, you need to create an account on the NetApp Support Site, register your system, and get license keys. You also need to inventory the appropriate number and type of cables for your system and collect specific network information.
  • Page 435 You might also want to have access to the Release Notes for your version of ONTAP for more information about this system. NetApp Hardware Universe Find the Release Notes for your version of ONTAP 9 You need to provide the following at your site: •...
  • Page 436 Power cables Not applicable Powering up the system 4. Review the NetApp ONTAP Confiuration Guide and collect the required information listed in that guide. ONTAP Configuration Guide Step 2: Install the hardware You need to install your system in a 4-post rack or NetApp system cabinet, as applicable.
  • Page 437 1. Attach cable management devices (as shown). 2. Place the bezel on the front of the system. Step 3: Cable controllers to your network You can cable the controllers to your network by using the two-node switchless cluster method or by using the cluster interconnect network.
  • Page 438 1. Go to Step 4: Cable controllers to drive shelves for drive shelf cabling instructions. Option 2: Switched cluster Management network, data network, and management ports on the controllers are connected to switches. The cluster interconnect and HA ports are cabled on to the cluster/HA switch. You must have contacted your network administrator for information about connecting the system to the switches.
  • Page 439 1. Go to Step 4: Cable controllers to drive shelves for drive shelf cabling instructions. Step 4: Cable controllers to drive shelves You can cable your new system to DS212C, DS224C, or NS224 shelves, depending on if it is an AFF or FAS system.
  • Page 440 The examples use DS224C shelves. Cabling is similar with other supported SAS drive shelves. ◦ Cabling SAS shelves in FAS9000, AFF A700, and ASA AFF A700, ONTAP 9.7 and earlier: Animation - Cable SAS storage - ONTAP 9.7 and earlier ◦...
  • Page 441 If you have more than one drive shelf stack, see the Installation and Cabling Guide for your drive shelf type. Install and cable shelves for a new system installation - shelves with IOM12 modules...
  • Page 442 2. Go to Step 5: Complete system setup and configuration to complete system setup and configuration. Option 2: Cable the controllers to a single NS224 drive shelf in AFF A700 and ASA AFF A700 systems running ONTAP 9.8 and later only You must cable each controller to the NSM modules on the NS224 drive shelf on an AFF A700 or ASA AFF A700 running system ONTAP 9.8 or later.
  • Page 443 • The systems must have at least one X91148A module installed in slots 3 and/or 7 for each controller. The animation or illustrations show this module installed in both slots 3 and 7. • Be sure to check the illustration arrow for the proper cable connector pull-tab orientation. The cable pull-tab for the storage modules are up, while the pull tabs on the shelves are down.
  • Page 444 2. Go to Step 5: Complete system setup and configuration to complete system setup and configuration. Option 3: Cable the controllers to two NS224 drive shelves in AFF A700 and ASA AFF A700 systems running ONTAP 9.8 and later only You must cable each controller to the NSM modules on the NS224 drive shelves on an AFF A700 or ASA AFF A700 running system ONTAP 9.8 or later.
  • Page 445 As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it around and try again. Steps 1. Use the following animation or illustrations to cable your controllers to two NS224 drive shelves. Animation - Cable two NS224 shelves - ONTAP 9.8 and later...
  • Page 446 2. Go to Step 5: Complete system setup and configuration to complete system setup and configuration. Step 5: Complete system setup and configuration You can complete the system setup and configuration using cluster discovery with only a connection to the switch and laptop, or by connecting directly to a controller in the system and then connecting to the management switch.
  • Page 447 Double-click either ONTAP icon and accept any certificates displayed on your screen. XXXXX is the system serial number for the target node. System Manager opens. 7. Use System Manager guided setup to configure your system using the data you collected in the NetApp ONTAP Configuration Guide.
  • Page 448 Register your system. NetApp Product Registration c. Download Active IQ Config Advisor. NetApp Downloads: Config Advisor 9. Verify the health of your system by running Config Advisor. 10. After you have completed the initial configuration, go to the ONTAP & ONTAP System Manager Documentation Resources page for information about configuring additional features in ONTAP.
  • Page 449 Point your browser to the node management IP address. The format for the address is https://x.x.x.x. b. Configure the system using the data you collected in the NetApp ONTAP Configuration guide. ONTAP Configuration Guide 7. Set up your account and download Active IQ Config Advisor: a.
  • Page 450 ◦ If the impaired controller is in a standalone configuration and at LOADER prompt, contact mysupport.netapp.com. 2. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message...
  • Page 451 Option 1: Check NVE or NSE on systems running ONTAP 9.5 and earlier Before shutting down the impaired controller, you need to check whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
  • Page 452 Retrieve and restore all authentication keys and associated key IDs: security key-manager restore -address * If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column displays for all authentication keys and that all key managers Restored...
  • Page 453 Retrieve and restore all authentication keys and associated key IDs: security key-manager restore -address * If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column displays for all authentication keys and that all key managers...
  • Page 454 Option 2: Check NVE or NSE on systems running ONTAP 9.6 and later Before shutting down the impaired controller, you need to verify whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
  • Page 455 Restore the external key management authentication keys to all nodes in the cluster: security key- manager external restore If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys:...
  • Page 456 Restored yes: a. Restore the external key management authentication keys to all nodes in the cluster: security key- manager external restore If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys: Restored...
  • Page 457 Option 1: Most systems After completing the NVE or NSE tasks, you need to complete the shutdown of the impaired controller. Steps a. Take the impaired controller to the LOADER prompt: If the impaired controller Then… displays… The LOADER prompt Go to Remove controller module.
  • Page 458 3. Take the impaired controller to the LOADER prompt: If the impaired controller is Then… displaying… The LOADER prompt Go to Remove controller module. Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name...
  • Page 459 If the impaired controller is Then… displaying… Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y.
  • Page 460 Cam handle release button Cam handle 4. Rotate the cam handle so that it completely disengages the controller module from the chassis, and then slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 5.
  • Page 461 Controller module cover locking button Step 2: Replace the boot media Locate the boot media using the following illustration or the FRU map on the controller module:...
  • Page 462 • A copy of the same image version of ONTAP as what the impaired controller was running. You can download the appropriate image from the Downloads section on the NetApp Support Site ◦ If NVE is enabled, download the image with NetApp Volume Encryption, as indicated in the download button.
  • Page 463 The node begins to boot as soon as it is completely installed into the chassis. 5. Interrupt the boot process to stop at the LOADER prompt by pressing Ctrl-C when you see Starting AUTOBOOT press Ctrl-C to abort…. If you miss this message, press Ctrl-C, select the option to boot to Maintenance mode, and then halt the node to boot to LOADER.
  • Page 464 This procedure applies to systems that are not in a two-node MetroCluster configuration. Steps 1. From the LOADER prompt, boot the recovery image from the USB flash drive: boot_recovery The image is downloaded from the USB flash drive. 2. When prompted, either enter the name of the image or accept the default image displayed inside the brackets on your screen.
  • Page 465 If your system has… Then… No network connection and is in a a. Press when prompted to restore the backup configuration. MetroCluster IP configuration b. Reboot the system when prompted by the system. c. Wait for the iSCSI storage connections to connect. You can proceed after you see the following messages: date-and-time [node- name:iscsi.session.stateChanged:notice]:...
  • Page 466 ◦ If your system does not have onboard keymanager, NSE or NVE configured, complete the steps in this section. 6. From the LOADER prompt, enter the command. boot_ontap *If you see… Then…* The login prompt Go to the next Step. Waiting for giveback…...
  • Page 467 b. Check the environment variable settings with the printenv command. c. If an environment variable is not set as expected, modify it with the setenv environment- command. variable-name changed-value d. Save your changes using the command. savenv e. Reboot the node. Switch back aggregates in a two-node MetroCluster configuration - AFF A700 and FAS9000 After you have completed the FRU replacement in a two-node MetroCluster configuration, you can perform the MetroCluster switchback operation.
  • Page 468 Restore OKM, NSE, and NVE as needed - AFF A700 and FAS9000 Once environment variables are checked, you must complete steps specific to systems that have Onboard Key Manager (OKM), NetApp Storage Encryption (NSE) or NetApp Volume Encryption (NVE) enabled.
  • Page 469 If the console Then… displays… The LOADER prompt Boot the controller to the boot menu: boot_ontap menu Waiting for giveback… a. Enter at the prompt Ctrl-C b. At the message: Do you wish to halt this controller rather than wait [y/n]? , enter: c.
  • Page 470 9. Confirm the target controller is ready for giveback with the storage failover show command. 10. Give back only the CFO aggregates with the storage failover giveback -fromnode local -only-cfo command. -aggregates true ◦ If the command fails because of a failed disk, physically disengage the failed disk, but leave the disk in the slot until a replacement is received.
  • Page 471 If giveback is not complete after 20 minutes, contact Customer Support. 18. At the clustershell prompt, enter the command to list the logical net int show -is-home false interfaces that are not on their home controller and port. If any interfaces are listed as false, revert those interfaces back to their home port using the net int command.
  • Page 472 This command does not work if NVE (NetApp Volume Encryption) is configured 10. Use the security key-manager query to display the key IDs of the authentication keys that are stored on the key management servers.
  • Page 473 If the console Then… displays… The login prompt Go to Step 7. Waiting for giveback… a. Log into the partner controller. b. Confirm the target controller is ready for giveback with the storage command. failover show 4. Move the console cable to the partner controller and give back the target controller storage using the storage failover giveback -fromnode local -only-cfo-aggregates true local command.
  • Page 474 -node * -type all -message MAINT=END Return the failed part to NetApp - AFF A700 and FAS9000 Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information.
  • Page 475 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in Returning SEDs to unprotected mode.
  • Page 476 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with CLI.
  • Page 477 controller_A_1::> metrocluster operation show   Operation: heal-aggregates   State: successful Start Time: 7/25/2016 18:45:55   End Time: 7/25/2016 18:45:56   Errors: - 5. Check the state of the aggregates by using the command. storage aggregate show controller_A_1::> storage aggregate show Aggregate Size Available Used% State #Vols...
  • Page 478 Step 2: Replace or add a caching module The NVMe SSD Flash Cache modules (FlashCache or caching modules) are separate modules. They are located in the front of the NVRAM module. To replace or add a caching module, locate it on the rear of the system on slot 6, and then follow the specific sequence of steps to replace it.
  • Page 479 Orange release button. Caching module cam handle. a. Press the orange release button on the front of the caching module. Do not use the numbered and lettered I/O cam latch to eject the caching module. The numbered and lettered I/O cam latch ejects the entire NVRAM10 module and not the caching module.
  • Page 480 Orange release button. Core dump module cam handle. a. Locate the failed module by the amber Attention LED on the front of the module. b. Press the orange release button on the front of the core dump module. Do not use the numbered and lettered I/O cam latch to eject the core dump module. The numbered and lettered I/O cam latch ejects the entire NVRAM10 module and not the core dump module.
  • Page 481 Step 4: Reboot the controller after FRU replacement After you replace the FRU, you must reboot the controller module. Step 1. To boot ONTAP from the LOADER prompt, enter bye. Step 5: Switch back aggregates in a two-node MetroCluster configuration After you have completed the FRU replacement in a two-node MetroCluster configuration, you can perform the MetroCluster switchback operation.
  • Page 482 6. Reestablish any SnapMirror or SnapVault configurations. Step 6: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information.
  • Page 483 3. Prepare the caching module slot for replacement as follows: a. For ONTAP 9.7 and earlier: i. Record the caching module capacity, part number, and serial number on the target node: system node run local sysconfig -av 6 ii. In admin privilege level, prepare the target NVMe slot for replacement, responding when prompted whether to continue: system controller slot module replace -node...
  • Page 484 The NVMe slot status displays powered-off in the screen output for the caching module that needs replacing. See the Command man pages for your version of ONTAP for more details. 4. Remove the caching module: Orange release button. Caching module cam handle. a.
  • Page 485 If you replace the caching module with a caching module from a different vendor, the new vendor name is displayed in the command output. 9. Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return &...
  • Page 486 • Suspend external backup jobs. • Necessary tools and equipment for the replacement. If the system is a NetApp StorageGRID or ONTAP S3 used as FabricPool cloud tier, refer to the Gracefully shutdown and power up your storage system Resolution Guide after performing this procedure.
  • Page 487 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with the CLI.
  • Page 488 command from the surviving cluster. controller_A_1::> metrocluster heal -phase aggregates [Job 130] Job succeeded: Heal Aggregates is successful. If the healing is vetoed, you have the option of reissuing the command with the metrocluster heal parameter. If you use this optional parameter, the system overrides any soft vetoes -override-vetoes that prevent the healing operation.
  • Page 489 mcc1A::> metrocluster operation show   Operation: heal-root-aggregates   State: successful  Start Time: 7/29/2016 20:54:41   End Time: 7/29/2016 20:54:42   Errors: - 8. On the impaired controller module, disconnect the power supplies. Move and replace hardware - AFF A700 and FAS9000 Move the fans, hard drives, and controller module or modules from the impaired chassis to the new chassis, and swap out the impaired chassis from the equipment rack or system cabinet with the new chassis of the same model as the impaired chassis.
  • Page 490 Locking button 4. Repeat the preceding steps for any remaining power supplies. Step 2: Remove the fans To remove the fan modules when replacing the chassis, you must perform a specific sequence of tasks. Steps 1. Remove the bezel (if necessary) with two hands, by grasping the openings on each side of the bezel, and then pulling it toward you until the bezel releases from the ball studs on the chassis frame.
  • Page 491 Orange release button 3. Set the fan module aside. 4. Repeat the preceding steps for any remaining fan modules. Step 3: Remove the controller module To replace the chassis, you must remove the controller module or modules from the old chassis. Steps 1.
  • Page 492 Cam handle release button Cam handle 3. Rotate the cam handle so that it completely disengages the controller module from the chassis, and then slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 4.
  • Page 493 1. Unplug any cabling associated with the target I/O module. Make sure that you label the cables so that you know where they came from. 2. Remove the target I/O module from the chassis: a. Depress the lettered and numbered cam button. The cam button moves away from the chassis.
  • Page 494 Step 5: Remove the De-stage Controller Power Module Steps You must remove the de-stage controller power modules from the old chassis in preparation for installing the replacement chassis. 1. Press the orange locking button on the module handle, and then slide the DCPM module out of the chassis.
  • Page 495 5. Slide the chassis all the way into the equipment rack or system cabinet. 6. Secure the front of the chassis to the equipment rack or system cabinet, using the screws you removed from the old chassis. 7. Secure the rear of the chassis to the equipment rack or system cabinet. 8.
  • Page 496 Step 10: Install I/O modules Steps To install I/O modules, including the NVRAM/FlashCache modules from the old chassis, follow the specific sequence of steps. You must have the chassis installed so that you can install the I/O modules into the corresponding slots in the new chassis.
  • Page 497 Complete the restoration and replacement process - AFF A700 and FAS9000 You must verify the HA state of the chassis, run diagnostics, and return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Step 1: Verify and set the HA state of the chassis You must verify the HA state of the chassis, and, if necessary, update the state to match your system configuration.
  • Page 498 b. Confirm that the setting has changed: ha-config show 3. If you have not already done so, recable the rest of your system. 4. Exit Maintenance mode: halt The LOADER prompt appears. Step 2: Running system-level diagnostics After installing a new chassis, you should run interconnect diagnostics. Your system must be at the LOADER prompt to start System Level Diagnostics.
  • Page 499 If the system-level diagnostics Then… tests… Were completed without any a. Clear the status logs: sldiag device clearstatus failures b. Verify that the log was cleared: sldiag device status The following default response is displayed: SLDIAG: No log messages are present. c.
  • Page 500 If the system-level diagnostics Then… tests… Resulted in some test failures Determine the cause of the problem. a. Exit Maintenance mode: halt b. Perform a clean shutdown, and then disconnect the power supplies. c. Verify that you have observed all of the considerations identified for running system-level diagnostics, that cables are securely connected, and that hardware components are properly installed in the storage system.
  • Page 501 6. Reestablish any SnapMirror or SnapVault configurations. Step 4: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information.
  • Page 502 are required because the failure is restricted to an HA pair and storage failover commands can be used to provide nondisruptive operation during the replacement. • You must replace the failed component with a replacement FRU component you received from your provider.
  • Page 503 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in Returning SEDs to unprotected mode.
  • Page 504 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with CLI.
  • Page 505 controller_A_1::> metrocluster operation show   Operation: heal-aggregates   State: successful Start Time: 7/25/2016 18:45:55   End Time: 7/25/2016 18:45:56   Errors: - 5. Check the state of the aggregates by using the command. storage aggregate show controller_A_1::> storage aggregate show Aggregate Size Available Used% State #Vols...
  • Page 506 Replace the controller module hardware - AFF A700 and FAS9000 To replace the controller module hardware, you must remove the impaired node, move FRU components to the replacement controller module, install the replacement controller module in the chassis, and then boot the system to Maintenance mode. Step 1: Remove the controller module To access components inside the controller, you must first remove the controller module from the system and then remove the cover on the controller module.
  • Page 507 Cam handle release button Cam handle 1. Rotate the cam handle so that it completely disengages the controller module from the chassis, and then slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 2.
  • Page 508 Step 2: Move the boot media You must locate the boot media and follow the directions to remove it from the old controller and insert it in the new controller. Steps 1. Lift the black air duct at the back of the controller module and then locate the boot media using the following illustration or the FRU map on the controller module: Press release tab Boot media...
  • Page 509 Step 3: Move the system DIMMs To move the DIMMs, locate and move them from the old controller into the replacement controller and follow the specific sequence of steps. Steps 1. If you are not already grounded, properly ground yourself. 2.
  • Page 510 DIMM 5. Locate the slot where you are installing the DIMM. 6. Make sure that the DIMM ejector tabs on the connector are in the open position, and then insert the DIMM squarely into the slot. The DIMM fits tightly in the slot, but should go in easily. If not, realign the DIMM with the slot and reinsert it. Visually inspect the DIMM to verify that it is evenly aligned and fully inserted into the slot.
  • Page 511 b. Firmly push the controller module into the chassis until it meets the midplane and is fully seated. The locking latches rise when the controller module is fully seated. Do not use excessive force when sliding the controller module into the chassis to avoid damaging the connectors.
  • Page 512 Step 2: Verify and set the HA state of the controller module You must verify the state of the controller module and, if necessary, update the state to match your system configuration. Steps 1. In Maintenance mode from the new controller module, verify that all components display the same state: ha-config show The value for HA-state can be one of the following:...
  • Page 513 ◦ nvmem is a hybrid of NVRAM and system memory. ◦ is a Serial Attached SCSI device not connected to a disk shelf. 4. Run diagnostics as desired. If you want to run diagnostic Then… tests on… Individual components a. Clear the status logs: sldiag device clearstatus b.
  • Page 514 If you want to run diagnostic Then… tests on… Multiple components at the same a. Review the enabled and disabled devices in the output from the time preceding procedure and determine which ones you want to run concurrently. b. List the individual tests for the device: sldiag device show -dev dev_name c.
  • Page 515 If the system-level diagnostics Then… tests… Were completed without any a. Clear the status logs: sldiag device clearstatus failures b. Verify that the log was cleared: sldiag device status The following default response is displayed: SLDIAG: No log messages are present. c.
  • Page 516 If the system-level diagnostics Then… tests… Resulted in some test failures Determine the cause of the problem: a. Exit Maintenance mode: halt After you issue the command, wait until the system stops at the LOADER prompt. b. Turn off or leave on the power supplies, depending on how many controller modules are in the chassis: ◦...
  • Page 517 2. Verify that the cabling is correct by using Active IQ Config Advisor. a. Download and install Config Advisor. b. Enter the information for the target system, and then click Collect Data. c. Click the Cabling tab, and then examine the output. Make sure that all disk shelves are displayed and all disks appear in the output, correcting any cabling issues you find.
  • Page 518 c. Wait for the `savecore`command to complete before issuing the giveback. You can enter the following command to monitor the progress of the savecore command: system node run -node local-node-name partner savecore -s d. Return to the admin privilege level: set -privilege admin 5.
  • Page 519 Complete system restoration - AFF A700 and FAS9000 To complete the replacement procedure and restore your system to full operation, you must recable the storage, restore the NetApp Storage Encryption configuration (if necessary), and install licenses for the new controller. You must complete a series of tasks before restoring your system to full operation.
  • Page 520 If the node is in a MetroCluster configuration and all nodes at a site have been replaced, license keys must be installed on the replacement node or nodes prior to switchback. 1. If you need new license keys, obtain replacement license keys on the NetApp Support Site in the My Support section under Software licenses.
  • Page 521 4. If automatic giveback was disabled, reenable it: storage failover modify -node local -auto -giveback true Step 3: (MetroCluster only): Switching back aggregates in a two-node MetroCluster configuration After you have completed the FRU replacement in a two-node MetroCluster configuration, you can perform the MetroCluster switchback operation.
  • Page 522 6. Reestablish any SnapMirror or SnapVault configurations. Step 4: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information.
  • Page 523 Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information.
  • Page 524 All other components in the system must be functioning properly; if not, you must contact technical support. You must replace the failed component with a replacement FRU component you received from your provider. Step 1: Shut down the impaired controller You can shut down or take over the impaired controller using different procedures, depending on the storage system hardware configuration.
  • Page 525 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in Returning SEDs to unprotected mode.
  • Page 526 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with CLI.
  • Page 527 controller_A_1::> metrocluster operation show   Operation: heal-aggregates   State: successful Start Time: 7/25/2016 18:45:55   End Time: 7/25/2016 18:45:56   Errors: - 5. Check the state of the aggregates by using the command. storage aggregate show controller_A_1::> storage aggregate show Aggregate Size Available Used% State #Vols...
  • Page 528 Step 2: Remove the controller module To access components inside the controller, you must first remove the controller module from the system and then remove the cover on the controller module. Steps 1. If you are not already grounded, properly ground yourself. 2.
  • Page 529 Make sure that you support the bottom of the controller module as you slide it out of the chassis. 5. Place the controller module lid-side up on a stable, flat surface, press the blue button on the cover, slide the cover to the back of the controller module, and then swing the cover up and lift it off of the controller module.
  • Page 530 1. Eject the DIMM from its slot by slowly pushing apart the two DIMM ejector tabs on either side of the DIMM, and then slide the DIMM out of the slot. Carefully hold the DIMM by the edges to avoid pressure on the components on the DIMM circuit board.
  • Page 531 DIMM ejector tabs DIMM 2. Remove the replacement DIMM from the antistatic shipping bag, hold the DIMM by the corners, and align it to the slot. The notch among the pins on the DIMM should line up with the tab in the socket. 3.
  • Page 532 Step 4: Install the controller After you install the components into the controller module, you must install the controller module back into the system chassis and boot the operating system. For HA pairs with two controller modules in the same chassis, the sequence in which you install the controller module is especially important because it attempts to reboot as soon as you completely seat it in the chassis.
  • Page 533 b. After the node boots to Maintenance mode, halt the node: halt After you issue the command, you should wait until the system stops at the LOADER prompt. During the boot process, you can safely respond to prompts: ▪ A prompt warning that when entering Maintenance mode in an HA configuration, you must ensure that the healthy node remains down.
  • Page 534 If the system-level diagnostics Then… tests… A two-node MetroCluster Proceed to the next step. configuration The MetroCluster switchback procedure is done in the next task in the replacement process. A stand-alone configuration Proceed to the next step. No action is required. You have completed system-level diagnostics.
  • Page 535 Step 6: Switch back aggregates in a two-node MetroCluster configuration After you have completed the FRU replacement in a two-node MetroCluster configuration, you can perform the MetroCluster switchback operation. This returns the configuration to its normal operating state, with the sync- source storage virtual machines (SVMs) on the formerly impaired site now active and serving data from the local disk pools.
  • Page 536 6. Reestablish any SnapMirror or SnapVault configurations. Step 7: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information.
  • Page 537 7. Align the bezel with the ball studs, and then gently push the bezel onto the ball studs. 8. Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return &...
  • Page 538 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in Returning SEDs to unprotected mode.
  • Page 539 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with CLI.
  • Page 540 controller_A_1::> metrocluster operation show   Operation: heal-aggregates   State: successful Start Time: 7/25/2016 18:45:55   End Time: 7/25/2016 18:45:56   Errors: - 5. Check the state of the aggregates by using the command. storage aggregate show controller_A_1::> storage aggregate show Aggregate Size Available Used% State #Vols...
  • Page 541 Step 2: Replace I/O modules To replace an I/O module, locate it within the chassis and follow the specific sequence of steps. Steps 1. If you are not already grounded, properly ground yourself. 2. Unplug any cabling associated with the target I/O module. Make sure that you label the cables so that you know where they came from.
  • Page 542 4. Set the I/O module aside. 5. Install the replacement I/O module into the chassis by gently sliding the I/O module into the slot until the lettered and numbered I/O cam latch begins to engage with the I/O cam pin, and then push the I/O cam latch all the way up to lock the module in place.
  • Page 543 cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A   controller_A_1 configured enabled heal roots completed   cluster_B   controller_B_1 configured enabled waiting for switchback recovery 2 entries were displayed. 2. Verify that resynchronization is complete on all SVMs: metrocluster vserver show 3.
  • Page 544 Step 5: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information. Replace an LED USB module - AFF A700 and FAS9000 You can replace an LED USB module without interrupting service.
  • Page 545 There is an audible click when the module is secure and connected to the midplane. Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return &...
  • Page 546 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in Returning SEDs to unprotected mode.
  • Page 547 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with CLI.
  • Page 548 controller_A_1::> metrocluster operation show   Operation: heal-aggregates   State: successful Start Time: 7/25/2016 18:45:55   End Time: 7/25/2016 18:45:56   Errors: - 5. Check the state of the aggregates by using the command. storage aggregate show controller_A_1::> storage aggregate show Aggregate Size Available Used% State #Vols...
  • Page 549 Step 2: Replace the NVRAM module To replace the NVRAM module, locate it in slot 6 in the chassis and follow the specific sequence of steps. Steps 1. If you are not already grounded, properly ground yourself. 2. Move the FlashCache module from the old NVRAM module to the new NVRAM module: Orange release button (gray on empty FlashCache modules) FlashCache cam handle...
  • Page 550 The cam button moves away from the chassis. b. Rotate the cam latch down until it is in a horizontal position. The NVRAM module disengages from the chassis and moves out a few inches. c. Remove the NVRAM module from the chassis by pulling on the pull tabs on the sides of the module face.
  • Page 551 Cover locking button DIMM and DIMM ejector tabs 5. Remove the DIMMs, one at a time, from the old NVRAM module and install them in the replacement NVRAM module. 6. Close the cover on the module. 7. Install the replacement NVRAM module into the chassis: a.
  • Page 552 face. Lettered and numbered I/O cam latch I/O latch completely unlocked 3. Set the NVRAM module on a stable surface and remove the cover from the NVRAM module by pushing down on the blue locking button on the cover, and then, while holding down the blue button, slide the lid off the NVRAM module.
  • Page 553 Cover locking button DIMM and DIMM ejector tabs 4. Locate the DIMM to be replaced inside the NVRAM module, and then remove it by pressing down on the DIMM locking tabs and lifting the DIMM out of the socket. 5. Install the replacement DIMM by aligning the DIMM with the socket and gently pushing the DIMM into the socket until the locking tabs lock in place.
  • Page 554 Option 1: Verify ID (HA pair) Verify the system ID change on an HA system You must confirm the system ID change when you boot the replacement node and then verify that the change was implemented. This procedure applies only to systems running ONTAP in an HA pair. Steps 1.
  • Page 555 node run -node local-node-name partner savecore -s d. Return to the admin privilege level: set -privilege admin 5. Give back the node: a. From the healthy node, give back the replaced node’s storage: storage failover giveback -ofnode replacement_node_name The replacement node takes back its storage and completes booting. If you are prompted to override the system ID due to a system ID mismatch, you should enter y.
  • Page 556 8. If the node is in a MetroCluster configuration, depending on the MetroCluster state, verify that the DR home ID field shows the original owner of the disk if the original owner is a node on the disaster site. This is required if both of the following are true: ◦...
  • Page 557 You must enter when prompted to override the system ID due to a system ID mismatch. 2. View the old system IDs from the healthy node: `metrocluster node show -fields node- systemid,dr-partner-systemid` In this example, the Node_B_1 is the old node, with the old system ID of 118073209: dr-group-id cluster node node-systemid dr-...
  • Page 558 *> disk show -a Local System ID: 118065481   DISK OWNER POOL SERIAL NUMBER HOME ------- ------------- ----- ------------- ------------- disk_name system-1 (118065481) Pool0 J8Y0TDZC system-1 (118065481) disk_name system-1 (118065481) Pool0 J8Y09DXC system-1 (118065481) 6. From the healthy node, verify that any coredumps are saved: a.
  • Page 559 Display the results of the MetroCluster check: metrocluster check show e. Run Config Advisor. Go to the Config Advisor page on the NetApp Support Site at support.netapp.com/NOW/download/tools/config_advisor/. After running Config Advisor, review the tool’s output and follow the recommendations in the output to address any issues discovered.
  • Page 560 ◦ 2. Reset the SED MSID Step 7: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information.
  • Page 561 The green power LED lights when the PSU is fully inserted into the chassis and the amber attention LED flashes initially, but turns off after a few moments. 9. Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return &...
  • Page 562 • You can use this procedure with all versions of ONTAP supported by your system • All other components in the system must be functioning properly; if not, you must contact technical support. Step 1: Shut down the impaired controller You can shut down or take over the impaired controller using different procedures, depending on the storage system hardware configuration.
  • Page 563 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in Returning SEDs to unprotected mode.
  • Page 564 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with CLI.
  • Page 565 controller_A_1::> metrocluster operation show   Operation: heal-aggregates   State: successful Start Time: 7/25/2016 18:45:55   End Time: 7/25/2016 18:45:56   Errors: - 5. Check the state of the aggregates by using the command. storage aggregate show controller_A_1::> storage aggregate show Aggregate Size Available Used% State #Vols...
  • Page 566 Step 2: Remove the controller module To access components inside the controller, you must first remove the controller module from the system and then remove the cover on the controller module. Steps 1. If you are not already grounded, properly ground yourself. 2.
  • Page 567 Make sure that you support the bottom of the controller module as you slide it out of the chassis. 5. Place the controller module lid-side up on a stable, flat surface, press the blue button on the cover, slide the cover to the back of the controller module, and then swing the cover up and lift it off of the controller module.
  • Page 568 RTC battery RTC battery housing 3. Gently push the battery away from the holder, rotate it away from the holder, and then lift it out of the holder. Note the polarity of the battery as you remove it from the holder. The battery is marked with a plus sign and must be positioned in the holder correctly.
  • Page 569 1. If you have not already done so, close the air duct or controller module cover. 2. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. Do not completely insert the controller module in the chassis until instructed to do so.
  • Page 570 1. Verify that all nodes are in the enabled state: metrocluster node show cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A   controller_A_1 configured enabled heal roots completed   cluster_B  ...
  • Page 571 Step 6: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information. X91148A module Overview of adding an X91148A module - AFF A9000...
  • Page 572 Option 1: Add an X91148A module as a NIC module in a system with open slots To add an X91148A module as a NIC module in a system with open slots, you must follow the specific sequence of steps. Steps 1.
  • Page 573 -port port name -mode network for other slots that can be used by the X91148A module for networking. NetApp Hardware Universe • All other components in the system must be functioning properly; if not, you must contact technical support.
  • Page 574 Steps 1. If you are adding an X91148A module into a slot that contains a NIC module with the same number of ports as the X91148A module, the LIFs will automatically migrate when its controller module is shut down. If the NIC module being replaced has more ports than the X91148A module, you must permanently reassign the affected LIFs to a different home port.
  • Page 575 Lettered and numbered I/O cam latch I/O cam latch completely unlocked 6. Install the X91148A module into the target slot: a. Align the X91148A module with the edges of the slot. b. Slide the X91148A module into the slot until the lettered and numbered I/O cam latch begins to engage with the I/O cam pin.
  • Page 576 Option 2: Adding an X91148A module as a storage module in a system with no open slots You must remove one or more existing NIC or storage modules in your system in order to install one or more X91148A storage modules into your fully-populated system. •...
  • Page 577 Lettered and numbered I/O cam latch I/O cam latch completely unlocked 6. Install the X91148A module into slot 3: a. Align the X91148A module with the edges of the slot. b. Slide the X91148A module into the slot until the lettered and numbered I/O cam latch begins to engage with the I/O cam pin.
  • Page 578: Aff A800 System Documentation

    Use the AFF A800 Installation and Setup Instructions if you are familiar with installing NetApp systems. Videos - AFF A800 There are two videos - one showing how to rack and cable your system and one showing an example of using the System Manager Guided Setup to perform initial system configuration.
  • Page 579 (NetApp Product Registration) your system. 2. Download and install NetApp Downloads: Config Advisor on your laptop. 3. Inventory and make a note of the number and types of cables you received. The following table identifies the types of cables you might receive. If you receive a cable not listed in the...
  • Page 580 4. Download and complete the Cluster Configuration Worksheet. Step 2: Install the hardware You need to install your system in a 4-post rack or NetApp system cabinet, as applicable. Steps 1. Install the rail kits, as needed. Installing SuperRail into a four-post rack 2.
  • Page 581 3. Attach cable management devices (as shown). 4. Place the bezel on the front of the system. Step 3: Cable controllers There is required cabling for your platform’s cluster using the two-node switchless cluster method or the cluster interconnect network method. There is optional cabling to the Fibre Channel or iSCSI host networks or direct- attached storage.
  • Page 582 Steps 1. Use the animation or the tabulated steps to complete the cabling between the controllers and the switches: Animation - Cable a two-node switchless cluster Step Perform on each controller module Cable the HA interconnect ports: • e0b to e0b •...
  • Page 583 Step Perform on each controller module Cable the management ports to the management network switches DO NOT plug in the power cords at this point. 2. To perform optional cabling, see: [Option 1: Connect to a Fibre Channel host] ◦ [Option 2: Connect to a 10GbE host] ◦...
  • Page 584 As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it around and try again. Steps 1. Use the animation or the tabulated steps to complete the cabling between the controllers and the switches: Animation - Cable a switched cluster Step Perform on each controller module...
  • Page 585 Step Perform on each controller module Cable the cluster interconnect ports to the 100 GbE cluster interconnect switches. Cable the management ports to the management network switches DO NOT plug in the power cords at this point.
  • Page 586 2. To perform optional cabling, see: [Option 1: Connect to a Fibre Channel host] ◦ [Option 2: Connect to a 10GbE host] ◦ [Option 3: Connect to a single direct-attached NS224 drive shelf] ◦ [Option 4: Connect to two direct-attached NS224 drive shelves] ◦...
  • Page 587 Step Perform on each controller module Cable ports 2a through 2d to the FC host switches. To perform other optional cabling, choose from: • [Option 3: Connect to a single direct-attached NS224 drive shelf] • [Option 4: Connect to two direct-attached NS224 drive shelves] To complete setting up your system, see Step 4: Complete system setup and configuration.
  • Page 588 Step Perform on each controller module Cable ports e4a through e4d to the 10GbE host network switches. To perform other optional cabling, choose from: • [Option 3: Connect to a single direct-attached NS224 drive shelf] • [Option 4: Connect to two direct-attached NS224 drive shelves] To complete setting up your system, see Step 4: Complete system setup and configuration.
  • Page 589 Animation - Cable the controllers to a single drive shelf Step Perform on each controller module Cable controller A to the shelf: Cable controller B to the shelf: To complete setting up your system, see Step 4: Complete system setup and configuration.
  • Page 590 Option 4: Cable the controllers to two drive shelves You must cable each controller to the NSM modules on both NS224 drive shelves. Before you begin Be sure to check the illustration arrow for the proper cable connector pull-tab orientation. As you insert the connector, you should feel it click into place;...
  • Page 591 Step Perform on each controller module Cable controller B to the shelves: To complete setting up your system, see Step 4: Complete system setup and configuration. Step 4: Complete system setup and configuration Complete the system setup and configuration using cluster discovery with only a connection to the switch and laptop, or by connecting directly to a controller in the system and then connecting to the management switch.
  • Page 592 Animation - Connect your laptop to the Management switch 4. Select an ONTAP icon listed to discover: a. Open File Explorer. b. Click Network in the left pane. c. Right-click and select refresh. d. Double-click either ONTAP icon and accept any certificates displayed on your screen. XXXXX is the system serial number for the target node.
  • Page 593 c. Connect the laptop or console to the switch on the management subnet. d. Assign a TCP/IP address to the laptop or console, using one that is on the management subnet. 2. Plug the power cords into the controller power supplies, and then connect them to power sources on different circuits.
  • Page 594 ◦ If the impaired controller is in a standalone configuration and at LOADER prompt, contact mysupport.netapp.com. 2. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message...
  • Page 595 Option 1: Check NVE or NSE on systems running ONTAP 9.5 and earlier Before shutting down the impaired controller, you need to check whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
  • Page 596 Verify that the column displays for all authentication keys and that all key managers Restored display available: security key-manager query c. Shut down the impaired controller. 3. If you saw the message This command is not supported when onboard key management is enabled,...
  • Page 597 Retrieve and restore all authentication keys and associated key IDs: security key-manager restore -address * If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column displays for all authentication keys and that all key managers...
  • Page 598 Option 2: Check NVE or NSE on systems running ONTAP 9.6 and later Before shutting down the impaired controller, you need to verify whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
  • Page 599 If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the Restored column equals for all authentication keys: security key-manager key query c. Shut down the impaired controller. 4. If the type displays and the column displays anything other than yes:...
  • Page 600 Restored yes: a. Restore the external key management authentication keys to all nodes in the cluster: security key- manager external restore If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys: Restored...
  • Page 601 Option 1: Most systems After completing the NVE or NSE tasks, you need to complete the shutdown of the impaired controller. Steps a. Take the impaired controller to the LOADER prompt: If the impaired controller Then… displays… The LOADER prompt Go to Remove controller module.
  • Page 602 3. Take the impaired controller to the LOADER prompt: If the impaired controller is Then… displaying… The LOADER prompt Go to Remove controller module. Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name...
  • Page 603 Locking latch Locking pin 7. Slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 8. Place the controller module on a stable, flat surface, and then open the air duct: a.
  • Page 604 Air duct locking tabs Slide air duct towards fan modules Rotate air duct towards fan modules Step 2: Replace the boot media You locate the failed boot media in the controller module by removing Riser 3 on the controller module before you can replace the boot media.
  • Page 605 Air duct Riser 3 Phillips #1 screwdriver Boot media screw Boot media 2. Remove the boot media from the controller module: a. Using a #1 Phillips head screwdriver, remove the screw holding down the boot media and set the screw aside in a safe place.
  • Page 606 Steps 1. Download and copy the appropriate service image from the NetApp Support Site to the USB flash drive. a. Download the service image to your work space on your laptop. b. Unzip the service image.
  • Page 607 Air duct Risers 3. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. 4. Reinstall the cable management device and recable the system, as needed. When recabling, remember to reinstall the media converters (SFPs or QSFPs) if they were removed.
  • Page 608 Boot the recovery image - AFF A800 You must boot the ONTAP image from the USB drive, restore the file system, and verify the environmental variables. 1. From the LOADER prompt, boot the recovery image from the USB flash drive: boot_recovery The image is downloaded from the USB flash drive.
  • Page 609 If your system has… Then… No network connection and is in a a. Press when prompted to restore the backup configuration. MetroCluster IP configuration b. Reboot the system when prompted by the system. c. Wait for the iSCSI storage connections to connect. You can proceed after you see the following messages: date-and-time [node- name:iscsi.session.stateChanged:notice]:...
  • Page 610 Restore OKM, NSE, and NVE as needed - AFF A800 Once environment variables are checked, you must complete steps specific to systems that have Onboard Key Manager (OKM), NetApp Storage Encryption (NSE) or NetApp Volume Encryption (NVE) enabled. Determine which section you should use to restore your OKM, NSE, or NVE configurations: If NSE or NVE are enabled along with Onboard Key Manager you must restore settings you captured at the beginning of this procedure.
  • Page 611 3. Check the console output: If the console Then… displays… The LOADER prompt Boot the controller to the boot menu: boot_ontap menu Waiting for giveback… a. Enter at the prompt Ctrl-C b. At the message: Do you wish to halt this controller rather than wait [y/n]? , enter: c.
  • Page 612 8. Move the console cable to the partner controller and login as admin. 9. Confirm the target controller is ready for giveback with the command. storage failover show 10. Give back only the CFO aggregates with the storage failover giveback -fromnode local -only-cfo -aggregates true command.
  • Page 613 If giveback is not complete after 20 minutes, contact Customer Support. 18. At the clustershell prompt, enter the command to list the logical net int show -is-home false interfaces that are not on their home controller and port. If any interfaces are listed as false, revert those interfaces back to their home port using the net int command.
  • Page 614 This command does not work if NVE (NetApp Volume Encryption) is configured 10. Use the security key-manager query to display the key IDs of the authentication keys that are stored on the key management servers.
  • Page 615 If the console Then… displays… The login prompt Go to Step 7. Waiting for giveback… a. Log into the partner controller. b. Confirm the target controller is ready for giveback with the storage command. failover show 4. Move the console cable to the partner controller and give back the target controller storage using the storage failover giveback -fromnode local -only-cfo-aggregates true local command.
  • Page 616 • This procedure is written with the assumption that you are moving the bezel, NVMe drives, and controller modules to the new chassis, and that the replacement chassis is a new component from NetApp. • This procedure is disruptive. For a two-node cluster, you will have a complete service outage and a partial outage in a multi-node cluster.
  • Page 617 • Suspend external backup jobs. • Necessary tools and equipment for the replacement. If the system is a NetApp StorageGRID or ONTAP S3 used as FabricPool cloud tier, refer to the Gracefully shutdown and power up your storage system Resolution Guide after performing this procedure.
  • Page 618 7. Enter y for each controller in the cluster when you see Warning: Are you sure you want to halt node "cluster name-controller number"? {y|n}: 8. Wait for each controller to halt and display the LOADER prompt. 9. Turn off each PSU or unplug them if there is no PSU on/off switch. 10.
  • Page 619 Locking latch Locking pin 6. Slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 7. Set the controller module aside in a safe place, and repeat these steps for the other controller module in the chassis.
  • Page 620 3. Align the drive from the old chassis with the same bay opening in the new chassis. 4. Gently push the drive into the chassis as far as it will go. The cam handle engages and begins to rotate upward. 5.
  • Page 621 Complete the restoration and replacement process - AFF A800 You must verify the HA state of the chassis, run diagnostics, and return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Step 1: Verify and set the HA state of the chassis You must verify the HA state of the chassis, and, if necessary, update the state to match your system configuration.
  • Page 622 ◦ If the test reported no failures, select Reboot from the menu to reboot the system. Step 3: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return &...
  • Page 623 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the Returning SEDs to unprotected mode.
  • Page 624 If the impaired controller is Then… displaying… System prompt or password Take over or halt the impaired controller from the healthy controller: prompt storage failover takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y. Replace the controller module hardware - AFF A800 To replace the controller module hardware, you must remove the impaired controller, move FRU components to the replacement controller module, install the replacement...
  • Page 625 Leave the cables in the cable management device so that when you reinstall the cable management device, the cables are organized. 6. Remove the cable management device from the controller module and set it aside. 7. Press down on both of the locking latches, and then rotate both latches downward at the same time. The controller module moves slightly out of the chassis.
  • Page 626 Air duct locking tabs Slide air duct towards fan modules Rotate air duct towards fan modules Step 2: Move the power supplies You must move the power supplies from the impaired controller module to the replacement controller module when you replace a controller module. 1.
  • Page 627 Blue power supply locking tab Power supply 2. Move the power supply to the new controller module, and then install it. 3. Using both hands, support and align the edges of the power supply with the opening in the controller module, and then gently push the power supply into the controller module until the locking tab clicks into place.
  • Page 628 Fan locking tabs Fan module 2. Move the fan module to the replacement controller module, and then install the fan module by aligning its edges with the opening in the controller module, and then sliding the fan module into the controller module until the locking latches click into place.
  • Page 629 Air duct riser NVDIMM battery plug NVDIMM battery pack Attention: The NVDIMM battery control board LED blinks while destaging contents to the flash memory when you halt the system. After the destage is complete, the LED turns off. 2. Locate the battery plug and squeeze the clip on the face of the battery plug to release the plug from the socket, and then unplug the battery cable from the socket.
  • Page 630 Step 5: Remove the PCIe risers As part of the controller replacement process, you must remove the PCIe modules from the impaired controller module. You must install them into the same location in the replacement controller module once the NVDIMMS and DIMMs have moved to the replacement controller module.
  • Page 631 2. Eject the DIMM from its slot by slowly pushing apart the two DIMM ejector tabs on either side of the DIMM, and then slide the DIMM out of the slot. Carefully hold the DIMM by the edges to avoid pressure on the components on the DIMM circuit board.
  • Page 632 NVDIMMs 2. Note the orientation of the NVDIMM in the socket so that you can insert the NVDIMM in the replacement controller module in the proper orientation. 3. Eject the NVDIMM from its slot by slowly pushing apart the two NVDIMM ejector tabs on either side of the NVDIMM, and then slide the NVDIMM out of the socket and set it aside.
  • Page 633 Air duct Riser 3 Phillips #1 screwdriver Boot media screw Boot media 2. Remove the boot media from the controller module: a. Using a #1 Phillips head screwdriver, remove the screw holding down the boot media and set the screw aside in a safe place.
  • Page 634 Step 9: Install the PCIe risers You install the PCIe risers in the replacement controller module after moving the DIMMs, NVDIMMs, and boot media. 1. Install the riser into the replacement controller module: a. Align the lip of the riser with the underside of the controller module sheet metal. b.
  • Page 635 Locking tabs Slide plunger 2. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. Do not completely insert the controller module in the chassis until instructed to do so. 3.
  • Page 636 About this task It is important that you apply the commands in the steps on the correct systems: • The replacement node is the new node that replaced the impaired node as part of this procedure. • The healthy node is the HA partner of the replacement node. Steps 1.
  • Page 637 Step 3: Run diagnostics After you have replaced a component in your system, you should run diagnostic tests on that component. Your system must be at the LOADER prompt to start diagnostics. All commands in the diagnostic procedures are issued from the controller where the component is being replaced.
  • Page 638 Step 2: Reassign disks If the storage system is in an HA pair, the system ID of the new controller module is automatically assigned to the disks when the giveback occurs at the end of the procedure. You must confirm the system ID change when you boot the replacement controller and then verify that the change was implemented.
  • Page 639 Restore onboard key management encryption keys ◦ Restore external key management encryption keys ◦ 6. Give back the controller: a. From the healthy controller, give back the replaced controller’s storage: storage failover giveback -ofnode replacement_node_name The replacement controller takes back its storage and completes booting. If you are prompted to override the system ID due to a system ID mismatch, you should enter y.
  • Page 640 -node replacement-node-name -onreboot true Complete system restoration - AFF A800 To restore your system to full operation, you must restore the NetApp Storage Encryption configuration (if necessary), and install licenses for the new controller, and return the failed part to NetApp, as described in the RMA instructions shipped with the kit.
  • Page 641 -node local -auto -giveback true Step 3: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information.
  • Page 642 All other components in the system must be functioning properly; if not, you must contact technical support. You must replace the failed component with a replacement FRU component you received from your provider. Step 1: Shut down the impaired controller Recable the controller module’s storage and network connections.
  • Page 643 Locking latch Locking pin 7. Slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 8. Place the controller module on a stable, flat surface, and then open the air duct: a.
  • Page 644 Air duct locking tabs Slide air duct towards fan modules Rotate air duct towards fan modules Step 3: Replace a DIMM To replace a DIMM, you must locate it in the controller module using the DIMM map label on top of the air duct and then replace it following the specific sequence of steps.
  • Page 645 Air duct cover Riser 1 and DIMM bank 1, and 3-6 Riser 2 and DIMM Riser 3 and DIMM 19 -22 and 24 bank 7-10, 12-13, and 15-18 Note: Slot 2 and 14 are left empty. Do not attempt to install DIMMs into these slots. 2.
  • Page 646 Carefully hold the DIMM by the edges to avoid pressure on the components on the DIMM circuit board. 4. Remove the replacement DIMM from the antistatic shipping bag, hold the DIMM by the corners, and align it to the slot. The notch among the pins on the DIMM should line up with the tab in the socket.
  • Page 647 Locking tabs Slide plunger 2. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. Do not completely insert the controller module in the chassis until instructed to do so. 3.
  • Page 648 ◦ If the test reported no failures, select Reboot from the menu to reboot the system. Step 6: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return &...
  • Page 649 Procedure Replace the failed drive by selecting the option appropriate to the drives that your platform supports.
  • Page 650 Option 1: Replace SSD 1. If you want to manually assign drive ownership for the replacement drive, you need to disable automatic drive assignment replacement drive, if it is enabled You manually assign drive ownership and then reenable automatic drive assignment later in this procedure.
  • Page 651 -node node_name -autoassign on You must reenable automatic drive assignment on both controller modules. 10. Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 652 Depending on the storage system, the disk drives have the release button located at the top or on the left of the disk drive face. For example, the following illustration shows a disk drive with the release button located on the top of the disk drive face: The cam handle on the disk drive springs open partially and the disk drive releases from the midplane.
  • Page 653 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the Returning SEDs to unprotected mode.
  • Page 654 If the impaired controller is Then… displaying… System prompt or password Take over or halt the impaired controller from the healthy controller: prompt storage failover takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y. Step 2: Remove the controller module You must remove the controller module from the chassis when you replace a fan module.
  • Page 655 Locking latch Locking pin 7. Slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 8. Set the controller module aside in a safe place. Step 3: Replace a fan To replace a fan, remove the failed fan module and replace it with a new fan module.
  • Page 656 -controller local -auto-giveback true Step 5: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information.
  • Page 657 • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the Returning SEDs to unprotected mode. • If you have a SAN system, you must have checked event messages (cluster kernel-service show) for impaired controller SCSI blade.
  • Page 658 3. Release the power cable retainers, and then unplug the cables from the power supplies. 4. Loosen the hook and loop strap binding the cables to the cable management device, and then unplug the system cables and SFP and QSFP modules (if needed) from the controller module, keeping track of where the cables were connected.
  • Page 659 Air duct locking tabs Slide air duct towards fan modules Rotate air duct towards fan modules Step 3: Replace the NVDIMM To replace the NVDIMM, you must locate it in the controller module using the NVDIMM map label on top of the air duct, and then replace it following the specific sequence of steps.
  • Page 660 Air duct cover Riser 2 and NVDIMM 11 2. Note the orientation of the NVDIMM in the socket so that you can insert the NVDIMM in the replacement controller module in the proper orientation. 3. Eject the NVDIMM from its slot by slowly pushing apart the two NVDIMM ejector tabs on either side of the NVDIMM, and then slide the NVDIMM out of the socket and set it aside.
  • Page 661 5. Locate the slot where you are installing the NVDIMM. 6. Insert the NVDIMM squarely into the slot. The NVDIMM fits tightly in the slot, but should go in easily. If not, realign the NVDIMM with the slot and reinsert it. Visually inspect the NVDIMM to verify that it is evenly aligned and fully inserted into the slot.
  • Page 662 2. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. Do not completely insert the controller module in the chassis until instructed to do so. 3.
  • Page 663 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the Returning SEDs to unprotected mode.
  • Page 664 If the impaired controller is Then… displaying… The LOADER prompt Go to Remove controller module. Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired controller from the healthy controller: prompt storage failover takeover -ofnode impaired_node_name...
  • Page 665 Locking latch Locking pin 7. Slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 8. Set the controller module aside in a safe place. Step 3: Replace the NVDIMM battery To replace the NVDIMM battery, you must remove the failed battery from the controller module and install the replacement battery into the controller module.
  • Page 666 Air duct riser NVDIMM battery plug NVDIMM battery pack Attention: The NVDIMM battery control board LED blinks while destaging contents to the flash memory when you halt the system. After the destage is complete, the LED turns off. 2. Locate the battery plug and squeeze the clip on the face of the battery plug to release the plug from the socket, and then unplug the battery cable from the socket.
  • Page 667 b. Plug the battery plug into the riser socket and make sure that the plug locks into place. 6. Close the NVDIMM air duct. Make sure that the plug locks into the socket. Step 4: Reinstall the controller module and booting the system After you replace a FRU in the controller module, you must reinstall the controller module and reboot it.
  • Page 668 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the Returning SEDs to unprotected mode.
  • Page 669 –node local -auto-giveback false When you see Do you want to disable auto-giveback?, enter y. 3. Take the impaired controller to the LOADER prompt: If the impaired controller is Then… displaying… The LOADER prompt Go to Remove controller module. Waiting for giveback… Press Ctrl-C, and then respond when prompted.
  • Page 670 Locking latch Locking pin 7. Slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 8. Place the controller module on a stable, flat surface, and then open the air duct: a.
  • Page 671 Air duct locking tabs Slide air duct towards fan modules Rotate air duct towards fan modules Step 3: Replace a PCIe card To replace a PCIe card, you must remove the cabling and any QSFPs and SFPs from the ports on the PCIe cards in the target riser, remove the riser from the controller module, remove and replace the PCIe card, reinstall the riser and any QSFPs and SFPs onto the ports, and cable the ports.
  • Page 672 Air duct Riser locking latch Card locking bracket Riser 1 (left riser) with 100GbE PCIe card in slot 1. 3. Remove the PCIe card from Riser 1: a. Turn the riser so that you can access the PCIe card. b. Press the locking bracket on the side of the PCIe riser, and then rotate it to the open position. c.
  • Page 673 Air duct Riser 2 (middle riser) or 3 (right riser) locking latch Card locking bracket Side panel on riser 2 or 3 PCIe cards in riser 2 or 3 5. Remove the PCIe card from the riser: a. Turn the riser so that you can access the PCIe cards. b.
  • Page 674 controller module. d. Reinsert any SFP modules that were removed from the PCIe cards. Step 4: Reinstall the controller module After you replace a component within the controller module, you must reinstall the controller module in the system chassis and boot it. 1.
  • Page 675 -node local -auto -giveback true Step 5: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information.
  • Page 676 Option 1: Replace an AC PSU To replace an AC PSU, complete the following steps. 1. If you are not already grounded, properly ground yourself. 2. Identify the PSU you want to replace, based on console error messages or through the red Fault LED on the PSU.
  • Page 677 Secure the power cable to the PSU using the power cable retainer. Once power is restored to the PSU, the status LED should be green. 7. Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information.
  • Page 678 Secure the power cable to the PSU with the thumbscrews. Once power is restored to the PSU, the status LED should be green. 7. Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information.
  • Page 679 • If you have a cluster with more than two nodes, it must be in quorum. If the cluster is not in quorum or a healthy controller shows false for eligibility and health, you must correct the issue before shutting down the impaired controller;...
  • Page 680 6. Press down on both of the locking latches, and then rotate both latches downward at the same time. The controller module moves slightly out of the chassis. Locking latch Locking pin 1. Slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis.
  • Page 681 Air duct locking tabs Slide air duct towards fan modules Rotate air duct towards fan modules Step 3: Remove the PCIe risers You must remove one or more PCIe risers when replacing specific hardware components in the controller module. 1. Remove the PCIe riser from the controller module: a.
  • Page 682 Air duct Riser 2 (middle riser) locking latch Step 4: Replace the RTC battery To replace the RTC battery, locate it inside the controller and follow the specific sequence of steps. 1. Locate the RTC battery under Riser 2.
  • Page 683 Air duct Riser 2 RTC battery and housing 2. Gently push the battery away from the holder, rotate it away from the holder, and then lift it out of the holder. Note the polarity of the battery as you remove it from the holder. The battery is marked with a plus sign and must be positioned in the holder correctly.
  • Page 684: Aff A900 Systems

    -node local -auto -giveback true Step 7: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information.
  • Page 685 Step 1: Prepare for installation To install your system, you need to create an account on the NetApp Support Site, register your system, and get license keys. You also need to inventory the appropriate number and type of cables for your system and collect specific network information.
  • Page 686 The following table identifies the types of cables you might receive. If you receive a cable not listed in the table, see the Hardware Universe to locate the cable and identify its use. NetApp Hardware Universe Type of Part number and length Connector type For…...
  • Page 687 Step 2: Install the hardware You need to install your system in a 4-post rack or NetApp system cabinet, as applicable. 1. Install the rail kits, as needed. 2. Install and secure your system using the instructions included with the rail kit.
  • Page 688 4. Place the bezel on the front of the system. The following diagram shows a representation of what a typical system looks like and where the major components are located at the rear of the system: Step 3: Cable controllers to your network You can cable the controllers to your network by using the two-node switchless cluster method or by using the cluster interconnect network.
  • Page 689 Option 1: Two-node switchless cluster Management network, data network, and management ports on the controllers are connected to switches. The cluster interconnect ports are cabled on both controllers. Before you begin You must have contacted your network administrator for information about connecting the system to the switches.
  • Page 690 Step Perform on each controller Cable controller management (wrench) ports. Cable 25 GbE network switches: Ports in slot A3 and B3 (e3a and e3c) and slot A9 and B9 (e9a and e9c) to the 25 GbE network switches. 40GbE host network switches: Cable host‐side b ports in slot A4 and B4 (e4b) and slot A8 and B8 (e8b) to the host switch.
  • Page 691 As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it over and try again. 1. Use the animation or illustration to complete the cabling between the controllers and to the switches: Animation - Cable a switched cluster To cluster switches Controller A...
  • Page 692 Step Perform on each controller Cable 25GbE network switches: Ports in slot A3 and B3 (e3a and e3c) and slot A9 and B9 (e9a and e9c) to the 25 GbE network switches. 40GbE host network switches: Cable host‐side b ports in slot A4 and B4 (e4b) and slot A8 and B8 (e8b) to the host switch.
  • Page 693 Option 1: Cable the controllers to a single NS224 drive shelf You must cable each controller to the NSM modules on the NS224 drive shelf on an AFF A900 system. Before you begin • Be sure to check the illustration arrow for the proper cable connector pull-tab orientation. The cable pull-tab for the storage modules are up, while the pull tabs on the shelves are down.
  • Page 694 Step Perform on each controller • Connect controller A port e2a to port e0a on NSM A on the shelf. • Connect controller A port e10b to port e0b on NSM B on the shelf. 100 GbE cable • Connect controller B port e2a to port e0a on NSM B on the shelf.
  • Page 696 Step Perform on each controller • Connect controller A port e2a to NSM A e0a on shelf 1. • Connect controller A port e10b to NSM B e0b on shelf 1. • Connect controller A port e2b to NSM B e0b on shelf 2.
  • Page 697 Option 1: If network discovery is enabled If you have network discovery enabled on your laptop, you can complete system setup and configuration using automatic cluster discovery. 1. Use the following animation or drawing to set one or more drive shelf IDs: The NS224 shelves are pre-set to shelf ID 00 and 01.
  • Page 698 Initial booting may take up to eight minutes. 3. Make sure that your laptop has network discovery enabled. See your laptop’s online help for more information. 4. Use the following animation to connect your laptop to the Management switch. Animation - Connect your laptop to the Management switch 5.
  • Page 699 Register your system. NetApp Product Registration c. Download Active IQ Config Advisor. NetApp Downloads: Config Advisor 8. Verify the health of your system by running Config Advisor. 9. After you have completed the initial configuration, go to the ONTAP & ONTAP System Manager Documentation Resources page for information about configuring additional features in ONTAP.
  • Page 700 for detailed instructions. Animation - Set NVMe drive shelf IDs Shelf end cap Shelf faceplate Shelf ID LED Shelf ID setting button 3. Turn on the power switches on the power supplies to both nodes. Animation - Turn on the power to the controllers...
  • Page 701 Log in to your existing account or create an account. NetApp Support Registration b. Register your system. NetApp Product Registration c. Download Active IQ Config Advisor. NetApp Downloads: Config Advisor 7. Verify the health of your system by running Config Advisor.
  • Page 702 Before you begin • Check the NetApp Hardware Universe to make sure that the new I/O module is compatible with your system and version of ONTAP you’re running. • If multiple slots are available, check the slot priorities in...
  • Page 703 Option 1: Most systems To shut down the impaired controller, you must determine the status of the controller and, if necessary, take over the controller so that the healthy controller continues to serve data from the impaired controller storage. About this task If you have a cluster with more than two nodes, it must be in quorum.
  • Page 704 Configuration State is configured and that the nodes are in an enabled and normal state (metrocluster node show). Steps 1. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message MAINT=number_of_hours_downh The following AutoSupport message suppresses automatic case creation for two hours: cluster1:*>...
  • Page 705 Make sure that any unused I/O slots have blanks installed to prevent possible thermal issues. 5. Reboot the controller from the LOADER prompt: bye This reinitializes the PCIe cards and other components and reboots the node. 6. Give back the controller from the partner controller. storage failover giveback -ofnode target_node_name 7.
  • Page 706 Option 1: Most systems To shut down the impaired controller, you must determine the status of the controller and, if necessary, take over the controller so that the healthy controller continues to serve data from the impaired controller storage. About this task If you have a cluster with more than two nodes, it must be in quorum.
  • Page 707 Configuration State is configured and that the nodes are in an enabled and normal state (metrocluster node show). Steps 1. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message MAINT=number_of_hours_downh The following AutoSupport message suppresses automatic case creation for two hours: cluster1:*>...
  • Page 708 Lettered and numbered I/O cam latch I/O cam latch completely unlocked 4. Install the I/O module into the target slot: a. Align the I/O module with the edges of the slot. b. Slide the I/O module into the slot until the lettered and numbered I/O cam latch begins to engage with the I/O cam pin.
  • Page 709 If you encounter an issue during reboot, see BURT 1494308 - Environment shutdown might be triggered during I/O module replacement 8. Give back the controller from the partner controller. storage failover giveback -ofnode target_node_name 9. Enable automatic giveback if it was disabled: storage failover modify -node local -auto -giveback true 10.
  • Page 710 NVE, proceed to shut down the controller. ONTAP 9.6 and later Before shutting down the impaired controller, you need to verify whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
  • Page 711 Restore the external key management authentication keys to all nodes in the cluster: security key- manager external restore If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys:...
  • Page 712 Restored yes: a. Restore the external key management authentication keys to all nodes in the cluster: security key- manager external restore If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys: Restored...
  • Page 713 Restored a. Enter the onboard security key-manager sync command: security key-manager onboard sync Enter the customer’s onboard key management passphrase at the prompt. If the passphrase cannot be provided, contact NetApp Support. mysupport.netapp.com b. Verify the Restored column shows...
  • Page 714 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in Returning SEDs to unprotected mode.
  • Page 715 If the impaired controller is Then… displaying… System prompt or password Take over or halt the impaired controller from the healthy prompt controller: storage failover takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y. Controller is in a MetroCluster After completing the NVE or NSE tasks, you need to complete the shutdown of the impaired controller.
  • Page 716 If the impaired controller is Then… displaying… System prompt or password Take over or halt the impaired controller from the healthy prompt (enter system password) controller: storage failover takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y.
  • Page 717 Cam handle release button Cam handle 4. Rotate the cam handle so that it completely disengages the controller module from the chassis, and then slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis.
  • Page 718 5. Place the controller module lid-side up on a stable, flat surface, press the blue button on the cover, slide the cover to the back of the controller module, and then swing the cover up and lift it off of the controller module.
  • Page 719 Press release tab Boot media 2. Press the blue button on the boot media housing to release the boot media from its housing, and then gently pull it straight out of the boot media socket. Do not twist or pull the boot media straight up, because this could damage the socket or the boot media.
  • Page 720 • A copy of the same image version of ONTAP as what the impaired controller was running. You can download the appropriate image from the Downloads section on the NetApp Support Site ◦ If NVE is enabled, download the image with NetApp Volume Encryption, as indicated in the download button.
  • Page 721 ◦ If you are configuring manual connections: ifconfig e0a -addr=filer_addr -mask=netmask -gw=gateway-dns=dns_addr-domain=dns_domain ▪ filer_addr is the IP address of the storage system. ▪ netmask is the network mask of the management network that is connected to the HA partner. ▪ gateway is the gateway for the network. ▪...
  • Page 722 If your system has… Then… A network connection a. Press when prompted to restore the backup configuration. b. Press when prompted to overwrite /etc/ssh/ssh_host_ecdsa_key. c. Press when prompted to confirm if the restore backup was successful. d. Press when prompted to the restored configuration copy. e.
  • Page 723 If your system has… Then… No network connection and is in a a. Press when prompted to restore the backup configuration. MetroCluster IP configuration b. Reboot the system when prompted by the system. c. Wait for the iSCSI storage connections to connect. You can proceed after you see the following messages: date-and-time [node- name:iscsi.session.stateChanged:notice]:...
  • Page 724 Post boot media replacement steps for OKM, NSE, and NVE - AFF A900 Once environment variables are checked, you must complete steps specific to restore Onboard Key Manager (OKM), NetApp Storage Encryption (NSE) and NetApp Volume Encryption (NVE). 1. Determine which section you should use to restore your OKM, NSE, or NVE configurations: If NSE or NVE are enabled along with Onboard Key Manager you must restore settings you captured at the beginning of this procedure.
  • Page 725 If the console displays… Then… The LOADER prompt Boot the controller to the boot menu: boot_ontap menu Waiting for giveback…. a. Enter at the prompt Ctrl-C b. At the message: Do you wish to halt this node rather than wait [y/n]? , enter: c.
  • Page 726 8. Move the console cable to the partner controller and log in as admin. 9. Confirm the target controller is ready for giveback with the command. storage failover show 10. Give back only the CFO aggregates with the storage failover giveback -fromnode local -only-cfo-aggregates true command.
  • Page 727 Restore NSE/NVE on systems running ONTAP 9.6 and later 1. Connect the console cable to the target controller. 2. Use the boot_ontap command at the LOADER prompt to boot the controller. 3. Check the console output: If the console displays… Then…...
  • Page 728 -node local command. -auto-giveback true Return the failed part to NetApp - AFF A900 Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information.
  • Page 729 • Necessary tools and equipment for the replacement. If the system is a NetApp StorageGRID or ONTAP S3 used as FabricPool cloud tier, refer to the Gracefully shutdown and power up your storage system Resolution Guide after performing this procedure.
  • Page 730 7. Enter y for each controller in the cluster when you see Warning: Are you sure you want to halt node "cluster name-controller number"? {y|n}: 8. Wait for each controller to halt and display the LOADER prompt. 9. Turn off each PSU or unplug them if there is no PSU on/off switch. 10.
  • Page 731 Locking button 4. Repeat the preceding steps for any remaining power supplies. Step 2: Remove the fans You must remove the six fan modules, located on in the front of the chassis, when replacing the chassis. 1. If you are not already grounded, properly ground yourself. 2.
  • Page 732 Terra cotta locking button Slide fan in/out of chassis 4. Set the fan module aside. 5. Repeat the preceding steps for any remaining fan modules. Step 3: Remove the controller module To replace the chassis, you must remove the controller module or modules from the impaired chassis. 1.
  • Page 733 Cam handle locking button Cam handle 4. Rotate the cam handle so that it completely disengages the controller module from the chassis, and then slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis.
  • Page 734 5. Set the controller module aside in a safe place and keep track of which chassis slot it came from, so that it can be installed into the same slot in the replacement chassis.. 6. Repeat these steps if you have another controller module in the chassis. Step 4: Remove the I/O modules To remove I/O modules from the impaired chassis, including the NVRAM modules, follow the specific sequence of steps.
  • Page 735 Lettered and numbered I/O cam latch I/O cam latch completely unlocked 4. Set the I/O module aside. 5. Repeat the preceding step for the remaining I/O modules in the impaired chassis. Step 5: Remove the De-stage Controller Power Module Remove the two de-stage controller power modules from the front of the impaired chassis. 1.
  • Page 736 DCPM terra cotta locking button 3. Set the DCPM aside in a safe place and repeat this step for the remaining DCPM. Step 6 Remove the USB LED module Remove the USB LED modules. Animation - Remove/install USB...
  • Page 737 Eject the module. Slide out of chassis. 1. Locate the USB LED module on the front of the impaired chassis, directly under the DCPM bays. 2. Press the black locking button on the right side of the module to release the module from the chassis, and then slide it out of the impaired chassis.
  • Page 738 4. Using two or three people, install the replacement chassis into the equipment rack or system cabinet by guiding the chassis onto the rack rails in a system cabinet or L brackets in an equipment rack. 5. Slide the chassis all the way into the equipment rack or system cabinet. 6.
  • Page 739 3. Recable the I/O module, as needed. 4. Repeat the preceding step for the remaining I/O modules that you set aside. If the impaired chassis has blank I/O panels, move them to the replacement chassis at this time. Step 11: Install the power supplies Installing the power supplies when replacing a chassis involves installing the power supplies into the replacement chassis, and connecting to the power source.
  • Page 740 5. With the cam handle in the open position, slide the controller module into the chassis and firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle until it clicks into the locked position.
  • Page 741 As a best practice, you should do the following: • Resolve any Active IQ Wellness Alerts and Risks (Active IQ will take time to process post-power up AutoSupports - expect a delay in results) • Run Active IQ Config Advisor •...
  • Page 742 failures resulting from testing the component. 7. Proceed based on the result of the preceding step. If the system-level diagnostics tests… Then… Were completed without any failures a. Clear the status logs: sldiag device clearstatus b. Verify that the log was cleared: sldiag device status The following default response is displayed:...
  • Page 743 If the system-level diagnostics test fails again, contact mysupport.netapp.com. Step 4: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information.
  • Page 744 • You must replace the failed component with a replacement FRU component you received from your provider. • You must be replacing a controller module with a controller module of the same model type. You cannot upgrade your system by just replacing the controller module. •...
  • Page 745 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in Returning SEDs to unprotected mode.
  • Page 746 Option 2: Controller is in a MetroCluster Do not use this procedure if your system is in a two-node MetroCluster configuration. To shut down the impaired controller, you must determine the status of the controller and, if necessary, take over the controller so that the healthy controller continues to serve data from the impaired controller storage.
  • Page 747 controller. Animation - Move components to replacement controller Step 1: Remove the controller module To access components inside the controller, you must first remove the controller module from the system and then remove the cover on the controller module. 1. If you are not already grounded, properly ground yourself. 2.
  • Page 748 Cam handle release button Cam handle 4. Rotate the cam handle so that it completely disengages the controller module from the chassis, and then slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis.
  • Page 749 5. Place the controller module lid-side up on a stable, flat surface, press the blue button on the cover, slide the cover to the back of the controller module, and then swing the cover up and lift it off of the controller module.
  • Page 750 Press release tab Boot media 2. Press the blue button on the boot media housing to release the boot media from its housing, and then gently pull it straight out of the boot media socket. Do not twist or pull the boot media straight up, because this could damage the socket or the boot media.
  • Page 751 If necessary, remove the boot media and reseat it into the socket. 5. Push the boot media down to engage the locking button on the boot media housing. Step 3: Move the system DIMMs To move the DIMMs, locate and move them from the old controller into the replacement controller and follow the specific sequence of steps.
  • Page 752 DIMM ejector tabs DIMM 5. Locate the slot where you are installing the DIMM. 6. Make sure that the DIMM ejector tabs on the connector are in the open position, and then insert the DIMM squarely into the slot. The DIMM fits tightly in the slot, but should go in easily. If not, realign the DIMM with the slot and reinsert it. Visually inspect the DIMM to verify that it is evenly aligned and fully inserted into the slot.
  • Page 753 Visually inspect the DIMM to verify that it is evenly aligned and fully inserted into the slot. 8. Push carefully, but firmly, on the top edge of the DIMM until the ejector tabs snap into place over the notches at the ends of the DIMM. 9.
  • Page 754 Cam handle release button Cam handle Do not completely insert the controller module in the chassis until instructed to do so. 4. Cable the management and console ports only, so that you can access the system to perform the tasks in...
  • Page 755 the following sections. You will connect the rest of the cables to the controller module later in this procedure. 5. Complete the reinstallation of the controller module: a. If you have not already done so, reinstall the cable management device. b.
  • Page 756 6. At the LOADER prompt, confirm the date and time on the replacement node: date The date and time are given in GMT. Step 2: Verify and set the HA state of the controller module You must verify the state of the controller module and, if necessary, update the state to match your system configuration.
  • Page 757 ◦ fcal is a Fibre Channel-Arbitrated Loop device not connected to a Fibre Channel network. ◦ env is motherboard environmentals. ◦ mem is system memory. ◦ nic is a network interface card. ◦ nvram is nonvolatile RAM. ◦ nvmem is a hybrid of NVRAM and system memory. ◦...
  • Page 758 If you want to run diagnostic Then… tests on… Multiple components at the same a. Review the enabled and disabled devices in the output from the time preceding procedure and determine which ones you want to run concurrently. b. List the individual tests for the device: sldiag device show -dev dev_name c.
  • Page 759 If the system-level diagnostics Then… tests… Were completed without any a. Clear the status logs: sldiag device clearstatus failures b. Verify that the log was cleared: sldiag device status The following default response is displayed: SLDIAG: No log messages are present. c.
  • Page 760 If your controller is in… Then… Resulted in some test failures Determine the cause of the problem: a. Exit Maintenance mode: halt After you issue the command, wait until the system stops at the LOADER prompt. b. Turn off or leave on the power supplies, depending on how many controller modules are in the chassis.
  • Page 761 The system ID and disk assignment information reside in the NVRAM module, which is in a module separate from the controller module and not impacted by the controller module replacement. Step 2: Reassign disks If the storage system is in an HA pair, the system ID of the new controller module is automatically assigned to the disks when the giveback occurs at the end of the procedure.
  • Page 762 d. Return to the admin privilege level: set -privilege admin 5. If your storage system has Storage or Volume Encryption configured, you must restore Storage or Volume Encryption functionality by using one of the following procedures, depending on whether you are using onboard or external key management: Restore onboard key management encryption keys ◦...
  • Page 763 Complete system restoration - AFF A900 To complete the replacement procedure and restore your system to full operation, you must recable the storage, restore the NetApp Storage Encryption configuration (if necessary), and install licenses for the new controller. You must complete a series of tasks before restoring your system to full operation.
  • Page 764 If the node is in a MetroCluster configuration and all nodes at a site have been replaced, license keys must be installed on the replacement node or nodes prior to switchback. 1. If you need new license keys, obtain replacement license keys on the NetApp Support Site in the My Support section under Software licenses.
  • Page 765 Step 3: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information. Replace a DIMM - AFF A900 You must replace a DIMM in the controller module when your system registers an increasing number of correctable error correction codes (ECC);...
  • Page 766 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in Returning SEDs to unprotected mode.
  • Page 767 Option 2: Controller is in a MetroCluster Do not use this procedure if your system is in a two-node MetroCluster configuration. To shut down the impaired controller, you must determine the status of the controller and, if necessary, take over the controller so that the healthy controller continues to serve data from the impaired controller storage.
  • Page 768 3. Slide the terra cotta button on the cam handle downward until it unlocks. Animation - Remove the controller Cam handle release button Cam handle...
  • Page 769 4. Rotate the cam handle so that it completely disengages the controller module from the chassis, and then slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 5.
  • Page 770 1. Eject the DIMM from its slot by slowly pushing apart the two DIMM ejector tabs on either side of the DIMM, and then slide the DIMM out of the slot. Carefully hold the DIMM by the edges to avoid pressure on the components on the DIMM circuit board.
  • Page 771 DIMM ejector tabs DIMM 2. Remove the replacement DIMM from the antistatic shipping bag, hold the DIMM by the corners, and align it to the slot. The notch among the pins on the DIMM should line up with the tab in the socket. 3.
  • Page 772 4. Push carefully, but firmly, on the top edge of the DIMM until the ejector tabs snap into place over the notches at the ends of the DIMM. 5. Close the controller module cover. Step 4: Install the controller After you install the components into the controller module, you must install the controller module back into the system chassis and boot the operating system.
  • Page 773 Animation - Install controller Cam handle release button Cam handle Do not completely insert the controller module in the chassis until instructed to do so.
  • Page 774 4. Cable the management and console ports only, so that you can access the system to perform the tasks in the following sections. You will connect the rest of the cables to the controller module later in this procedure. 5. Complete the reinstallation of the controller module: a.
  • Page 775 -dev mem -long -state failed System-level diagnostics returns you to the prompt if there are no test failures, or lists the full status of failures resulting from testing the component. 5. Proceed based on the result of the preceding step: If the system-level diagnostics Then…...
  • Page 776 LOADER prompt. f. Rerun the system-level diagnostic test. Step 6: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information.
  • Page 777 You must dispose of batteries according to the local regulations regarding battery recycling or disposal. If you cannot properly dispose of batteries, you must return the batteries to NetApp, as described in the RMA instructions that are shipped with the kit.
  • Page 778 Step 3: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information. Swap out a fan - AFF A900 To swap out a fan module without interrupting service, you must perform a specific sequence of tasks.
  • Page 779 7. Align the bezel with the ball studs, and then gently push the bezel onto the ball studs. 8. Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return &...
  • Page 780 Option 1: Most systems To shut down the impaired controller, you must determine the status of the controller and, if necessary, take over the controller so that the healthy controller continues to serve data from the impaired controller storage. About this task If you have a cluster with more than two nodes, it must be in quorum.
  • Page 781 Configuration State is configured and that the nodes are in an enabled and normal state (metrocluster node show). Steps 1. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message MAINT=number_of_hours_downh The following AutoSupport message suppresses automatic case creation for two hours: cluster1:*>...
  • Page 782 Make sure that you keep track of which slot the I/O module was in. Animation - Remove/install I/O module Lettered and numbered I/O cam latch I/O cam latch completely unlocked 4. Set the I/O module aside. 5. Install the replacement I/O module into the chassis by gently sliding the I/O module into the slot until the lettered and numbered I/O cam latch begins to engage with the I/O cam pin, and then push the I/O cam latch all the way up to lock the module in place.
  • Page 783 -node local -auto -giveback true Step 4: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information.
  • Page 784 There is an audible click when the module is secure and connected to the midplane. Step 2: Return the failed component 1. Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return &...
  • Page 785 NVRAM module or the DIMMs inside the NVRAM module. To replace a failed NVRAM module, you must remove it from the chassis, move the DIMMs to the replacement module, and install the replacement NVRAM module into the chassis. To replace and NVRAM DIMM, you must remove the NVRAM module from the chassis, replace the failed DIMM in the module, and then reinstall the NVRAM module.
  • Page 786 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in Returning SEDs to unprotected mode.
  • Page 787 Option 2: Controller is in a MetroCluster Do not use this procedure if your system is in a two-node MetroCluster configuration. To shut down the impaired controller, you must determine the status of the controller and, if necessary, take over the controller so that the healthy controller continues to serve data from the impaired controller storage.
  • Page 788 a. Depress the lettered and numbered cam button. The cam button moves away from the chassis. b. Rotate the cam latch down until it is in a horizontal position. The NVRAM module disengages from the chassis and moves out a few inches. c.
  • Page 789 the NVRAM module. Cover locking button DIMM and DIMM ejector tabs 4. Remove the DIMMs, one at a time, from the old NVRAM module and install them in the replacement NVRAM module. 5. Close the cover on the module. 6. Install the replacement NVRAM module into the chassis: a.
  • Page 790 The cam button moves away from the chassis. b. Rotate the cam latch down until it is in a horizontal position. The NVRAM module disengages from the chassis and moves out a few inches. c. Remove the NVRAM module from the chassis by pulling on the pull tabs on the sides of the module face.
  • Page 791 Cover locking button DIMM and DIMM ejector tabs 4. Locate the DIMM to be replaced inside the NVRAM module, and then remove it by pressing down on the DIMM locking tabs and lifting the DIMM out of the socket. 5. Install the replacement DIMM by aligning the DIMM with the socket and gently pushing the DIMM into the socket until the locking tabs lock in place.
  • Page 792 Step 5: Reassigning disks You must confirm the system ID change when you boot the replacement controller and then verify that the change was implemented. This procedure applies only to systems running ONTAP in an HA pair. Disk reassignment is only needed when replacing the NVRAM module. Steps 1.
  • Page 793 The output from the storage failover show command should not include the System ID changed on partner message. 5. Verify that the disks were assigned correctly: storage disk show -ownership The disks belonging to the replacement controller should show the new system ID. In the following example, the disks owned by node1 now show the new system ID, 151759706: node1:>...
  • Page 794 node1_siteA::> metrocluster node show -fields configuration-state dr-group-id cluster node configuration-state ----------- ---------------------- -------------- ------------------- 1 node1_siteA node1mcc-001 configured 1 node1_siteA node1mcc-002 configured 1 node1_siteB node1mcc-003 configured 1 node1_siteB node1mcc-004 configured 4 entries were displayed. 9. Verify that the expected volumes are present for each controller: vol show -node node-name 10.
  • Page 795 Option 1: Using Onboard Key Manager Steps 1. Boot the node to the boot menu. 2. Select option 10, Set onboard key management recovery secrets. 3. Enter the passphrase for the onboard key manager you obtained from the customer. 4. At the prompt, paste the backup key data from the output of security key-manager backup command.
  • Page 796 7. Once the giveback completes, check the failover and giveback status with the storage failover commands. show storage failover show-giveback Only the CFO aggregates (root aggregate and CFO style data aggregates) will be shown. 8. Run the security key-manager onboard sync: a.
  • Page 797 3. Enter the management certificate information at the prompts. The controller returns to the boot menu after the management certificate information is completed. 4. Select option 1, Normal Boot 5. Move the console cable to the partner controller and give back the target controller storage using the storage failover giveback -fromnode local -only-cfo-aggregates true local command.
  • Page 798 Step 7: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information. Swap out a power supply - AFF A900 Swapping out a power supply involves turning off, disconnecting, and removing the power supply and installing, connecting, and turning on the replacement power supply.
  • Page 799 b. Open the power cable retainer, and then unplug the power cable from the power supply. 4. Press and hold the terra cotta button on the power supply handle, and then pull the power supply out of the chassis. CAUTION: When removing a power supply, always use two hands to support its weight.
  • Page 800 9. Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information. Replacing the real-time clock battery - AFF A900 You replace the real-time clock (RTC) battery in the controller module so that your system’s services and applications that depend on accurate time synchronization...
  • Page 801 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in Returning SEDs to unprotected mode.
  • Page 802 Option 2: Controller is in a MetroCluster Do not use this procedure if your system is in a two-node MetroCluster configuration. To shut down the impaired controller, you must determine the status of the controller and, if necessary, take over the controller so that the healthy controller continues to serve data from the impaired controller storage.
  • Page 803 3. Slide the terra cotta button on the cam handle downward until it unlocks. Animation - Remove the controller Cam handle release button Cam handle...
  • Page 804 4. Rotate the cam handle so that it completely disengages the controller module from the chassis, and then slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 5.
  • Page 805 Animation - Replace RTC battery RTC battery RTC battery housing 3. Gently push the battery away from the holder, rotate it away from the holder, and then lift it out of the holder. Note the polarity of the battery as you remove it from the holder. The battery is marked with a plus sign and must be positioned in the holder correctly.
  • Page 806 6. Note the polarity of the RTC battery, and then insert it into the holder by tilting the battery at an angle and pushing down. 7. Visually inspect the battery to make sure that it is completely installed into the holder and that the polarity is correct.
  • Page 807 -node local -auto -giveback true Step 5: Return the failed part to NetApp Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information.
  • Page 808 NetApp. The use or purchase of this product does not convey a license under any patent rights, trademark rights, or any other intellectual property rights of NetApp.
Save PDF