Advertisement

Quick Links

AFF systems
ONTAP Systems
NetApp
April 04, 2022
This PDF was generated from https://docs.netapp.com/us-en/ontap-systems/c190/install-setup.html on
April 04, 2022. Always check docs.netapp.com for the latest.

Advertisement

Table of Contents
loading
Need help?

Need help?

Do you have a question about the AFF and is the answer not in the manual?

Questions and answers

Subscribe to Our Youtube Channel

Summary of Contents for NetApp AFF

  • Page 1 AFF systems ONTAP Systems NetApp April 04, 2022 This PDF was generated from https://docs.netapp.com/us-en/ontap-systems/c190/install-setup.html on April 04, 2022. Always check docs.netapp.com for the latest.
  • Page 2: Table Of Contents

      AFF A700 and FAS9000 System Documentation ..........
  • Page 3: Aff Systems

    Video two of two: Performing end-to-end software configuration The following video shows end-to-end software configuration for systems running ONTAP 9.2 and later. NetApp video: Software configuration for vSphere NAS datastores for FAS/AFF systems running ONTAP 9.2 Detailed steps - AFF C190...
  • Page 4 Step 1: Prepare for installation To install your AFF C190 system, you need to create an account and register the system. You also need to inventory the appropriate number and type of cables for your system and collect specific network information.
  • Page 5 Cluster Configuration Worksheet. Step 2: Install the hardware You need to install your system in a 4-post rack or NetApp system cabinet, as applicable. Steps 1. Install the rail kits, as needed. 2. Install and secure your system using the instructions included with the rail kit.
  • Page 6 Step 3: Cable controllers to your network You can cable the controllers to your network by using the two-node switchless cluster method or by using the cluster interconnect network. Option 1: Cable a two-node switchless cluster, unified configuration UTA2 ports and management ports on the controller modules are connected to switches. The cluster interconnect ports are cabled on both controller modules.
  • Page 7 Step Perform on each controller Cable the cluster interconnect ports to each other with the cluster interconnect cable: • e0a to e0a • e0b to e0b Use one of the following cable types to cable the e0c/0c and e0d/0d or e0e/0e and e0f/0f data ports to your host network:...
  • Page 8 Step Perform on each controller Cable the e0M ports to the management network switches with the RJ45 cables: DO NOT plug in the power cords at this point. 2. To complete setting up your system, see Step 4: Complete system setup and configuration.
  • Page 9 Step Perform on each controller module Cable e0a and e0b to the cluster interconnect switches with the cluster interconnect cable: Use one of the following cable types to cable the e0c/0c and e0d/0d or e0e/0e and e0f/0f data ports to your host network:...
  • Page 10 Step Perform on each controller module Cable the e0M ports to the management network switches with the RJ45 cables: DO NOT plug in the power cords at this point. 2. To complete setting up your system, see Step 4: Complete system setup and configuration.
  • Page 11 Step Perform on each controller Cable the cluster interconnect ports to each other with the cluster interconnect cable • e0a to e0a • e0b to e0b Use the Cat 6 RJ45 cable to cable the e0c through e0f ports to your host network:...
  • Page 12 Step Perform on each controller Cable the e0M ports to the management network switches with the RJ45 cables DO NOT plug in the power cords at this point. 2. To complete setting up your system, see Step 4: Complete system setup and configuration.
  • Page 13 Step Perform on each controller module Cable e0a and e0b to the cluster interconnect switches with the cluster interconnect cable: Use the Cat 6 RJ45 cable to cable the e0c through e0f ports to your host network:...
  • Page 14 Step Perform on each controller module Cable the e0M ports to the management network switches with the RJ45 cables: DO NOT plug in the power cords at this point. 2. To complete setting up your system, see Step 4: Complete system setup and configuration.
  • Page 15 4. Use the animation (Connecting your laptop to the Management switch) to connect your laptop to the Management switch. 5. Select an ONTAP icon listed to discover: a. Open File Explorer. b. Click Network in the left pane. c. Right-click and select refresh. d.
  • Page 16 b. Connect the console cable to the laptop or console, and connect the console port on the controller using the console cable that came with your system. c. Connect the laptop or console to the switch on the management subnet. d.
  • Page 17 Maintain Boot media Overview of boot media replacement - AFF C190 The boot media stores a primary and secondary set of system (boot image) files that the system uses when it boots. Depending on your network configuration, you can perform either a nondisruptive or disruptive replacement.
  • Page 18 Check NVE or NSE on systems running ONTAP 9.6 and later Before shutting down the impaired controller, you need to verify whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
  • Page 19 Restored yes: a. Restore the external key management authentication keys to all nodes in the cluster: security key-manager external restore If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys: Restored...
  • Page 20 Enter the onboard security key-manager sync command: security key-manager onboard sync Enter the customer’s onboard key management passphrase at the prompt. If the passphrase cannot be provided, contact NetApp Support. mysupport.netapp.com b. Verify the Restored column shows for all authentication keys:...
  • Page 21 Return to admin mode: set -priv admin h. You can safely shut down the controller. Shut down the controller - AFF C190 After completing the NVE or NSE tasks, you need to complete the shutdown of the impaired controller.
  • Page 22 This command may not work if the boot device is corrupted or non-functional. Replace the boot media - AFF C190 To replace the boot media, you must remove the impaired controller module, install the replacement boot media, and transfer the boot image to a USB flash drive.
  • Page 23 5. Turn the controller module over and place it on a flat, stable surface. 6. Open the cover by sliding in the blue tabs to release the cover, and then swing the cover up and open. Step 2: Replace the boot media You must locate the boot media in the controller module, and then follow the directions to replace it.
  • Page 24 • A copy of the same image version of ONTAP as what the impaired controller was running. You can download the appropriate image from the Downloads section on the NetApp Support Site ◦ If NVE is enabled, download the image with NetApp Volume Encryption, as indicated in the download button.
  • Page 25 5. Interrupt the boot process to stop at the LOADER prompt by pressing Ctrl-C when you see Starting AUTOBOOT press Ctrl-C to abort…. If you miss this message, press Ctrl-C, select the option to boot to Maintenance mode, and then halt controller to boot to LOADER.
  • Page 26 Restore automatic giveback if you disabled it by using the command. storage failover modify Boot the recovery image - AFF C190 You must boot the ONTAP image from the USB drive, restore the file system, and verify the environmental variables.
  • Page 27 Steps 1. From the LOADER prompt, boot the recovery image from the USB flash drive: boot_recovery The image is downloaded from the USB flash drive. 2. When prompted, either enter the name of the image or accept the default image displayed inside the brackets on your screen.
  • Page 28 Restore OKM, NSE, and NVE as needed - AFF C190 Once environment variables are checked, you must complete steps specific to systems that have Onboard Key Manager (OKM), NetApp Storage Encryption (NSE) or NetApp Volume Encryption (NVE) enabled. 1. Determine which section you should use to restore your OKM, NSE, or NVE configurations: If NSE or NVE are enabled along with Onboard Key Manager you must restore settings you captured at the beginning of this procedure.
  • Page 29 If the console displays… Then… The LOADER prompt Boot the controller to the boot menu: boot_ontap menu Waiting for giveback…. a. Enter at the prompt Ctrl-C b. At the message: Do you wish to halt this node rather than wait [y/n]? , enter: c.
  • Page 30 9. Confirm the target controller is ready for giveback with the storage failover show command. 10. Giveback only the CFO aggregates with the storage failover giveback -fromnode local command. -only-cfo-aggregates true ◦ If the command fails because of a failed disk, physically disengage the failed disk, but leave the disk in the slot until a replacement is received.
  • Page 31 Restore NSE/NVE on systems running ONTAP 9.6 and later Steps 1. Connect the console cable to the target controller. 2. Use the command at the LOADER prompt to boot the controller. boot_ontap 3. Check the console output: If the console displays… Then…...
  • Page 32 -auto-giveback true Return the failed part to NetApp - AFF C190 After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 33 a healthy controller shows for eligibility and health, you must correct the issue before shutting down false the impaired controller; see the Administration overview with the CLI. • If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message MAINT=number_of_hours_downh The following AutoSupport message suppresses automatic case creation for two hours:...
  • Page 34 Move and replace hardware - AFF C190 Move the power supplies, hard drives, and controller module or modules from the impaired chassis to the new chassis, and swap out the impaired chassis from the equipment rack or system cabinet with the new chassis of the same model as the impaired chassis.
  • Page 35 4. Squeeze the latch on the cam handle until it releases, open the cam handle fully to release the controller module from the midplane, and then, using two hands, pull the controller module out of the chassis. 5. Set the controller module aside in a safe place, and repeat these steps if you have another controller module in the chassis.
  • Page 36 4. Gently push the drive into the chassis as far as it will go. The cam handle engages and begins to rotate upward. 5. Firmly push the drive the rest of the way into the chassis, and then lock the cam handle by pushing it up and against the drive holder.
  • Page 37 From the boot menu, select the option for Maintenance mode. Restore and verify the configuration - AFF C190 You must verify the HA state of the chassis and run System-Level diagnostics. Step 1: Verify and setting the HA state of the chassis You must verify the HA state of the chassis, and, if necessary, update the state to match your system configuration.
  • Page 38 If your system is in… Then… A stand-alone configuration a. Exit Maintenance mode: halt b. Go to "Completing the replacement process. An HA pair with a second Exit Maintenance mode: controller module halt The LOADER prompt appears. Step 2: Run system-level diagnostics After installing a new chassis, you should run interconnect diagnostics.
  • Page 39 5. Run the interconnect diagnostics test from the Maintenance mode prompt: sldiag device run -dev interconnect You only need to run the interconnect test from one controller. 6. Verify that no hardware problems resulted from the replacement of the chassis: sldiag device status -dev interconnect -long -state failed System-level diagnostics returns you to the prompt if there are no test failures, or lists the full status of failures resulting from testing the component.
  • Page 40 Rerun the system-level diagnostics test. Step 3: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 41 This provides you a record of the procedure so that you can troubleshoot any issues that you might encounter during the replacement process. Shut down the controller - AFF C190 To shut down the impaired controller, you must determine the status of the controller and, if necessary, take over the controller so that the healthy controller continues to serve data from the impaired controller storage.
  • Page 42 When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y. Replace the controller module hardware - AFF C190 To replace the controller module, you must remove the impaired controller, move FRU components to the replacement controller module, install the replacement controller module in the chassis, and then boot the system to Maintenance mode.
  • Page 43 6. Turn the controller module over and place it on a flat, stable surface. 7. Open the cover by sliding in the blue tabs to release the cover, and then swing the cover up and open. Step 2: Move the boot media You must locate the boot media and follow the directions to remove it from the old controller module and insert it in the new controller module.
  • Page 44 1. Locate the boot media using the following illustration or the FRU map on the controller module: 2. Press the blue button on the boot media housing to release the boot media from its housing, and then gently pull it straight out of the boot media socket. Do not twist or pull the boot media straight up, because this could damage the socket or the boot media.
  • Page 45 The NVRAM LED blinks while destaging contents to the flash memory when you halt the system. After the destage is complete, the LED turns off. ▪ If power is lost without a clean shutdown, the NVMEM LED flashes until the destage is complete, and then the LED turns off.
  • Page 46 You must have the new controller module ready so that you can move the DIMMs directly from the impaired controller module to the corresponding slots in the replacement controller module. 1. Locate the DIMMs on your controller module. 2. Note the orientation of the DIMM in the socket so that you can insert the DIMM in the replacement controller module in the proper orientation.
  • Page 47 Make sure that the plug locks down onto the controller module. Step 5: Install the controller module After you install the components from the old controller module into the new controller module, you must install the new controller module into the system chassis and boot the operating system. For HA pairs with two controller modules in the same chassis, the sequence in which you install the controller module is especially important because it attempts to reboot as soon as you completely seat it in the chassis.
  • Page 48 Select the option to boot to Maintenance mode from the displayed menu. Restore and verify the system configuration - AFF C190 After completing the hardware replacement and booting to Maintenance mode, you verify the low-level system configuration of the replacement controller and reconfigure system settings as necessary.
  • Page 49 Step 2: Verify and set the HA state of the controller module You must verify the state of the controller module and, if necessary, update the state to match your system configuration. 1. In Maintenance mode from the new controller module, verify that all components display the same state: ha-config show The HA state should be the same for all components.
  • Page 50 ◦ bootmedia is the system booting device. ◦ is a Converged Network Adapter or interface not connected to a network or storage device. ◦ is a Fibre Channel-Arbitrated Loop device not connected to a Fibre Channel network. fcal ◦ is motherboard environmentals. ◦...
  • Page 51 If you want to run diagnostic Then… tests on… Individual components a. Clear the status logs: sldiag device clearstatus b. Display the available tests for the selected devices: sldiag device show -dev dev_name can be any one of the ports and devices identified in dev_name the preceding step.
  • Page 52 If you want to run diagnostic Then… tests on… Multiple components at the same a. Review the enabled and disabled devices in the output from the time preceding procedure and determine which ones you want to run concurrently. b. List the individual tests for the device: sldiag device show -dev dev_name c.
  • Page 53 Reconnect the power supplies, and then power on the storage system. e. Rerun the system-level diagnostics test. Recable the system and reassign disks - AFF C190 Continue the replacement procedure by recabling the storage and confirming disk reassignment. Step 1: Recable the system After running diagnostics, you must recable the controller module’s storage and network connections.
  • Page 54 1. Recable the system. 2. Verify that the cabling is correct by using Active IQ Config Advisor. a. Download and install Config Advisor. b. Enter the information for the target system, and then click Collect Data. c. Click the Cabling tab, and then examine the output. Make sure that all disk shelves are displayed and all disks appear in the output, correcting any cabling issues you find.
  • Page 55 set -privilege advanced You can respond when prompted to continue into advanced mode. The advanced mode prompt appears (*>). b. Save any coredumps: system node run -node local-node-name partner savecore c. Wait for the `savecore`command to complete before issuing the giveback. You can enter the following command to monitor the progress of the savecore command: system node run -node local-node-name partner savecore -s d.
  • Page 56 -node replacement-node-name -onreboot true Complete system restoration - AFF C190 To restore your system to full operation, you must restore the NetApp Storage Encryption configuration (if necessary), and install licenses for the new controller, and return the failed part to NetApp, as described in the RMA instructions shipped with the kit.
  • Page 57 -node local -auto -giveback true Step 4: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 58 638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure. Replace a DIMM - AFF C190 You must replace a DIMM in the controller module when your system registers an increasing number of correctable error correction codes (ECC); failure to do so causes a system panic.
  • Page 59 Step 2: Remove controller module To access components inside the controller module, you must first remove the controller module from the system, and then remove the cover on the controller module. 1. If you are not already grounded, properly ground yourself. 2.
  • Page 60 Step 3: Replace the DIMMs To replace the DIMMs, you need to locate them inside the controller module, and then follow the specific sequence of steps. If you are replacing a DIMM, you need to remove it after you have unplugged the NVMEM battery from the controller module.
  • Page 61 b. Confirm that the NVMEM LED is no longer lit. c. Reconnect the battery connector. 4. Return to Step 3: Replace the DIMMs of this procedure to recheck the NVMEM LED. 5. Locate the DIMMs on your controller module. Each system memory DIMM has an LED located on the board next to each DIMM slot. The LED for the faulty blinks every two seconds.
  • Page 62 8. Remove the replacement DIMM from the antistatic shipping bag, hold the DIMM by the corners, and align it to the slot. The notch among the pins on the DIMM should line up with the tab in the socket. 9. Make sure that the DIMM ejector tabs on the connector are in the open position, and then insert the DIMM squarely into the slot.
  • Page 63 3. Recable the system, as needed. If you removed the media converters (QSFPs or SFPs), remember to reinstall them if you are using fiber optic cables. 4. Complete the reinstallation of the controller module. The controller module begins to boot as soon as it is fully seated in the chassis.
  • Page 64 3. Run diagnostics on the system memory: sldiag device run -dev mem 4. Verify that no hardware problems resulted from the replacement of the DIMMs: sldiag device status -dev mem -long -state failed System-level diagnostics returns you to the prompt if there are no test failures, or lists the full status of failures resulting from testing the component.
  • Page 65 Rerun the system-level diagnostic test. Step 6: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 66 Replace SSD Drive or HDD Drive - AFF C190 You can replace a failed drive nondisruptively while I/O is in progress. The procedure for replacing an SSD is meant for non-spinning drives and the procedure for replacing an HDD is meant for spinning drives.
  • Page 67 Option 1: Replace SSD 1. If you want to manually assign drive ownership for the replacement drive, you need to disable automatic drive assignment replacement drive, if it is enabled You manually assign drive ownership and then reenable automatic drive assignment later in this procedure.
  • Page 68 -node node_name -autoassign on You must reenable automatic drive assignment on both controller modules. 10. Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 69 Depending on the storage system, the disk drives have the release button located at the top or on the left of the disk drive face. For example, the following illustration shows a disk drive with the release button located on the top of the disk drive face: The cam handle on the disk drive springs open partially and the disk drive releases from the midplane.
  • Page 70 13. Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp Support, 888-463-8277 (North America), 00-800-44-638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure.
  • Page 71 Step 2: Remove controller module To access components inside the controller module, you must first remove the controller module from the system, and then remove the cover on the controller module. 1. If you are not already grounded, properly ground yourself. 2.
  • Page 72 Step 3: Replace the NVMEM battery To replace the NVMEM battery in your system, you must remove the failed NVMEM battery from the system and replace it with a new NVMEM battery. 1. Check the NVMEM LED: ◦ If your system is in an HA configuration, go to the next step. ◦...
  • Page 73 2. Locate the NVMEM battery in the controller module. 3. Locate the battery plug and squeeze the clip on the face of the battery plug to release the plug from the socket, and then unplug the battery cable from the socket. 4.
  • Page 74 4. Complete the reinstallation of the controller module. The controller module begins to boot as soon as it is fully seated in the chassis. Be prepared to interrupt the boot process. a. With the cam handle in the open position, firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position.
  • Page 75 System-level diagnostics returns you to the prompt if there are no test failures, or lists the full status of failures resulting from testing the component. 5. Proceed based on the result of the preceding step: If the system-level diagnostics Then… tests…...
  • Page 76 Rerun the system-level diagnostic test. Step 6: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 77 All other components in the system must be functioning properly; if not, you must contact technical support. • The power supplies are redundant and hot-swappable. • This procedure is written for replacing one power supply at a time. Cooling is integrated with the power supply, so you must replace the power supply within two minutes of removal to prevent overheating due to reduced airflow.
  • Page 78 The power supply LEDs are lit when the power supply comes online. 2. After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 79 1. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message MAINT=number_of_hours_downh The following AutoSupport message suppresses automatic case creation for two hours: cluster1:*> system node autosupport invoke -node * -type all -message MAINT=2h 2.
  • Page 80 5. Turn the controller module over and place it on a flat, stable surface. 6. Open the cover by sliding in the blue tabs to release the cover, and then swing the cover up and open. Step 3: Replace the RTC battery To replace the RTC battery, you need to locate it inside the controller module, and then follow the specific sequence of steps.
  • Page 81 1. Locate the RTC battery. 2. Gently push the battery away from the holder, rotate it away from the holder, and then lift it out of the holder. Note the polarity of the battery as you remove it from the holder. The battery is marked with a plus sign and must be positioned in the holder correctly.
  • Page 82: Aff A200 System Documentation

    -node local -auto -giveback true Step 5: Complete the replacement process After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 83 Maintain Boot media Overview of boot media replacement - AFF A200 The boot media stores a primary and secondary set of system (boot image) files that the system uses when it boots. Depending on your network configuration, you can perform either a nondisruptive or disruptive replacement.
  • Page 84 Option 1: Checking NVE or NSE on systems running ONTAP 9.5 and earlier Before shutting down the impaired controller, you need to check whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
  • Page 85 If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column displays for all authentication keys and that all key managers Restored display available: security key-manager query c. Shut down the impaired controller. 3. If you saw the message This command is not supported when onboard key management is enabled,...
  • Page 86 Retrieve and restore all authentication keys and associated key IDs: security key-manager restore -address * If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column displays for all authentication keys and that all key managers...
  • Page 87 Option 2: Checking NVE or NSE on systems running ONTAP 9.6 and later Before shutting down the impaired controller, you need to verify whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
  • Page 88 If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the Restored column equals for all authentication keys: security key- manager key-query c. Shut down the impaired controller. 3. If the type displays and the column displays anything other than...
  • Page 89 Restored column displays anything other than yes: a. Enter the onboard security key-manager sync command: security key-manager external sync If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys: Restored security key- manager key-query c.
  • Page 90 Shut down the impaired controller - AFF A200 After completing the NVE or NSE tasks, you need to complete the shutdown of the impaired controller. Steps a. Take the impaired controller to the LOADER prompt: If the impaired controller Then…...
  • Page 91 4. Squeeze the latch on the cam handle until it releases, open the cam handle fully to release the controller module from the midplane, and then, using two hands, pull the controller module out of the chassis. 5. Turn the controller module over and place it on a flat, stable surface. 6.
  • Page 92 Step 2: Replace the boot media You must locate the boot media in the controller and follow the directions to replace it. Steps 1. If you are not already grounded, properly ground yourself. 2. Locate the boot media using the following illustration or the FRU map on the controller module: 3.
  • Page 93 • A copy of the same image version of ONTAP as what the impaired controller was running. You can download the appropriate image from the Downloads section on the NetApp Support Site ◦ If NVE is enabled, download the image with NetApp Volume Encryption, as indicated in the download button.
  • Page 94 Other parameters might be necessary for your interface. You can enter help ifconfig the firmware prompt for details. Boot the recovery image - AFF A200 You must boot the ONTAP image from the USB drive, restore the file system, and verify the environmental variables.
  • Page 95 If your system has… Then… No network connection a. Press when prompted to restore the backup configuration. b. Reboot the system when prompted by the system. c. Select the Update flash from backup config (sync flash) option from the displayed menu. If you are prompted to continue with the update, press y.
  • Page 96 Restore OKM, NSE, and NVE as needed - AFF A200 Once environment variables are checked, you must complete steps specific to systems that have Onboard Key Manager (OKM), NetApp Storage Encryption (NSE) or NetApp Volume Encryption (NVE) enabled. Determine which section you should use to restore your OKM, NSE, or NVE configurations: If NSE or NVE are enabled along with Onboard Key Manager you must restore settings you captured at the beginning of this procedure.
  • Page 97 --------------------------BEGIN BACKUP-------------------------- TmV0QXBwIEtleSBCbG9iAAEAAAAEAAAAcAEAAAAAAADuD+byAAAAACEAAAAAAAAA QAAAAAAAAABvOlH0AAAAAMh7qDLRyH1DBz12piVdy9ATSFMT0C0TlYFss4PDjTaV dzRYkLd1PhQLxAWJwOIyqSr8qY1SEBgm1IWgE5DLRqkiAAAAAAAAACgAAAAAAAAA 3WTh7gAAAAAAAAAAAAAAAAIAAAAAAAgAZJEIWvdeHr5RCAvHGclo+wAAAAAAAAAA IgAAAAAAAAAoAAAAAAAAAEOTcR0AAAAAAAAAAAAAAAACAAAAAAAJAGr3tJA/ LRzUQRHwv+1aWvAAAAAAAAAAACQAAAAAAAAAgAAAAAAAAACdhTcvAAAAAJ1PXeBf ml4NBsSyV1B4jc4A7cvWEFY6lLG6hc6tbKLAHZuvfQ4rIbYAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA H4nPQM0nrDRYRa9SCv8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAA ---------------------------END BACKUP--------------------------- 7. At the Boot Menu select the option for Normal Boot. The system boots to prompt. Waiting for giveback… 8. Move the console cable to the partner controller and login as admin. 9.
  • Page 98 b. Enter the key-manager key show -detail command to see a detailed view of all keys stored in the onboard key manager and verify that the column = for all authentication keys. Restored If the column = anything other than yes, contact Customer Support. Restored c.
  • Page 99 This command does not work if NVE (NetApp Volume Encryption) is configured 10. Use the security key-manager query to display the key IDs of the authentication keys that are stored on the key management servers.
  • Page 100 11. If the Onboard Key Management is enabled: a. Use the to see a detailed view of all keys stored in security key-manager key show -detail the onboard key manager. b. Use the command and verify that the security key-manager key show -detail Restored column = for all authentication keys.
  • Page 101 -auto-giveback true Return the failed part to NetApp - AFF A200 After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 102 • This procedure is written with the assumption that you are moving all drives and controller module or modules to the new chassis, and that the chassis is a new component from NetApp. • This procedure is disruptive. For a two-node cluster, you will have a complete service outage and a partial outage in a multi-node cluster.
  • Page 103 -skip-lif-migration-before-shutdown true Answer when prompted. Move and replace hardware - AFF A200 Move the power supplies, hard drives, and controller module or modules from the impaired chassis to the new chassis, and swap out the impaired chassis from the equipment rack or system cabinet with the new chassis of the same model as the impaired chassis.
  • Page 104 5. Repeat the preceding steps for any remaining power supplies. 6. Using both hands, support and align the edges of the power supply with the opening in the system chassis, and then gently push the power supply into the chassis using the cam handle. The power supplies are keyed and can only be installed one way.
  • Page 105 4. Set the controller module aside in a safe place, and repeat these steps if you have another controller module in the chassis. Step 3: Move drives to the new chassis Move the drives from each bay opening in the old chassis to the same bay opening in the new chassis. Steps 1.
  • Page 106 Step 4: Replace a chassis from within the equipment rack or system cabinet You must remove the existing chassis from the equipment rack or system cabinet before you can install the replacement chassis. Steps 1. Remove the screws from the chassis mount points. 2.
  • Page 107 From the boot menu, select the option for Maintenance mode. Restore and verify the configuration - AFF A200 Step 1: Verify and set the HA state of the chassis You must verify the HA state of the chassis, and, if necessary, update the state to match...
  • Page 108 Steps 1. In Maintenance mode, from either controller module, display the HA state of the local controller module and chassis: ha-config show The HA state should be the same for all components. 2. If the displayed system state for the chassis does not match your system configuration: a.
  • Page 109 The interconnect tests are disabled by default and must be enabled to run separately. 5. Run the interconnect diagnostics test from the Maintenance mode prompt: sldiag device run -dev interconnect You only need to run the interconnect test from one controller. 6.
  • Page 110 Rerun the system-level diagnostics test. Step 3: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 111 This provides you a record of the procedure so that you can troubleshoot any issues that you might encounter during the replacement process. Shut down the impaired controller - AFF A200 To shut down the impaired controller, you must determine the status of the controller and, if necessary, take over the controller so that the healthy controller continues to serve data from the impaired controller storage.
  • Page 112 move FRU components to the replacement controller module, install the replacement controller module in the chassis, and then boot the system to Maintenance mode. Step 1: Remove controller module To replace the controller module, you must first remove the old controller module from the chassis. Steps 1.
  • Page 113 7. Open the cover by sliding in the blue tabs to release the cover, and then swing the cover up and open. Step 2: Move the boot media You must locate the boot media and follow the directions to remove it from the old controller module and insert it in the new controller module.
  • Page 114 specific sequence of steps. Steps 1. Check the NVMEM LED: ◦ If your system is in an HA configuration, go to the next step. ◦ If your system is in a stand-alone configuration, cleanly shut down the controller module, and then check the NVRAM LED identified by the NV icon.
  • Page 115 5. Move the battery to the replacement controller module. 6. Loop the battery cable around the cable channel on the side of the battery holder. 7. Position the battery pack by aligning the battery holder key ribs to the “V” notches on the sheet metal side wall.
  • Page 116 4. Repeat these steps to remove additional DIMMs as needed. 5. Verify that the NVMEM battery is not plugged into the new controller module. 6. Locate the slot where you are installing the DIMM. 7. Make sure that the DIMM ejector tabs on the connector are in the open position, and then insert the DIMM squarely into the slot.
  • Page 117 Steps 1. If you are not already grounded, properly ground yourself. 2. If you have not already done so, replace the cover on the controller module. 3. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system.
  • Page 118 You can safely respond to these prompts. Restore and verify the system configuration - AFF A200 After completing the hardware replacement and booting to Maintenance mode, you verify the low-level system configuration of the replacement controller and reconfigure system settings as necessary.
  • Page 119 • The replacement node is the new node that replaced the impaired node as part of this procedure. • The healthy node is the HA partner of the replacement node. Steps 1. If the replacement node is not at the LOADER prompt, halt the system to the LOADER prompt. 2.
  • Page 120 2. At the LOADER prompt, access the special drivers specifically designed for system-level diagnostics to function properly: boot_diags During the boot process, you can safely respond to the prompts until the Maintenance mode prompt (*>) appears. 3. Display and note the available devices on the controller module: sldiag device show -dev mb The controller module devices and ports displayed can be any one or more of the following: ◦...
  • Page 121 If you want to run diagnostic Then… tests on… Individual components a. Clear the status logs: sldiag device clearstatus b. Display the available tests for the selected devices: sldiag device show -dev dev_name dev_name can be any one of the ports and devices identified in the preceding step.
  • Page 122 If you want to run diagnostic Then… tests on… Multiple components at the same a. Review the enabled and disabled devices in the output from the time preceding procedure and determine which ones you want to run concurrently. b. List the individual tests for the device: sldiag device show -dev dev_name c.
  • Page 123 Reconnect the power supplies, and then power on the storage system. e. Rerun the system-level diagnostics test. Recable the system and reassign disks - AFF A200 Continue the replacement procedure by re-cabling the storage and confirming disk reassignment. Step 1: Re-cable the system After running diagnostics, you must recable the controller module’s storage and network connections.
  • Page 124 d. Check other cabling by clicking the appropriate tab, and then examining the output from Config Advisor. Step 2: Reassign disks If the storage system is in an HA pair, the system ID of the new controller module is automatically assigned to the disks when the giveback occurs at the end of the procedure.
  • Page 125 c. Wait for the `savecore`command to complete before issuing the giveback. You can enter the following command to monitor the progress of the savecore command: system node run -node local-node-name partner savecore -s d. Return to the admin privilege level: set -privilege admin 5.
  • Page 126 Option 2: Manually reassign the system ID on a stand-alone system in ONTAP In a stand-alone system, you must manually reassign disks to the new controller’s system ID before you return the system to normal operating condition. About this task This procedure applies only to systems that are in a stand-alone configuration.
  • Page 127 Complete system restoration - AFF A200 To restore your system to full operation, you must restore the NetApp Storage Encryption configuration (if necessary), and install licenses for the new controller, and return the failed part to NetApp, as described in the RMA instructions shipped with the kit.
  • Page 128 -node local -auto -giveback true Step 4: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 129 system panic. About this task • All other components in the system must be functioning properly; if not, you must contact technical support. • You must replace the failed component with a replacement FRU component you received from your provider. Step 1: Shut down the impaired controller To shut down the impaired controller, you must determine the status of the controller and, if necessary, take over the controller so that the healthy controller continues to serve data from the impaired controller storage.
  • Page 130 1. If you are not already grounded, properly ground yourself. 2. Loosen the hook and loop strap binding the cables to the cable management device, and then unplug the system cables and SFPs (if needed) from the controller module, keeping track of where the cables were connected.
  • Page 131 Step 3: Replace the DIMMs To replace the DIMMs, locate them inside the controller and follow the specific sequence of steps. About this task If you are replacing a DIMM, you need to remove it after you have unplugged the NVMEM battery from the controller module.
  • Page 132 socket, and then unplug the battery cable from the socket. b. Confirm that the NVMEM LED is no longer lit. c. Reconnect the battery connector. 5. Return to step 2 of this procedure to recheck the NVMEM LED. 6. Locate the DIMMs on your controller module. Each system memory DIMM has an LED located on the board next to each DIMM slot.
  • Page 133 9. Remove the replacement DIMM from the antistatic shipping bag, hold the DIMM by the corners, and align it to the slot. The notch among the pins on the DIMM should line up with the tab in the socket. 10. Make sure that the DIMM ejector tabs on the connector are in the open position, and then insert the DIMM squarely into the slot.
  • Page 134 4. Recable the system, as needed. If you removed the media converters (QSFPs or SFPs), remember to reinstall them if you are using fiber optic cables. 5. Complete the reinstallation of the controller module: If your system is in… Then perform these steps… An HA pair The controller module begins to boot as soon as it is fully seated in the chassis.
  • Page 135 If your system is in… Then perform these steps… A stand-alone a. With the cam handle in the open position, firmly push the controller module in configuration until it meets the midplane and is fully seated, and then close the cam handle to the locked position.
  • Page 136 appears. 3. Run diagnostics on the system memory: sldiag device run -dev mem 4. Verify that no hardware problems resulted from the replacement of the DIMMs: sldiag device status -dev mem -long -state failed System-level diagnostics returns you to the prompt if there are no test failures, or lists the full status of failures resulting from testing the component.
  • Page 137 If the system-level Then… diagnostics tests… Resulted in some test Determine the cause of the problem: failures a. Exit Maintenance mode: halt After you issue the command, wait until the system stops at the LOADER prompt. b. Turn off or leave on the power supplies, depending on how many controller modules are in the chassis: ◦...
  • Page 138 If the system-level Then… diagnostics tests… Were completed without a. Clear the status logs: sldiag device clearstatus any failures b. Verify that the log was cleared: sldiag device status The following default response is displayed: SLDIAG: No log messages are present. c.
  • Page 139 Rerun the system-level diagnostic test. Step 6: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 140 How you replace the disk depends on how the disk drive is being used. If SED authentication is enabled, you must use the SED replacement instructions in the ONTAP 9 NetApp Encryption Power Guide. These Instructions describe additional steps you must perform before and after replacing an SED.
  • Page 141 Option 1: Replace SSD 1. If you want to manually assign drive ownership for the replacement drive, you need to disable automatic drive assignment replacement drive, if it is enabled You manually assign drive ownership and then reenable automatic drive assignment later in this procedure.
  • Page 142 -node node_name -autoassign on You must reenable automatic drive assignment on both controller modules. 10. Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 143 Depending on the storage system, the disk drives have the release button located at the top or on the left of the disk drive face. For example, the following illustration shows a disk drive with the release button located on the top of the disk drive face: The cam handle on the disk drive springs open partially and the disk drive releases from the midplane.
  • Page 144 13. Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp Support, 888-463-8277 (North America), 00-800-44-638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure.
  • Page 145 If the impaired controller is Then… displaying… System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y.
  • Page 146 5. Turn the controller module over and place it on a flat, stable surface. 6. Open the cover by sliding in the blue tabs to release the cover, and then swing the cover up and open. Step 3: Replace the NVMEM battery To replace the NVMEM battery in your system, you must remove the failed NVMEM battery from the system and replace it with a new NVMEM battery.
  • Page 147 Steps 1. If you are not already grounded, properly ground yourself. 2. Check the NVMEM LED: ◦ If your system is in an HA configuration, go to the next step. ◦ If your system is in a stand-alone configuration, cleanly shut down the controller module, and then check the NVRAM LED identified by the NV icon.
  • Page 148 7. Loop the battery cable around the cable channel on the side of the battery holder. 8. Position the battery pack by aligning the battery holder key ribs to the “V” notches on the sheet metal side wall. 9. Slide the battery pack down along the sheet metal side wall until the support tabs on the side wall hook into the slots on the battery pack, and the battery pack latch engages and clicks into the opening on the side wall.
  • Page 149 If your system is in… Then perform these steps… An HA pair The controller module begins to boot as soon as it is fully seated in the chassis. Be prepared to interrupt the boot process. a. With the cam handle in the open position, firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position.
  • Page 150 If your system is in… Then perform these steps… A stand-alone configuration a. With the cam handle in the open position, firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position. Do not use excessive force when sliding the controller module into the chassis to avoid damaging the connectors.
  • Page 151 2. At the LOADER prompt, access the special drivers specifically designed for system-level diagnostics to function properly: boot_diags During the boot process, you can safely respond to the prompts until the Maintenance mode prompt (*>) appears. 3. Run diagnostics on the NVMEM memory: sldiag device run -dev nvmem 4.
  • Page 152 Rerun the system-level diagnostic test. Step 6: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 153 4. Squeeze the latch on the power supply cam handle, and then open the cam handle to fully release the power supply from the mid plane. If you have an AFF A200 system, a plastic flap within the now empty slot is released to cover the opening and maintain air flow and cooling.
  • Page 154 The power supply LEDs are lit when the power supply comes online. 11. After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 155 If the impaired controller is Then… displaying… Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y.
  • Page 156 5. Turn the controller module over and place it on a flat, stable surface. 6. Open the cover by sliding in the blue tabs to release the cover, and then swing the cover up and open. Step 3: Replace the RTC battery To replace the RTC battery, locate it inside the controller and follow the specific sequence of steps.
  • Page 157 1. If you are not already grounded, properly ground yourself. 2. Locate the RTC battery. 3. Gently push the battery away from the holder, rotate it away from the holder, and then lift it out of the holder. Note the polarity of the battery as you remove it from the holder. The battery is marked with a plus sign and must be positioned in the holder correctly.
  • Page 158: Fas2700 System Documentation

    -node local -auto -giveback true Step 5: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 159 Video two of two: Performing end-to-end software configuration The following video shows end-to-end software configuration for systems running ONTAP 9.2 and later. NetApp video: Software configuration for vSphere NAS datastores for FAS/AFF systems running ONTAP 9.2 Detailed guide - AFF A220 and FAS2700 This guide gives detailed step-by-step instructions for installing a typical NetApp system.
  • Page 160 Step 1: Prepare for installation To install your FAS2700 or AFF A220 system, you need to create an account on the NetApp Support Site, register your system, and get license keys. You also need to inventory the appropriate number and type of cables for your system and collect specific network information.
  • Page 161 6. Download and complete the Cluster configuration worksheet. Cluster Configuration Worksheet Step 2: Install the hardware You need to install your system in a 4-post rack or NetApp system cabinet, as applicable. Steps 1. Install the rail kits, as needed.
  • Page 162 You need to be aware of the safety concerns associated with the weight of the system. 3. Attach cable management devices (as shown). 4. Place the bezel on the front of the system. Step 3: Cable controllers to your network You can cable the controllers to your network by using the two-node switchless cluster method or by using the cluster interconnect network.
  • Page 163 Step Perform on each controller Cable the cluster interconnect ports to each other with the cluster interconnect cable: • e0a to e0a • e0b to e0b...
  • Page 164 Step Perform on each controller Use one of the following cable types to cable the UTA2 data ports to your host network: An FC host • 0c and 0d • or 0e and 0f A 10GbE • e0c and e0d •...
  • Page 165 2. To cable your storage, see Cabling controllers to drive shelves Option 2: Cable a switched cluster, unified network configuration Management network, UTA2 data network, and management ports on the controllers are connected to switches. The cluster interconnect ports are cabled to the cluster interconnect switches.
  • Page 166 Step Perform on each controller module Cable e0a and e0b to the cluster interconnect switches with the cluster interconnect cable: Use one of the following cable types to cable the UTA2 data ports to your host network: An FC host •...
  • Page 167 Step Perform on each controller module Cable the e0M ports to the management network switches with the RJ45 cables: DO NOT plug in the power cords at this point. 2. To cable your storage, see Cabling controllers to drive shelves Option 3: Cable a two-node switchless cluster, Ethernet network configuration Management network, Ethernet data network, and management ports on the controllers are connected to switches.
  • Page 168 Step Perform on each controller Cable the cluster interconnect ports to each other with the cluster interconnect cable: • e0a to e0a • e0b to e0b Use the Cat 6 RJ45 cable to cable the e0c through e0f ports to your host network:...
  • Page 169 Step Perform on each controller Cable the e0M ports to the management network switches with the RJ45 cables: DO NOT plug in the power cords at this point. 2. To cable your storage, see Cabling controllers to drive shelves Option 4: Cable a switched cluster, Ethernet network configuration Management network, Ethernet data network, and management ports on the controllers are connected to switches.
  • Page 170 Step Perform on each controller module Cable e0a and e0b to the cluster interconnect switches with the cluster interconnect cable: Use the Cat 6 RJ45 cable to cable the e0c through e0f ports to your host network:...
  • Page 171 Cabling controllers to drive shelves Step 4: Cable controllers to drive shelves You must cable the controllers to your shelves using the onboard storage ports. NetApp recommends MP-HA cabling for systems with external storage. If you have a SAS tape drive, you can use single-path cabling.
  • Page 172 Step Perform on each controller Cable the shelf-to-shelf ports. • Port 3 on IOM A to port 1 on the IOM A on the shelf directly below. • Port 3 on IOM B to port 1 on the IOM B on the shelf directly below. mini-SAS HD to mini-SAS HD cables Connect each node to IOM A in the stack.
  • Page 173 Step 5: Complete system setup and configuration You can complete the system setup and configuration using cluster discovery with only a connection to the switch and laptop, or by connecting directly to a controller in the system and then connecting to the management switch. Option 1: Complete system setup if network discovery is enabled If you have network discovery enabled on your laptop, you can complete system setup and configuration using automatic cluster discovery.
  • Page 174 Double-click either ONTAP icon and accept any certificates displayed on your screen. XXXXX is the system serial number for the target node. System Manager opens. 7. Use System Manager guided setup to configure your system using the data you collected in the NetApp ONTAP Configuration Guide. ONTAP Configuration Guide 8.
  • Page 175 c. Connect the laptop or console to the switch on the management subnet. d. Assign a TCP/IP address to the laptop or console, using one that is on the management subnet. 2. Use the following animation to set one or more drive shelf IDs: Setting drive shelf IDs 3.
  • Page 176 Point your browser to the node management IP address. The format for the address is https://x.x.x.x. b. Configure the system using the data you collected in the NetApp ONTAP Configuration guide. ONTAP Configuration Guide 7. Verify the health of your system by running Config Advisor.
  • Page 177 Option 1: Check NVE or NSE on systems running ONTAP 9.5 and earlier Before shutting down the impaired controller, you need to check whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
  • Page 178 Retrieve and restore all authentication keys and associated key IDs: security key-manager restore -address * If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column displays for all authentication keys and that all key managers...
  • Page 179 Retrieve and restore all authentication keys and associated key IDs: security key-manager restore -address * If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column displays for all authentication keys and that all key managers...
  • Page 180 Option 2: Check NVE or NSE on systems running ONTAP 9.6 and later Before shutting down the impaired controller, you need to verify whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
  • Page 181 Restore the external key management authentication keys to all nodes in the cluster: security key-manager external restore If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys:...
  • Page 182 b. Verify the Restored column shows for all authentication keys: security key-manager key-query c. Verify that the type shows onboard, and then manually back up the OKM Key Manager information. d. Go to advanced privilege mode and enter when prompted to continue: set -priv advanced e.
  • Page 183 Return to admin mode: set -priv admin h. You can safely shut down the controller. Shut down the impaired controller - AFF A220 and FAS2700 Shut down or take over the impaired controller using the appropriate procedure for your configuration.
  • Page 184 If the impaired controller Then… displays… Press Ctrl-C, and then respond when prompted. Waiting for giveback… System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y.
  • Page 185 When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y. Replace the boot media - AFF A220 and FAS2700 To replace the boot media, you must remove the impaired controller module, install the replacement boot media, and transfer the boot image to a USB flash drive.
  • Page 186 5. Turn the controller module over and place it on a flat, stable surface. 6. Open the cover by sliding in the blue tabs to release the cover, and then swing the cover up and open. Step 2: Replace the boot media You must locate the boot media in the controller and follow the directions to replace it.
  • Page 187 • A copy of the same image version of ONTAP as what the impaired controller was running. You can download the appropriate image from the Downloads section on the NetApp Support Site ◦ If NVE is enabled, download the image with NetApp Volume Encryption, as indicated in the download button.
  • Page 188 ◦ If NVE is not enabled, download the image without NetApp Volume Encryption, as indicated in the download button. • If your system is an HA pair, you must have a network connection. • If your system is a stand-alone system you do not need a network connection, but you must perform an additional reboot when restoring the var file system.
  • Page 189 Other parameters might be necessary for your interface. You can enter help ifconfig the firmware prompt for details. Boot the recovery image - AFF A220 and FAS2700 You must boot the ONTAP image from the USB drive, restore the file system, and verify the environmental variables.
  • Page 190 -node local command. -auto-giveback true Restore OKM, NSE, and NVE as needed - AFF A220 and FAS2700 Once environment variables are checked, you must complete steps specific to systems that have Onboard Key Manager (OKM), NetApp Storage Encryption (NSE) or NetApp Volume Encryption (NVE) enabled.
  • Page 191 Option 1: Restore NVE or NSE when Onboard Key Manager is enabled Steps 1. Connect the console cable to the target controller. 2. Use the command at the LOADER prompt to boot the controller. boot_ontap 3. Check the console output: If the console Then…...
  • Page 192 --------------------------BEGIN BACKUP-------------------------- TmV0QXBwIEtleSBCbG9iAAEAAAAEAAAAcAEAAAAAAADuD+byAAAAACEAAAAAAAAA QAAAAAAAAABvOlH0AAAAAMh7qDLRyH1DBz12piVdy9ATSFMT0C0TlYFss4PDjTaV dzRYkLd1PhQLxAWJwOIyqSr8qY1SEBgm1IWgE5DLRqkiAAAAAAAAACgAAAAAAAAA 3WTh7gAAAAAAAAAAAAAAAAIAAAAAAAgAZJEIWvdeHr5RCAvHGclo+wAAAAAAAAAA IgAAAAAAAAAoAAAAAAAAAEOTcR0AAAAAAAAAAAAAAAACAAAAAAAJAGr3tJA/ LRzUQRHwv+1aWvAAAAAAAAAAACQAAAAAAAAAgAAAAAAAAACdhTcvAAAAAJ1PXeBf ml4NBsSyV1B4jc4A7cvWEFY6lLG6hc6tbKLAHZuvfQ4rIbYAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA H4nPQM0nrDRYRa9SCv8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAA ---------------------------END BACKUP--------------------------- 7. At the Boot Menu select the option for Normal Boot. The system boots to prompt. Waiting for giveback… 8. Move the console cable to the partner controller and login as admin. 9.
  • Page 193 b. Enter the key-manager key show -detail command to see a detailed view of all keys stored in the onboard key manager and verify that the column = for all authentication keys. Restored If the column = anything other than yes, contact Customer Support. Restored c.
  • Page 194 This command does not work if NVE (NetApp Volume Encryption) is configured 10. Use the security key-manager query to display the key IDs of the authentication keys that are stored on the key management servers.
  • Page 195 11. If the Onboard Key Management is enabled: a. Use the to see a detailed view of all keys stored in security key-manager key show -detail the onboard key manager. b. Use the command and verify that the security key-manager key show -detail Restored column = for all authentication keys.
  • Page 196 -auto-giveback true Return the failed part to NetApp - AFF A220 and FAS2700 After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 197 • You must replace the failed component with a replacement FRU component you received from your provider. Step 1: Shut down the impaired controller Shut down or take over the impaired controller using the appropriate procedure for your configuration. Option 1: Most configurations To shut down the impaired controller, you must determine the status of the controller and, if necessary, take over the controller so that the healthy controller continues to serve data from the impaired controller storage.
  • Page 198 If the impaired controller is Then… displaying… System prompt or password Take over or halt the impaired controller: prompt (enter system password) • For an HA pair, take over the impaired controller from the healthy controller: storage failover takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y.
  • Page 199 If the impaired controller is Then… displaying… System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y.
  • Page 200 5. Turn the controller module over and place it on a flat, stable surface. 6. Open the cover by sliding in the blue tabs to release the cover, and then swing the cover up and open. Step 3: Replace a caching module To replace a caching module referred to as the M.2 PCIe card on the label on your controller, locate the slot inside the controller and follow the specific sequence of steps.
  • Page 201 Your storage system must meet certain criteria depending on your situation: • It must have the appropriate operating system for the caching module you are installing. • It must support the caching capacity. • All other components in the storage system must be functioning properly; if not, you must contact technical support.
  • Page 202 7. Close the controller module cover, as needed. Step 4: Reinstall the controller module After you replace components in the controller module, reinstall it into the chassis. Steps 1. If you are not already grounded, properly ground yourself. 2. If you have not already done so, replace the cover on the controller module. 3.
  • Page 203 If your system is in… Then perform these steps… An HA pair The controller module begins to boot as soon as it is fully seated in the chassis. Be prepared to interrupt the boot process. a. With the cam handle in the open position, firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position.
  • Page 204 If your system is in… Then perform these steps… A stand-alone configuration a. With the cam handle in the open position, firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position. Do not use excessive force when sliding the controller module into the chassis to avoid damaging the connectors.
  • Page 205 appears. 3. Run diagnostics on the caching module: sldiag device run -dev fcache 4. Verify that no hardware problems resulted from the replacement of the caching module: sldiag device status -dev fcache -long -state failed System-level diagnostics returns you to the prompt if there are no test failures, or lists the full status of failures resulting from testing the component.
  • Page 206 If the system-level Then… diagnostics tests… Resulted in some test Determine the cause of the problem: failures a. Exit Maintenance mode: halt After you issue the command, wait until the system stops at the LOADER prompt. b. Turn off or leave on the power supplies, depending on how many controller modules are in the chassis: ◦...
  • Page 207 cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A   controller_A_1 configured enabled heal roots completed   cluster_B   controller_B_1 configured enabled waiting for switchback recovery 2 entries were displayed. 2. Verify that resynchronization is complete on all SVMs: metrocluster vserver show 3.
  • Page 208 Step 7: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp Support, 888- 463-8277 (North America), 00-800-44-638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure.
  • Page 209 If your system is running Then… clustered ONTAP with… Two controllers in the cluster cluster ha modify -configured false storage failover modify -node node0 -enabled false More than two controllers in the storage failover modify -node node0 -enabled false cluster 2.
  • Page 210 When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y. Move and replace hardware - AFF A220 and FAS2700 Move the power supplies, hard drives, and controller module or modules from the impaired chassis to the new chassis, and swap out the impaired chassis from the equipment rack or system cabinet with the new chassis of the same model as the impaired chassis.
  • Page 211 4. Use the cam handle to slide the power supply out of the system. When removing a power supply, always use two hands to support its weight. 5. Repeat the preceding steps for any remaining power supplies. 6. Using both hands, support and align the edges of the power supply with the opening in the system chassis, and then gently push the power supply into the chassis using the cam handle.
  • Page 212 4. Set the controller module aside in a safe place, and repeat these steps if you have another controller module in the chassis. Step 3: Move drives to the new chassis You need to move the drives from each bay opening in the old chassis to the same bay opening in the new chassis.
  • Page 213 Step 4: Replace a chassis from within the equipment rack or system cabinet You must remove the existing chassis from the equipment rack or system cabinet before you can install the replacement chassis. 1. Remove the screws from the chassis mount points. 2.
  • Page 214 Restore and verify the configuration - AFF A220 and FAS2700 You must verify the HA state of the chassis and run System-Level diagnostics, switch back aggregates, and return the failed part to NetApp, as described in the RMA instructions shipped with the kit.
  • Page 215 Step 1: Verify and set the HA state of the chassis You must verify the HA state of the chassis, and, if necessary, update the state to match your system configuration. 1. In Maintenance mode, from either controller module, display the HA state of the local controller module and chassis: ha-config show The HA state should be the same for all components.
  • Page 216 During the boot process, you can safely respond to prompts: 2. Repeat the previous step on the second controller if you are in an HA configuration. Both controllers must be in Maintenance mode to run the interconnect test. 3. At the LOADER prompt, access the special drivers specifically designed for system-level diagnostics to function properly: boot_diags During the boot process, you can safely respond...
  • Page 217 If the system-level diagnostics Then… tests… Were completed without any a. Clear the status logs: sldiag device clearstatus failures b. Verify that the log was cleared: sldiag device status The following default response is displayed: SLDIAG: No log messages are present. c.
  • Page 218 If your system is running Then… ONTAP… Resulted in some test failures Determine the cause of the problem. a. Exit Maintenance mode: halt b. Perform a clean shutdown, and then disconnect the power supplies. c. Verify that you have observed all of the considerations identified for running system-level diagnostics, that cables are securely connected, and that hardware components are properly installed in the storage system.
  • Page 219 6. Reestablish any SnapMirror or SnapVault configurations. Step 4: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 220 This provides you a record of the procedure so that you can troubleshoot any issues that you might encounter during the replacement process. Shut down the impaired controller - AFF A220 and FAS2700 Shut down or take over the impaired controller using the appropriate procedure for your configuration.
  • Page 221 3. Take the impaired controller to the LOADER prompt: If the impaired controller is Then… displaying… The LOADER prompt Go to Remove controller module.. Waiting for giveback… Press Ctrl-C, and then respond y. System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name...
  • Page 222 When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y. Replace the controller module hardware - AFF A220 and FAS2700 To replace the controller module hardware, you must remove the impaired controller, move FRU components to the replacement controller module, install the replacement controller module in the chassis, and then boot the system to Maintenance mode.
  • Page 223 6. Turn the controller module over and place it on a flat, stable surface. 7. Open the cover by sliding in the blue tabs to release the cover, and then swing the cover up and open. Step 2: Move the NVMEM battery To move the NVMEM battery from the old controller module to the new controller module, you must perform a specific sequence of steps.
  • Page 224 1. Check the NVMEM LED: ◦ If your system is in an HA configuration, go to the next step. ◦ If your system is in a stand-alone configuration, cleanly shut down the controller module, and then check the NVRAM LED identified by the NV icon. The NVRAM LED blinks while destaging contents to the flash memory when you halt the system.
  • Page 225 7. Position the battery pack by aligning the battery holder key ribs to the “V” notches on the sheet metal side wall. 8. Slide the battery pack down along the sheet metal side wall until the support tabs on the side wall hook into the slots on the battery pack, and the battery pack latch engages and clicks into the opening on the side wall.
  • Page 226 controller module to the corresponding slots in the replacement controller module. 1. Locate the DIMMs on your controller module. 2. Note the orientation of the DIMM in the socket so that you can insert the DIMM in the replacement controller module in the proper orientation. 3.
  • Page 227 Step 5: Move a caching module, if present If your AFF A220 or FAS2700 system has a caching module, you need to move the caching module from the old controller module to the replacement controller module. The caching module is referred to as the “M.2 PCIe card”...
  • Page 228 5. Reseat and push the heatsink down to engage the locking button on the caching module housing. 6. Close the controller module cover, as needed. Step 6: Install the controller After you install the components from the old controller module into the new controller module, you must install the new controller module into the system chassis and boot the operating system.
  • Page 229 If your system is in… Then perform these steps… An HA pair The controller module begins to boot as soon as it is fully seated in the chassis. Be prepared to interrupt the boot process. a. With the cam handle in the open position, firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position.
  • Page 230 If your system is in… Then perform these steps… A stand-alone configuration a. With the cam handle in the open position, firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position. Do not use excessive force when sliding the controller module into the chassis to avoid damaging the connectors.
  • Page 231 Restore and verify the system configuration - AFF A220 and FAS2700 After completing the hardware replacement and booting to Maintenance mode, you verify the low-level system configuration of the replacement controller and reconfigure system settings as necessary. Step 1: Set and verify system time after replacing the controller You should check the time and date on the replacement controller module against the healthy controller module in an HA pair, or against a reliable time server in a stand-alone configuration.
  • Page 232 ▪ ▪ ▪ mcc-2n ▪ mccip ▪ non-ha b. Confirm that the setting has changed: ha-config show Step 3: Run system-level diagnostics You should run comprehensive or focused diagnostic tests for specific components and subsystems whenever you replace the controller. All commands in the diagnostic procedures are issued from the controller where the component is being replaced.
  • Page 233 If you want to run diagnostic Then… tests on… Individual components a. Clear the status logs: sldiag device clearstatus b. Display the available tests for the selected devices: sldiag device show -dev dev_name dev_name can be any one of the ports and devices identified in the preceding step.
  • Page 234 If you want to run diagnostic Then… tests on… Multiple components at the same a. Review the enabled and disabled devices in the output from the time preceding procedure and determine which ones you want to run concurrently. b. List the individual tests for the device: sldiag device show -dev dev_name c.
  • Page 235 Rerun the system-level diagnostics test. Recable the system and reassign disks - AFF A220 and FAS2700 To complete the replacement procedure and restore your system to full operation, you must recable the storage, confirm disk reassignment, restore the NetApp Storage Encryption configuration (if necessary), and install licenses for the new controller.
  • Page 236 c. Click the Cabling tab, and then examine the output. Make sure that all disk shelves are displayed and all disks appear in the output, correcting any cabling issues you find. d. Check other cabling by clicking the appropriate tab, and then examining the output from Config Advisor. Step 2: Reassign disks If the storage system is in an HA pair, the system ID of the new controller module is automatically assigned to the disks when the giveback occurs at the end of the procedure.
  • Page 237 node1> `storage failover show`   Takeover Node Partner Possible State Description ------------ ------------ -------- ------------------------------------- node1 node2 false System ID changed on partner (Old:   151759755, New: 151759706), In takeover node2 node1 Waiting for giveback (HA mailboxes) 4. From the healthy controller, verify that any coredumps are saved: a.
  • Page 238 node1> `storage disk show -ownership` Disk Aggregate Home Owner DR Home Home ID Owner ID DR Home ID Reserver Pool ----- ------ ----- ------ -------- ------- ------- ------- --------- 1.0.0 aggr0_1 node1 node1 1873775277 1873775277 1873775277 Pool0 1.0.1 aggr0_1 node1 node1 1873775277 1873775277 1873775277 Pool0 Option 2: Manually reassign the system ID on a stand-alone system in ONTAP...
  • Page 239 5. Reassign disk ownership by using the system ID information obtained from the disk show command: disk reassign -s old system ID disk reassign -s 118073209 6. Verify that the disks were assigned correctly: disk show -a The disks belonging to the replacement node should show the new system ID. The following example now show the disks owned by system-1 the new system ID, 118065481: *>...
  • Page 240 dr-group-id cluster node node-systemid dr- partner-systemid  ----------- --------------------- -------------------- ------------- -------------------  1 Cluster_A Node_A_1 536872914 118073209  1 Cluster_B Node_B_1 118073209 536872914  2 entries were displayed. 3. View the new system ID at the Maintenance mode prompt on the impaired node: disk show In this example, the new system ID is 118065481: Local System ID: 118065481...
  • Page 241 Display the results of the MetroCluster check: metrocluster check show e. Run Config Advisor. Go to the Config Advisor page on the NetApp Support Site at support.netapp.com/NOW/download/tools/config_advisor/. After running Config Advisor, review the tool’s output and follow the recommendations in the output to address any issues discovered.
  • Page 242 -privilege admin Complete system restoration - AFF A220 and FAS2700 To restore your system to full operation, you must restore the NetApp Storage Encryption configuration (if necessary), and install licenses for the new controller, and return the failed part to NetApp, as described in the RMA instructions shipped with the kit.
  • Page 243 If any LIFs are listed as false, revert them to their home ports: network interface revert 2. Register the system serial number with NetApp Support. ◦ If AutoSupport is enabled, send an AutoSupport message to register the serial number. ◦ If AutoSupport is not enabled, call NetApp Support to register the serial number.
  • Page 244 cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A   controller_A_1 configured enabled heal roots completed   cluster_B   controller_B_1 configured enabled waiting for switchback recovery 2 entries were displayed. 2. Verify that resynchronization is complete on all SVMs: metrocluster vserver show 3.
  • Page 245 Step 5: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp Support, 888-463-8277 (North America), 00-800-44- 638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure.
  • Page 246 If the impaired controller is Then… displaying… Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y.
  • Page 247 • If you have a MetroCluster configuration, you must have confirmed that the MetroCluster Configuration State is configured and that the nodes are in an enabled and normal state (metrocluster node show). Steps 1. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message MAINT=number_of_hours_downh The following AutoSupport message suppresses automatic case creation for two hours:...
  • Page 248 4. Squeeze the latch on the cam handle until it releases, open the cam handle fully to release the controller module from the midplane, and then, using two hands, pull the controller module out of the chassis. 5. Turn the controller module over and place it on a flat, stable surface. 6.
  • Page 249 Step 3: Replace the DIMMs To replace the DIMMs, locate them inside the controller and follow the specific sequence of steps. If you are replacing a DIMM, you need to remove it after you have unplugged the NVMEM battery from the controller module.
  • Page 250 a. Locate the battery, press the clip on the face of the battery plug to release the lock clip from the plug socket, and then unplug the battery cable from the socket. b. Confirm that the NVMEM LED is no longer lit. c.
  • Page 251 9. Remove the replacement DIMM from the antistatic shipping bag, hold the DIMM by the corners, and align it to the slot. The notch among the pins on the DIMM should line up with the tab in the socket. 10. Make sure that the DIMM ejector tabs on the connector are in the open position, and then insert the DIMM squarely into the slot.
  • Page 252 Do not completely insert the controller module in the chassis until instructed to do so. 4. Recable the system, as needed. If you removed the media converters (QSFPs or SFPs), remember to reinstall them if you are using fiber optic cables. 5.
  • Page 253 If your system is in… Then perform these steps… A stand-alone configuration a. With the cam handle in the open position, firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position. Do not use excessive force when sliding the controller module into the chassis to avoid damaging the connectors.
  • Page 254 function properly: boot_diags During the boot process, you can safely respond to the prompts until the Maintenance mode prompt (*>) appears. 3. Run diagnostics on the system memory: sldiag device run -dev mem 4. Verify that no hardware problems resulted from the replacement of the DIMMs: sldiag device status -dev mem -long -state failed System-level diagnostics returns you to the prompt if there are no test failures, or lists the full status of...
  • Page 255 If your controller is in… Then… Resulted in some test failures Determine the cause of the problem: a. Exit Maintenance mode: halt After you issue the command, wait until the system stops at the LOADER prompt. b. Turn off or leave on the power supplies, depending on how many controller modules are in the chassis: ◦...
  • Page 256 1. Verify that all nodes are in the enabled state: metrocluster node show cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A   controller_A_1 configured enabled heal roots completed   cluster_B  ...
  • Page 257 Step 7: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp Support, 888- 463-8277 (North America), 00-800-44-638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure.
  • Page 258 You may also choose to watch the Replace failed drive video that shows an overview of the embedded drive replacement procedure.
  • Page 259 Option 1: Replace SSD 1. If you want to manually assign drive ownership for the replacement drive, you need to disable automatic drive assignment replacement drive, if it is enabled You manually assign drive ownership and then reenable automatic drive assignment later in this procedure.
  • Page 260 -node node_name -autoassign on You must reenable automatic drive assignment on both controller modules. 10. Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 261 Depending on the storage system, the disk drives have the release button located at the top or on the left of the disk drive face. For example, the following illustration shows a disk drive with the release button located on the top of the disk drive face: The cam handle on the disk drive springs open partially and the disk drive releases from the midplane.
  • Page 262 13. Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp Support, 888-463-8277 (North America), 00-800-44-638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure.
  • Page 263 If the impaired controller is Then… displaying… System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y.
  • Page 264 1. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message MAINT=number_of_hours_downh The following AutoSupport message suppresses automatic case creation for two hours: cluster1:*> system node autosupport invoke -node * -type all -message MAINT=2h 2.
  • Page 265 module from the midplane, and then, using two hands, pull the controller module out of the chassis. 5. Turn the controller module over and place it on a flat, stable surface. 6. Open the cover by sliding in the blue tabs to release the cover, and then swing the cover up and open. Step 3: Replace the NVMEM battery To replace the NVMEM battery in your system, you must remove the failed NVMEM battery from the system and replace it with a new NVMEM battery.
  • Page 266 3. Locate the NVMEM battery in the controller module. 4. Locate the battery plug and squeeze the clip on the face of the battery plug to release the plug from the socket, and then unplug the battery cable from the socket. 5.
  • Page 267 If you removed the media converters (QSFPs or SFPs), remember to reinstall them if you are using fiber optic cables. 5. Complete the reinstallation of the controller module: If your system is in… Then perform these steps… An HA pair The controller module begins to boot as soon as it is fully seated in the chassis.
  • Page 268 If your system is in… Then perform these steps… A stand-alone configuration a. With the cam handle in the open position, firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position. Do not use excessive force when sliding the controller module into the chassis to avoid damaging the connectors.
  • Page 269 function properly: boot_diags During the boot process, you can safely respond to the prompts until the Maintenance mode prompt (*>) appears. 3. Run diagnostics on the NVMEM memory: sldiag device run -dev nvmem 4. Verify that no hardware problems resulted from the replacement of the NVMEM battery: sldiag device status -dev nvmem -long -state failed System-level diagnostics returns you to the prompt if there are no test failures, or lists the full status of...
  • Page 270 If your controller is in… Then… Resulted in some test failures Determine the cause of the problem: a. Exit Maintenance mode: halt After you issue the command, wait until the system stops at the LOADER prompt. b. Turn off or leave on the power supplies, depending on how many controller modules are in the chassis: ◦...
  • Page 271 1. Verify that all nodes are in the enabled state: metrocluster node show cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A   controller_A_1 configured enabled heal roots completed   cluster_B  ...
  • Page 272 Step 7: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp Support, 888- 463-8277 (North America), 00-800-44-638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure.
  • Page 273 5. Use the cam handle to slide the power supply out of the system. When removing a power supply, always use two hands to support its weight. 6. Make sure that the on/off switch of the new power supply is in the Off position. 7.
  • Page 274 11. After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp Support, 888-463-8277 (North America), 00-800- 44-638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure.
  • Page 275 If the impaired controller is Then… displaying… System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y.
  • Page 276 Step 2: Remove controller module To access components inside the controller, you must first remove the controller module from the system and then remove the cover on the controller module. 1. If you are not already grounded, properly ground yourself. 2.
  • Page 277 Step 3: Replace the RTC battery To replace the RTC battery, locate it inside the controller and follow the specific sequence of steps. 1. If you are not already grounded, properly ground yourself. 2. Locate the RTC battery.
  • Page 278 3. Gently push the battery away from the holder, rotate it away from the holder, and then lift it out of the holder. Note the polarity of the battery as you remove it from the holder. The battery is marked with a plus sign and must be positioned in the holder correctly.
  • Page 279 3. Recable the system, as needed. If you removed the media converters (QSFPs or SFPs), remember to reinstall them if you are using fiber optic cables. 4. If the power supplies were unplugged, plug them back in and reinstall the power cable retainers. 5.
  • Page 280 cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A   controller_A_1 configured enabled heal roots completed   cluster_B   controller_B_1 configured enabled waiting for switchback recovery 2 entries were displayed. 2. Verify that resynchronization is complete on all SVMs: metrocluster vserver show 3.
  • Page 281: Aff A250 System Documentation

    Step 6: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp Support, 888- 463-8277 (North America), 00-800-44-638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure.
  • Page 282 Step 1: Prepare for installation To install your AFF A250 system, you need to create an account and register the system. You also need to inventory the appropriate number and type of cables for your system and collect specific network information.
  • Page 283 ONTAP Configuration Guide and collect the required information listed in that guide. Step 2: Install the hardware You need to install your system in a 4-post rack or NetApp system cabinet, as applicable. Steps 1. Install the rail kits, as needed.
  • Page 284 3. Identify and manage cables because this system does not have a cable management device. 4. Place the bezel on the front of the system. Step 3: Cable controllers There is required cabling for your platform’s cluster using the two-node switchless cluster method or the cluster interconnect network method.
  • Page 285 Step Perform on each controller Cable the cluster interconnect ports to each other with the 25GbE cluster interconnect cable • e0c to e0c • e0d to e0d Cable the wrench ports to the management network switches with the RJ45 cables. DO NOT plug in the power cords at this point.
  • Page 286 As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it around and try again. Steps 1. Use the animation (Cabling a switched cluster) or the step-by-step instructions to complete the cabling between the controllers and to the switches: Step Perform on each controller...
  • Page 287 Option 1: Cable to a Fibre Channel host network Fibre Channel ports on the controllers are connected to Fibre Channel host network switches. Before you begin Contact your network administrator for information about connecting the system to the switches. Be sure to check the illustration arrow for the proper cable connector pull-tab orientation. As you insert the connector, you should feel it click into place;...
  • Page 288 As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it around and try again. Step Perform on each controller module Cable ports e4a through e4d to the 10GbE host network switches.
  • Page 289 1. Use the animation (Cabling the controllers to a single NS224) or the step-by-step instructions to cable your controller modules to a single shelf. Step Perform on each controller module Cable controller A to the shelf: Cable controller B to the shelf: 2.
  • Page 290 Steps 1. Plug the power cords into the controller power supplies, and then connect them to power sources on different circuits. The system begins to boot. Initial booting may take up to eight minutes. 2. Make sure that your laptop has network discovery enabled. See your laptop’s online help for more information.
  • Page 291 Steps 1. Cable and configure your laptop or console: a. Set the console port on the laptop or console to 115,200 baud with N-8-1. See your laptop or console’s online help for how to configure the console port. b. Connect the laptop or console to the switch on the management subnet. c.
  • Page 292 Maintain Boot media Overview of boot media replacement - AFF A250 The boot media stores a primary and secondary set of system (boot image) files that the system uses when it boots. Before you begin You must have a USB flash drive, formatted to MBR/FAT32, with the appropriate amount of storage to hold the file.
  • Page 293 Check NVE or NSE on systems running ONTAP 9.6 and later Before shutting down the impaired controller, you need to verify whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
  • Page 294 Restored yes: a. Restore the external key management authentication keys to all nodes in the cluster: security key-manager external restore If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the Restored column equals for all authentication keys:...
  • Page 295 Key Manager external Restored yes: a. Enter the onboard security key-manager sync command: security key-manager external sync If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the Restored column equals for all authentication keys: security key- manager key-query c.
  • Page 296 Return to admin mode: set -priv admin h. You can safely shut down the controller. Shut down the controller - AFF A250 Option 1: Most systems After completing the NVE or NSE tasks, you need to complete the shutdown of the impaired controller.
  • Page 297 To replace the boot media, you must remove the impaired controller module, install the replacement boot media, and transfer the boot image to a USB flash drive. Step 1: Remove the controller module - AFF A250 To access components inside the controller module, you must first remove the controller module from the system, and then remove the cover on the controller module.
  • Page 298 2. Unplug the controller module power supplies from the source. 3. Release the power cable retainers, and then unplug the cables from the power supplies. 4. Insert your forefinger into the latching mechanism on either side of the controller module, press the lever with your thumb, and gently pull the controller a few inches out of the chassis.
  • Page 299 Thumbscrew Controller module cover. 7. Lift out the air duct cover.
  • Page 300 Step 2: Replace the boot media You locate the failed boot media in the controller module by removing the air duct on the controller module before you can replace the boot media. You need a #1 magnetic Phillips head screwdriver to remove the screw that holds the boot media in place. Due to the space constraints within the controller module, you should also have a magnet to transfer the screw on to so that you do not lose it.
  • Page 301 • A copy of the same image version of ONTAP as what the impaired controller was running. You can download the appropriate image from the Downloads section on the NetApp Support Site ◦ If NVE is enabled, download the image with NetApp Volume Encryption, as indicated in the download...
  • Page 302 ◦ If NVE is not enabled, download the image without NetApp Volume Encryption, as indicated in the download button. • If your system is an HA pair, you must have a network connection. • If your system is a stand-alone system you do not need a network connection, but you must perform an additional reboot when restoring the var file system.
  • Page 303 7. Close the controller module cover and tighten the thumbscrew.
  • Page 304 Controller module cover Thumbscrew 8. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. 9. Plug the power cable into the power supply and reinstall the power cable retainer. 10.
  • Page 305 Other parameters might be necessary for your interface. You can enter help ifconfig the firmware prompt for details. Boot the recovery image - AFF A250 You must boot the ONTAP image from the USB drive, restore the file system, and verify the environmental variables.
  • Page 306 If your system has… Then… No network connection and is in a a. Press when prompted to restore the backup configuration. MetroCluster IP configuration b. Reboot the system when prompted by the system. c. Wait for the iSCSI storage connections to connect. You can proceed after you see the following messages: date-and-time [node- name:iscsi.session.stateChanged:notice]:...
  • Page 307 Restore OKM, NSE, and NVE as needed - AFF A250 Once environment variables are checked, you must complete steps specific to systems that have Onboard Key Manager (OKM), NetApp Storage Encryption (NSE) or NetApp Volume Encryption (NVE) enabled. 1. Determine which section you should use to restore your OKM, NSE, or NVE configurations: If NSE or NVE are enabled along with Onboard Key Manager you must restore settings you captured at the beginning of this procedure.
  • Page 308 If the console displays… Then… The LOADER prompt Boot the controller to the boot menu: boot_ontap menu Waiting for giveback…. a. Enter at the prompt Ctrl-C b. At the message: Do you wish to halt this node rather than wait [y/n]? , enter: c.
  • Page 309 9. Confirm the target controller is ready for giveback with the storage failover show command. 10. Giveback only the CFO aggregates with the storage failover giveback -fromnode local command. -only-cfo-aggregates true ◦ If the command fails because of a failed disk, physically disengage the failed disk, but leave the disk in the slot until a replacement is received.
  • Page 310 Restore NSE/NVE on systems running ONTAP 9.6 and later Steps 1. Connect the console cable to the target controller. 2. Use the command at the LOADER prompt to boot the controller. boot_ontap 3. Check the console output: If the console displays… Then…...
  • Page 311 -auto-giveback true Return the failed part to NetApp - AFF A250 After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 312 • If you have a cluster with more than two controllers, it must be in quorum. If the cluster is not in quorum or a healthy controller shows for eligibility and health, you must correct the issue before shutting down false the impaired controller;...
  • Page 313 Answer when prompted. Replace hardware - AFF A250 Move the power supplies, hard drives, and controller module or modules from the impaired chassis to the new chassis, and swap out the impaired chassis from the equipment rack or system cabinet with the new chassis of the same model as the impaired chassis.
  • Page 314 6. Set the controller module aside in a safe place, and repeat these steps for the other controller module in the chassis. Step 2: Move drives to the new chassis You need to move the drives from each bay opening in the old chassis to the same bay opening in the new chassis.
  • Page 315 Complete the restoration and replacement process - AFF A250 You must verify the HA state of the chassis, run diagnostics, and return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Step 1: Verify and set the HA state of the chassis You must verify the HA state of the chassis, and, if necessary, update the state to match your system configuration.
  • Page 316 ◦ If the test reported no failures, select Reboot from the menu to reboot the system. Step 3: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 317 Shut down the impaired controller module - AFF A250 • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the “Returning SEDs to unprotected mode” section of the ONTAP 9 NetApp Encryption Power Guide.
  • Page 318 2. Disable automatic giveback from the console of the healthy controller: storage failover modify –node local -auto-giveback false 3. Take the impaired controller to the LOADER prompt: If the impaired controller is Then… displaying… The LOADER prompt Go to Remove controller module. Waiting for giveback…...
  • Page 319 When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y. Replace the controller module hardware - AFF A250 To replace the controller module hardware, you must remove the impaired controller, move FRU components to the replacement controller module, install the replacement controller module in the chassis, and then boot the system to Maintenance mode.
  • Page 320 Lever Latching mechanism 5. Using both hands, grasp the controller module sides and gently pull it out of the chassis and set it on a flat, stable surface. 6. Turn the thumbscrew on the front of the controller module anti-clockwise and open the controller module cover.
  • Page 321 Thumbscrew Controller module cover. 7. Lift out the air duct cover. Step 2: Move the power supply You must move the power supply from the impaired controller module to the replacement controller module when you replace a controller module. 1. Disconnect the power supply. 2.
  • Page 322 Blue power supply locking tab Power supply 5. Move the power supply to the new controller module, and then install it. 6. Using both hands, support and align the edges of the power supply with the opening in the controller module, and then gently push the power supply into the controller module until the locking tab clicks into place.
  • Page 323 3. Repeat these steps for the remaining fan modules. Step 4: Move the boot media There is one boot media device in the AFF A250 under the air duct in the controller module. You must move it from the impaired controller module to the replacement controller module.
  • Page 324 Remove the screw securing the boot media to the motherboard in the impaired controller module. Lift the boot media out of the impaired controller module. a. Using the #1 magnetic screwdriver, remove the screw from the boot media, and set it aside safely on the magnet.
  • Page 325 1. Slowly push apart the DIMM ejector tabs on either side of the DIMM, and slide the DIMM out of the slot. Hold the DIMM by the edges to avoid pressure on the components on the DIMM circuit board. 2. Locate the corresponding DIMM slot on the replacement controller module. 3.
  • Page 326 Loosen the screw in the controller module. Move the mezzanine card. 2. Unplug any cabling associated with the mezzanine card. Make sure that you label the cables so that you know where they came from. a. Remove any SFP or QSFP modules that might be in the mezzanine card and set it aside. b.
  • Page 327 Squeeze the clip on the face of the battery plug. Unplug the battery cable from the socket. Grasp the battery and press the blue locking tab marked PUSH. Lift the battery out of the holder and controller module. 2. Locate the battery plug and squeeze the clip on the face of the battery plug to release the plug from the socket.
  • Page 328 Step 8: Install the controller module After all of the components have been moved from the impaired controller module to the replacement controller module, you must install the replacement controller module into the chassis, and then boot it to Maintenance mode. You can use the following illustrations or the written steps to install the replacement controller module in the chassis.
  • Page 329 Controller module cover Thumbscrew 3. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. Do not completely insert the controller module in the chassis until instructed to do so. 4.
  • Page 330 The controller module should be fully inserted and flush with the edges of the chassis. Restore and verify the system configuration - AFF A250 After completing the hardware replacement and booting to Maintenance mode, you verify the low-level system configuration of the replacement controller and reconfigure system settings as necessary.
  • Page 331 ▪ A prompt warning that when entering Maintenance mode in an HA configuration you must ensure that the healthy controller remains down. You can safely respond to these prompts. Recable the system and reassign disks - AFF A250 Continue the replacement procedure by recabling the storage and confirming disk reassignment.
  • Page 332 Step 1: Recable the system After running diagnostics, you must recable the controller module’s storage and network connections. Steps 1. Recable the system. 2. Verify that the cabling is correct by using Active IQ Config Advisor. a. Download and install Config Advisor. b.
  • Page 333 You can respond when prompted to continue into advanced mode. The advanced mode prompt appears (*>). b. Save any coredumps: system node run -node local-node-name partner savecore c. Wait for the `savecore`command to complete before issuing the giveback. You can enter the following command to monitor the progress of the savecore command: system node run -node local-node-name partner savecore -s d.
  • Page 334 -node replacement-node-name -onreboot true Complete system restoration - AFF A250 To restore your system to full operation, you must restore the NetApp Storage Encryption configuration (if necessary), and install licenses for the new controller, and return the failed part to NetApp, as described in the RMA instructions shipped with the kit.
  • Page 335 Steps 1. If you need new license keys, obtain replacement license keys on the NetApp Support Site in the My Support section under Software licenses. The new license keys that you require are automatically generated and sent to the email address on file.
  • Page 336 -node local -auto -giveback true Step 4: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 337 3. Take the impaired controller to the LOADER prompt: If the impaired controller is Then… displaying… The LOADER prompt Go to Remove controller module. Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name...
  • Page 338 If the impaired controller is Then… displaying… System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y.
  • Page 339 6. Turn the thumbscrew on the front of the controller module anti-clockwise and open the controller module cover. Thumbscrew Controller module cover. 7. Lift out the air duct cover.
  • Page 340 Step 3: Replace a DIMM To replace a DIMM, you must locate it in the controller module using the DIMM map label on top of the air duct or locating it using the LED next to the DIMM, and then replace it following the specific sequence of steps. Use the following video or the tabulated steps to replace a DIMM: Replacing a DIMM 1.
  • Page 341 2. Note the orientation of the DIMM in the socket so that you can insert the replacement DIMM in the proper orientation. 3. Slowly push apart the DIMM ejector tabs on either side of the DIMM, and slide the DIMM out of the slot. 4.
  • Page 342 2. Close the controller module cover and tighten the thumbscrew.
  • Page 343 Controller module cover Thumbscrew 3. Insert the controller module into the chassis: a. Ensure the latching mechanism arms are locked in the fully extended position. b. Using both hands, align and gently slide the controller module into the latching mechanism arms until it stops.
  • Page 344 Step 6: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp Support, 888-463-8277 (North America), 00-800-44- 638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure.
  • Page 345 Option 1: Replace SSD 1. If you want to manually assign drive ownership for the replacement drive, you need to disable automatic drive assignment replacement drive, if it is enabled You manually assign drive ownership and then reenable automatic drive assignment later in this procedure.
  • Page 346 -node node_name -autoassign on You must reenable automatic drive assignment on both controller modules. 10. Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 347 Depending on the storage system, the disk drives have the release button located at the top or on the left of the disk drive face. For example, the following illustration shows a disk drive with the release button located on the top of the disk drive face: The cam handle on the disk drive springs open partially and the disk drive releases from the midplane.
  • Page 348 13. Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp Support, 888-463-8277 (North America), 00-800-44-638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure.
  • Page 349 Option 2: System is in a MetroCluster Do not use this procedure if your system is in a two-node MetroCluster configuration. To shut down the impaired controller, you must determine the status of the controller and, if necessary, take over the controller so that the healthy controller continues to serve data from the impaired controller storage. •...
  • Page 350 4. Insert your forefinger into the latching mechanism on either side of the controller module, press the lever with your thumb, and gently pull the controller a few inches out of the chassis. If you have difficulty removing the controller module, place your index fingers through the finger holes from the inside (by crossing your arms).
  • Page 351 Thumbscrew Controller module cover Step 3: Replace a fan To replace a fan, remove the failed fan module and replace it with a new fan module. Use the following video or the tabulated steps to replace a fan: Replacing a fan 1.
  • Page 352 Fan module 3. Align the edges of the replacement fan module with the opening in the controller module, and then slide the replacement fan module into the controller module. Step 4: Reinstall the controller module After you replace a component within the controller module, you must reinstall the controller module in the system chassis and boot it.
  • Page 353 Controller module cover Thumbscrew 2. Insert the controller module into the chassis: a. Ensure the latching mechanism arms are locked in the fully extended position. b. Using both hands, align and gently slide the controller module into the latching mechanism arms until it stops.
  • Page 354 -node local -auto -giveback true Step 5: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 355 If the impaired controller is Then… displaying… Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y.
  • Page 356 If the impaired controller is Then… displaying… System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y.
  • Page 357 cover. Thumbscrew Controller module cover. Step 3: Replace or install a mezzanine card To replace a mezzanine card, you must remove the impaired card and install the replacement card; to install a mezzanine card, you must remove the faceplate and install the new card. Use the following video or the tabulated steps to replace a mezzanine card: Replacing a mezzanine card 1.
  • Page 358 Remove screws on the face of the controller module. Loosen the screw in the controller module. Remove the mezzanine card. a. Unplug any cabling associated with the impaired mezzanine card. Make sure that you label the cables so that you know where they came from. b.
  • Page 359 Do not apply force when tightening the screw on the mezzanine card; you might crack it. i. Insert any SFP or QSFP modules that were removed from the impaired mezzanine card to the replacement mezzanine card. 3. To install a mezzanine card: 4.
  • Page 360 -node local -auto -giveback true Step 5: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 361 About this task If you have a cluster with more than two nodes, it must be in quorum. If the cluster is not in quorum or a healthy controller shows false for eligibility and health, you must correct the issue before shutting down the impaired controller;...
  • Page 362 The following AutoSupport message suppresses automatic case creation for two hours: cluster1:*> system node autosupport invoke -node * -type all -message MAINT=2h 2. Disable automatic giveback from the console of the healthy controller: storage failover modify –node local -auto-giveback false 3.
  • Page 363 Lever Latching mechanism 5. Using both hands, grasp the controller module sides and gently pull it out of the chassis and set it on a flat, stable surface. 6. Turn the thumbscrew on the front of the controller module anti-clockwise and open the controller module cover.
  • Page 364 Thumbscrew Controller module cover. Step 3: Replace the NVMEM battery To replace the NVMEM battery, you must remove the failed battery from the controller module and install the replacement battery into the controller module. Use the following video or the tabulated steps to replace the NVMEM battery: Replacing the NVMEM battery 1.
  • Page 365 Grasp the battery and press the blue locking tab marked PUSH. Lift the battery out of the holder and controller module. 2. Locate the battery plug and squeeze the clip on the face of the battery plug to release the plug from the socket.
  • Page 366 Controller module cover Thumbscrew 2. Insert the controller module into the chassis: a. Ensure the latching mechanism arms are locked in the fully extended position. b. Using both hands, align and gently slide the controller module into the latching mechanism arms until it stops.
  • Page 367 ◦ If the scan reported no failures, select Reboot from the menu to reboot the system. Step 6: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 368 1. If you are not already grounded, properly ground yourself. 2. Identify the power supply you want to replace, based on console error messages or through the red Fault LED on the power supply. 3. Disconnect the power supply: a. Open the power cable retainer, and then unplug the power cable from the power supply. b.
  • Page 369 Once power is restored to the power supply, the status LED should be green. 7. After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 370 If the impaired controller is Then… displaying… The LOADER prompt Go to Remove controller module. Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name...
  • Page 371 If the impaired controller is Then… displaying… System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y.
  • Page 372 6. Turn the thumbscrew on the front of the controller module anti-clockwise and open the controller module cover. Thumbscrew Controller module cover. 7. Lift out the air duct cover.
  • Page 373 Step 3: Replace the RTC battery To replace the RTC battery, locate it inside the controller and follow the specific sequence of steps. Use the following video or the tabulated steps to replace the RTC battery: Replacing the RTC battery 1.
  • Page 374 Gently pull tab away from the battery housing. Attention: Pulling it away aggressively might displace the tab. Lift the battery up. Note: Make a note of the polarity of the battery. The battery should eject out. The battery will be ejected out. 2.
  • Page 375 With positive polarity face up, slide the battery under the tab of the battery housing. Push the battery gently into place and make sure the tab secures it to the housing. CAUTION: Pushing it in aggressively might cause the battery to eject out again.
  • Page 376: Aff A300 System Documentation

    -node local -auto -giveback true Step 5: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 377 Install MetroCluster IP configuration • Install MetroCluster Fabric-Attached configuration Installation and setup PDF poster - AFF A300 You can use the PDF poster to install and set up your new system. The PDF poster provides step-by-step instructions with live links to additional content.
  • Page 378 ◦ The impaired node is the node on which you are performing maintenance. ◦ The healthy node is the HA partner of the impaired node. Check onboard encryption keys - AFF A300 Prior to shutting down the impaired controller and checking the status of the onboard encryption keys, you must check the status of the impaired controller, disable automatic giveback, and check the version of ONTAP that is running.
  • Page 379 Option 1: Check NVE or NSE on systems running ONTAP 9.5 and earlier Before shutting down the impaired controller, you need to check whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
  • Page 380 Retrieve and restore all authentication keys and associated key IDs: security key-manager restore -address * If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column displays for all authentication keys and that all key managers...
  • Page 381 Option 2: Check NVE or NSE on systems running ONTAP 9.6 and later Before shutting down the impaired controller, you need to verify whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
  • Page 382 Restored yes: a. Restore the external key management authentication keys to all nodes in the cluster: security key-manager external restore If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys: Restored...
  • Page 383 Enter the customer’s onboard key management passphrase at the prompt. If the passphrase cannot be provided, contact NetApp Support. mysupport.netapp.com b. Verify the column shows for all authentication keys: Restored security key-manager key-query c. Verify that the type shows onboard, and then manually back up the OKM Key Manager information.
  • Page 384 Return to admin mode: set -priv admin h. You can safely shut down the controller. Shut down the impaired controller - AFF A300 Shut down or take over the impaired controller using the appropriate procedure for your configuration. Option 1: Most configurations After completing the NVE or NSE tasks, you need to complete the shutdown of the impaired controller.
  • Page 385 If the impaired controller Then… displays… The LOADER prompt Go to Remove controller module. Press Ctrl-C, and then respond when prompted. Waiting for giveback… System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press...
  • Page 386 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with the CLI.
  • Page 387 controller_A_1::> metrocluster heal -phase aggregates [Job 130] Job succeeded: Heal Aggregates is successful. If the healing is vetoed, you have the option of reissuing the command with the metrocluster heal -override-vetoes parameter. If you use this optional parameter, the system overrides any soft vetoes that prevent the healing operation.
  • Page 388 Errors: - 8. On the impaired controller module, disconnect the power supplies. Replace the boot media - AFF A300 To replace the boot media, you must remove the impaired controller module, install the replacement boot media, and transfer the boot image to a USB flash drive.
  • Page 389 Make sure that you support the bottom of the controller module as you slide it out of the chassis. Step 2: Replace the boot media - AFF A300 You must locate the boot media in the controller and follow the directions to replace it.
  • Page 390 3. Press the blue button on the boot media housing to release the boot media from its housing, and then gently pull it straight out of the boot media socket. Do not twist or pull the boot media straight up, because this could damage the socket or the boot media.
  • Page 391 • A copy of the same image version of ONTAP as what the impaired controller was running. You can download the appropriate image from the Downloads section on the NetApp Support Site ◦ If NVE is enabled, download the image with NetApp Volume Encryption, as indicated in the download button.
  • Page 392 The changes will be implemented when the system is booted. Boot the recovery image - AFF A300 The procedure for booting the impaired controller from the recovery image depends on whether the system is in a two-controller MetroCluster configuration.
  • Page 393 If your system has… Then… A network connection a. Press when prompted to restore the backup configuration. b. Set the healthy controller to advanced privilege level: -privilege advanced c. Run the restore backup command: system node restore- backup -node local -target-address impaired_node_IP_address d.
  • Page 394 Reboot the node. Switch back aggregates in a two-node MetroCluster configuration - AFF A300 After you have completed the FRU replacement in a two-node MetroCluster configuration, you can perform the MetroCluster switchback operation. This returns the configuration to its normal operating state, with the sync-source storage virtual machines (SVMs) on the formerly impaired site now active and serving data from the local disk pools.
  • Page 395 This task only applies to two-node MetroCluster configurations. Steps 1. Verify that all nodes are in the enabled state: metrocluster node show cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A  ...
  • Page 396 Restore OKM, NSE, and NVE as needed - AFF A300 Once environment variables are checked, you must complete steps specific to systems that have Onboard Key Manager (OKM), NetApp Storage Encryption (NSE) or NetApp Volume Encryption (NVE) enabled. Determine which section you should use to restore your OKM, NSE, or NVE configurations: If NSE or NVE are enabled along with Onboard Key Manager you must restore settings you captured at the beginning of this procedure.
  • Page 397 The data is output from either security key-manager backup show security command. key-manager onboard show-backup Example of backup data: --------------------------BEGIN BACKUP-------------------------- TmV0QXBwIEtleSBCbG9iAAEAAAAEAAAAcAEAAAAAAADuD+byAAAAACEAAAAAAAAA QAAAAAAAAABvOlH0AAAAAMh7qDLRyH1DBz12piVdy9ATSFMT0C0TlYFss4PDjTaV dzRYkLd1PhQLxAWJwOIyqSr8qY1SEBgm1IWgE5DLRqkiAAAAAAAAACgAAAAAAAAA 3WTh7gAAAAAAAAAAAAAAAAIAAAAAAAgAZJEIWvdeHr5RCAvHGclo+wAAAAAAAAAA IgAAAAAAAAAoAAAAAAAAAEOTcR0AAAAAAAAAAAAAAAACAAAAAAAJAGr3tJA/ LRzUQRHwv+1aWvAAAAAAAAAAACQAAAAAAAAAgAAAAAAAAACdhTcvAAAAAJ1PXeBf ml4NBsSyV1B4jc4A7cvWEFY6lLG6hc6tbKLAHZuvfQ4rIbYAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA H4nPQM0nrDRYRa9SCv8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAA ---------------------------END BACKUP--------------------------- 7. At the Boot Menu select the option for Normal Boot. The system boots to prompt.
  • Page 398 12. Move the console cable to the target controller. 13. If you are running ONTAP 9.5 and earlier, run the key-manager setup wizard: a. Start the wizard using the command, and then security key-manager setup -nodenodename enter the passphrase for onboard key management when prompted. b.
  • Page 399 This command does not work if NVE (NetApp Volume Encryption) is configured 10. Use the security key-manager query to display the key IDs of the authentication keys that are stored on the key management servers.
  • Page 400 Check the output of the security key-manager query again to ensure that the column = Restored and all key managers report in an available state 11. If the Onboard Key Management is enabled: a. Use the to see a detailed view of all keys stored in security key-manager key show -detail the onboard key manager.
  • Page 401 -auto-giveback true command. Return the failed part to NetApp - AFF A300 After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 402 • This procedure is written with the assumption that you are moving the controller module or modules to the new chassis, and that the chassis is a new component from NetApp. • This procedure is disruptive. For a two-node cluster, you will have a complete service outage and a partial outage in a multi-node cluster.
  • Page 403 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with the CLI.
  • Page 404 If the impaired controller… Then… Has not automatically switched Review the veto messages and, if possible, resolve the issue and try over, you attempted switchover again. If you are unable to resolve the issue, contact technical support. with the metrocluster command, and the switchover switchover was vetoed...
  • Page 405 Errors: - 8. On the impaired controller module, disconnect the power supplies. Replace hardware - AFF A300 Move the power supplies, fans, and controller modules from the impaired chassis to the new chassis, and swap out the impaired chassis from the equipment rack or system cabinet with the new chassis of the same model as the impaired chassis.
  • Page 406 Power supply Cam handle release latch Power and Fault LEDs Cam handle...
  • Page 407 Power cable locking mechanism 4. Use the cam handle to slide the power supply out of the system. When removing a power supply, always use two hands to support its weight. 5. Repeat the preceding steps for any remaining power supplies. 6.
  • Page 408 Cam handle Fan module Cam handle release latch Fan module Attention LED 3. Pull the fan module straight out from the chassis, making sure that you support it with your free hand so that it does not swing out of the chassis. The fan modules are short.
  • Page 409 10. Align the bezel with the ball studs, and then gently push the bezel onto the ball studs. Step 3: Remove the controller module To replace the chassis, you must remove the controller module or modules from the old chassis. 1.
  • Page 410 Step 4: Replace a chassis from within the equipment rack or system cabinet You must remove the existing chassis from the equipment rack or system cabinet before you can install the replacement chassis. 1. Remove the screws from the chassis mount points. If the system is in a system cabinet, you might need to remove the rear tie-down bracket.
  • Page 411 From the boot menu, select the option for Maintenance mode. Restore and verify the configuration - AFF A300 You must verify the HA state of the chassis and run System-Level diagnostics, switch back aggregates, and return the failed part to NetApp, as described in the RMA...
  • Page 412 Then… A stand-alone configuration a. Exit Maintenance mode: halt b. Go to Step 4: Return the failed part to NetApp. An HA pair with a second Exit Maintenance mode: The LOADER prompt appears. halt controller module Step 2: Run system-level diagnostics After installing a new chassis, you should run interconnect diagnostics.
  • Page 413 After you issue the command, you should wait until the system stops at the LOADER prompt. During the boot process, you can safely respond to prompts: 2. Repeat the previous step on the second controller if you are in an HA configuration. Both controllers must be in Maintenance mode to run the interconnect test.
  • Page 414 If the system-level diagnostics Then… tests… Were completed without any a. Clear the status logs: sldiag device clearstatus failures b. Verify that the log was cleared: sldiag device status The following default response is displayed: SLDIAG: No log messages are present. c.
  • Page 415 If your system is running Then… ONTAP… Resulted in some test failures Determine the cause of the problem. a. Exit Maintenance mode: halt b. Perform a clean shutdown, and then disconnect the power supplies. c. Verify that you have observed all of the considerations identified for running system-level diagnostics, that cables are securely connected, and that hardware components are properly installed in the storage system.
  • Page 416 6. Reestablish any SnapMirror or SnapVault configurations. Step 4: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 417 This provides you a record of the procedure so that you can troubleshoot any issues that you might encounter during the replacement process. Shut down the impaired controller - AFF A300 Shut down or take over the impaired controller using the appropriate procedure for your configuration.
  • Page 418 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with the CLI.
  • Page 419 3. Resynchronize the data aggregates by running the metrocluster heal -phase aggregates command from the surviving cluster. controller_A_1::> metrocluster heal -phase aggregates [Job 130] Job succeeded: Heal Aggregates is successful. If the healing is vetoed, you have the option of reissuing the command with the metrocluster heal parameter.
  • Page 420 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with the CLI.
  • Page 421 that prevent the healing operation. 4. Verify that the operation has been completed by using the metrocluster operation show command. controller_A_1::> metrocluster operation show   Operation: heal-aggregates   State: successful Start Time: 7/25/2016 18:45:55   End Time: 7/25/2016 18:45:56   Errors: - 5.
  • Page 422 Replace the controller module - AFF A300 To replace the controller module hardware, you must remove the impaired controller, move FRU components to the replacement controller module, install the replacement controller module in the chassis, and then boot the system to Maintenance mode.
  • Page 423 6. Pull the cam handle downward and begin to slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. Step 2: Move the boot device You must locate the boot media and follow the directions to remove it from the old controller and insert it in the new controller.
  • Page 424 If necessary, remove the boot media and reseat it into the socket. 5. Push the boot media down to engage the locking button on the boot media housing. Step 3: Move the NVMEM battery To move the NVMEM battery from the old controller module to the new controller module, you must perform a specific sequence of steps.
  • Page 425 Battery lock tab NVMEM battery pack 3. Grasp the battery and press the blue locking tab marked PUSH, and then lift the battery out of the holder and controller module. 4. Remove the battery from the controller module and set it aside. Step 4: Move the DIMMs To move the DIMMs, locate and move them from the old controller into the replacement controller and follow the specific sequence of steps.
  • Page 426 Carefully hold the DIMM by the edges to avoid pressure on the components on the DIMM circuit board. The number and placement of system DIMMs depends on the model of your system. The following illustration shows the location of system DIMMs: 4.
  • Page 427 Step 5: Move a PCIe card To move PCIe cards, locate and move them from the old controller into the replacement controller and follow the specific sequence of steps. You must have the new controller module ready so that you can move the PCIe cards directly from the old controller module to the corresponding slots in the new one.
  • Page 428 5. Open the new controller module side panel, if necessary, slide off the PCIe card filler plate, as needed, and carefully install the PCIe card. Be sure that you properly align the card in the slot and exert even pressure on the card when seating it in the socket.
  • Page 429 If your system is in… Then perform these steps… An HA pair The controller module begins to boot as soon as it is fully seated in the chassis. Be prepared to interrupt the boot process. a. With the cam handle in the open position, firmly push the controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position.
  • Page 430 You can safely respond to these prompts. Restore and verify the system configuration - AFF A300 After completing the hardware replacement and booting to Maintenance mode, you verify the low-level system configuration of the replacement controller and reconfigure system settings as necessary.
  • Page 431 • The replacement node is the new node that replaced the impaired node as part of this procedure. • The healthy node is the HA partner of the replacement node. Steps 1. If the replacement node is not at the LOADER prompt, halt the system to the LOADER prompt. 2.
  • Page 432 All commands in the diagnostic procedures are issued from the controller where the component is being replaced. 1. If the controller to be serviced is not at the LOADER prompt, reboot the controller: halt After you issue the command, you should wait until the system stops at the LOADER prompt. 2.
  • Page 433 If you want to run diagnostic Then… tests on… Individual components a. Clear the status logs: sldiag device clearstatus b. Display the available tests for the selected devices: sldiag device show -dev dev_name dev_name can be any one of the ports and devices identified in the preceding step.
  • Page 434 If you want to run diagnostic Then… tests on… Multiple components at the same a. Review the enabled and disabled devices in the output from the time preceding procedure and determine which ones you want to run concurrently. b. List the individual tests for the device: sldiag device show -dev dev_name c.
  • Page 435 Reconnect the power supplies, and then power on the storage system. e. Rerun the system-level diagnostics test. Recable the system and reassign disks - AFF A300 Continue the replacement procedure by recabling the storage and confirming disk reassignment. Step 1: Recable the system After running diagnostics, you must recable the controller module’s storage and network connections.
  • Page 436 d. Check other cabling by clicking the appropriate tab, and then examining the output from Config Advisor. Step 2: Reassign disks If the storage system is in an HA pair, the system ID of the new controller module is automatically assigned to the disks when the giveback occurs at the end of the procedure.
  • Page 437 You can enter the following command to monitor the progress of the savecore command: system node run -node local-node-name partner savecore -s d. Return to the admin privilege level: set -privilege admin 5. Give back the node: a. From the healthy node, give back the replaced node’s storage: storage failover giveback -ofnode replacement_node_name The replacement node takes back its storage and completes booting.
  • Page 438 You must be sure to issue the commands in this procedure on the correct node: • The impaired node is the node on which you are performing maintenance. • The replacement node is the new node that replaced the impaired node as part of this procedure. •...
  • Page 439 *> disk show -a Local System ID: 118065481   DISK OWNER POOL SERIAL NUMBER HOME ------- ------------- ----- ------------- ------------- disk_name system-1 (118065481) Pool0 J8Y0TDZC system-1 (118065481) disk_name system-1 (118065481) Pool0 J8Y09DXC system-1 (118065481) 6. From the healthy node, verify that any coredumps are saved: a.
  • Page 440 Display the results of the MetroCluster check: metrocluster check show e. Run Config Advisor. Go to the Config Advisor page on the NetApp Support Site at support.netapp.com/NOW/download/tools/config_advisor/. After running Config Advisor, review the tool’s output and follow the recommendations in the output to address any issues discovered.
  • Page 441 If any LIFs are listed as false, revert them to their home ports: network interface revert 2. Register the system serial number with NetApp Support. ◦ If AutoSupport is enabled, send an AutoSupport message to register the serial number. ◦ If AutoSupport is not enabled, call NetApp Support to register the serial number.
  • Page 442 This task only applies to two-node MetroCluster configurations. Steps 1. Verify that all nodes are in the enabled state: metrocluster node show cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A  ...
  • Page 443 6. Reestablish any SnapMirror or SnapVault configurations. Step 5: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 444 If the impaired controller is Then… displaying… The LOADER prompt Go to Remove controller module. Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name...
  • Page 445 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with the CLI.
  • Page 446 parameter. If you use this optional parameter, the system overrides any soft vetoes -override-vetoes that prevent the healing operation. 4. Verify that the operation has been completed by using the metrocluster operation show command. controller_A_1::> metrocluster operation show   Operation: heal-aggregates  ...
  • Page 447 Step 2: Open the controller module To access components inside the controller, you must first remove the controller module from the system and then remove the cover on the controller module. 1. If you are not already grounded, properly ground yourself. 2.
  • Page 448 Step 3: Replace the DIMMs To replace the DIMMs, locate them inside the controller and follow the specific sequence of steps. 1. If you are not already grounded, properly ground yourself. 2. Check the NVMEM LED on the controller module. You must perform a clean system shutdown before replacing system components to avoid losing unwritten data in the nonvolatile memory (NVMEM).
  • Page 449 NVMEM battery lock tab NVMEM battery b. Locate the battery plug and squeeze the clip on the face of the battery plug to release the plug from the socket, and then unplug the battery cable from the socket. c. Wait a few seconds, and then plug the battery back into the socket. 5.
  • Page 450 Carefully hold the DIMM by the edges to avoid pressure on the components on the DIMM circuit board. The number and placement of system DIMMs depends on the model of your system. The following illustration shows the location of system DIMMs: 9.
  • Page 451 12. Locate the NVMEM battery plug socket, and then squeeze the clip on the face of the battery cable plug to insert it into the socket. Make sure that the plug locks down onto the controller module. 13. Close the controller module cover. Step 4: Reinstall the controller After you replace a component within the controller module, you must reinstall the controller module in the system chassis and boot it to a state where you can run...
  • Page 452 1. If the controller to be serviced is not at the LOADER prompt, perform the following steps: a. Select the Maintenance mode option from the displayed menu. b. After the controller boots to Maintenance mode, halt the controller: halt After you issue the command, you should wait until the system stops at the LOADER prompt. During the boot process, you can safely respond to prompts: ▪...
  • Page 453 If your controller is in… Then… An HA pair Perform a give back: storage failover giveback -ofnode replacement_node_name If you disabled automatic giveback, re-enable it with the storage failover modify command. A two-node MetroCluster Proceed to the next step. The MetroCluster switchback procedure is configuration done in the next task in the replacement process.
  • Page 454 Step 6 (Two-node MetroCluster only): Switch back aggregates After you have completed the FRU replacement in a two-node MetroCluster configuration, you can perform the MetroCluster switchback operation. This returns the configuration to its normal operating state, with the sync-source storage virtual machines (SVMs) on the formerly impaired site now active and serving data from the local disk pools.
  • Page 455 6. Reestablish any SnapMirror or SnapVault configurations. Step 7: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 456 10. Align the bezel with the ball studs, and then gently push the bezel onto the ball studs. 11. After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 457 44-638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure. Replace the NVMEM battery - AFF A300 To replace an NVMEM battery in the system, you must remove the controller module from the system, open it, replace the battery, and close and replace the controller module.
  • Page 458 If the impaired controller is Then… displaying… System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y.
  • Page 459 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with the CLI.
  • Page 460 controller_A_1::> metrocluster operation show   Operation: heal-aggregates   State: successful Start Time: 7/25/2016 18:45:55   End Time: 7/25/2016 18:45:56   Errors: - 5. Check the state of the aggregates by using the command. storage aggregate show controller_A_1::> storage aggregate show Aggregate Size Available Used% State #Vols...
  • Page 461 1. If you are not already grounded, properly ground yourself. 2. Loosen the hook and loop strap binding the cables to the cable management device, and then unplug the system cables and SFPs (if needed) from the controller module, keeping track of where the cables were connected.
  • Page 462 2. Check the NVMEM LED: ◦ If your system is in an HA configuration, go to the next step. ◦ If your system is in a stand-alone configuration, cleanly shut down the controller module, and then check the NVRAM LED identified by the NV icon. The NVRAM LED blinks while destaging contents to the flash memory when you halt the system.
  • Page 463 Battery lock tab NVMEM battery pack 4. Grasp the battery and press the blue locking tab marked PUSH, and then lift the battery out of the holder and controller module. 5. Remove the replacement battery from its package. 6. Align the tab or tabs on the battery holder with the notches in the controller module side, and then gently push down on the battery housing until the battery housing clicks into place.
  • Page 464 f. Select the option to boot to Maintenance mode from the displayed menu. Step 5: Run system-level diagnostics After installing a new NVMEM battery, you should run diagnostics. Your system must be at the LOADER prompt to start System Level Diagnostics. All commands in the diagnostic procedures are issued from the controller where the component is being replaced.
  • Page 465 If the system-level diagnostics Then… tests… Were completed without any a. Clear the status logs: sldiag device clearstatus failures b. Verify that the log was cleared: sldiag device status The following default response is displayed: SLDIAG: No log messages are present. c.
  • Page 466 If your controller is in… Then… Resulted in some test failures Determine the cause of the problem: a. Exit Maintenance mode: halt After you issue the command, wait until the system stops at the LOADER prompt. b. Turn off or leave on the power supplies, depending on how many controller modules are in the chassis: ◦...
  • Page 467 1. Verify that all nodes are in the enabled state: metrocluster node show cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A   controller_A_1 configured enabled heal roots completed   cluster_B  ...
  • Page 468 Step 7: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp Support, 888- 463-8277 (North America), 00-800-44-638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure.
  • Page 469 If the impaired controller is Then… displaying… System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y.
  • Page 470 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with the CLI.
  • Page 471 controller_A_1::> metrocluster operation show   Operation: heal-aggregates   State: successful Start Time: 7/25/2016 18:45:55   End Time: 7/25/2016 18:45:56   Errors: - 5. Check the state of the aggregates by using the command. storage aggregate show controller_A_1::> storage aggregate show Aggregate Size Available Used% State #Vols...
  • Page 472 1. If you are not already grounded, properly ground yourself. 2. Loosen the hook and loop strap binding the cables to the cable management device, and then unplug the system cables and SFPs (if needed) from the controller module, keeping track of where the cables were connected.
  • Page 473 2. Loosen the thumbscrew on the controller module side panel. 3. Swing the side panel off the controller module. Side panel PCIe card 4. Remove the PCIe card from the controller module and set it aside. 5. Install the replacement PCIe card. Be sure that you properly align the card in the slot and exert even pressure on the card when seating it in the socket.
  • Page 474 1. If you are not already grounded, properly ground yourself. 2. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. Do not completely insert the controller module in the chassis until instructed to do so. 3.
  • Page 475 If your system is in… Then perform these steps… A two-node MetroCluster a. With the cam handle in the open position, firmly push the configuration controller module in until it meets the midplane and is fully seated, and then close the cam handle to the locked position. Tighten the thumbscrew on the cam handle on back of the controller module.
  • Page 476 This task only applies to two-node MetroCluster configurations. Steps 1. Verify that all nodes are in the state: enabled metrocluster node show cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A  ...
  • Page 477 6. Reestablish any SnapMirror or SnapVault configurations. Step 6: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 478 Power supply Cam handle release latch Power and Fault LEDs Cam handle...
  • Page 479 The power supply LEDs are lit when the power supply comes online. 2. After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 480 Option 1: Most configurations To shut down the impaired controller, you must determine the status of the controller and, if necessary, take over the controller so that the healthy controller continues to serve data from the impaired controller storage. About this task If you have a cluster with more than two nodes, it must be in quorum.
  • Page 481 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with the CLI.
  • Page 482 If the impaired controller… Then… Has not automatically switched Perform a planned switchover operation from the healthy controller: over metrocluster switchover Has not automatically switched Review the veto messages and, if possible, resolve the issue and try over, you attempted switchover again.
  • Page 483 mcc1A::> metrocluster heal -phase root-aggregates [Job 137] Job succeeded: Heal Root Aggregates is successful If the healing is vetoed, you have the option of reissuing the command with the metrocluster heal -override-vetoes parameter. If you use this optional parameter, the system overrides any soft vetoes that prevent the healing operation.
  • Page 484 Thumbscrew Cam handle 5. Pull the cam handle downward and begin to slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. Step 3: Replace the RTC Battery To replace the RTC battery, locate them inside the controller and follow the specific sequence of steps.
  • Page 485 3. Gently push the battery away from the holder, rotate it away from the holder, and then lift it out of the holder. Note the polarity of the battery as you remove it from the holder. The battery is marked with a plus sign and must be positioned in the holder correctly.
  • Page 486 1. If you have not already done so, close the air duct or controller module cover. 2. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. Do not completely insert the controller module in the chassis until instructed to do so.
  • Page 487 pools. This task only applies to two-node MetroCluster configurations. Steps 1. Verify that all nodes are in the state: enabled metrocluster node show cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A  ...
  • Page 488: Aff A320 System Documentation

    6. Reestablish any SnapMirror or SnapVault configurations. Step 6: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 489 Use this guide if you want more detailed installation instructions. Prepare for installation To install your AFF A320 system, you need to create an account, register the system, and get license keys. You also need to inventory the appropriate number and type of cables for your system and collect specific network information.
  • Page 490 The following table identifies the types of cables you might receive. If you receive a cable not listed in the table, see the Hardware Universe to locate the cable and identify its use. NetApp Hardware Universe Type of Part number and length Connector For…...
  • Page 491 Cluster Configuration Worksheet Install the hardware You need to install your system in a 4-post rack or NetApp system cabinet, as applicable. 1. Install the rail kits, as needed. 2. Install and secure your system using the instructions included with the rail kit.
  • Page 492 You must have contacted your network administrator for information about connecting the system to the switches. Be sure to check the illustration arrow for the proper cable connector pull-tab orientation. As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it around and try again.
  • Page 493 Step Perform on each controller module Cable the cluster/HA ports to each other with the 100 GbE (QSFP28) cable: • e0a to e0a • e0d to e0d If you are using your onboard ports for a data network connection, connect the 100GbE or 40Gbe cables to the appropriate data network switches: •...
  • Page 494 Step Perform on each controller module If you are using your NIC cards for Ethernet or FC connections, connect the NIC card(s) to the appropriate switches: Cable the e0M ports to the management network switches with the RJ45 cables. DO NOT plug in the power cords at this point. 2.
  • Page 495 switches. Be sure to check the illustration arrow for the proper cable connector pull-tab orientation. As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it around and try again. 1.
  • Page 496 Step Perform on each controller module Cable the cluster/HA ports to the cluster/HA switch with the 100 GbE (QSFP28) cable: • e0a on both controllers to the cluster/HA switch • e0d on both controllers to the cluster/HA switch If you are using your onboard ports for a data network connection, connect the 100GbE or 40Gbe cables to the appropriate data network switches: •...
  • Page 497 Step Perform on each controller module If you are using your NIC cards for Ethernet or FC connections, connect the NIC card(s) to the appropriate switches: Cable the e0M ports to the management network switches with the RJ45 cables. DO NOT plug in the power cords at this point. 2.
  • Page 498 Option 1: Cable the controllers to a single drive shelf You must cable each controller to the NSM modules on the NS224 drive shelf. Be sure to check the illustration arrow for the proper cable connector pull-tab orientation. As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it around and try again.
  • Page 499 Step Perform on each controller module Cable controller A to the shelf Cable controller B to the shelf: 2. To complete setting up your system, see Completing system setup and configuration.
  • Page 500 Option 2: Cable the controllers to two drive shelves You must cable each controller to the NSM modules on both NS224 drive shelves. Be sure to check the illustration arrow for the proper cable connector pull-tab orientation. As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it around and try again.
  • Page 501 Step Perform on each controller module Cable controller A to the shelves: Cable controller B to the shelves: 2. To complete setting up your system, see Completing system setup and configuration. Complete system setup and configuration You can complete the system setup and configuration using cluster discovery with only a connection to the switch and laptop, or by connecting directly to a controller in the system and then connecting to the management switch.
  • Page 502 Double-click either ONTAP icon and accept any certificates displayed on your screen. XXXXX is the system serial number for the target node. System Manager opens. 5. Use System Manager guided setup to configure your system using the data you collected in the NetApp ONTAP Configuration Guide. ONTAP Configuration Guide 6.
  • Page 503 Option 2: Completing system setup and configuration if network discovery is not enabled If network discovery is not enabled on your laptop, you must complete the configuration and setup using this task. 1. Cable and configure your laptop or console: a.
  • Page 504 Point your browser to the node management IP address. The format for the address is https://x.x.x.x. b. Configure the system using the data you collected in the NetApp ONTAP Configuration guide. ONTAP Configuration Guide 6. Verify the health of your system by running Config Advisor.
  • Page 505 Check NVE or NSE on systems running ONTAP 9.6 and later Before shutting down the impaired controller, you need to verify whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
  • Page 506 Restored yes: a. Restore the external key management authentication keys to all nodes in the cluster: security key-manager external restore If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys: Restored...
  • Page 507 Enter the customer’s onboard key management passphrase at the prompt. If the passphrase cannot be provided, contact NetApp Support. mysupport.netapp.com b. Verify the column shows for all authentication keys: Restored security key-manager key-query c. Verify that the type shows onboard, and then manually back up the OKM Key Manager information.
  • Page 508 -priv admin h. You can safely shut down the controller. Shut down the node - AFF A320 After completing the NVE or NSE tasks, you need to complete the shutdown of the impaired node. Shut down or take over the impaired controller using the appropriate procedure for your configuration.
  • Page 509 If the impaired controller Then… displays… The LOADER prompt Go to Remove controller module. Press Ctrl-C, and then respond when prompted. Waiting for giveback… System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press...
  • Page 510 When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y. Replace the boot media - AFF A320 To replace the boot media, you must remove the impaired controller module, install the replacement boot media, and transfer the boot image to a USB flash drive.
  • Page 511 a. Insert your forefinger into the latching mechanism on either side of the controller module. b. Press down on the orange tab on top of the latching mechanism until it clears the latching pin on the chassis. The latching mechanism hook should be nearly vertical and should be clear of the chassis pin. c.
  • Page 512 • A copy of the same image version of ONTAP as what the impaired controller was running. You can download the appropriate image from the Downloads section on the NetApp Support Site ◦ If NVE is enabled, download the image with NetApp Volume Encryption, as indicated in the download button.
  • Page 513 • If your system is a stand-alone system you do not need a network connection, but you must perform an additional reboot when restoring the var file system. 1. Download and copy the appropriate service image from the NetApp Support Site to the USB flash drive.
  • Page 514 e. Release the latches to lock the controller module into place. f. If you have not already done so, reinstall the cable management device. 8. Interrupt the boot process by pressing Ctrl-C to stop at the LOADER prompt. If you miss this message, press Ctrl-C, select the option to boot to Maintenance mode, and then halt the node to boot to LOADER.
  • Page 515 Restore automatic giveback if you disabled it using the storage failover modify command. 17. Exit advanced privilege level on the healthy node. Boot the recovery image - AFF A320 You must boot the ONTAP image from the USB drive, restore the file system, and verify the environmental variables.
  • Page 516 If your system has… Then… A network connection a. Press when prompted to restore the backup configuration. b. Set the healthy node to advanced privilege level: -privilege advanced c. Run the restore backup command: system node restore- backup -node local -target-address impaired_node_IP_address d.
  • Page 517 If your system has… Then… No network connection and is in a a. Press when prompted to restore the backup configuration. MetroCluster IP configuration b. Reboot the system when prompted by the system. c. Wait for the iSCSI storage connections to connect. You can proceed after you see the following messages: date-and-time [node- name:iscsi.session.stateChanged:notice]:...
  • Page 518 Restore OKM, NSE, and NVE as needed - AFF A320 Once environment variables are checked, you must complete steps specific to systems that have Onboard Key Manager (OKM), NetApp Storage Encryption (NSE) or NetApp Volume Encryption (NVE) enabled. 1. Determine which section you should use to restore your OKM, NSE, or NVE configurations: If NSE or NVE are enabled along with Onboard Key Manager you must restore settings you captured at the beginning of this procedure.
  • Page 519 If the console displays… Then… The LOADER prompt Boot the controller to the boot menu: boot_ontap menu Waiting for giveback…. a. Enter at the prompt Ctrl-C b. At the message: Do you wish to halt this node rather than wait [y/n]? , enter: c.
  • Page 520 9. Confirm the target controller is ready for giveback with the storage failover show command. 10. Giveback only the CFO aggregates with the storage failover giveback -fromnode local command. -only-cfo-aggregates true ◦ If the command fails because of a failed disk, physically disengage the failed disk, but leave the disk in the slot until a replacement is received.
  • Page 521 Restore NSE/NVE on systems running ONTAP 9.6 and later Steps 1. Connect the console cable to the target controller. 2. Use the command at the LOADER prompt to boot the controller. boot_ontap 3. Check the console output: If the console displays… Then…...
  • Page 522 -auto-giveback true Return the failed part to NetApp - AFF A320 After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 523 • If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message MAINT=number_of_hours_downh The following AutoSupport message suppresses automatic case creation for two hours: cluster1:*> system node autosupport invoke -node * -type all -message MAINT=2h Steps 1.
  • Page 524 Replace hardware - AFF A320 Move the fans, hard drives, and controller module or modules from the impaired chassis to the new chassis, and swap out the impaired chassis from the equipment rack or system cabinet with the new chassis of the same model as the impaired chassis.
  • Page 525 5. Set the fan module aside. 6. Repeat the preceding steps for any remaining fan modules. 7. Insert the fan module into the replacement chassis by aligning it with the opening, and then sliding it into the chassis. 8. Push firmly on the fan module cam handle so that it is seated all the way into the chassis. The cam handle raises slightly when the fan module is completely seated.
  • Page 526 Complete the restoration and replacement process - AFF A320 You must verify the HA state of the chassis, run diagnostics, and return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Step 1: Verify and set the HA state of the chassis You must verify the HA state of the chassis, and, if necessary, update the state to match your system configuration.
  • Page 527 ◦ If the test reported no failures, select Reboot from the menu to reboot the system. Step 3: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 528 This provides you a record of the procedure so that you can troubleshoot any issues that you might encounter during the replacement process. Shut down the impaired controller - AFF A320 Shut down or take over the impaired controller using the appropriate procedure for your configuration.
  • Page 529 If the impaired controller is Then… displaying… System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y.
  • Page 530 Replace the controller module hardware - AFF A320 To replace the controller module hardware, you must remove the impaired controller, move FRU components to the replacement controller module, install the replacement controller module in the chassis, and then boot the system to Maintenance mode.
  • Page 531 a. Insert your forefinger into the latching mechanism on either side of the controller module. b. Press down on the orange tab on top of the latching mechanism until it clears the latching pin on the chassis. The latching mechanism hook should be nearly vertical and should be clear of the chassis pin. c.
  • Page 532 1. Locate the NVDIMM battery in the controller module. 2. Locate the battery plug and squeeze the clip on the face of the battery plug to release the plug from the socket, and then unplug the battery cable from the socket. 3.
  • Page 533 1. Open the air duct and locate the boot media using the following illustration or the FRU map on the controller module: 2. Locate and remove the boot media from the controller module: a. Press the blue button at the end of the boot media until the lip on the boot media clears the blue button. b.
  • Page 534 1. Locate the DIMMs on your controller module.
  • Page 535 Air duct • System DIMMs slots: 2,4, 7, 9, 13, 15, 18, and • NVDIMM slot: 11 The NVDIMM looks significantly different than system DIMMs. 2. Note the orientation of the DIMM in the socket so that you can insert the DIMM in the replacement controller module in the proper orientation.
  • Page 536 1. Remove the cover over the PCIe risers by unscrewing the blue thumbscrew on the cover, slide the cover toward you, rotate the cover upward, lift it off the controller module, and then set it aside. 2. Remove the empty risers from the replacement controller module. a.
  • Page 537 If you have not already done so, reinstall the cable management device. h. Interrupt the normal boot process by pressing Ctrl-C. Restore and verify the system configuration - AFF A320 After completing the hardware replacement and booting to Maintenance mode, you verify...
  • Page 538 settings as necessary. Step 1: Set and verify the system time after replacing the controller module You should check the time and date on the replacement controller module against the healthy controller module in an HA pair, or against a reliable time server in a stand-alone configuration. If the time and date do not match, you must reset them on the replacement controller module to prevent possible outages on clients due to time differences.
  • Page 539 ▪ A prompt warning that when entering Maintenance mode in an HA configuration you must ensure that the healthy controller remains down. You can safely respond to these prompts. Recable the system and reassign disks - AFF A320 Continue the replacement procedure by recabling the storage and confirming disk reassignment. Step 1: Recable the system After running diagnostics, you must recable the controller module’s storage and network connections.
  • Page 540 a. Download and install Config Advisor. b. Enter the information for the target system, and then click Collect Data. c. Click the Cabling tab, and then examine the output. Make sure that all disk shelves are displayed and all disks appear in the output, correcting any cabling issues you find. d.
  • Page 541 8. If you disabled automatic takeover on reboot, enable it from the healthy controller: storage failover modify -node replacement-node-name -onreboot true Complete system restoration - AFF A320 To restore your system to full operation, you must restore the NetApp Storage Encryption...
  • Page 542 (if necessary), and install licenses for the new controller, and return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Step 1: Install licenses for the replacement controller in ONTAP You must install new licenses for the replacement node if the impaired node was using ONTAP features that require a standard (node-locked) license.
  • Page 543 -node local -auto -giveback true Step 4: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 544 controller; see the Administration overview with the CLI. Steps 1. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message MAINT=number_of_hours_downh The following AutoSupport message suppresses automatic case creation for two hours: cluster1:*>...
  • Page 545 2. Disable automatic giveback from the console of the healthy controller: storage failover modify –node local -auto-giveback false 3. Take the impaired controller to the LOADER prompt: If the impaired controller is Then… displaying… The LOADER prompt Go to Remove controller module. Waiting for giveback…...
  • Page 546 a. Insert your forefinger into the latching mechanism on either side of the controller module. b. Press down on the orange tab on top of the latching mechanism until it clears the latching pin on the chassis. The latching mechanism hook should be nearly vertical and should be clear of the chassis pin. c.
  • Page 547 Air duct • System DIMMs slots: 2,4, 7, 9, 13, 15, 18, and • NVDIMM slot: 11 The NVDIMM looks significantly different than system DIMMs. 3. Note the orientation of the DIMM in the socket so that you can insert the replacement DIMM in the proper orientation.
  • Page 548 The DIMM fits tightly in the slot, but should go in easily. If not, realign the DIMM with the slot and reinsert it. Visually inspect the DIMM to verify that it is evenly aligned and fully inserted into the slot. 7.
  • Page 549 -node local -auto -giveback true Step 7: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 550 Hot-swap a fan module - AFF A320 To swap out a fan module without interrupting service, you must perform a specific sequence of tasks. You must replace the fan module within two minutes of removing it from the chassis. System airflow is disrupted and the controller module or modules shut down after two minutes to avoid overheating.
  • Page 551 The Attention LED should not be lit after the fan is seated and has spun up to operational speed. 10. Align the bezel with the ball studs, and then gently push the bezel onto the ball studs. Replace an NVDIMM - AFF A320 You must replace the NVDIMM in the controller module when your system registers that the flash lifetime is almost at an end or that the identified NVDIMM is not healthy in general;...
  • Page 552 2. Disable automatic giveback from the console of the healthy controller: storage failover modify –node local -auto-giveback false 3. Take the impaired controller to the LOADER prompt: If the impaired controller is Then… displaying… The LOADER prompt Go to Remove controller module. Waiting for giveback…...
  • Page 553 If the impaired controller is Then… displaying… Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y.
  • Page 554 a. Insert your forefinger into the latching mechanism on either side of the controller module. b. Press down on the orange tab on top of the latching mechanism until it clears the latching pin on the chassis. The latching mechanism hook should be nearly vertical and should be clear of the chassis pin. a.
  • Page 555 2. Note the orientation of the NVDIMM in the socket so that you can insert the NVDIMM in the replacement controller module in the proper orientation. 3. Eject the NVDIMM from its slot by slowly pushing apart the two NVDIMM ejector tabs on either side of the NVDIMM, and then slide the NVDIMM out of the socket and set it aside.
  • Page 556 The latching mechanism arms slide into the chassis. The controller module begins to boot as soon as it is fully seated in the chassis. e. Release the latches to lock the controller module into place. f. Recable the power supply. g.
  • Page 557 Step 7: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp Support, 888- 463-8277 (North America), 00-800-44-638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure.
  • Page 558 If the impaired controller is Then… displaying… System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y.
  • Page 559 Step 2: Remove the controller module To access components inside the controller module, you must remove the controller module from the chassis. 1. If you are not already grounded, properly ground yourself. 2. Unplug the controller module power supply from the power source. 3.
  • Page 560 Step 3: Replace the NVDIMM battery To replace the NVDIMM battery, you must remove the failed battery from the controller module and install the replacement battery into the controller module. 1. Open the air duct and locate the NVDIMM battery. 2.
  • Page 561 Do not completely insert the controller module in the chassis until instructed to do so. 3. Cable the management and console ports only, so that you can access the system to perform the tasks in the following sections. You will connect the rest of the cables to the controller module later in this procedure. 4.
  • Page 562 -node local -auto -giveback true Step 7: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 563 if necessary, take over the controller so that the healthy controller continues to serve data from the impaired controller storage. About this task If you have a cluster with more than two nodes, it must be in quorum. If the cluster is not in quorum or a healthy controller shows false for eligibility and health, you must correct the issue before shutting down the impaired controller;...
  • Page 564 system node autosupport invoke -node * -type all -message MAINT=number_of_hours_downh The following AutoSupport message suppresses automatic case creation for two hours: cluster1:*> system node autosupport invoke -node * -type all -message MAINT=2h 2. Disable automatic giveback from the console of the healthy controller: storage failover modify –node local -auto-giveback false 3.
  • Page 565 4. Remove and set aside the cable management devices from the left and right sides of the controller module. 5. Remove the controller module from the chassis: a. Insert your forefinger into the latching mechanism on either side of the controller module. b.
  • Page 566 1. Remove the cover over the PCIe risers by unscrewing the blue thumbscrew on the cover, slide the cover toward you, rotate the cover upward, lift it off the controller module, and then set it aside. 2. Remove the riser with the failed PCIe card: a.
  • Page 567 d. Reinstall the PCIe riser cover on the controller module. Sep 4: Install the controller module After you have replaced the component in the controller module, you must reinstall the controller module into the chassis, and then boot it to Maintenance mode. 1.
  • Page 568 -node local -auto -giveback true Step 6: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 569 Once power is restored to the power supply, the status LED should be green. 8. After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 570 Replace the real-time clock battery - AFF A320 You replace the real-time clock (RTC) battery in the controller module so that your system’s services and applications that depend on accurate time synchronization continue to function. • You can use this procedure with all versions of ONTAP supported by your system •...
  • Page 571 If the impaired controller is Then… displaying… System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y.
  • Page 572 Step 2: Replace the RTC battery You need to locate the RTC battery inside the controller module, and then follow the specific sequence of steps. Step 3: Remove the controller module To access components inside the controller module, you must remove the controller module from the chassis.
  • Page 573 a. Gently pull the controller module a few inches toward you so that you can grasp the controller module sides. b. Using both hands, gently pull the controller module out of the chassis and set it on a flat, stable surface.
  • Page 574 d. Note the polarity of the RTC battery, and then insert it into the holder by tilting the battery at an angle and pushing down. 3. Visually inspect the battery to make sure that it is completely installed into the holder and that the polarity is correct.
  • Page 575: Aff A400 System Documentation

    -node local -auto -giveback true Step 6: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 576 Video two of two: Perform end-to-end software configuration The following video shows end-to-end software configuration for systems running ONTAP 9.2 and later. NetApp video: Software configuration for vSphere NAS datastores for FAS/AFF systems running ONTAP 9.2 Detailed guide - AFF A400 This guide gives detailed step-by-step instructions for installing a typical NetApp system.
  • Page 577 The following table identifies the types of cables you might receive. If you receive a cable not listed in the table, see the Hardware Universe to locate the cable and identify its use. NetApp Hardware Universe Type of cable… Part number and length Connector type For…...
  • Page 578 Power cables Not applicable Powering up the system 4. Review the NetApp ONTAP Configuration Guide and collect the required information listed in that guide. ONTAP Configuration Guide Step 2: Install the hardware You need to install your system in a 4-post rack or NetApp system cabinet, as applicable.
  • Page 579 switches. Be sure to check the direction of the cable pull-tabs when inserting the cables in the ports. Cable pull-tabs are up for all onboard ports and down for expansion (NIC) cards. As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it around and try again.
  • Page 580 switches. Be sure to check the direction of the cable pull-tabs when inserting the cables in the ports. Cable pull-tabs are up for all onboard ports and down for expansion (NIC) cards. As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it around and try again.
  • Page 581 Step 4: Cable controllers to drive shelves You can cable either NSS224 or SAS shelves to you system. Option 1: Cable the controllers to a single drive shelf You must cable each controller to the NSM modules on the NS224 drive shelf. Be sure to check the illustration arrow for the proper cable connector pull-tab orientation.
  • Page 582 Option 2: Cable the controllers to two drive shelves You must cable each controller to the NSM modules on both NS224 drive shelves. Be sure to check the illustration arrow for the proper cable connector pull-tab orientation. The cable pull-tab for the NS224 are up.
  • Page 583 2. Go to Step 5: Complete system setup and configuration to complete system setup and configuration. Option 3: Cable the controllers to SAS drive shelves You must cable each controller to the IOM modules on both SAS drive shelves. Be sure to check the illustration arrow for the proper cable connector pull-tab orientation. The cable pull-tab for the DS224-C are down.
  • Page 584 1. Use the following illustration to cable your controllers to two drive shelves. Cabling the controllers to SAS drive shelves 2. Go to Step 5: Complete system setup and configuration to complete system setup and configuration. Step 5: Complete system setup and configuration You can complete the system setup and configuration using cluster discovery with only a connection to the switch and laptop, or by connecting directly to a controller in the system and then connecting to the management switch.
  • Page 585 Double-click either ONTAP icon and accept any certificates displayed on your screen. XXXXX is the system serial number for the target node. System Manager opens. 6. Use System Manager guided setup to configure your system using the data you collected in the NetApp ONTAP Configuration Guide.
  • Page 586 Register your system. NetApp Product Registration c. Download Active IQ Config Advisor. NetApp Downloads: Config Advisor 8. Verify the health of your system by running Config Advisor. 9. After you have completed the initial configuration, go to the ONTAP & ONTAP System Manager Documentation Resources page for information about configuring additional features in ONTAP.
  • Page 587 Point your browser to the node management IP address. The format for the address is https://x.x.x.x. b. Configure the system using the data you collected in the NetApp ONTAP Configuration guide. ONTAP Configuration Guide 4. Set up your account and download Active IQ Config Advisor: a.
  • Page 588 ◦ The impaired node is the node on which you are performing maintenance. ◦ The healthy node is the HA partner of the impaired node. Check onboard encryption - AFF A400 Prior to shutting down the impaired controller and checking the status of the onboard encryption keys, you must check the status of the impaired controller, disable automatic giveback, and check the version of ONTAP that is running.
  • Page 589 Check NVE or NSE on systems running ONTAP 9.6 and later Before shutting down the impaired controller, you need to verify whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
  • Page 590 Restored yes: a. Restore the external key management authentication keys to all nodes in the cluster: security key-manager external restore If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys: Restored...
  • Page 591 Key Manager external Restored yes: a. Enter the onboard security key-manager sync command: security key-manager external sync If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys: Restored security key- manager key-query c.
  • Page 592 -priv admin h. You can safely shut down the controller. Shut down the impaired controller - AFF A400 After completing the NVE or NSE tasks, you need to complete the shutdown of the impaired controller. Shut down or take over the impaired controller using the appropriate procedure for your configuration.
  • Page 593 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with the CLI.
  • Page 594 If the impaired controller… Then… Has automatically switched over Proceed to the next step. Has not automatically switched Perform a planned switchover operation from the healthy controller: over metrocluster switchover Has not automatically switched Review the veto messages and, if possible, resolve the issue and try over, you attempted switchover again.
  • Page 595 Errors: - 8. On the impaired controller module, disconnect the power supplies. Replace the boot media - AFF A400 To replace the boot media, you must remove the impaired controller module, install the replacement boot media, and transfer the boot image to a USB flash drive.
  • Page 596 NetApp Support Site. You must log into the NetApp Support Site to display the Statement of Volatility for your system. You can use the following animation, illustration, or the written steps to replace the boot media.
  • Page 597 Locking tabs Slide air duct toward back of controller Rotate air duct up a. Press the locking tabs on the sides of the air duct in toward the middle of the controller module. b. Slide the air duct toward the back of the controller module, and then rotate it upward to its completely open position.
  • Page 598 Press blue button Rotate boot media up and remove from socket a. Press the blue button at the end of the boot media until the lip on the boot media clears the blue button. b. Rotate the boot media up and gently pull the boot media out of the socket. 3.
  • Page 599 Steps 1. Download and copy the appropriate service image from the NetApp Support Site to the USB flash drive. a. Download the service image to your work space on your laptop. b. Unzip the service image.
  • Page 600 The changes will be implemented when the system is booted. Boot the recovery image - AFF A400 The procedure for booting the impaired controller from the recovery image depends on whether the system is in a two-node MetroCluster configuration.
  • Page 601 3. Restore the file system: If your system has… Then… A network connection a. Press when prompted to restore the backup configuration. b. Set the healthy controller to advanced privilege level: -privilege advanced c. Run the restore backup command: system node restore- backup -node local -target-address impaired_node_IP_address d.
  • Page 602 Save your changes using the savenv command. e. Reboot the node. Switch back aggregates in a two-node MetroCluster configuration - AFF A400 After you have completed the FRU replacement in a two-node MetroCluster configuration, you can perform the MetroCluster switchback operation. This returns the...
  • Page 603 configuration to its normal operating state, with the sync-source storage virtual machines (SVMs) on the formerly impaired site now active and serving data from the local disk pools. This task only applies to two-node MetroCluster configurations. Steps 1. Verify that all nodes are in the state: enabled metrocluster node show...
  • Page 604 Restore OKM, NSE, and NVE as needed - AFF A400 Once environment variables are checked, you must complete steps specific to systems that have Onboard Key Manager (OKM), NetApp Storage Encryption (NSE) or NetApp Volume Encryption (NVE) enabled. 1. Determine which section you should use to restore your OKM, NSE, or NVE configurations: If NSE or NVE are enabled along with Onboard Key Manager you must restore settings you captured at the beginning of this procedure.
  • Page 605 6. When prompted to enter the backup data, paste the backup data you captured at the beginning of this procedure, when asked. Paste the output of security key-manager backup show security command key-manager onboard show-backup The data is output from either security key-manager backup show security command.
  • Page 606 show-giveback` commands. `storage failover Only the CFO aggregates (root aggregate and CFO style data aggregates) will be shown. 12. Move the console cable to the target controller. a. If you are running ONTAP 9.6 or later, run the security key-manager onboard sync: b.
  • Page 607 Waiting for giveback… a. Log into the partner controller. b. Confirm the target controller is ready for giveback with the storage failover show command. 4. Move the console cable to the partner controller and give back the target controller storage using the storage failover giveback -fromnode local -only-cfo-aggregates true local command.
  • Page 608 -auto-giveback true command. Return the failed part to NetApp - AFF A400 After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 609 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with the...
  • Page 610 • You must leave the power supplies turned on at the end of this procedure to provide power to the healthy controller. Steps 1. Check the MetroCluster status to determine whether the impaired controller has automatically switched over to the healthy controller: metrocluster show 2.
  • Page 611 Errors: - 8. On the impaired controller module, disconnect the power supplies. Replace hardware - AFF A400 Move the fans, hard drives, and controller module or modules from the impaired chassis to the new chassis, and swap out the impaired chassis from the equipment rack or system cabinet with the new chassis of the same model as the impaired chassis.
  • Page 612 Leave the cables in the cable management device so that when you reinstall the cable management device, the cables are organized. 4. Remove and set aside the cable management devices from the left and right sides of the controller module. 5.
  • Page 613 1. Remove the screws from the chassis mount points. 2. With two people, slide the old chassis off the rack rails in a system cabinet or equipment rack, and then set it aside. 3. If you are not already grounded, properly ground yourself. 4.
  • Page 614 Complete the restoration and replacement process - AFF A400 You must verify the HA state of the chassis, run diagnostics, and return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Step 1: Verify and set the HA state of the chassis You must verify the HA state of the chassis, and, if necessary, update the state to match your system configuration.
  • Page 615 function properly: boot_diags 3. Select Scan System from the displayed menu to enable running the diagnostics tests. 4. Select Test system from the displayed menu to run diagnostics tests. 5. Select the test or series of tests from the various sub-menus. 6.
  • Page 616 6. Reestablish any SnapMirror or SnapVault configurations. Step 4: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 617 This provides you a record of the procedure so that you can troubleshoot any issues that you might encounter during the replacement process. Shut down the impaired controller - AFF A400 Shut down or take over the impaired controller using the appropriate procedure for your configuration.
  • Page 618 If the impaired controller is Then… displaying… System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y.
  • Page 619 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with the CLI.
  • Page 620 that prevent the healing operation. 4. Verify that the operation has been completed by using the metrocluster operation show command. controller_A_1::> metrocluster operation show   Operation: heal-aggregates   State: successful Start Time: 7/25/2016 18:45:55   End Time: 7/25/2016 18:45:56   Errors: - 5.
  • Page 621 Replace the controller module hardware - AFF A400 To replace the controller module hardware, you must remove the impaired controller, move FRU components to the replacement controller module, install the replacement controller module in the chassis, and then boot the system to Maintenance mode.
  • Page 622 7. Place the controller module on a stable, flat surface. 8. On the replacement controller module, open the air duct and remove the empty risers from the controller module using the animation, illustration, or the written steps: Removing the empty risers from the replacement controller module a.
  • Page 623 a. Rotate the cam handle so that it can be used to pull the power supply out of the chassis. b. Press the blue locking tab to release the power supply from the chassis. c. Using both hands, pull the power supply out of the chassis, and then set it aside. 2.
  • Page 624 1. Open the air duct: a. Press the locking tabs on the sides of the air duct in toward the middle of the controller module. b. Slide the air duct toward the back of the controller module, and then rotate it upward to its completely open position.
  • Page 625 You can use the following animation, illustration, or the written steps to move the boot media from the impaired controller module to the replacement controller module. Moving the boot media 1. Locate and remove the boot media from the controller module: a.
  • Page 626 Step 5: Move the PCIe risers and mezzanine card As part of the controller replacement process, you must move the PCIe risers and mezzanine card from the impaired controller module to the replacement controller module. You can use the following animations, illustrations, or the written steps to move the PCIe risers and mezzanine card from the impaired controller module to the replacement controller module.
  • Page 627 module: a. Remove any SFP or QSFP modules that might be in the PCIe cards. b. Rotate the riser locking latch on the left side of the riser up and toward air duct. The riser raises up slightly from the controller module. c.
  • Page 628 1. Locate the DIMMs on your controller module. 2. Note the orientation of the DIMM in the socket so that you can insert the DIMM in the replacement controller module in the proper orientation. 3. Verify that the NVDIMM battery is not plugged into the new controller module. 4.
  • Page 629 b. Locate the corresponding DIMM slot on the replacement controller module. c. Make sure that the DIMM ejector tabs on the DIMM socket are in the open position, and then insert the DIMM squarely into the socket. The DIMMs fit tightly in the socket, but should go in easily. If not, realign the DIMM with the socket and reinsert it.
  • Page 630 Interrupt the boot process and boot to the LOADER prompt by pressing Ctrl-C. If your system stops at the boot menu, select the option to boot to LOADER. Restore and verify the system configuration - AFF A400 After completing the hardware replacement and booting to Maintenance mode, you verify the low-level system configuration of the replacement controller and reconfigure system settings as necessary.
  • Page 631 2. On the healthy node, check the system time: show date The date and time are given in GMT. 3. At the LOADER prompt, check the date and time on the replacement node: show date The date and time are given in GMT. 4.
  • Page 632 ▪ A prompt warning that when entering Maintenance mode in an HA configuration you must ensure that the healthy controller remains down. You can safely respond to these prompts. Recable the system and reassign disks - AFF A400 Continue the replacement procedure by recabling the storage and confirming disk reassignment. Step 1: Recable the system After running diagnostics, you must recable the controller module’s storage and network connections.
  • Page 633 2. From the LOADER prompt on the replacement controller, boot the controller, entering if you are prompted to override the system ID due to a system ID mismatch:boot_ontap 3. Wait until the message is displayed on the replacement controller console and Waiting for giveback…...
  • Page 634 b. After the giveback has been completed, confirm that the HA pair is healthy and that takeover is possible: storage failover show The output from the command should not include the System ID changed storage failover show on partner message. 6.
  • Page 635 -node replacement-node-name -onreboot true Complete system restoration - AFF A400 To restore your system to full operation, you must restore the NetApp Storage Encryption configuration (if necessary), and install licenses for the new controller, and return the failed part to NetApp, as described in the RMA instructions shipped with the kit.
  • Page 636 If any LIFs are listed as false, revert them to their home ports: network interface revert 2. Register the system serial number with NetApp Support. ◦ If AutoSupport is enabled, send an AutoSupport message to register the serial number. ◦ If AutoSupport is not enabled, call NetApp Support to register the serial number.
  • Page 637 This task only applies to two-node MetroCluster configurations. Steps 1. Verify that all nodes are in the state: enabled metrocluster node show cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A  ...
  • Page 638 6. Reestablish any SnapMirror or SnapVault configurations. Step 5: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 639 If the impaired controller is Then… displaying… The LOADER prompt Go to Remove controller module. Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name...
  • Page 640 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with the CLI.
  • Page 641 parameter. If you use this optional parameter, the system overrides any soft vetoes -override-vetoes that prevent the healing operation. 4. Verify that the operation has been completed by using the metrocluster operation show command. controller_A_1::> metrocluster operation show   Operation: heal-aggregates  ...
  • Page 642 Step 2: Remove the controller module To access components inside the controller module, you must remove the controller module from the chassis. You can use the following animation, illustration, or the written steps to remove the controller module from the chassis.
  • Page 643 Step 3: Replace system DIMMs Replacing a system DIMM involves identifying the target DIMM through the associated error message, locating the target DIMM using the FRU map on the air duct or the lit LED on the motherboard, and then replacing the DIMM. You can use the following animation, illustration, or the written steps to replace a system DIMM.
  • Page 644 a. Press the locking tabs on the sides of the air duct in toward the middle of the controller module. b. Slide the air duct toward the back of the controller module, and then rotate it upward to its completely open position.
  • Page 645 1. If you have not already done so, close the air duct. 2. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. Do not completely insert the controller module in the chassis until instructed to do so. 3.
  • Page 646 f. At the LOADER prompt, enter to reinitialize the PCIe cards and other components. g. Interrupt the boot process and boot to the LOADER prompt by pressing Ctrl-C. If your system stops at the boot menu, select the option to boot to LOADER. Step 5: Run diagnostics After you have replaced a system DIMM in your system, you should run diagnostic tests on that component.
  • Page 647 configuration to its normal operating state, with the sync-source storage virtual machines (SVMs) on the formerly impaired site now active and serving data from the local disk pools. This task only applies to two-node MetroCluster configurations. Steps 1. Verify that all nodes are in the state: enabled metrocluster node show...
  • Page 648 6. Reestablish any SnapMirror or SnapVault configurations. Step 8: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 649 1. If you are not already grounded, properly ground yourself. 2. Remove the bezel (if necessary) with two hands, by grasping the openings on each side of the bezel, and then pulling it toward you until the bezel releases from the ball studs on the chassis frame. 3.
  • Page 650 10. Align the bezel with the ball studs, and then gently push the bezel onto the ball studs. 11. After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 651 If the impaired controller is Then… displaying… Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y.
  • Page 652 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with the CLI.
  • Page 653 parameter. If you use this optional parameter, the system overrides any soft vetoes -override-vetoes that prevent the healing operation. 4. Verify that the operation has been completed by using the metrocluster operation show command. controller_A_1::> metrocluster operation show   Operation: heal-aggregates  ...
  • Page 654 Step 2: Remove the controller module To access components inside the controller module, you must remove the controller module from the chassis. You can use the following animations, illustration, or the written steps to remove the controller module from the chassis.
  • Page 655 Step 3: Replace the NVDIMM battery To replace the NVDIMM battery, you must remove the failed battery from the controller module and install the replacement battery into the controller module. See the FRU map inside the controller module to locate the NVDIMM battery. The NVDIMM LED blinks while destaging contents when you halt the system.
  • Page 656 5. Remove the replacement battery from its package. 6. Align the battery module with the opening for the battery, and then gently push the battery into slot until it locks into place. 7. Plug the battery plug back into the controller module, and then close the air duct. Step 4: Install the controller module After you have replaced the component in the controller module, you must reinstall the controller module into the chassis, and then boot it to Maintenance mode.
  • Page 657 Do not use excessive force when sliding the controller module into the chassis to avoid damaging the connectors. c. Fully seat the controller module in the chassis by rotating the locking latches upward, tilting them so that they clear the locking pins, gently push the controller all the way in, and then lower the locking latches into the locked position.
  • Page 658 1. Recable the system, as needed. If you removed the media converters (QSFPs or SFPs), remember to reinstall them if you are using fiber optic cables. 2. Return the controller to normal operation by giving back its storage: storage failover giveback -ofnode impaired_node_name 3.
  • Page 659 6. Reestablish any SnapMirror or SnapVault configurations. Step 8: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 660 from the impaired controller storage. About this task If you have a cluster with more than two nodes, it must be in quorum. If the cluster is not in quorum or a healthy controller shows false for eligibility and health, you must correct the issue before shutting down the impaired controller;...
  • Page 661 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with the CLI.
  • Page 662 If the impaired controller… Then… Has not automatically switched Review the veto messages and, if possible, resolve the issue and try over, you attempted switchover again. If you are unable to resolve the issue, contact technical support. with the metrocluster command, and the switchover switchover was vetoed...
  • Page 663 If the healing is vetoed, you have the option of reissuing the command with the metrocluster heal -override-vetoes parameter. If you use this optional parameter, the system overrides any soft vetoes that prevent the healing operation. 7. Verify that the heal operation is complete by using the command on metrocluster operation show the destination cluster:...
  • Page 664 Statement of Volatility on the NetApp Support Site. You must log into the NetApp Support Site to display the Statement of Volatility for your system. You can use the following animation, illustration, or the written steps to replace the NVDIMM.
  • Page 665 1. Open the air duct and then locate the NVDIMM in slot 11 on your controller module. The NVDIMM looks significantly different than system DIMMs. 2. Eject the NVDIMM from its slot by slowly pushing apart the two NVDIMM ejector tabs on either side of the NVDIMM, and then slide the NVDIMM out of the socket and set it aside.
  • Page 666 5. Insert the NVDIMM squarely into the slot. The NVDIMM fits tightly in the slot, but should go in easily. If not, realign the NVDIMM with the slot and reinsert it. Visually inspect the NVDIMM to verify that it is evenly aligned and fully inserted into the slot. 6.
  • Page 667 4. Complete the installation of the controller module: a. Plug the power cord into the power supply, reinstall the power cable locking collar, and then connect the power supply to the power source. b. Using the locking latches, firmly push the controller module into the chassis until the locking latches begin to rise.
  • Page 668 ◦ If the test reported no failures, select Reboot from the menu to reboot the system. Step 6: Restore the controller module to operation after running diagnostics After completing diagnostics, you must recable the system, give back the controller module, and then reenable automatic giveback. 1.
  • Page 669 6. Reestablish any SnapMirror or SnapVault configurations. Step 8: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 670 Option 1: Most configurations To shut down the impaired controller, you must determine the status of the controller and, if necessary, take over the controller so that the healthy controller continues to serve data from the impaired controller storage. About this task If you have a cluster with more than two nodes, it must be in quorum.
  • Page 671 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with the CLI.
  • Page 672 If the impaired controller… Then… Has not automatically switched Perform a planned switchover operation from the healthy controller: over metrocluster switchover Has not automatically switched Review the veto messages and, if possible, resolve the issue and try over, you attempted switchover again.
  • Page 673 mcc1A::> metrocluster heal -phase root-aggregates [Job 137] Job succeeded: Heal Root Aggregates is successful If the healing is vetoed, you have the option of reissuing the command with the metrocluster heal -override-vetoes parameter. If you use this optional parameter, the system overrides any soft vetoes that prevent the healing operation.
  • Page 674 1. If you are not already grounded, properly ground yourself. 2. Release the power cable retainers, and then unplug the cables from the power supplies. 3. Loosen the hook and loop strap binding the cables to the cable management device, and then unplug the system cables and SFPs (if needed) from the controller module, keeping track of where the cables were connected.
  • Page 675 1. Remove the riser containing the card to be replaced: a. Open the air duct by pressing the locking tabs on the sides of the air duct, slide it toward the back of the controller module, and then rotate it to its completely open position. b.
  • Page 676 4. Reinstall the riser: a. Align the riser with the pins to the side of the riser socket, lower the riser down on the pins. b. Push the riser squarely into the socket on the motherboard. c. Rotate the latch down flush with the sheet metal on the riser. Step 4: Replace the mezzanine card The mezzanine card is located under riser number 3 (slots 4 and 5).
  • Page 677 The riser raises up slightly from the controller module. d. Lift the riser up, and then set it aside on a stable, flat surface. 2. Replace the mezzanine card: a. Remove any QSFP or SFP modules from the card. b. Loosen the thumbscrews on the mezzanine card, and gently lift the card directly out of the socket and set it aside.
  • Page 678 3. Recable the system, as needed. If you removed the media converters (QSFPs or SFPs), remember to reinstall them if you are using fiber optic cables. 4. Complete the installation of the controller module: a. Plug the power cord into the power supply, reinstall the power cable locking collar, and then connect the power supply to the power source.
  • Page 679 Step 7: Switch back aggregates in a two-node MetroCluster configuration After you have completed the FRU replacement in a two-node MetroCluster configuration, you can perform the MetroCluster switchback operation. This returns the configuration to its normal operating state, with the sync-source storage virtual machines (SVMs) on the formerly impaired site now active and serving data from the local disk pools.
  • Page 680 6. Reestablish any SnapMirror or SnapVault configurations. Step 8: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 681 1. If you are not already grounded, properly ground yourself. 2. Identify the power supply you want to replace, based on console error messages or through the LEDs on the power supplies. 3. Disconnect the power supply: a. Open the power cable retainer, and then unplug the power cable from the power supply. b.
  • Page 682 Once power is restored to the power supply, the status LED should be green. 8. After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 683 If the impaired controller is Then… displaying… System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y.
  • Page 684 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with the CLI.
  • Page 685 controller_A_1::> metrocluster operation show   Operation: heal-aggregates   State: successful Start Time: 7/25/2016 18:45:55   End Time: 7/25/2016 18:45:56   Errors: - 5. Check the state of the aggregates by using the command. storage aggregate show controller_A_1::> storage aggregate show Aggregate Size Available Used% State #Vols...
  • Page 686 You can use the following animations, illustration, or the written steps to remove the controller module from the chassis. Removing the controller module 1. If you are not already grounded, properly ground yourself. 2. Release the power cable retainers, and then unplug the cables from the power supplies. 3.
  • Page 687 You can use the following animation, illustration, or the written steps to replace the RTC battery. Replacing the RTC battery 1. If you are not already grounded, properly ground yourself. 2. Open the air duct: a. Press the locking tabs on the sides of the air duct in toward the middle of the controller module. b.
  • Page 688 4. Visually inspect the battery to make sure that it is completely installed into the holder and that the polarity is correct. 5. Close the air duct. Step 4: Reinstall the controller module and setting time/date after RTC battery replacement After you replace a component within the controller module, you must reinstall the controller module in the system chassis, reset the time and date on the controller, and then boot it.
  • Page 689 Do not use excessive force when sliding the controller module into the chassis to avoid damaging the connectors. The controller module begins to boot as soon as it is fully seated in the chassis. Be prepared to interrupt the boot process. b.
  • Page 690 cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A   controller_A_1 configured enabled heal roots completed   cluster_B   controller_B_1 configured enabled waiting for switchback recovery 2 entries were displayed. 2. Verify that resynchronization is complete on all SVMs: metrocluster vserver show 3.
  • Page 691: Aff A700 And Fas9000 System Documentation

    Step 6: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp Support, 888- 463-8277 (North America), 00-800-44-638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure.
  • Page 692 Video two of two: Performing end-to-end software configuration The following video shows end-to-end software configuration for systems running ONTAP 9.2 and later. NetApp video: Software configuration for vSphere NAS datastores for FAS/AFF systems running ONTAP 9.2 Detailed guide - AFF A700 and FAS9000 This guide gives detailed step-by-step instructions for installing a typical NetApp system.
  • Page 693 Power cables Not applicable Powering up the system 4. Review the NetApp ONTAP Confiuration Guide and collect the required information listed in that guide. ONTAP Configuration Guide Step 2: Install the hardware You need to install your system in a 4-post rack or NetApp system cabinet, as applicable.
  • Page 694 Steps 1. Install the rail kits, as needed. 2. Install and secure your system using the instructions included with the rail kit. You need to be aware of the safety concerns associated with the weight of the system. The label on the left indicates an empty chassis, while the label on the right indicates a fully- populated system.
  • Page 695 As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it around and try again. Steps 1. Use the animation or illustration to complete the cabling between the controllers and to the switches: Cabling a two-node switchless cluster 1.
  • Page 696 Step 4: Cable controllers to drive shelves You can cable your new system to DS212C, DS224C, or NS224 shelves, depending on if it is an AFF or FAS system. Option 1: Cable the controllers to DS212C or DS224C drive shelves You must cable the shelf-to-shelf connections, and then cable both controllers to the DS212C or DS224C drive shelves.
  • Page 697 1. Use the following animations or illustrations to cable your drive shelves to your controllers. The examples use DS224C shelves. Cabling is similar with other supported SAS drive shelves. ◦ Cabling SAS shelves in FAS9000, AFF A700, and ASA AFF A700, ONTAP 9.7 and earlier: Cabling SAS storage - ONTAP 9.7 and earlier...
  • Page 698 ◦ Cabling SAS shelves in FAS9000, AFF A700, and ASA AFF A700, ONTAP 9.8 and later: Cabling SAS storage - ONTAP 9.8 and later...
  • Page 699 If you have more than one drive shelf stack, see the Installation and Cabling Guide for your drive shelf type. Install and cable shelves for a new system installation - shelves with IOM12 modules...
  • Page 700 Step 5: Complete system setup and configuration to complete system setup and configuration. Option 2: Cable the controllers to a single NS224 drive shelf in AFF A700 and ASA AFF A700 systems running ONTAP 9.8 and later only You must cable each controller to the NSM modules on the NS224 drive shelf on an AFF A700 or ASA AFF A700 running system ONTAP 9.8 or later.
  • Page 701 • The systems must have at least one X91148A module installed in slots 3 and/or 7 for each controller. The animation or illustrations show this module installed in both slots 3 and 7. • Be sure to check the illustration arrow for the proper cable connector pull-tab orientation. The cable pull-tab for the storage modules are up, while the pull tabs on the shelves are down.
  • Page 702 Step 5: Complete system setup and configuration to complete system setup and configuration. Option 3: Cable the controllers to two NS224 drive shelves in AFF A700 and ASA AFF A700 systems running ONTAP 9.8 and later only You must cable each controller to the NSM modules on the NS224 drive shelves on an AFF A700 or ASA AFF A700 running system ONTAP 9.8 or later.
  • Page 703 As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it around and try again. Steps 1. Use the following animation or illustrations to cable your controllers to two NS224 drive shelves. Cabling two NS224 shelves - ONTAP 9.8 and later...
  • Page 704 2. Go to Step 5: Complete system setup and configuration to complete system setup and configuration. Step 5: Complete system setup and configuration You can complete the system setup and configuration using cluster discovery with only a connection to the switch and laptop, or by connecting directly to a controller in the system and then connecting to the management switch.
  • Page 705 Double-click either ONTAP icon and accept any certificates displayed on your screen. XXXXX is the system serial number for the target node. System Manager opens. 7. Use System Manager guided setup to configure your system using the data you collected in the NetApp...
  • Page 706 Register your system. NetApp Product Registration c. Download Active IQ Config Advisor. NetApp Downloads: Config Advisor 9. Verify the health of your system by running Config Advisor. 10. After you have completed the initial configuration, go to the ONTAP & ONTAP System Manager Documentation Resources page for information about configuring additional features in ONTAP.
  • Page 707 Point your browser to the node management IP address. The format for the address is https://x.x.x.x. b. Configure the system using the data you collected in the NetApp ONTAP Configuration guide. ONTAP Configuration Guide 7. Set up your account and download Active IQ Config Advisor: a.
  • Page 708 Maintain Boot media Overview of boot media replacement - AFF A700 and FAS9000 The boot media stores a primary and secondary set of system (boot image) files that the system uses when it boots. Depending on your network configuration, you can perform either a nondisruptive or disruptive replacement.
  • Page 709 Option 1: Check NVE or NSE on systems running ONTAP 9.5 and earlier Before shutting down the impaired controller, you need to check whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
  • Page 710 Restored unavailable: a. Retrieve and restore all authentication keys and associated key IDs: security key-manager restore -address * If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the Restored column displays for all authentication keys and that all key managers...
  • Page 711 Retrieve and restore all authentication keys and associated key IDs: security key-manager restore -address * If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column displays for all authentication keys and that all key managers...
  • Page 712 Option 2: Check NVE or NSE on systems running ONTAP 9.6 and later Before shutting down the impaired controller, you need to verify whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
  • Page 713 Restored yes: a. Restore the external key management authentication keys to all nodes in the cluster: security key-manager external restore If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys: Restored...
  • Page 714 Key Manager external Restored yes: a. Enter the onboard security key-manager sync command: security key-manager external sync If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys: Restored security key- manager key-query c.
  • Page 715 -priv admin h. You can safely shut down the controller. Shut down the impaired controller - AFF A700 and FAS9000 Option 1: Most systems After completing the NVE or NSE tasks, you need to complete the shutdown of the impaired controller.
  • Page 716 • If you have a cluster with more than two nodes, it must be in quorum. If the cluster is not in quorum or a healthy controller shows false for eligibility and health, you must correct the issue before shutting down the impaired controller;...
  • Page 717 When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y. Replace the boot media - AFF A700 and FAS9000 To replace the boot media, you must remove the impaired controller module, install the replacement boot media, and transfer the boot image to a USB flash drive.
  • Page 718 Cam handle release button Cam handle 4. Rotate the cam handle so that it completely disengages the controller module from the chassis, and then slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 5.
  • Page 719 Controller module cover locking button Step 2: Replace the boot media Locate the boot media using the following illustration or the FRU map on the controller module:...
  • Page 720 • A copy of the same image version of ONTAP as what the impaired controller was running. You can download the appropriate image from the Downloads section on the NetApp Support Site ◦ If NVE is enabled, download the image with NetApp Volume Encryption, as indicated in the download button.
  • Page 721 The changes will be implemented when the system is booted. Boot the recovery image - AFF A700 and FAS9000 The procedure for booting the impaired node from the recovery image depends on whether the system is in a two-node MetroCluster configuration.
  • Page 722 the environmental variables. This procedure applies to systems that are not in a two-node MetroCluster configuration. Steps 1. From the LOADER prompt, boot the recovery image from the USB flash drive: boot_recovery The image is downloaded from the USB flash drive. 2.
  • Page 723 If your system has… Then… No network connection and is in a a. Press when prompted to restore the backup configuration. MetroCluster IP configuration b. Reboot the system when prompted by the system. c. Wait for the iSCSI storage connections to connect. You can proceed after you see the following messages: date-and-time [node- name:iscsi.session.stateChanged:notice]:...
  • Page 724 ◦ If your system does not have onboard keymanager, NSE or NVE configured, complete the steps in this section. 6. From the LOADER prompt, enter the command. boot_ontap *If you see… Then…* The login prompt Go to the next Step. Waiting for giveback…...
  • Page 725 Reboot the node. Switch back aggregates in a two-node MetroCluster configuration - AFF A700 and FAS9000 After you have completed the FRU replacement in a two-node MetroCluster configuration, you can perform the MetroCluster switchback operation. This returns the...
  • Page 726 6. Reestablish any SnapMirror or SnapVault configurations. Restore OKM, NSE, and NVE as needed - AFF A700 and FAS9000 Once environment variables are checked, you must complete steps specific to systems that have Onboard Key Manager (OKM), NetApp Storage Encryption (NSE) or NetApp Volume Encryption (NVE) enabled.
  • Page 727 If the console Then… displays… The LOADER prompt Boot the controller to the boot menu: boot_ontap menu Waiting for giveback… a. Enter at the prompt Ctrl-C b. At the message: Do you wish to halt this controller rather than wait [y/n]? , enter: c.
  • Page 728 9. Confirm the target controller is ready for giveback with the storage failover show command. 10. Give back only the CFO aggregates with the storage failover giveback -fromnode local -only-cfo command. -aggregates true ◦ If the command fails because of a failed disk, physically disengage the failed disk, but leave the disk in the slot until a replacement is received.
  • Page 729 If giveback is not complete after 20 minutes, contact Customer Support. 18. At the clustershell prompt, enter the command to list the logical net int show -is-home false interfaces that are not on their home controller and port. If any interfaces are listed as false, revert those interfaces back to their home port using the net int command.
  • Page 730 This command does not work if NVE (NetApp Volume Encryption) is configured 10. Use the security key-manager query to display the key IDs of the authentication keys that are stored on the key management servers.
  • Page 731 If the console Then… displays… The login prompt Go to Step 7. Waiting for giveback… a. Log into the partner controller. b. Confirm the target controller is ready for giveback with the storage command. failover show 4. Move the console cable to the partner controller and give back the target controller storage using the storage failover giveback -fromnode local -only-cfo-aggregates true local command.
  • Page 732 • When removing, replacing, or adding caching or core dump modules, the target node must be halted to the LOADER. • AFF A700 supports the 1TB core dump module, X9170A, which is required if you are adding NS224 drive shelves.
  • Page 733 About this task If you have a cluster with more than two nodes, it must be in quorum. If the cluster is not in quorum or a healthy controller shows false for eligibility and health, you must correct the issue before shutting down the impaired controller;...
  • Page 734 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with the CLI.
  • Page 735 If the impaired controller… Then… Has not automatically switched Review the veto messages and, if possible, resolve the issue and try over, you attempted switchover again. If you are unable to resolve the issue, contact technical support. with the metrocluster command, and the switchover switchover was vetoed...
  • Page 736 If the healing is vetoed, you have the option of reissuing the command with the metrocluster heal -override-vetoes parameter. If you use this optional parameter, the system overrides any soft vetoes that prevent the healing operation. 7. Verify that the heal operation is complete by using the command on metrocluster operation show the destination cluster:...
  • Page 737 Step 3: Add or replace an X9170A core dump module The 1TB cache core dump, X9170A, is only used in the AFF A700 systems. The core dump module cannot be hot-swapped. The core dump module typically is located in the...
  • Page 738 front of the NVRAM module in slot 6-1 in the rear of the system. To replace or add the core dump module, locate slot 6-1, and then follow the specific sequence of steps to add or replace it. Before you begin •...
  • Page 739 Do not use the numbered and lettered I/O cam latch to eject the core dump module. The numbered and lettered I/O cam latch ejects the entire NVRAM10 module and not the core dump module. c. Rotate the cam handle until the core dump module begins to slide out of the NVRAM10 module. d.
  • Page 740 cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A   controller_A_1 configured enabled heal roots completed   cluster_B   controller_B_1 configured enabled waiting for switchback recovery 2 entries were displayed. 2. Verify that resynchronization is complete on all SVMs: metrocluster vserver show 3.
  • Page 741 Step 6: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp Support, 888-463-8277 (North America), 00-800-44- 638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure.
  • Page 742 ::> system controller slot module replace -node node1 -slot 6-2 Warning: NVMe module in slot 6-2 of the node node1 will be powered off for replacement. Do you want to continue? (y|n): `y` The module has been successfully powered off. It can now be safely replaced.
  • Page 743 Orange release button. Caching module cam handle. a. Press the orange release button on the front of the caching module. Do not use the numbered and lettered I/O cam latch to eject the caching module. The numbered and lettered I/O cam latch ejects the entire NVRAM10 module and not the caching module.
  • Page 744 If you replace the caching module with a caching module from a different vendor, the new vendor name is displayed in the command output. 9. After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 745 • If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message MAINT=number_of_hours_downh The following AutoSupport message suppresses automatic case creation for two hours: cluster1:*> system node autosupport invoke -node * -type all -message MAINT=2h Steps 1.
  • Page 746 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with the CLI.
  • Page 747 Errors: - 8. On the impaired controller module, disconnect the power supplies. Move and replace hardware - AFF A700 and FAS9000 Move the fans, hard drives, and controller module or modules from the impaired chassis to the new chassis, and swap out the impaired chassis from the equipment rack or system cabinet with the new chassis of the same model as the impaired chassis.
  • Page 748 2. Turn off the power supply and disconnect the power cables: a. Turn off the power switch on the power supply. b. Open the power cable retainer, and then unplug the power cable from the power supply. c. Unplug the power cable from the power source. 3.
  • Page 749 The fan modules are short. Always support the bottom of the fan module with your free hand so that it does not suddenly drop free from the chassis and injure you. Orange release button 3. Set the fan module aside. 4.
  • Page 750 Cam handle release button Cam handle 3. Rotate the cam handle so that it completely disengages the controller module from the chassis, and then slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 4.
  • Page 751 1. Unplug any cabling associated with the target I/O module. Make sure that you label the cables so that you know where they came from. 2. Remove the target I/O module from the chassis: a. Depress the lettered and numbered cam button. The cam button moves away from the chassis.
  • Page 752 Step 5: Remove the De-stage Controller Power Module Steps You must remove the de-stage controller power modules from the old chassis in preparation for installing the replacement chassis. 1. Press the orange locking button on the module handle, and then slide the DCPM module out of the chassis.
  • Page 753 5. Slide the chassis all the way into the equipment rack or system cabinet. 6. Secure the front of the chassis to the equipment rack or system cabinet, using the screws you removed from the old chassis. 7. Secure the rear of the chassis to the equipment rack or system cabinet. 8.
  • Page 754 Step 10: Install I/O modules Steps To install I/O modules, including the NVRAM/FlashCache modules from the old chassis, follow the specific sequence of steps. You must have the chassis installed so that you can install the I/O modules into the corresponding slots in the new chassis.
  • Page 755 From the boot menu, select the option for Maintenance mode. Complete the restoration and replacement process - AFF A700 and FAS9000 You must verify the HA state of the chassis, run diagnostics, and return the failed part to NetApp, as described in the RMA instructions shipped with the kit.
  • Page 756 b. Confirm that the setting has changed: ha-config show 3. If you have not already done so, recable the rest of your system. 4. Exit Maintenance mode: halt The LOADER prompt appears. Step 2: Running system-level diagnostics After installing a new chassis, you should run interconnect diagnostics. Your system must be at the LOADER prompt to start System Level Diagnostics.
  • Page 757 If the system-level diagnostics Then… tests… Were completed without any a. Clear the status logs: sldiag device clearstatus failures b. Verify that the log was cleared: sldiag device status The following default response is displayed: SLDIAG: No log messages are present. c.
  • Page 758 If the system-level diagnostics Then… tests… Resulted in some test failures Determine the cause of the problem. a. Exit Maintenance mode: halt b. Perform a clean shutdown, and then disconnect the power supplies. c. Verify that you have observed all of the considerations identified for running system-level diagnostics, that cables are securely connected, and that hardware components are properly installed in the storage system.
  • Page 759 6. Reestablish any SnapMirror or SnapVault configurations. Step 4: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 760 If this is the procedure you should use, note that the controller replacement procedure for a node in a four or eight node MetroCluster configuration is the same as that in an HA pair. No MetroCluster-specific steps are required because the failure is restricted to an HA pair and storage failover commands can be used to provide nondisruptive operation during the replacement.
  • Page 761 If the impaired controller is Then… displaying… The LOADER prompt Go to Remove controller module. Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name...
  • Page 762 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with the CLI.
  • Page 763 parameter. If you use this optional parameter, the system overrides any soft vetoes -override-vetoes that prevent the healing operation. 4. Verify that the operation has been completed by using the metrocluster operation show command. controller_A_1::> metrocluster operation show   Operation: heal-aggregates  ...
  • Page 764 Replace the controller module hardware - AFF A700 and FAS9000 To replace the controller module hardware, you must remove the impaired node, move FRU components to the replacement controller module, install the replacement controller module in the chassis, and then boot the system to Maintenance mode.
  • Page 765 Cam handle release button Cam handle 1. Rotate the cam handle so that it completely disengages the controller module from the chassis, and then slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 2.
  • Page 766 Step 2: Move the boot media You must locate the boot media and follow the directions to remove it from the old controller and insert it in the new controller. Steps 1. Lift the black air duct at the back of the controller module and then locate the boot media using the following illustration or the FRU map on the controller module: Press release tab Boot media...
  • Page 767 5. Push the boot media down to engage the locking button on the boot media housing. Step 3: Move the system DIMMs To move the DIMMs, locate and move them from the old controller into the replacement controller and follow the specific sequence of steps. Steps 1.
  • Page 768 DIMM 5. Locate the slot where you are installing the DIMM. 6. Make sure that the DIMM ejector tabs on the connector are in the open position, and then insert the DIMM squarely into the slot. The DIMM fits tightly in the slot, but should go in easily. If not, realign the DIMM with the slot and reinsert it. Visually inspect the DIMM to verify that it is evenly aligned and fully inserted into the slot.
  • Page 769 Menu. e. Select the option to boot to Maintenance mode from the displayed menu. Restore and verify the system configuration - AFF A700 and FAS9000 After completing the hardware replacement and booting to Maintenance mode, you verify the low-level system configuration of the replacement controller and reconfigure system settings as necessary.
  • Page 770 The date and time are given in GMT. Step 2: Verify and set the HA state of the controller module You must verify the state of the controller module and, if necessary, update the state to match your system configuration. Steps 1.
  • Page 771 ◦ is a network interface card. ◦ is nonvolatile RAM. nvram ◦ is a hybrid of NVRAM and system memory. nvmem ◦ is a Serial Attached SCSI device not connected to a disk shelf. 4. Run diagnostics as desired. If you want to run diagnostic Then…...
  • Page 772 If you want to run diagnostic Then… tests on… Multiple components at the same a. Review the enabled and disabled devices in the output from the time preceding procedure and determine which ones you want to run concurrently. b. List the individual tests for the device: sldiag device show -dev dev_name c.
  • Page 773 If the system-level diagnostics Then… tests… Were completed without any a. Clear the status logs: sldiag device clearstatus failures b. Verify that the log was cleared: sldiag device status The following default response is displayed: SLDIAG: No log messages are present. c.
  • Page 774 After you issue the command, wait until the system stops at the LOADER prompt. g. Rerun the system-level diagnostic test. Recable the system and reassign disks - AFF A700 and FAS9000 Continue the replacement procedure by recabling the storage and confirming disk reassignment.
  • Page 775 1. Recable the system. 2. Verify that the cabling is correct by using Active IQ Config Advisor. a. Download and install Config Advisor. b. Enter the information for the target system, and then click Collect Data. c. Click the Cabling tab, and then examine the output. Make sure that all disk shelves are displayed and all disks appear in the output, correcting any cabling issues you find.
  • Page 776 appears (*>). b. Save any coredumps: system node run -node local-node-name partner savecore c. Wait for the `savecore`command to complete before issuing the giveback. You can enter the following command to monitor the progress of the savecore command: system node run -node local-node-name partner savecore -s d.
  • Page 777 Complete system restoration - AFF A700 and FAS9000 To complete the replacement procedure and restore your system to full operation, you must recable the storage, restore the NetApp Storage Encryption configuration (if necessary), and install licenses for the new controller. You must complete a series of tasks before restoring your system to full operation.
  • Page 778 If the node is in a MetroCluster configuration and all nodes at a site have been replaced, license keys must be installed on the replacement node or nodes prior to switchback. 1. If you need new license keys, obtain replacement license keys on the NetApp Support Site in the My Support section under Software licenses.
  • Page 779 If any LIFs are listed as false, revert them to their home ports: network interface revert 2. Register the system serial number with NetApp Support. ◦ If AutoSupport is enabled, send an AutoSupport message to register the serial number. ◦ If AutoSupport is not enabled, call NetApp Support to register the serial number.
  • Page 780 6. Reestablish any SnapMirror or SnapVault configurations. Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 781 Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 782 Replace a DIMM - AFF A700 and FAS9000 You must replace a DIMM in the controller module when your system registers an increasing number of correctable error correction codes (ECC); failure to do so causes a system panic.
  • Page 783 If the impaired controller is Then… displaying… System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y.
  • Page 784 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with the CLI.
  • Page 785 controller_A_1::> metrocluster operation show   Operation: heal-aggregates   State: successful Start Time: 7/25/2016 18:45:55   End Time: 7/25/2016 18:45:56   Errors: - 5. Check the state of the aggregates by using the command. storage aggregate show controller_A_1::> storage aggregate show Aggregate Size Available Used% State #Vols...
  • Page 786 Steps 1. If you are not already grounded, properly ground yourself. 2. Unplug the cables from the impaired controller module, and keep track of where the cables were connected. 3. Slide the orange button on the cam handle downward until it unlocks. Cam handle release button Cam handle 4.
  • Page 787 module. Controller module cover locking button Step 3: Replace the DIMMs To replace the DIMMs, locate them inside the controller and follow the specific sequence of steps. Steps 1. If you are not already grounded, properly ground yourself. 2. Locate the DIMMs on your controller module. Each system memory DIMM has an LED located on the board next to each DIMM slot.
  • Page 788 3. Eject the DIMM from its slot by slowly pushing apart the two DIMM ejector tabs on either side of the DIMM, and then slide the DIMM out of the slot. Carefully hold the DIMM by the edges to avoid pressure on the components on the DIMM circuit board.
  • Page 789 DIMM ejector tabs DIMM 4. Remove the replacement DIMM from the antistatic shipping bag, hold the DIMM by the corners, and align it to the slot. The notch among the pins on the DIMM should line up with the tab in the socket. 5.
  • Page 790 Step 4: Install the controller After you install the components into the controller module, you must install the controller module back into the system chassis and boot the operating system. For HA pairs with two controller modules in the same chassis, the sequence in which you install the controller module is especially important because it attempts to reboot as soon as you completely seat it in the chassis.
  • Page 791 b. After the node boots to Maintenance mode, halt the node: halt After you issue the command, you should wait until the system stops at the LOADER prompt. During the boot process, you can safely respond to prompts: ▪ A prompt warning that when entering Maintenance mode in an HA configuration, you must ensure that the healthy node remains down.
  • Page 792 If the system-level diagnostics Then… tests… A two-node MetroCluster Proceed to the next step. configuration The MetroCluster switchback procedure is done in the next task in the replacement process. A stand-alone configuration Proceed to the next step. No action is required. You have completed system-level diagnostics.
  • Page 793 Step 6: Switch back aggregates in a two-node MetroCluster configuration After you have completed the FRU replacement in a two-node MetroCluster configuration, you can perform the MetroCluster switchback operation. This returns the configuration to its normal operating state, with the sync-source storage virtual machines (SVMs) on the formerly impaired site now active and serving data from the local disk pools.
  • Page 794 6. Reestablish any SnapMirror or SnapVault configurations. Step 7: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 795 7. Align the bezel with the ball studs, and then gently push the bezel onto the ball studs. 8. After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 796 from the impaired controller storage. About this task If you have a cluster with more than two nodes, it must be in quorum. If the cluster is not in quorum or a healthy controller shows false for eligibility and health, you must correct the issue before shutting down the impaired controller;...
  • Page 797 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with the CLI.
  • Page 798 If the impaired controller… Then… Has not automatically switched Review the veto messages and, if possible, resolve the issue and try over, you attempted switchover again. If you are unable to resolve the issue, contact technical support. with the metrocluster command, and the switchover switchover was vetoed...
  • Page 799 If the healing is vetoed, you have the option of reissuing the command with the metrocluster heal -override-vetoes parameter. If you use this optional parameter, the system overrides any soft vetoes that prevent the healing operation. 7. Verify that the heal operation is complete by using the command on metrocluster operation show the destination cluster:...
  • Page 800 Lettered and numbered I/O cam latch I/O cam latch completely unlocked 4. Set the I/O module aside. 5. Install the replacement I/O module into the chassis by gently sliding the I/O module into the slot until the lettered and numbered I/O cam latch begins to engage with the I/O cam pin, and then push the I/O cam latch all the way up to lock the module in place.
  • Page 801 If your system is in… Issue this command from the partner’s console… An HA pair storage failover giveback -ofnode impaired_node_name A two-node MetroCluster Proceed to the next step. configuration The MetroCluster switchback procedure is done in the next task in the replacement process.
  • Page 802 6. Reestablish any SnapMirror or SnapVault configurations. Step 5: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 803 There is an audible click when the module is secure and connected to the midplane. Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 804 Before you begin • All disk shelves must be working properly. • If your system is in an HA pair, the partner node must be able to take over the node associated with the NVRAM module that is being replaced. •...
  • Page 805 Option 1: Most configs To shut down the impaired controller, you must determine the status of the controller and, if necessary, take over the controller so that the healthy controller continues to serve data from the impaired controller storage. About this task If you have a cluster with more than two nodes, it must be in quorum.
  • Page 806 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with CLI.
  • Page 807 If the impaired controller… Then… Has automatically switched over Proceed to the next step. Has not automatically switched Perform a planned switchover operation from the healthy over controller: metrocluster switchover Has not automatically switched Review the veto messages and, if possible, resolve the issue and over, you attempted switchover try again.
  • Page 808 mcc1A::> metrocluster heal -phase root-aggregates [Job 137] Job succeeded: Heal Root Aggregates is successful If the healing is vetoed, you have the option of reissuing the command with metrocluster heal the -override-vetoes parameter. If you use this optional parameter, the system overrides any soft vetoes that prevent the healing operation.
  • Page 809 Orange release button (gray on empty FlashCache modules) FlashCache cam handle a. Press the orange button on the front of the FlashCache module. The release button on empty FlashCache modules is gray. b. Swing the cam handle out until the module begins to slide out of the old NVRAM module. c.
  • Page 810 Lettered and numbered I/O cam latch I/O latch completely unlocked 4. Set the NVRAM module on a stable surface and remove the cover from the NVRAM module by pushing down on the blue locking button on the cover, and then, while holding down the blue button, slide the lid off the NVRAM module.
  • Page 811 Cover locking button DIMM and DIMM ejector tabs 5. Remove the DIMMs, one at a time, from the old NVRAM module and install them in the replacement NVRAM module. 6. Close the cover on the module. 7. Install the replacement NVRAM module into the chassis: a.
  • Page 812 c. Remove the NVRAM module from the chassis by pulling on the pull tabs on the sides of the module face. Lettered and numbered I/O cam latch I/O latch completely unlocked 3. Set the NVRAM module on a stable surface and remove the cover from the NVRAM module by pushing down on the blue locking button on the cover, and then, while holding down the blue button, slide the lid off the NVRAM module.
  • Page 813 Cover locking button DIMM and DIMM ejector tabs 4. Locate the DIMM to be replaced inside the NVRAM module, and then remove it by pressing down on the DIMM locking tabs and lifting the DIMM out of the socket. Each DIMM has an LED next to it that flashes when the DIMM has failed. 5.
  • Page 814 Select one of the following options for instructions on how to reassign disks to the new controller.
  • Page 815 Option 1: Verify ID (HA pair) Verify the system ID change on an HA system You must confirm the system ID change when you boot the replacement node and then verify that the change was implemented. This procedure applies only to systems running ONTAP in an HA pair. Steps 1.
  • Page 816 node run -node local-node-name partner savecore -s d. Return to the admin privilege level: set -privilege admin 5. Give back the node: a. From the healthy node, give back the replaced node’s storage: storage failover giveback -ofnode replacement_node_name The replacement node takes back its storage and completes booting. If you are prompted to override the system ID due to a system ID mismatch, you should enter y.
  • Page 817 8. If the node is in a MetroCluster configuration, depending on the MetroCluster state, verify that the DR home ID field shows the original owner of the disk if the original owner is a node on the disaster site. This is required if both of the following are true: ◦...
  • Page 818 Ctrl-C, and then select the option to boot to Maintenance mode from the displayed menu. You must enter when prompted to override the system ID due to a system ID mismatch. 2. View the old system IDs from the healthy node: `metrocluster node show -fields node- systemid,dr-partner-systemid` In this example, the Node_B_1 is the old node, with the old system ID of 118073209:...
  • Page 819 *> disk show -a Local System ID: 118065481   DISK OWNER POOL SERIAL NUMBER HOME ------- ------------- ----- ------------- ------------- disk_name system-1 (118065481) Pool0 J8Y0TDZC system-1 (118065481) disk_name system-1 (118065481) Pool0 J8Y09DXC system-1 (118065481) 6. From the healthy node, verify that any coredumps are saved: a.
  • Page 820 Display the results of the MetroCluster check: metrocluster check show e. Run Config Advisor. Go to the Config Advisor page on the NetApp Support Site at support.netapp.com/NOW/download/tools/config_advisor/. After running Config Advisor, review the tool’s output and follow the recommendations in the output to address any issues discovered.
  • Page 821 Restore external key management encryption keys ◦ Step 7: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 822 The green power LED lights when the PSU is fully inserted into the chassis and the amber attention LED flashes initially, but turns off after a few moments. 9. After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 823 continue to function. • You can use this procedure with all versions of ONTAP supported by your system • All other components in the system must be functioning properly; if not, you must contact technical support. Step 1: Shut down the impaired controller You can shut down or take over the impaired controller using different procedures, depending on the storage system hardware configuration.
  • Page 824 About this task • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Return a FIPS drive or SED to unprotected mode" section of NetApp Encryption overview with the...
  • Page 825 • You must leave the power supplies turned on at the end of this procedure to provide power to the healthy controller. Steps 1. Check the MetroCluster status to determine whether the impaired controller has automatically switched over to the healthy controller: metrocluster show 2.
  • Page 826 controller_A_1::> storage aggregate show Aggregate Size Available Used% State #Vols Nodes RAID Status --------- -------- --------- ----- ------- ------ ---------------- ------------ aggr_b2 227.1GB 227.1GB 0% online 0 mcc1-a2 raid_dp, mirrored, normal... 6. Heal the root aggregates by using the command. metrocluster heal -phase root-aggregates mcc1A::>...
  • Page 827 Cam handle release button Cam handle 4. Rotate the cam handle so that it completely disengages the controller module from the chassis, and then slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 5.
  • Page 828 Controller module cover locking button Step 3: Replace the RTC battery To replace the RTC battery, you must locate the failed battery in the controller module, remove it from the holder, and then install the replacement battery in the holder. Steps 1.
  • Page 829 RTC battery RTC battery housing 3. Gently push the battery away from the holder, rotate it away from the holder, and then lift it out of the holder. Note the polarity of the battery as you remove it from the holder. The battery is marked with a plus sign and must be positioned in the holder correctly.
  • Page 830 Steps 1. If you have not already done so, close the air duct or controller module cover. 2. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. Do not completely insert the controller module in the chassis until instructed to do so.
  • Page 831 Steps 1. Verify that all nodes are in the state: enabled metrocluster node show cluster_B::> metrocluster node show Configuration Group Cluster Node State Mirroring Mode ----- ------- -------------- -------------- --------- -------------------- cluster_A   controller_A_1 configured enabled heal roots completed   cluster_B  ...
  • Page 832 6. Reestablish any SnapMirror or SnapVault configurations. Step 6: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 833 Option 1: Add an X91148A module as a NIC module in a system with open slots To add an X91148A module as a NIC module in a system with open slots, you must follow the specific sequence of steps. Steps 1.
  • Page 834 Hot-add - NS224 shelves. Add an X91148A storage module in a system with no open slots - AFF A700 and FAS9000 You must remove one more or more existing NIC or storage modules in your system in order to install one or more X91148A storage modules into your fully-populated system.
  • Page 835 install one or more X91148A NIC modules into your fully-populated system. Steps 1. If you are adding an X91148A module into a slot that contains a NIC module with the same number of ports as the X91148A module, the LIFs will automatically migrate when its controller module is shut down. If the NIC module being replaced has more ports than the X91148A module, you must permanently reassign the affected LIFs to a different home port.
  • Page 836 Lettered and numbered I/O cam latch I/O cam latch completely unlocked 6. Install the X91148A module into the target slot: a. Align the X91148A module with the edges of the slot. b. Slide the X91148A module into the slot until the lettered and numbered I/O cam latch begins to engage with the I/O cam pin.
  • Page 837 Option 2: Adding an X91148A module as a storage module in a system with no open slots You must remove one or more existing NIC or storage modules in your system in order to install one or more X91148A storage modules into your fully-populated system. •...
  • Page 838 Lettered and numbered I/O cam latch I/O cam latch completely unlocked 6. Install the X91148A module into slot 3: a. Align the X91148A module with the edges of the slot. b. Slide the X91148A module into the slot until the lettered and numbered I/O cam latch begins to engage with the I/O cam pin.
  • Page 839: Aff A700S System Documentation

    • Video steps Video step-by-step instructions. Installation and setup PDF poster - AFF A700s You can use the PDF poster to install and set up your new system. The PDF poster provides step-by-step instructions with live links to additional content.
  • Page 840 -node local -auto-giveback-after-panic false Option 1: Check NVE or NSE on systems running ONTAP 9.5 and earlier Before shutting down the impaired controller, you need to check whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If...
  • Page 841 Retrieve and restore all authentication keys and associated key IDs: security key-manager restore -address * If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column displays for all authentication keys and that all key managers...
  • Page 842 Retrieve and restore all authentication keys and associated key IDs: security key-manager restore -address * If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column displays for all authentication keys and that all key managers...
  • Page 843 Option 2: Check NVE or NSE on systems running ONTAP 9.6 and later Before shutting down the impaired controller, you need to verify whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
  • Page 844 Restored yes: a. Restore the external key management authentication keys to all nodes in the cluster: security key-manager external restore If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys: Restored...
  • Page 845 Enter the customer’s onboard key management passphrase at the prompt. If the passphrase cannot be provided, contact NetApp Support. mysupport.netapp.com b. Verify the column shows for all authentication keys: Restored security key-manager key-query c. Verify that the type shows onboard, and then manually back up the OKM Key Manager information.
  • Page 846 Return to admin mode: set -priv admin h. You can safely shut down the controller. Shut down the controller - AFF A700s After completing the NVE or NSE tasks, you need to complete the shutdown of the impaired controller.
  • Page 847 This command may not work if the boot device is corrupted or non-functional. Replace the boot media - AFF A700s You must remove the controller module from the chassis, open it, and then replace the failed boot media.
  • Page 848 Locking latch Locking pin 1. Slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 2. Place the controller module on a stable, flat surface, and then open the air duct: a.
  • Page 849 Risers Air duct Step 2: Replace the boot media - AFF A700s You must locate the failed boot media in the controller module by removing the middle PCIe module on the controller module, locate the failed boot media by the lit LED near the boot media, and then replace the boot media.
  • Page 850 Air duct Riser 2 (middle PCIe module) Boot media screw Boot media 3. Locate the failed boot media by the lit LED on the controller module motherboard. 4. Remove the boot media from the controller module: a. Using a #1 Phillips head screwdriver, remove the screw holding down the boot media and set the screw aside in a safe place.
  • Page 851 Slide the air duct toward the risers until it clicks into place. Transfer the boot image to the boot media - AFF A700s You can install the system image to the replacement boot media using by using either the...
  • Page 852 Air duct Risers 3. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. 4. Reinstall the cable management device and recable the system, as needed. When recabling, remember to reinstall the media converters (SFPs) if they were removed.
  • Page 853 • A copy of the same image version of ONTAP as what the impaired controller was running. You can download the appropriate image from the Downloads section on the NetApp Support Site ◦ If NVE is enabled, download the image with NetApp Volume Encryption, as indicated in the download button.
  • Page 854 Air duct Risers 3. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. 4. Reinstall the cable management device and recable the system, as needed. When recabling, remember to reinstall the media converters (SFPs) if they were removed.
  • Page 855 <value> a. Check the boot environment variables: ▪ bootarg.init.boot_clustered ▪ partner-sysid ▪ for AFF C190/AFF A220 (All Flash FAS) bootarg.init.flash_optimized ▪ for AFF A220 and All SAN Array bootarg.init.san_optimized ▪ bootarg.init.switchless_cluster.enable b. If External Key Manager is enabled, check the bootarg values, listed in the...
  • Page 856 Restore automatic giveback if you disabled it using the storage command. failover modify 19. Exit advanced privilege level on the healthy controller. Boot the recovery image - AFF A700s You must boot the ONTAP image from the USB drive, restore the file system, and verify...
  • Page 857 the environmental variables. 1. From the LOADER prompt, boot the recovery image from the USB flash drive: boot_recovery The image is downloaded from the USB flash drive. 2. When prompted, either enter the name of the image or accept the default image displayed inside the brackets on your screen.
  • Page 858 Restore OKM, NSE, and NVE as needed - AFF A700s Once environment variables are checked, you must complete steps specific to systems that have Onboard Key Manager (OKM), NetApp Storage Encryption (NSE) or NetApp Volume Encryption (NVE) enabled. Determine which section you should use to restore your OKM, NSE, or NVE configurations: If NSE or NVE are enabled along with Onboard Key Manager you must restore settings you captured at the beginning of this procedure.
  • Page 859 If the console Then… displays… The LOADER prompt Boot the controller to the boot menu: boot_ontap menu Waiting for giveback… a. Enter at the prompt Ctrl-C b. At the message: Do you wish to halt this controller rather than wait [y/n]? , enter: c.
  • Page 860 9. Confirm the target controller is ready for giveback with the storage failover show command. 10. Give back only the CFO aggregates with the storage failover giveback -fromnode local -only-cfo command. -aggregates true ◦ If the command fails because of a failed disk, physically disengage the failed disk, but leave the disk in the slot until a replacement is received.
  • Page 861 If giveback is not complete after 20 minutes, contact Customer Support. 18. At the clustershell prompt, enter the command to list the logical net int show -is-home false interfaces that are not on their home controller and port. If any interfaces are listed as false, revert those interfaces back to their home port using the net int command.
  • Page 862 This command does not work if NVE (NetApp Volume Encryption) is configured 10. Use the security key-manager query to display the key IDs of the authentication keys that are stored on the key management servers.
  • Page 863 If the console Then… displays… The login prompt Go to Step 7. Waiting for giveback… a. Log into the partner controller. b. Confirm the target controller is ready for giveback with the storage command. failover show 4. Move the console cable to the partner controller and give back the target controller storage using the storage failover giveback -fromnode local -only-cfo-aggregates true local command.
  • Page 864 -auto-giveback true Return the failed part to NetApp - AFF A700s After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 865 -skip-lif-migration-before-shutdown true Answer when prompted. Replace hardware - AFF A700s Move the power supplies, hard drives, and controller module or modules from the impaired chassis to the new chassis, and swap out the impaired chassis from the equipment rack or system cabinet with the new chassis of the same model as the...
  • Page 866 Step 1: Remove the controller modules To replace the chassis, you must remove the controller modules from the old chassis. 1. If you are not already grounded, properly ground yourself. 2. Unplug the controller module power supply from the source, and then unplug the cable from the power supply.
  • Page 867 Step 2: Move drives to the new chassis You need to move the drives from each bay opening in the old chassis to the same bay opening in the new chassis. 1. Gently remove the bezel from the front of the system. 2.
  • Page 868 Complete the restoration and replacement process - AFF A700s You must verify the HA state of the chassis, run diagnostics, and return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Step 1: Verify and set the HA state of the chassis You must verify the HA state of the chassis, and, if necessary, update the state to match your system configuration.
  • Page 869 ◦ If the test reported no failures, select Reboot from the menu to reboot the system. Step 3: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 870 This provides you a record of the procedure so that you can troubleshoot any issues that you might encounter during the replacement process. Shut down the impaired controller - AFF A700s To shut down the impaired controller, you must determine the status of the controller and, if necessary, take over the controller so that the healthy controller continues to serve data from the impaired controller storage.
  • Page 871 When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y. Replace the ontroller module hardware - AFF A700s To replace the controller module hardware, you must remove the impaired controller, move FRU components to the replacement controller module, install the replacement controller module in the chassis, and then boot the system to Maintenance mode.
  • Page 872 Locking latch Locking pin 6. Slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 7. Place the controller module on a stable, flat surface, and then open the air duct: a.
  • Page 873 Air duct locking tabs Risers Air duct Step 2: Move the NVRAM card As part of the controller replacement process, you must remove the NVRAM card from Riser 1 in the impaired controller module and install the card into Riser 1 of the replacement controller module. You should only reinstall Riser 1 into the replacement controller module after you have moved the DIMMs from the impaired controller module to the replacement controller module.
  • Page 874 Air duct Riser 1 locking latch NVRAM battery cable plug connecting to the NVRAM card Card locking bracket NVRAM card 2. Remove the NVRAM card from the riser module: a. Turn the riser module so that you can access the NVRAM card. b.
  • Page 875 c. Connect the battery cable to the socket on the NVRAM card. d. Swing the locking latch into the locked position and make sure that it locks in place. Step 3: Move PCIe cards As part of the controller replacement process, you must remove both PCIe riser modules, Riser 2 (the middle riser) and Riser 3 (riser on the far right) from the impaired controller module, remove the PCIe cards from the riser modules, and install them in the same riser modules in the replacement controller module.
  • Page 876 Step 4: Move the boot media There are two boot media devices in the AFF A700s, a primary and a secondary or backup boot media. You must move them from the impaired controller to the replacement controller and install them into their respective slots in the replacement controller.
  • Page 877 Air duct Riser 2 (middle PCIe module) Boot media screw Boot media 2. Remove the boot media from the controller module: a. Using a #1 Phillips head screwdriver, remove the screw holding down the boot media and set the screw aside in a safe place.
  • Page 878 b. Rotate the boot media down toward the motherboard. c. Secure the boot media to the motherboard using the boot media screw. Do not over-tighten the screw or you might damage the boot media. Step 5: Move the fans You must move the fans from the impaired controller module to the replacement module when replacing a failed controller module.
  • Page 879 Air duct Riser 1 and DIMM bank 1-4 Riser 2 and DIMM banks 5-8 and 9-12 Riser 3 and DIMM bank 13-16 2. Note the orientation of the DIMM in the socket so that you can insert the DIMM in the replacement controller module in the proper orientation.
  • Page 880 Step 7: Install the NVRAM module To install the NVRAM module, you must follow the specific sequence of steps. 1. Install the riser into the controller module: a. Align the lip of the riser with the underside of the controller module sheet metal. b.
  • Page 881 into the slots on the battery pack, and the battery pack latch engages and locks into place. b. Press firmly down on the battery pack to make sure that it is locked into place. c. Plug the battery plug into the riser socket and make sure that the plug locks into place. Step 9: Install a PCIe riser To install a PCIe riser, you must follow a specific sequence of steps.
  • Page 882 Power supply 3. Move the power supply to the new controller module, and then install it. 4. Using both hands, support and align the edges of the power supply with the opening in the controller module, and then gently push the power supply into the controller module until the locking tab clicks into place.
  • Page 883 Maintenance mode. Be sure to exit Maintenance mode after completing the conversion. Restore and verify the system configuration - AFF A700s After completing the hardware replacement and booting to Maintenance mode, you verify the low-level system configuration of the replacement controller and reconfigure system...
  • Page 884 Step 1: Set and verify system time after replacing the controller You should check the time and date on the replacement controller module against the healthy controller module in an HA pair, or against a reliable time server in a stand-alone configuration. If the time and date do not match, you must reset them on the replacement controller module to prevent possible outages on clients due to time differences.
  • Page 885 Recable the system and reassign disks - AFF A700s To complete the replacement procedure and restore your system to full operation, you must recable the storage, restore the NetApp Storage Encryption configuration (if necessary), and install licenses for the new controller. You must complete a series of tasks before restoring your system to full operation.
  • Page 886 Step 2: Reassign disks If the storage system is in an HA pair, the system ID of the new controller module is automatically assigned to the disks when the giveback occurs at the end of the procedure. You must confirm the system ID change when you boot the replacement controller and then verify that the change was implemented.
  • Page 887 1873775277 Pool0 Complete system restoration - AFF A700s To restore your system to full operation, you must restore the NetApp Storage Encryption configuration (if necessary), and install licenses for the new controller, and return the failed part to NetApp, as described in the RMA instructions shipped with the kit.
  • Page 888 Steps 1. If you need new license keys, obtain replacement license keys on the NetApp Support Site in the My Support section under Software licenses. The new license keys that you require are automatically generated and sent to the email address on file.
  • Page 889 -node local -auto -giveback true Step 4: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 890 If the impaired controller is Then… displaying… System prompt or password Take over or halt the impaired controller: storage failover prompt (enter system password) takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y. Remove the controller module You must remove the controller module from the chassis when you replace the controller module or replace a component inside the controller module.
  • Page 891 Locking latch Locking pin 6. Slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 7. Place the controller module on a stable, flat surface, and then open the air duct: a.
  • Page 892 2. Remove the applicable riser. Air duct cover Riser 1 and DIMM bank 1-4 Riser 2 and DIMM bank 5-8 and 9-12 Riser 3 and DIMM 13-16 ◦ If you are removing or moving a DIMM in bank 1-4, unplug the NVRAM battery, unlock the locking latch on Riser 1, and then remove the riser.
  • Page 893 6. Make sure that the DIMM ejector tabs on the connector are in the open position, and then insert the DIMM squarely into the slot. The DIMM fits tightly in the slot, but should go in easily. If not, realign the DIMM with the slot and reinsert it. Visually inspect the DIMM to verify that it is evenly aligned and fully inserted into the slot.
  • Page 894 ◦ If the test reported no failures, select Reboot from the menu to reboot the system. Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 895 How you replace the disk depends on how the disk drive is being used. If SED authentication is enabled, you must use the SED replacement instructions in the ONTAP 9 NetApp Encryption Power Guide. These Instructions describe additional steps you must perform before and after replacing an SED.
  • Page 896 Option 1: Replace SSD 1. If you want to manually assign drive ownership for the replacement drive, you need to disable automatic drive assignment replacement drive, if it is enabled You manually assign drive ownership and then reenable automatic drive assignment later in this procedure.
  • Page 897 -node node_name -autoassign on You must reenable automatic drive assignment on both controller modules. 10. Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 898 Depending on the storage system, the disk drives have the release button located at the top or on the left of the disk drive face. For example, the following illustration shows a disk drive with the release button located on the top of the disk drive face: The cam handle on the disk drive springs open partially and the disk drive releases from the midplane.
  • Page 899 13. Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp Support, 888-463-8277 (North America), 00-800-44-638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure.
  • Page 900 module or replace a component inside the controller module. 1. If you are not already grounded, properly ground yourself. 2. Loosen the hook and loop strap binding the cables to the cable management device, and then unplug the system cables and SFPs (if needed) from the controller module, keeping track of where the cables were connected.
  • Page 901 Air duct locking tabs Risers Air duct Replacing a fan - AFF A700s To replace a fan, remove the failed fan module and replace it with a new fan module. 1. If you are not already grounded, properly ground yourself.
  • Page 902 4. Align the edges of the replacement fan module with the opening in the controller module, and then slide the replacement fan module into the controller module until the locking latches click into place. Reinstall the controller module - AFF A700s After you replace a component within the controller module, you must reinstall the controller module in the system chassis and boot it.
  • Page 903 Locking tabs Slide plunger 3. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. Do not completely insert the controller module in the chassis until instructed to do so. 4.
  • Page 904 -node local -auto -giveback true Return the failed part to NetApp - AFF A700s After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 905 –node local -auto-giveback false 3. Take the impaired controller to the LOADER prompt: If the impaired controller is Then… displaying… The LOADER prompt Go to Remove controller module. Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode...
  • Page 906 Locking latch Locking pin 6. Slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 7. Set the controller module aside in a safe place. Replace the NVRAM battery To replace the NVRAM battery, you must remove the failed NVRAM battery from the controller module and install the replacement NVRAM battery into the controller module.
  • Page 907 NVRAM battery plug Blue NVRAM battery locking tab 3. Locate the battery plug and squeeze the clip on the face of the battery plug to release the plug from the socket, and then unplug the battery cable from the socket. 4.
  • Page 908 Locking tabs Slide plunger 3. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. Do not completely insert the controller module in the chassis until instructed to do so. 4.
  • Page 909 -node local -auto -giveback true Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 910 When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y. ◦ If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the “Returning SEDs to unprotected mode” section of the ONTAP 9 NetApp Encryption Power Guide.
  • Page 911 Leave the cables in the cable management device so that when you reinstall the cable management device, the cables are organized. 3. Unplug the controller module power supply from the source, and then unplug the cable from the power supply. 4.
  • Page 912 Air duct locking tabs Risers Air duct Remove the NVRAM card Replacing the NVRAM consist of removing the NVRAM riser, Riser 1, from the controller module, disconnecting the NVRAM battery from the NVRAM card, removing the failed NVRAM card and installing the replacement NVRAM card, and then reinstalling the NVRAM riser back into the controller module.
  • Page 913 Air duct Riser 1 locking latch NVRAM battery cable plug connecting to the NVRAM card Card locking bracket NVRAM card 3. Remove the NVRAM card from the riser module: a. Turn the riser module so that you can access the NVRAM card. b.
  • Page 914 d. Swing the locking latch into the locked position and make sure that it locks in place. 5. Install the riser into the controller module: a. Align the lip of the riser with the underside of the controller module sheet metal. b.
  • Page 915 Verify the system ID change on an HA system You must confirm the system ID change when you boot the replacement controller and then verify that the change was implemented. This procedure applies only to systems running ONTAP in an HA pair. 1.
  • Page 916 Encryption functionality. You can skip this task on storage systems that do not have Storage or Volume Encryption enabled. Step 1. Restore Storage or Volume Encryption functionality by using the appropriate procedure in NetApp Encryption overview with the CLI.
  • Page 917 Restore external key management encryption keys ◦ Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 918 If the impaired controller is Then… displaying… Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y.
  • Page 919 Locking latch Locking pin 6. Slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 7. Place the controller module on a stable, flat surface, and then open the air duct: a.
  • Page 920 Air duct locking tabs Risers Air duct Step 3: Replace a PCIe card To replace a PCIe card, you must remove the cabling and any SFPs from the ports on the PCIe cards in the target riser, remove the riser from the controller module, remove and replace the PCIe card, reinstall the riser, and recable it.
  • Page 921 Air duct Riser locking latch Card locking bracket Riser 2 (middle riser) and PCI cards in riser slots 2 and 3. 3. Remove the PCIe card from the riser: a. Turn the riser so that you can access the PCIe card. b.
  • Page 922 module. c. Swing the locking latch down and click it into the locked position. When locked, the locking latch is flush with the top of the riser and the riser sits squarely in the controller module. d. Reinsert any SFP modules that were removed from the PCIe cards. Step 4: Reinstall the controller module After you replace a component within the controller module, you must reinstall the controller module in the system chassis and boot it.
  • Page 923 -node local -auto -giveback true Step 5: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 924 • This procedure is written for replacing one power supply at a time. It is a best practice to replace the power supply within two minutes of removing it from the chassis. The system continues to function, but ONTAP sends messages to the console about the degraded power supply until the power supply is replaced.
  • Page 925 Once power is restored to the power supply, the status LED should be green. 8. After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 926 –node local -auto-giveback false 3. Take the impaired controller to the LOADER prompt: If the impaired controller is Then… displaying… The LOADER prompt Go to Remove controller module. Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode...
  • Page 927 Locking latch Locking pin 6. Slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 7. Place the controller module on a stable, flat surface, and then open the air duct: a.
  • Page 928 Air duct locking tabs Risers Air duct Replace the RTC battery To replace the RTC battery, locate it inside the controller and follow the specific sequence of steps. 1. If you are not already grounded, properly ground yourself. 2. Locate the RTC battery.
  • Page 929 Air duct RTC battery and housing 3. Gently push the battery away from the holder, rotate it away from the holder, and then lift it out of the holder. Note the polarity of the battery as you remove it from the holder. The battery is marked with a plus sign and must be positioned in the holder correctly.
  • Page 930 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure. System-Level Diagnostics for AFF A700s System-Level Diagnostics for AFF A700s is available outside this library. You will be prompted to log in using your NetApp Support Site credentials. AFF A700s System-Level Diagnostics...
  • Page 931: Aff A800 System Documentation

    Video two of two: Perform end-to-end software configuration The following video shows end-to-end software configuration for systems running ONTAP 9.2 and later. NetApp video: Software configuration for vSphere NAS datastores for FAS/AFF systems running ONTAP 9.2 Detailed steps - AFF A800...
  • Page 932 Step 1: Prepare for installation To install your AFF A800 system, you need to create an account and register the system. You also need to inventory the appropriate number and type of cables for your system and collect specific network information.
  • Page 933 4. Download and complete the Cluster Configuration Worksheet. Step 2: Install the hardware You need to install your system in a 4-post rack or NetApp system cabinet, as applicable. Steps 1. Install the rail kits, as needed. Installing SuperRail into a four-post rack 2.
  • Page 934 3. Attach cable management devices (as shown). 4. Place the bezel on the front of the system. Step 3: Cable controllers There is required cabling for your platform’s cluster using the two-node switchless cluster method or the cluster interconnect network method. There is optional cabling to the Fibre Channel or iSCSI host networks or direct- attached storage.
  • Page 935 Steps 1. Use the animation (Cable a two-node switchless cluster) or the step-by-step instructions to complete the cabling between the controllers and to the switches: Step Perform on each controller module Cable the HA interconnect ports: • e0b to e0b •...
  • Page 936 Step Perform on each controller module Cable the management ports to the management network switches DO NOT plug in the power cords at this point. 2. To perform optional cabling, see: [Option 1: Connect to a Fibre Channel host] ◦ [Option 2: Connect to a 10GbE host] ◦...
  • Page 937 As you insert the connector, you should feel it click into place; if you do not feel it click, remove it, turn it around and try again. Steps 1. Use the animation (Cabling a switched cluster) or the step-by-step instructions to complete the cabling between the controllers and to the switches: Step Perform on each controller module...
  • Page 938 Step Perform on each controller module Cable the cluster interconnect ports to the 100 GbE cluster interconnect switches. Cable the management ports to the management network switches DO NOT plug in the power cords at this point.
  • Page 939 2. To perform optional cabling, see: [Option 1: Connect to a Fibre Channel host] ◦ [Option 2: Connect to a 10GbE host] ◦ [Option 3: Connect to a single direct-attached NS224 drive shelf] ◦ [Option 4: Connect to two direct-attached NS224 drive shelves] ◦...
  • Page 940 Step Perform on each controller module Cable ports 2a through 2d to the FC host switches. To perform other optional cabling, choose from: • [Option 3: Connect to a single direct-attached NS224 drive shelf] • [Option 4: Connect to two direct-attached NS224 drive shelves] To complete setting up your system, see Step 4: Complete system setup and configuration.
  • Page 941 Step Perform on each controller module Cable ports e4a through e4d to the 10GbE host network switches. To perform other optional cabling, choose from: • [Option 3: Connect to a single direct-attached NS224 drive shelf] • [Option 4: Connect to two direct-attached NS224 drive shelves] To complete setting up your system, see Step 4: Complete system setup and configuration.
  • Page 942 your controller modules to a single shelf. Step Perform on each controller module Cable controller A to the shelf: Cable controller B to the shelf: 2. To complete setting up your system, see Step 4: Complete system setup and configuration.
  • Page 943 Option 4: Cable the controllers to two drive shelves You must cable each controller to the NSM modules on both NS224 drive shelves. Before you begin Be sure to check the illustration arrow for the proper cable connector pull-tab orientation. As you insert the connector, you should feel it click into place;...
  • Page 944 Step Perform on each controller module Cable controller B to the shelves: 2. To complete setting up your system, see Step 4: Complete system setup and configuration. Step 4: Complete system setup and configuration Complete the system setup and configuration using cluster discovery with only a connection to the switch and laptop, or by connecting directly to a controller in the system and then connecting to the management switch.
  • Page 945 the Management switch. 4. Select an ONTAP icon listed to discover: a. Open File Explorer. b. Click Network in the left pane. c. Right-click and select refresh. d. Double-click either ONTAP icon and accept any certificates displayed on your screen. XXXXX is the system serial number for the target node.
  • Page 946 c. Connect the laptop or console to the switch on the management subnet. d. Assign a TCP/IP address to the laptop or console, using one that is on the management subnet. 2. Plug the power cords into the controller power supplies, and then connect them to power sources on different circuits.
  • Page 947 ONTAP. Maintain Boot media Overview of boot media replacement - AFF A800 • You must replace the failed component with a replacement FRU component you received from your provider. • It is important that you apply the commands in these steps on the correct controller: ◦...
  • Page 948 Option 1: Check NVE or NSE on systems running ONTAP 9.5 and earlier Before shutting down the impaired controller, you need to check whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
  • Page 949 Verify that the column displays for all authentication keys and that all key managers Restored display available: security key-manager query c. Shut down the impaired controller. 3. If you saw the message This command is not supported when onboard key management is enabled,...
  • Page 950 Retrieve and restore all authentication keys and associated key IDs: security key-manager restore -address * If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column displays for all authentication keys and that all key managers...
  • Page 951 Option 2: Check NVE or NSE on systems running ONTAP 9.6 and later Before shutting down the impaired controller, you need to verify whether the system has either NetApp Volume Encryption (NVE) or NetApp Storage Encryption (NSE) enabled. If so, you need to verify the configuration.
  • Page 952 If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the Restored column equals for all authentication keys: security key- manager key-query c. Shut down the impaired controller. 3. If the type displays and the column displays anything other than...
  • Page 953 Restored column displays anything other than yes: a. Enter the onboard security key-manager sync command: security key-manager external sync If the command fails, contact NetApp Support. mysupport.netapp.com b. Verify that the column equals for all authentication keys: Restored security key- manager key-query c.
  • Page 954 Shut down the controller - AFF A800 After completing the NVE or NSE tasks, you need to complete the shutdown of the impaired controller. Shut down or take over the impaired controller using the appropriate procedure for your configuration. Option 1: Most systems After completing the NVE or NSE tasks, you need to complete the shutdown of the impaired controller.
  • Page 955 When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y. Replace the boot media - AFF A800 To replace the boot media, you must remove the impaired controller module, install the replacement boot media, and transfer the boot image to a USB flash drive.
  • Page 956 Locking latch Locking pin 7. Slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 8. Place the controller module on a stable, flat surface, and then open the air duct: a.
  • Page 957 Air duct locking tabs Slide air duct towards fan modules Rotate air duct towards fan modules Step 2: Replace the boot media You locate the failed boot media in the controller module by removing Riser 3 on the controller module before you can replace the boot media.
  • Page 958 Air duct Riser 3 Phillips #1 screwdriver Boot media screw Boot media 2. Remove the boot media from the controller module: a. Using a #1 Phillips head screwdriver, remove the screw holding down the boot media and set the screw aside in a safe place.
  • Page 959 Steps 1. Download and copy the appropriate service image from the NetApp Support Site to the USB flash drive. a. Download the service image to your work space on your laptop. b. Unzip the service image.
  • Page 960 Air duct Risers 3. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system. 4. Reinstall the cable management device and recable the system, as needed. When recabling, remember to reinstall the media converters (SFPs or QSFPs) if they were removed.
  • Page 961 Boot the recovery image - AFF A800 You must boot the ONTAP image from the USB drive, restore the file system, and verify the environmental variables. 1. From the LOADER prompt, boot the recovery image from the USB flash drive: boot_recovery The image is downloaded from the USB flash drive.
  • Page 962 If your system has… Then… No network connection and is in a a. Press when prompted to restore the backup configuration. MetroCluster IP configuration b. Reboot the system when prompted by the system. c. Wait for the iSCSI storage connections to connect. You can proceed after you see the following messages: date-and-time [node- name:iscsi.session.stateChanged:notice]:...
  • Page 963 Restore OKM, NSE, and NVE as needed - AFF A800 Once environment variables are checked, you must complete steps specific to systems that have Onboard Key Manager (OKM), NetApp Storage Encryption (NSE) or NetApp Volume Encryption (NVE) enabled. Determine which section you should use to restore your OKM, NSE, or NVE configurations: If NSE or NVE are enabled along with Onboard Key Manager you must restore settings you captured at the beginning of this procedure.
  • Page 964 3. Check the console output: If the console Then… displays… The LOADER prompt Boot the controller to the boot menu: boot_ontap menu Waiting for giveback… a. Enter at the prompt Ctrl-C b. At the message: Do you wish to halt this controller rather than wait [y/n]? , enter: c.
  • Page 965 8. Move the console cable to the partner controller and login as admin. 9. Confirm the target controller is ready for giveback with the command. storage failover show 10. Give back only the CFO aggregates with the storage failover giveback -fromnode local -only-cfo -aggregates true command.
  • Page 966 If giveback is not complete after 20 minutes, contact Customer Support. 18. At the clustershell prompt, enter the command to list the logical net int show -is-home false interfaces that are not on their home controller and port. If any interfaces are listed as false, revert those interfaces back to their home port using the net int command.
  • Page 967 This command does not work if NVE (NetApp Volume Encryption) is configured 10. Use the security key-manager query to display the key IDs of the authentication keys that are stored on the key management servers.
  • Page 968 If the console Then… displays… The login prompt Go to Step 7. Waiting for giveback… a. Log into the partner controller. b. Confirm the target controller is ready for giveback with the storage command. failover show 4. Move the console cable to the partner controller and give back the target controller storage using the storage failover giveback -fromnode local -only-cfo-aggregates true local command.
  • Page 969 -auto-giveback true Return the failed part to NetApp - AFF A800 After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 970 -node second_node_name -ignore-quorum-warnings true -skip-lif-migration-before-shutdown true Answer when prompted. Move and replace hardware - AFF A800 Move the power supplies, hard drives, and controller module or modules from the impaired chassis to the new chassis, and swap out the impaired chassis from the...
  • Page 971 equipment rack or system cabinet with the new chassis of the same model as the impaired chassis. Step 1: Remove the controller modules To replace the chassis, you must remove the controller modules from the old chassis. 1. If you are not already grounded, properly ground yourself. 2.
  • Page 972 7. Set the controller module aside in a safe place, and repeat these steps for the other controller module in the chassis. Step 2: Move drives to the new chassis You need to move the drives from each bay opening in the old chassis to the same bay opening in the new chassis.
  • Page 973 Complete the restoration and replacement process - AFF A800 You must verify the HA state of the chassis, run diagnostics, and return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Step 1: Verify and set the HA state of the chassis You must verify the HA state of the chassis, and, if necessary, update the state to match your system configuration.
  • Page 974 ◦ If the test reported no failures, select Reboot from the menu to reboot the system. Step 3: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at...
  • Page 975 Do not downgrade the BIOS version of the replacement controller to match the partner controller or the old controller module. Shut down the impaired controller - AFF A800 Shut down or take over the impaired controller using the appropriate procedure for your configuration.
  • Page 976 The following AutoSupport message suppresses automatic case creation for two hours: cluster1:*> system node autosupport invoke -node * -type all -message MAINT=2h 2. Disable automatic giveback from the console of the healthy controller: storage failover modify –node local -auto-giveback false 3.
  • Page 977 When the impaired controller shows Waiting for giveback…, press Ctrl-C, and then respond y. Replace the controller module hardware - AFF A800 To replace the controller module hardware, you must remove the impaired controller, move FRU components to the replacement controller module, install the replacement controller module in the chassis, and then boot the system to Maintenance mode.
  • Page 978 Locking latch Locking pin 7. Slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 8. Place the controller module on a stable, flat surface, and then open the air duct: a.
  • Page 979 Air duct locking tabs Slide air duct towards fan modules Rotate air duct towards fan modules Step 2: Move the power supplies You must move the power supplies from the impaired controller module to the replacement controller module when you replace a controller module. 1.
  • Page 980 Blue power supply locking tab Power supply 2. Move the power supply to the new controller module, and then install it. 3. Using both hands, support and align the edges of the power supply with the opening in the controller module, and then gently push the power supply into the controller module until the locking tab clicks into place.
  • Page 981 Fan locking tabs Fan module 2. Move the fan module to the replacement controller module, and then install the fan module by aligning its edges with the opening in the controller module, and then sliding the fan module into the controller module until the locking latches click into place.
  • Page 982 Air duct riser NVDIMM battery plug NVDIMM battery pack Attention: The NVDIMM battery control board LED blinks while destaging contents to the flash memory when you halt the system. After the destage is complete, the LED turns off. 2. Locate the battery plug and squeeze the clip on the face of the battery plug to release the plug from the socket, and then unplug the battery cable from the socket.
  • Page 983 Step 5: Remove the PCIe risers As part of the controller replacement process, you must remove the PCIe modules from the impaired controller module. You must install them into the same location in the replacement controller module once the NVDIMMS and DIMMs have moved to the replacement controller module.
  • Page 984 2. Eject the DIMM from its slot by slowly pushing apart the two DIMM ejector tabs on either side of the DIMM, and then slide the DIMM out of the slot. Carefully hold the DIMM by the edges to avoid pressure on the components on the DIMM circuit board.
  • Page 985 7. Repeat the preceding steps to move the other NVDIMM. Step 8: Move the boot media There is one boot media device in the AFF A800. You must move it from the impaired controller and install it in the replacement controller.
  • Page 986 Air duct Riser 3 Phillips #1 screwdriver Boot media screw Boot media 2. Remove the boot media from the controller module: a. Using a #1 Phillips head screwdriver, remove the screw holding down the boot media and set the screw aside in a safe place.
  • Page 987 Step 9: Install the PCIe risers You install the PCIe risers in the replacement controller module after moving the DIMMs, NVDIMMs, and boot media. 1. Install the riser into the replacement controller module: a. Align the lip of the riser with the underside of the controller module sheet metal. b.
  • Page 988 If you have not already done so, reinstall the cable management device. d. Interrupt the normal boot process by pressing Ctrl-C. Restore and verify the system configuration - AFF A800 After completing the hardware replacement and booting to Maintenance mode, you verify the low-level system configuration of the replacement controller and reconfigure system settings as necessary.
  • Page 989 • The replacement node is the new node that replaced the impaired node as part of this procedure. • The healthy node is the HA partner of the replacement node. Steps 1. If the replacement node is not at the LOADER prompt, halt the system to the LOADER prompt. 2.
  • Page 990 ▪ A prompt warning that when entering Maintenance mode in an HA configuration you must ensure that the healthy controller remains down. You can safely respond to these prompts. Recable the system and reassign disks - AFF A800 Continue the replacement procedure by recabling the storage and confirming disk reassignment. Step 1: Recable the system After running diagnostics, you must recable the controller module’s storage and network connections.
  • Page 991 1. If the replacement controller is in Maintenance mode (showing the *> prompt, exit Maintenance mode and go to the LOADER prompt: halt 2. From the LOADER prompt on the replacement controller, boot the controller, entering if you are prompted to override the system ID due to a system ID mismatch:boot_ontap 3.
  • Page 992 Find the High-Availability Configuration content for your version of ONTAP 9 b. After the giveback has been completed, confirm that the HA pair is healthy and that takeover is possible: storage failover show The output from the storage failover show command should not include the System ID changed on partner message.
  • Page 993 -node replacement-node-name -onreboot true Complete system restoration - AFF A800 To restore your system to full operation, you must restore the NetApp Storage Encryption configuration (if necessary), and install licenses for the new controller, and return the failed part to NetApp, as described in the RMA instructions shipped with the kit.
  • Page 994 -node local -auto -giveback true Step 4: Return the failed part to NetApp After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp...
  • Page 995 Replace a DIMM - AFF A800 You must replace a DIMM in the controller module when your system registers an increasing number of correctable error correction codes (ECC); failure to do so causes a system panic. All other components in the system must be functioning properly; if not, you must contact technical support.
  • Page 996 3. Take the impaired controller to the LOADER prompt: If the impaired controller is Then… displaying… The LOADER prompt Go to Remove controller module. Waiting for giveback… Press Ctrl-C, and then respond when prompted. System prompt or password Take over or halt the impaired controller from the healthy controller: prompt (enter system password) storage failover takeover -ofnode impaired_node_name...
  • Page 997 Locking latch Locking pin 7. Slide the controller module out of the chassis. Make sure that you support the bottom of the controller module as you slide it out of the chassis. 8. Place the controller module on a stable, flat surface, and then open the air duct: a.
  • Page 998 Air duct locking tabs Slide air duct towards fan modules Rotate air duct towards fan modules Step 3: Replace a DIMM To replace a DIMM, you must locate it in the controller module using the DIMM map label on top of the air duct or locating it using the LED next to the DIMM, and then replace it following the specific sequence of steps.
  • Page 999 Air duct cover Riser 1 and DIMM bank 1, and 3-6 Riser 2 and DIMM Riser 3 and DIMM 19 -22 and 24 bank 7-10, 12-13, and 15-18 Note: Slot 2 and 14 are left empty. Do not attempt to install DIMMs into these slots. 2.
  • Page 1000 Carefully hold the DIMM by the edges to avoid pressure on the components on the DIMM circuit board. 4. Remove the replacement DIMM from the antistatic shipping bag, hold the DIMM by the corners, and align it to the slot. The notch among the pins on the DIMM should line up with the tab in the socket.

Table of Contents