Page 1
Sun StorEdge 3900 and 6900 ™ Series 2.0 Troubleshooting Guide Sun Microsystems, Inc. 4150 Network Circle Santa Clara, CA 95054 U.S.A. 650-960-1300 Part No. 816-5255-12 March 2003, Revision A Send comments about this document to: docfeedback@sun.com...
Page 2
Sun, Sun Microsystems, le logo Sun, AnswerBook2, Sun StorEdge, StorTools, docs.sun.com, Sun Enterprise, Sun Fire, SunOS, Netra, SunSolve, et Solaris sont des marques de fabrique ou des marques déposées, ou marques de service, de Sun Microsystems, Inc. aux Etats-Unis et dans d’autres pays.
Contents Preface XV How This Book Is Organized XV Using UNIX Commands XVI Typographic Conventions XVII Shell Prompts XVII Related Documentation XVIII Accessing Sun Documentation Online XX Sun Welcomes Your Comments XX Introduction 1 Predictive Failure Analysis (PFA) Capabilities 2 General Troubleshooting Procedures 3 High-Level Troubleshooting Tasks 3 Host-Side Troubleshooting 6...
Page 4
Sun StorEdge 6900 Series Multipathing Example 11 Multipathing Options in the Sun StorEdge 6900 Series 16 Manually Halting the I/O 17 To Quiesce the I/O 17 To Unconfigure the c2 Path 17 Suspending the I/O 18 To Put the c2 Path Back into Production 19 To View the Dynamic Multi-Pathing (DMP) Properties 20 To Put the DMP-Enabled Paths Back into Production 22 Troubleshooting Tools 23...
Page 5
Suspending the I/O on the A3 to B3 Link 59 Troubleshooting the A4 or B4 FC Link 60 Verifying the Data Host 62 Sun StorEdge 3900 Series 62 Sun StorEdge 6900 Series 62 FRU Tests Available for the A4 or B4 FC Link Segment 64...
Page 6
Service and Diagnostic Codes 108 Retrieving Service Information 108 CLI Interface 108 Error Log Analysis Commands 109 To Display the Log Files and Retrieve SRNs 109 Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only...
Page 7
To Clear the Log 110 Virtualization Engine LEDs 110 Power LED Codes 111 Interpreting LED Service and Diagnostic Codes 111 Back Panel Features 112 Ethernet Port LEDs 112 FC Link Error Status Report 113 To Check the FC Link Error Status Manually 113 Translating Host-Device Names 115 Displaying the VLUN Serial Number 116 To Display Devices That are Not Sun StorEdge Traffic Manager (MPxIO)-...
Page 8
Virtualization Engine Error Messages 164 Switch Error Messages 168 Sun StorEdge T3+ Array Partner Group Error Messages 171 Other Error Messages 175 VIII Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only Interface (CLI) 142...
Qlogic SANblade Manager HBA Driver and Firmware Versions 33 FIGURE 3-3 QLogic SANblade Manager Diagnostics 34 FIGURE 3-4 Sun StorEdge 3900 Series FC Link Diagram 39 FIGURE 5-1 Sun StorEdge 6900 Series FC Link Diagram 41 FIGURE 5-2 Data Host Notification of Intermittent Problems 43...
Page 10
FIGURE 10-5 Multipath Configurator LUN Properties Detail 142 FIGURE 10-6 Sun StorEdge T3+ Array Failover Driver CLI Output for the Sun StorEdge 3900 Series 143 FIGURE 10-7 Sun StorEdge T3+ Array Failover Driver CLI Example Output for the Sun StorEdge 6900...
Page 11
Successful Switch Test Results 153 FIGURE 11-8 Multipath Recovery using the Sun StorEdge T3+ Array Multipath Configurator 154 FIGURE 11-9 Recovered Paths 154 FIGURE 11-10 Sun Proprietary/Confidential: Internal Use Only List of Figures...
Page 12
XII Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only...
List of Tables Sun StorEdge 3900 and 6900 Series Configurations 1 TABLE 1-1 Event Grid Sorting Criteria 25 TABLE 3-1 FC Links 38 TABLE 5-1 Ax to Bx FC Links. 40 TABLE 5-2 Storage Automated Diagnostic Environment Event Grid for the Host 69...
Page 14
Sun StorEdge Network FC Switch Error Messages 168 TABLE B-2 Sun StorEdge T3+ Array Error Messages 171 TABLE B-3 Other SUNWsecfg Error Messages 175 TABLE B-4 Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only...
The scope of this troubleshooting guide is limited to information pertaining to the components of the Sun StorEdge 3900 and 6900 series, including the Storage Service Processor, Sun StorEdge 1 Gbit and 2 Gbit switches, Sun StorEdge T3+ arrays, and the virtualization engines in the Sun StorEdge 6900 series.
Solaris Handbook for Sun Peripherals AnswerBook2™ online documentation for the Solaris™ operating environment Other software documentation that you received with your system Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only ®...
Typographic Conventions Typeface Meaning The names of commands, files, AaBbCc123 and directories; on-screen computer output What you type, when AaBbCc123 contrasted with on-screen computer output AaBbCc123 Book titles, new words or terms, words to be emphasized Command-line variable; replace with a real name or value Shell Prompts Shell C shell...
Sun StorEdge 3900 and • Sun StorEdge 3900 and 6900 Series 2.0 Installation Guide 6900 series information • Sun StorEdge 3900 and 6900 Series 2.0 Reference and Service Guide • Sun StorEdge 3900 and 6900 Series 2.0 Regulatory and Safety Compliance Manual •...
Page 19
Product Title SANbox-8/16 • SANbox-8/16 Segmented Loop Fibre Channel Switch Management Segmented Loop FC User’s Manual Switch • SANbox-8 Segmented Loop Fibre Channel Switch Installer’s/User’s Manual • SANbox-16 Segmented Loop Fibre Channel Switch Installer’s/User’s Manual Expansion cabinet • Sun StorEdge Expansion Cabinet Installation and Service Manual •...
You can email your comments to Sun at: docfeedback@sun.com Please include the part number (816-5255) of your document in the subject line of your email. Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only...
C H A P T E R Introduction The Sun StorEdge 3900 and 6900 series storage subsystems are complete preconfigured storage solutions. The configurations for each of the storage subsystems are shown in Sun StorEdge 3900 and 6900 Series Configurations...
Capabilities The Storage Automated Diagnostic Environment software provides the health and monitoring functions for the Sun StorEdge 3900 and 6900 series systems. This software provides the following predictive failure analysis (PFA) capabilities: FC links—Fibre Channel (FC) links are monitored at all end points using the Fibre Channel-Extended Link Service (FC-ELS) link counters.
High-Level Troubleshooting Tasks This section lists the high-level steps you can take to isolate and troubleshoot problems in the Sun StorEdge 3900 and 6900 series. It offers a methodical approach, and lists the tools and resources available at each step.
Page 24
Review the LED status on the Sun StorEdge T3+ array. Review the Explorer Data Collection Utility output, which is located on the Storage Service Processor. Sun StorEdge 3900 and 6900 2.0 Series Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only...
Page 25
4. Check the status of the Sun StorEdge network FC switch-8 and switch-16 switches using the following tools: Review the Storage Automated Diagnostic Environment device monitoring reports. Run the checkswitch(1M) and showswitch(1M) commands, which check and display the Sun StorEdge FC switch configurations. Review the online and offline LED status codes and POST error codes, which can be found in the Sun StorEdge SAN 4.0 and SAN 4.1 Release Installation Guide.
These tests isolate the problem to a FRU that must be replaced. Follow the instructions in the Sun StorEdge 3900 and 6900 Series 2.0 Reference and Service Guide and the Sun StorEdge 3900 and 6900 Series 2.0 Installation Guide for proper FRU replacement procedures.
Verifying the Configuration Settings During the course of troubleshooting, you might need to verify configuration settings on the various components in the Sun StorEdge 3900 or 6900 series. To Verify Configuration Settings 1. Run one of the following scripts: Run the runsecfg(1M) script and select the various Verify menu selections for the Sun StorEdge T3+ arrays, the Sun StorEdge network FC switch-8 and switch- 16 switches, and the virtualization engine components.
Page 28
Checking Virtualization Engine Pair Parameters: v1b v1b configuration check passed Checking Virtualization Engine Pair Configuration: v1 checkvemap: virtualization engine map v1 verification complete: PASS. Sun StorEdge 3900 and 6900 2.0 Series Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only checkdefaultconfig(1M) Output Configuration...
Page 29
2. If anything is marked FAIL, check the /var/adm/log/SEcfglog file for the details of the failure. Mon Jan 7 18:07:51 PST 2002 checkt3config: t3b0 INFO : ---------- -SAVED CONFIGURATION--------------. Mon Jan 7 18:07:51 PST 2002 checkt3config: t3b0 INFO : blocksize : 16k. Mon Jan 7 18:07:51 PST 2002 checkt3config: t3b0 INFO : Mon Jan 7 18:07:51 PST 2002 checkt3config: t3b0 INFO : Mon Jan 7 18:07:51 PST 2002 checkt3config: t3b0 INFO :...
Tue Jan 29 16:14:01 MST 2002 savevemap: v1 EXIT. When savevemap: ve-pair EXIT is displayed, the savevemap(1M) process has successfully exited. Sun StorEdge 3900 and 6900 2.0 Series Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only savevemap(1M) Output...
Sun StorEdge 6900 Series Multipathing Example This Sun StorEdge 6900 series multipathing example contains the following elements: One Sun StorEdge T3+ array partner group Two total LUNs One 500-Gbyte RAID5 LUN per partner group for a logical view of the Sun StorEdge 6900 series. FIGURE 2-1 LUN0-10G Active-MPDrive...
Page 32
Alternate Master Primary Data Paths to the Alternate Master FIGURE 2-2 Sun StorEdge 3900 and 6900 2.0 Series Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only , which illustrates primary data paths to the alternate master, and Host with HBA-0 and HBA-1...
Active - Alternate Master Path Failure—Before the Second Tier of Switches FIGURE 2-4 Sun StorEdge 3900 and 6900 2.0 Series Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only Host with HBA-0 and HBA-1 LUN0 - 10G Active-MPDrive 0...
The virtualization engine recognizes the primary (active) and secondary (passive) pathing for the LUNs, and routes the I/O to the primary controller—unless there is a path failure to the primary path. In that case, the virtualization engine initiates a LUN failover and routes the I/O through the secondary path (which, in turn, goes through the interconnect cables).
Path(s): /dev/rdsk/c6t29000060220041F96257354230303052d0s2 /devices/scsi_vhci/ssd@g29000060220041f96257354230303052:c,raw Controller Device Address Class State Controller Device Address Class State Sun StorEdge 3900 and 6900 2.0 Series Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only O.K. O.K. SESS01 2a000060220041f4 2b000060220041f4 2b000060220041f9 080C Unsupported Enabled Enabled...
Note that in the Class and State fields, the virtualization engines are presented as two primary ONLINE devices. The current Sun StorEdge Traffic Manager software design does not enable you to manually halt the I/O (that is, you cannot perform a failover to the secondary path) when only primary devices are present.
After the testing and any FRU replacement are finished, return the Controller state back to the default by using virtualization engine failback. Refer to “To Failback the Virtualization Engine” on page 120. Sun StorEdge 3900 and 6900 2.0 Series Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only Type...
Note – To confirm that a failover is occurring, open a Telnet session to the Sun StorEdge T3+ array and check the output of port listmap. Another, but slower, method is to run the runsecfg script and verify the virtualization engine maps by polling them against a live system. Caution –...
The vxdisk output includes two physical paths to the LUN: c20t2B000060220041F4d0s2 c23t2B000060220041F9d0s2 Both of these paths are currently enabled with DMP. Sun StorEdge 3900 and 6900 2.0 Series Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only state=enabled state=enabled...
Page 41
2. Use the luxadm(1M) command to display further information about the underlying LUN. # /usr/sbin/luxadm display /dev/rdsk/c20t2B000060220041F4d0s2 DEVICE PROPERTIES for disk: /dev/rdsk/c20t2B000060220041F4d0s2 Status(Port A): O.K. Vendor: Product ID: SESS01 WWN(Node): 2a000060220041f4 WWN(Port A): 2b000060220041f4 Revision: 080C Serial Num: Unsupported Unformatted capacity: 102400.000 MBytes Write Cache: Enabled Read Cache:...
To Put the DMP-Enabled Paths Back into Production 1. Type: # vxdmpadm enable ctlr=<cn> 2. Verify that the path has been reenabled by typing: # vxdmpadm listctlr all Sun StorEdge 3900 and 6900 2.0 Series Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only...
32 Storage Automated Diagnostic Environment 2.2 Check the internal status of the Sun StorEdge 3900 or 6900 series systems using the Storage Automated Diagnostic Environment utility, version 2.2. The Storage Automated Diagnostic Environment is installed on every Storage Service Processor that ships with the unit. All that is needed is web browser access to the Storage Service Processor.
Solaris host (diag221) and the Storage Service Processor (diag156) in the view. What is missing is the Microsoft Windows 2000 host, which is also connected. Storage Automated Diagnostic Environment Example Topology FIGURE 3-1 Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only...
Generating Component-Specific Event Grids The Storage Automated Diagnostic Environment generates component-specific event grids that describe the severity of an event, tell whether action is required, provide a description of the event, and recommended action. Refer to Chapters 5 through 9 of this troubleshooting guide for component-specific event grids.
You should also look for other events such as any HBA driver-related events (qla2200, for example) or disk-related events. Microsoft Windows 2000 Event Properties System Log FIGURE 3-2 Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only...
Command Line Test Examples To run a single Sun StorEdge diagnostic test from the command line rather than through the Storage Automated Diagnostic Environment interface, you must log in to the appropriate host or slave for testing the components. The following two tests, qlctest(1M) and switchtest(1M), are provided as examples.
/opt/SUNWstade/Diags/bin. Refer to the Storage Automated Diagnostic Environment User’s Guide for a complete list of tests, subtests, options, and restrictions. Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only switchtest(1M) called with options: dev=2:192.168.0.30:0x0|xfersize=200"...
Monitoring Sun StorEdge T3 and T3+ Arrays Using the Explorer Data Collection Utility The Explorer Data Collection Utility script is included on the Storage Service Processor in the /export/packages directory. The Explorer Data Collection Utility is not installed by default, but can be installed during rack setup.
Page 50
:wq! Note – xxxx represents Sun StorEdge T3+ array passwords. Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only Editing Switch Information Using vi Editing Sun StorEdge T3+ Array Information Using vi...
Page 51
You can now run /opt/SUNWexplo/bin/explorer for information about the Storage Service Processor operating system, the Sun StorEdge network FC switch- 8 or switch-16 switch, and Sun StorEdge T3+ array information that you can use for troubleshooting purposes. A tar/gzip file is put in the /opt/SUNWexplo/output/tar/gzip file directory.
Use the Qlogic SANblade Manager to extract information about: HBA Driver versions Firmware versions A primitive topology view A LUN listing Diagnostics on the HBA Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only...
Qlogic SANblade Manager HBA Driver and Firmware Versions FIGURE 3-3 Chapter 3 Troubleshooting Tools Sun Proprietary/Confidential: Internal Use Only...
Page 54
Differing HBA manufacturer’s may bundle different features with their tools. The information in this guide is written with the assumption of Qlogic software usage. Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only...
C H A P T E R Troubleshooting Ethernet Hubs The Sun StorEdge 3900 and 6900 series uses an Ethernet hub as the backbone for the internal service network. The allocation of Ethernet ports is as follows: One for the Storage Service Processor (per subsystem)
Page 56
Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only...
C H A P T E R Troubleshooting the Fibre Channel (FC) Links FC links diagnose Sun StorEdge network FC components in a SAN or a direct attached storage (DAS) environment. linktest(1M), which tests the health of the FC links, is available only from the Test from Topology view of the Storage Automated Diagnostic Environment GUI.
The following diagrams provide troubleshooting information for the basic components and FC links specific to the Sun StorEdge 3900 1.1 series (shown in ), and the Sun StorEdge 6900 1.1 series (shown in FIGURE 5-1 Note –...
FC Link Diagrams shows the basic components and the FC links for a Sun StorEdge 3900 FIGURE 5-1 series system: A1 to B1—HBA to Sun StorEdge network FC switch-8 and switch-16 switch link A4 to B4—Sun StorEdge network FC switch-8 and switch-16 switch to Sun...
A3 to B3 A4 to B4 T1 to T2 Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only shows the basic components and the FC links for a Sun Provides FC Link Between These Components...
HOST HBA-A sw1a sw2a T3+ alternate master T3+ Master Sun StorEdge 6900 Series FC Link Diagram FIGURE 5-2 Chapter 5 Sun Proprietary/Confidential: Internal Use Only HBA-B sw1b sw2b Troubleshooting the Fibre Channel (FC) Links...
What happens when a FC link fails depends on the system. If a problem occurs with the A1 or B1 FC link: In a Sun StorEdge 3900 series system, the Sun StorEdge T3+ array will fail over. In a Sun StorEdge 6900 series system, no Sun StorEdge T3+ array will fail over, but an error with the FC link can cause a path to go offline.
, and FIGURE 5-3 FIGURE 5-4 FIGURE 5-5 events. Site : FSDE LAB Broomfield CO Source : diag.xxxxx.xxx.com Severity : Normal Category : Message Key: message:diag.xxxxx.xxx.com EventType: LogEvent.driver.LOOP_OFFLINE EventTime: 01/08/2002 14:34:45 Found 1 ’driver.LOOP_OFFLINE’ error(s) in logfile: /var/adm/messages on diag.xxxxx.xxx.com (id=80fee746): info: Loop Offline Jan 8 14:34:25 WWN:...
Page 64
FIGURE 5-5 Note – An A1 or B1 FC link error can cause a port in sw1a or sw1b to change state. Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only Key: switch:100000c0dd0057bd...
Verifying the Data Host The following example shows an error in the A1 or B1 FC link, which can cause a path to go offline in the multipathing software. luxadm(1M) Display CODE EXAMPLE 5-1 # /usr/sbin/luxadm display /dev/rdsk/c6t29000060220041F96257354230303052d0s2 DEVICE PROPERTIES for disk: /dev/rdsk/ c6t29000060220041F96257354230303052d0s2 Status(Port A): O.K.
200 bytes or less. This is a limitation in the HBA application- specific integrated circuit (ASIC). Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only CODE EXAMPLE 5-2...
Page 67
switchtest(1M) Called With Options CODE EXAMPLE 5-3 # /opt/SUNWstade/Diags/bin/switchtest -v -o "dev=2:192.168.0.30:0" "switchtest: called with options: dev=2:192.168.0.30:0" "switchtest: Started." "Testing port: 2" "Using ip_addr: 192.168.0.30, fcaddr: 0x0 to access this port." "Chassis Status for Device: Switch Power: OK Temp: OK 23.0c Fan 1: OK Fan 2: OK "...
If the qlctest test passes, replace the cable. 8. Recable the entire link. 9. Run switchtest or qlctest to validate the fix. 10. Put the path back into production. Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only...
Troubleshooting the A2 or B2 FC Link The A2 or B2 link is the FC link from the first switch to the virtualization engine. This link exists in the Sun StorEdge 6900 Series only. An error with the FC link can cause a path to go offline.
To isolate further please run the Storage Automated Diagnostic Environment tests associated with this link segment. A2 or B2 FC Link Storage Service Processor-Side Event FIGURE 5-7 Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only Key: switch:100000c0dd0061bb Key: switch:100000c0dd0061bb:1...
2b000060220041f4,0 Class primary State OFFLINE Note – You can find procedures for restoring virtualization engine settings in the Sun StorEdge 3900 and 6900 Series 2.0 Reference and Service Guide. Chapter 5 Sun Proprietary/Confidential: Internal Use Only Type Receptacle Occupant scsi-bus...
4. Run switchtest: a. If the test fails, replace the GBIC and rerun switchtest. b. If the test fails again, replace the switch. Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only...
Page 73
Note – The procedures for restoring virtualization engine settings are in the Sun StorEdge 3900 and 6900 Series 2.0 Reference and Service Guide. 6. Return the path to production. Sun Proprietary/Confidential: Internal Use Only...
WARNING: fp(1): N_x Port with D_ID=104000, PWWN=2b000060220041f9 disappeared from fabric A3 or B3 FC Link Host-Side Event FIGURE 5-8 Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only , and are examples of A3 or B3 link notification FIGURE 5-10 Key: message:diag.xxxxx.xxx.com...
Site : FSDE LAB Broomfield CO Source : diag.xxxxx.xxx.com Severity : Normal Category : Switch Key: switch:100000c0dd0057bd EventType: StateChangeEvent.M.port.1 EventTime: 01/08/2002 18:28:38 ’port.1’ in SWITCH diag-sw1a (ip=192.168.0.30) is now Not-Available (status-state changed from ’Online’ to ’Offline’): Info: A port on the switch has logged out of the fabric and gone offline Action: 1.
/devices/scsi_vhci/ssd@g29000060220041f96257354230303052:c,raw Controller Device Address Class State Controller Device Address Class State Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only Devices in the “Connected” State Type Receptacle scsi-bus connected disk connected disk...
DMP Error Message CODE EXAMPLE 5-6 Jul 8 18:26:38 diag.xxxxx.xxx.com vxdmp: [ID 619769 kern.notice] NOTICE: dmp: Path failure on 118/0x1f8 Jul 8 18:26:38 diag.xxxxx.xxx.com vxdmp: [ID 997040 kern.notice] NOTICE: vxvm:vxdmp: disabled path 118/0x1f8 belonging to the dmpnode 231/0xd0 Verifying the Storage Service Processor-Side You can check the A3 or B3 FC link using the Storage Automated Diagnostic Environment’s Test from Topology functionality.
The procedures for restoring virtualization engine settings are in the Sun StorEdge 3900 and 6900 Series 2.0 Reference and Service Guide. 6. Return the path to production. Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only...
Quiescing the I/O on the A3 or B3 Link 1. Determine the path you want to disable. 2. Disable the path by typing the following: # /usr/bin/vxdmpadm disable ctlr=<cn> 3. Verify that the path is disabled: # /usr/bin/vxdmpadm listctlr all Steps 1 and 2 halt I/O only up to the A3 to B3 link.
If a problem occurs with the A4 or B4 FC link: In a Sun StorEdge 3900 series system, the Sun StorEdge T3+ array will fail over. In a Sun StorEdge 6900 series system, no Sun StorEdge T3+ array will fail over, but an error with the FC link can cause a path to go offline.
Page 81
Site : FSDE LAB Broomfield CO Source : diag Severity : Warning Category : Switch DeviceId : switch:100000c0dd0061bb EventType: LogEvent.MessageLog EventTime: 01/29/2002 14:25:05 Change in Port Statistics on switch diag-sw1b (ip=192.168.0.31): Port-1: Received 16289 ’InvalidTxWds’ in 0 mins (value=365972 ) ---------------------------------------------------------------------- Site : FSDE LAB Broomfield CO...
Verifying the Data Host A problem in the A4 or B4 FC Link appears differently on the data host, depending on whether the array is a Sun StorEdge 3900 series or a Sun StorEdge 6900 series device. Sun StorEdge 3900 Series...
Page 83
To verify that the failover luxadm display can be used, the failed path is marked “offline,” as shown in CODE EXAMPLE 5-7 Failed Path Marked Offline CODE EXAMPLE 5-7 # /usr/sbin/luxadm display /dev/rdsk/c26t60020F200000644> DEVICE PROPERTIES for disk: /dev/rdsk/ c26t60020F20000064433C3352A60003E82Fd0s2 Status(Port A): O.K.
2. Run switchtest(1M) to test the entire link (re-create the problem). 3. Break the connection by uncabling the link. 4. Insert the loopback connector in to the switch port. Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only Failed Path Marked Unusable...
Page 85
5. Rerun switchtest. a. If switchtest fails, replace the GBIC and rerun switchtest. b. If the test fails again, replace the switch. 6. If switchtest passes, assume that the suspect components are the cable and the Sun StorEdge T3+ array controller. a.
Page 86
Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only...
C H A P T E R Troubleshooting Host Devices This chapter describes how to troubleshoot components associated with a Sun StorEdge 3900 or 6900 series host. This chapter contains the following sections: “To Access the Host Event Grid” on page 67 “To Replace the Master Host”...
Sample Host Event Grid FIGURE 6-1 Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only...
Page 89
lists all the host events in the Storage Automated Diagnostic Environment. TABLE 6-1 Storage Automated Diagnostic Environment Event Grid for the Host TABLE 6-1 Alarm+ Alarm- LUN. Alarm- t300 LUN. Alarm- Sun Proprietary/Confidential: Internal Use Only Yellow The status of hba / devices/sbus@9,0/ SUNW,qlc@0,30000/ fp@0,0:devctl on...
Alarm capacity disk_ Alarm capacity_ okay Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only ifptest (diag240) on the host failed. qlctest (diag240) on the host failed. socaltest (diag240) on the host failed.
Replacing the Master, Alternate Master, and Slave Monitoring Host The following procedures are a high-level overview of the procedures that are detailed in the Storage Automated Diagnostic Environment User’s Guide. Follow these procedures when replacing a master, alternate master, or slave monitoring host. Note –...
This is especially important when the Storage Service Processor is replaced as a FRU— whether the Storage Service Processor is the master or the slave. Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only...
The switches are paired to provide redundancy. Two switches are used in each Sun StorEdge 3900 series, and four switches are used in each Sun StorEdge 6900 series. Each Sun StorEdge network FC switch-8 and switch-16 switch is connected by way of an Ethernet to the service network for management and service from the Storage Service Processor.
The Sun StorEdge network FC switches in a Sun StorEdge 3900 or 6900 configuration now support the Sun StorEdge SAN 4.1 Release. You can upgrade the switches to support the 402xx 2 Gbit-compatible firmware. Caution – Use caution when upgrading back-end switches to the 2 Gbit-compatible firmware.
StorEdge SAN 4.1 Release firmware. For a list of the supported switches visit the http://www.sun.com web site. Direct attachment to the StorEdge 3900 and 6900 Series arrays with 1 Gbit or 2 Gbit HBAs require no changes. Before making any changes to the Sun StorEdge 3900 or 6900 series, you must have a Sun StorEdge SAN 4.1 infrastructure already in place and functional.
Page 96
(HBA, GBIC, and cables) on either side of the switch. The Sun StorEdge SAN 4.1 Release Field Troubleshooting Guide also includes an appendix on the Brocade Silkworm switch troubleshooting. Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only...
Using the Switch Event Grid The Storage Automated Diagnostic Environment Switch Event Grid enables you to sort switch events by component, category, or event type. The Storage Automated Diagnostic Environment GUI displays an event grid that describes the severity of the event, tells whether action is required, provides a description of the event, and gives the recommended action.
Yellow power chassis. Alarm Yellow temp Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only “Change in port statistics on switch diag156-sw1b (ip=192.168.0.31)” The switch has reported a change in an error counter.
Page 99
Storage Automated Diagnostic Environment Event Grid for 1 Gbit Switches (Continued) TABLE 7-1 chassis. Alarm Yellow zone enclosure Audit Comm_ Established Comm_ Down Lost Diagnostic switch Test- test Sun Proprietary/Confidential: Internal Use Only “Switch sw1a was rezoned” This event reports changes in the zoning of a switch.
Page 100
Storage Automated Diagnostic Environment Event Grid for 1 Gbit Switches (Continued) TABLE 7-1 enclosure Discovery enclosure Location Change Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only “Discovered a new switch called ras d2-swb1 (ip=xxx.0.0.41) 10002000007a609” Discovery events occur the very first time the agent probes a storage device.
Page 101
Storage Automated Diagnostic Environment Event Grid for 1 Gbit Switches (Continued) TABLE 7-1 port State Change+ port State Change- enclosure Statistics Sun Proprietary/Confidential: Internal Use Only “port.1 in SWITCH diag185 (ip= xxx.20.67.185) is now Available (status-state changed from offline to online)”...
Yellow reboot enclosure Audit Comm_ Established Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only “chassis.fan.1 status changed from OK” The uptime of the switch was less than the previous uptime of the switch. This...
Page 103
Storage Automated Diagnostic Environment Event Grid for 2 GBit Switches (Continued) TABLE 7-2 Comm_ Down Lost Diagnostic switch2 Test- test enclosure Discovery enclosure Location Change Sun Proprietary/Confidential: Internal Use Only “Lost communication with sw1a (ip=xxx.20.67.213)” Ethernet connectivity to the switch has been lost. “Discovered a new switch called ras d2-swb1 (ip=xxx.0.0.41)
Page 104
State Change+ port State Change- enclosure Statistics Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only “port.1 in SWITCH diag185 (ip= xxx.20.67.185) is now Available (status-state changed from offline to online)”...
View and verify this nonstandard configuration setup as required, using the showswitch command. Refer to the Sun StorEdge 3900 and 6900 Series Version 1.1 Reference and Service Guide for detailed configuration information. • The chassis ID on the switch is not set to the default value. This could be caused by unique ID settings or by conflicts in a SAN environment.
Page 106
Note – If multiple systems are connected to a switch, the switch settings might not match the default settings. Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only...
RAID controller and disk drives with FC connectivity to the data host. In the Sun StorEdge 3900 and 6900 series, the Sun StorEdge T3+ array is used as a building block, configured in various ways to provide a storage solution optimized to the host application.
StorEdge T3+ array LUNs fail over, as all I/O is routed to the controlling virtualization engine. The host detects a pathing failure in its multipathing software. Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only...
Notification Events shows a typical port failure event FIGURE 8-1 Site : Lab 3286 - DSQA1 Broomfield Source : diag.xxxxx.xxx.com Severity : Error (Actionable) Category : Switch DeviceId : switch:100000c0dd00b682 EventType: StateChangeEvent.M.port.8 EventTime: 01/30/2002 11:17:22 ’port.8’ in SWITCH diag209-sw2a (ip=192.168.0.32) is now Not-Available (status-state changed from ’Online’...
Page 110
(ssd56): ’ ...continued on next page... Virtualization Engine Alert FIGURE 8-2 Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only Received 7 ’SSD Warning’ message(s) on ’ssd56’ Last-Message: ’diag.xxxxx.xxx.com scsi: in 8...
Page 111
...continued from previous page... ---------------------------------------------------------------------- Site : Lab 3286 - DSQA1 Broomfield Source : diag.xxxxx.xxx.com Severity : Warning Category : Message DeviceId : message:diag.xxxxx.xxx.com EventType: LogEvent.driver.Fabric_Warning EventTime: 01/30/2002 11:50:07 Found 1 ’driver.Fabric_Warning’ warning(s) in logfile: /var/adm/messages on diag.xxxxx.xxx.com (id=809f76b4): INFORMATION: Fabric warning Jan 30 11:46:37 WWN:2b00006022004186 kern.warning] WARNING: fp(2): N_x Port with D_ID=108000,...
> I00002 46d45 < Undefined checkvemap: virtualization engine map v1 verification complete: FAIL. Manage Configuration Files Menu FIGURE 8-3 Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only addr_type volume hard vol1...
FRU Tests Available for the T1 or T2 Data Path FRU Running the tests from the Storage Automated Diagnostic Environment GUI guides you in discovering the failed FRU. Refer to Chapter 5 of the Storage Automated Diagnostic Environment User’s Guide for instructions on how to run tests. Run the switchtest to test the switches.
1. Run linktest from the Storage Automated Diagnostic Environment for a guided isolation procedure. 2. After replacing the failed FRU, run failbackt3path , if needed. Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only...
Sun StorEdge T3+ Array Event Grid The Storage Automated Diagnostic Environment Event Grid enables you to sort Sun StorEdge T3+ array events by component, category, or event type. The Storage Automated Diagnostic Environment GUI displays an event grid that describes an event and its severity, and tells what, if any, action should be taken.
TABLE 8-1 power.temp Alarm+ sysvolslice Alarm disk.port Alarm- Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only The power temperature is normal. Yellow The vol slice feature is possible in Sun StorEdge T3+ array firmware version 2.1 and above.
Page 117
Storage Automated Diagnostic Environment Event Grid for the Sun StorEdge T3+ Array TABLE 8-1 interface. Alarm- loopcard.cable power.battery Alarm- Sun Proprietary/Confidential: Internal Use Only The Sun StorEdge T3+ array has reported that a loopcard is in a failed state. Possible Drive Status Messages: Value Description 0 Drive mounted...
Page 118
Alarm- Alarm time_diff Alarm Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only The state of a fan on the Sun StorEdge T3+ array is not optimal. The state of the power in...
Page 119
Storage Automated Diagnostic Environment Event Grid for the Sun StorEdge T3+ Array TABLE 8-1 enclosure Audit Comm_ Established Comm_ Established Comm_Lost Sun Proprietary/Confidential: Internal Use Only Auditing a new Sun StorEdge T3+ array Audits occur every week. The Storage Automated Diagnostic Environment sends a detailed description of the...
Page 120
Diagnostic t3test Test- Diagnostic t3volverify Test- Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only Down OutOfBand (oob) means that the Sun StorEdge T3+ array failed to answer to a ping or failed to return its tokens.
Page 121
Storage Automated Diagnostic Environment Event Grid for the Sun StorEdge T3+ Array TABLE 8-1 enclosure Discovery controller Topology disk Topology interface. Topology loopcard power Topology enclosure Location Change enclosure QuiesceEnd Sun Proprietary/Confidential: Internal Use Only The Storage Automated Diagnostic Environment discovered a new Sun StorEdge T3+ array Discovery events occur...
Page 122
Change+ volume State Change+ Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only Quiesce has started on a Sun StorEdge T3+ array. ’The Sun StorEdge T3+ array has reported that a controller was removed from the chassis.
Page 123
Storage Automated Diagnostic Environment Event Grid for the Sun StorEdge T3+ Array TABLE 8-1 power State Change+ controller State Change+ disk State Change+ interface. State loopcard Change+ volume State Change+ power State Change+ controller State Change- Sun Proprietary/Confidential: Internal Use Only The status of the PCU has changed from ready- disable to ready-enable.
Page 124
State Change- interface. State loopcard Change- Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only The Sun StorEdge T3+ array has reported that a disk has failed. The Sun StorEdge T3+ array has indicated that the loopcard is no longer in an optimal state.
Page 125
Storage Automated Diagnostic Environment Event Grid for the Sun StorEdge T3+ Array TABLE 8-1 volume State Change- power State Change- enclosure Statistics Sun Proprietary/Confidential: Internal Use Only The Sun StorEdge T3+ array has reported that a power cooling unit has been disabled.
Page 126
Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only...
C H A P T E R Troubleshooting Virtualization Engine Devices This chapter describes how to troubleshoot the virtualization engine component of a Sun StorEdge 6900 series system. This chapter contains the following sections: “About the Virtualization Engine” on page 107 “Virtualization Engine Diagnostics”...
It then passes this information, in the form of a SRN, to the Error Log file. Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only...
Error Log Analysis Commands To Display the Log Files and Retrieve SRNs Type # /opt/svengine/sduc/sreadlog Errors that need action are returned in the following format: TimeStamp: nnn :T xxxxx.uuuuuuuu SRN= mmmmm TimeStamp: nnn :T xxxxx.uuuuuuuu SRN= mmmmm TimeStamp:nnn:Txxxxx.uuuuuuuu SRN=mmmmm A description of the errors follows. Item Description The time and date when the error occurred...
TABLE 9-1 Power Status Fault 1 The Status LED blinks a service code when the Fault LED is solid on. Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only Color State Green...
Power LED Codes The virtualization engine LEDs are shown in VIRTUALIZATION ENGINE STATUS LED POWER LED FAULT LED Virtualization Engine Front Panel LEDs FIGURE 9-1 Interpreting LED Service and Diagnostic Codes The Status LED communicates the status of the virtualization engine in decimal numbers.
TABLE 9-3 Speed, Activity, and Validity of the Link TABLE 9-3 Speed Link Activity Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only Power Switch Status Port LED Rear Fault LED Color...
FC Link Error Status Report The virtualization engine’s host-side and device-side interfaces provide statistical data for the counts listed in TABLE 9-4 Virtualization Engine Statistical Data TABLE 9-4 Count Type Description Link failure count The number of times the virtualization engine’s frame manager detects a nonoperational state or other failure of N port initialization protocol.
Page 134
Protocol Error Count Invalid Word Count Invalid CRC Count diag.xxxxx.xxx.com: root# Note – v1 represents the first virtualization engine pair Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only FC Link Error Status Example...
Note – The Serial Loop IntraConnect (SLIC) daemon must be running for the svstat(1M) -d v1 command to work. Translating Host-Device Names You can translate host-device names to VLUN, disk pool, and physical Sun StorEdge T3+ array LUNs. The luxadm output for a host device, shown in the unique VLUN serial number that is needed to identify this LUN.
From this screen, note that the VLUN number is 62 57 33 4b 30 30 31 48, beginning with the fifth pair of numbers on the third line, up to and including the twelfth pair. Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only ...+...SUN...
To Display Sun StorEdge Traffic Manager (MPxIO)-Enabled Devices If the devices support the Sun StorEdge Traffic Manager software, you can use this shortcut. Type: # /usr/sbin/luxadm display /dev/rdsk/c6t29000060220041956257334B30303148d0s2 DEVICE PROPERTIES for disk: /dev/rdsk/ c6t29000060220041956257334B30303148d0s2 Status(Port A): O.K. Status(Port B): O.K. Vendor: Product ID: SESS01...
-------------------------------------------------------------------------------- Undefined Undefined Note – This example uses the virtualization engine map file, which could include old information. Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only MP Drive VLUN VLUN Target...
Page 139
2. Optionally open a Telnet session to the virtualization engine and run the runsecfg utility to poll a live snapshot of the virtualization engine map. Refer to “To Failback the Virtualization Engine” on page 120 for instructions about how to open a Telnet session. Determining the virtualization engine pairs on the system ...
For example, the LUNs in disk pools t3b00 and t3b01 are named t3b0 on the Sun StorEdge T3+ array device. CODE EXAMPLE 9-4 # /opt/SUWNsecfg/bin/failbackt3path -n t3b0 Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only CODE EXAMPLE 9-3 Multipath Drive Summary...
Page 141
The virtualization engine should be plugged in to port 1 (on a 1 Gbit switch) of the same two switches (port 0 on a 2 Gbit switch). Refer to the Sun StorEdge 3900 and 6900 Series 2.0 Reference and Service Guide to determine which switch ports are used for each component.
Page 142
Name Server ************ Port Address ---- ------- ------ 10C000 10C1EF Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only Error-Free Online Switch Ports Admin State Oper State ----------- ---------- online online online...
6. If either port 1 or port 2 is offline, check the GBICs and cables. 7. If a Sun StorEdge T3+ array switch port is offline, log in to the Sun StorEdge T3+ array and look at the status of the controllers and the port list, as shown in CODE EXAMPLE 9-7 Status of Sun StorEdge T3+ Array Controllers and Port List CODE EXAMPLE 9-7...
# /opt/SUNWsecfg/flib/setveport -n vehostname -e 6. To reset the virtualization engine and force it to synchronize with its partner virtualization engine, type: # resetve -n vehostname Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only...
To Reset the SAN Database on a Single Virtualization Engine 1. To disconnect the virtualization engine’s device-side FC cables, type: # setveport -v virtualization-engine-name -d 2. Open a Telnet session to the virtualization engine specified in Step 1. 3. Enter the password. The User Service Utility Menu is displayed.
# ipcrm -m 301 -m 302 -m 303 -s 196608 -s 196609 -s 196610 Refer to the ipcrm(1) man page for details. The message queues, and shared memory and semaphores have been removed. Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only MODE...
Page 147
4. To restart the slicd for the v1 virtualization engine, type # /opt/SUNWsecfg/bin/startslicd -n v1 (or v2, depending on configuration) 5. Confirm that the slicd daemon is running: # ps -ef | grep slicd root 16132 16130 0 11:45:00 ? root 16135 16130 0 11:45:00 ? root 16130...
Page 148
Unit Serial Number : 00250339 PCB Number MAC address DIP SW1 = 00000000 76543210 Error: None Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only : 2.02.42 : 00000060-2200418A : 00166425 : 0.60.22.3.D1.E3...
Diagnosing a creatediskpools(1M) Failure When modifying the Sun StorEdge T3+ array configuration on a Sun StorEdge 6900 series, the system should automatically create disk pools. If the virtualization engine cannot find two paths to all Sun StorEdge T3+ array LUNs, however, the multipath drives cannot be created.
Page 150
2. Run the showswitch(1M) command for sw2a and sw2b. Refer to the Sun StorEdge 3900 and 6900 Series 2.0 Reference and Service Manual to see to which switch ports the Sun StorEdge T3+ array and virtualization engine should be attached.
Page 151
3. After corrective action has been successfully completed, run the following command: # creatediskpools -n t3b0 The SEcfglog file should display the following message: Thu May 30 17:40:23 MDT 2002 creatediskpools: t3b0 ENTER: /opt/SUNWsecfg/ bin/creatediskpools -n t3b0. Thu May 30 17:40:24 MDT 2002 checkslicd: v1 ENTER /opt/SUNWsecfg/bin/ checkslicd -n v1.
Virtualization Engine Event Grid, from which you can select FIGURE 9-3 related criteria for the event you are troubleshooting. Virtualization Engine Event Grid FIGURE 9-3 Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only...
Page 154
Down Lost oob. Comm_ Down command Lost Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only Required Information Action The virtualization engine 1. Check the status of the slicd failed to execute slicd command.
Page 155
Storage Automated Diagnostic Environment Event Grid for Virtualization Engine (Continued) TABLE 9-5 Component EventType Severity ve_diag Diagnostic Test- veluntest Diagnostic Test- enclosure Discovery Sun Proprietary/Confidential: Internal Use Only Required Information Action The ve_diag test on ve-1 failed The veluntest failed The discovery device found a new virtualization engine called v1a.
Page 156
Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only...
C H A P T E R Troubleshooting Using Microsoft Windows 2000 General Notes Use the Manufacturer’s HBA Utilities to monitor and diagnose the HBAs. The examples in this chapter use Qlogic’s SANblade Manager utility. The Storage Automated Diagnostic Environment running on the Storage Service Processor is not able to monitor the host-to-switch link.
From the Microsoft Windows 2000 Advanced Server GUI, click Programs -> T3 StorEdge Configurator -> Configurator. Launching the Sun StorEdge T3+ Array Failover Driver FIGURE 10-1 Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only...
Checking the Version of the Sun StorEdge T3+ Array Failover Driver From the Microsoft Windows 2000 Advanced Server GUI, click Help -> About. The About Multipath Configurator window is displayed. Sun StorEdge T3+ Array Failover Driver Versions 2.0.0.123 and 2.1.0.104 FIGURE 10-2 Note –...
Note – The Sun StorEdge T3+ Array Failover Driver GUI is limited to the Sun StorEdge 3900 series systems. You must use the CLI for the Sun StorEdge 6900 series systems. 1. Make sure the Sun StorEdge T3+ Array failover driver is loaded.
4. Compare the healthy Sun StorEdge 3900 series system to a system that has experienced a LUN failover. A system that has experienced a LUN failover has a broken line connecting the HBA to the storage, as shown in FIGURE 10-4...
Although the Sun StorEdge T3+ Array Failover Driver GUI is limited to the Sun StorEdge 3900 series systems, you can use the CLI for both the Sun StorEdge 3900 series systems and the Sun StorEdge 6900 series systems.
CONTROLLER ID:0 DESC:Sun Microsystems 69XX Array Controller Sun StorEdge T3+ Array Failover Driver CLI Example Output for the Sun FIGURE 10-8 StorEdge 6900 Series Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only STATE:up_active(2) STATE:up_active(2)
lists some of the codes and descriptions for CLI output for a Sun StorEdge TABLE 10-1 6910 series system. Tips for Interpreting Sun StorEdge 6910 Series CLI Output TABLE 10-1 Component Output Code Device FW_REV NAME PATH TYPE Sun Proprietary/Confidential: Internal Use Only Description Firmware revision level of the virtualization engine The worldwide name of the Master virtualization engine of...
Page 166
Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only...
C H A P T E R Example of Fault Isolation In the following example, a fault was injected into a running Sun StorEdge 3900 series system to show a troubleshooting flow. 1. Discover the Error One of the best ways to discover errors is by using the Storage Automated Diagnostic Environment monitoring system.
Drilling Down for Sun StorEdge T3+ Array Failover Driver Fault Detail FIGURE 11-2 Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only Array 1: A solid line connecting the HBA to the storage represents a healthy system.
The primary path to Drive F: failed. The alternate path is currently handling all of the I/O. 3. Check the HBA Using the HBA utility (Qlogic SANblade in this example), confirm the fault. Fault Confirmation Using QLogic SunBlade FIGURE 11-3 4.
The first run will test the switch-side GBIC as well as the Sun StorEdge network FC switch. Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only...
In the examples shown in , and , Port 2 on Switch FIGURE 11-5 FIGURE 11-6 FIGURE 11-7 diag156-sw1a was marked with a "Red" icon, indicating a problem. Note – All tests were run with the default values. Storage Automated Diagnostic Environment Test from Topology FIGURE 11-5 Chapter 11 Example of Fault Isolation...
Storage Automated Diagnostic Environment Test from Topology Pull-Down FIGURE 11-6 Menu Storage Automated Diagnostic Environment Test from Topology Test Detail FIGURE 11-7 Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only...
Page 173
6. Recover the problem with the GBIC or the switch. a. Recable the link between the HBA and switch. b. Use the Sun StorEdge T3+ Array Failover Driver GUI for the Sun StorEdge 3900 series system, or the CLI for the 6900 series, to recover the multipathing.
Port has gone back online. The Multipath Configurator GUI should show both paths online and handling I/O, as illustrated in Recovered Paths FIGURE 11-10 Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only FIGURE 11-10...
A P P E N D I X Virtualization Engine References This appendix contains the following information: “SRN Reference” on page 155 “SRN/SNMP Single Point-of-Failure Descriptions” on page 159 “Port Communication Numbers” on page 160 “Virtualization Engine Service Codes” on page 160 SRN Reference provides an explanation of SRNs for the virtualization engine.
70010 The CleanUp configuration table is completed. 70020 The SAN physical configuration has changed. Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only Corrective Action If too many check conditions are returned, check the link status.
Page 177
SRN Reference TABLE A-1 Description 70021 The drive is offline. 70022 The virtualization engine is offline. 70023 The drive is unresponsive. 70024 For the Sun StorEdge T3+ array pack, the master virtualization engine has detected the partner virtualization engine’s IP Address. 70025 For Sun StorEdge T3+ array pack: The master virtualization engine is unable to detect the...
Page 178
The virtualization engine failed to read the SAN event log. 72007 The SLIC daemon connection is down. Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only Corrective Action 1. Check the condition of the virtualization engine.
SRN/SNMP Single Point-of-Failure Descriptions provides Simple Network Management Protocol (SNMP) descriptions, TABLE A-2 associated Service Request Numbers (SRNs), and recommendations for corrective action. SRN/SNMP Single Point-of-Failure Table TABLE A-2 SNMP Description 70020 • The SAN topology has changed. 70021 • The Global SAN configuration has 70030 changed.
Virtualization Engine Service Codes —0 -399 Host-Side Interface Driver Errors TABLE A-4 Service Code Number Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only Port Management programs...
Page 181
Virtualization Engine Service Codes (Continued)—0 -399 Host-Side Interface Driver Errors TABLE A-4 Sun Proprietary/Confidential: Internal Use Only An attempt to write a value into nonvolatile storage failed, perhaps because a hardware failure, or one of the databases stored in Flash memory could not accept the entry being added.
Page 182
Virtualization Engine Service Codes —400-599 Device-Side Interface Driver Errors TABLE A-5 Service Code Number Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only Cause of Error The FC device-side type code is invalid.
Configuration Utility Error Messages The Sun StorEdge 3900 and 6900 Series Reference Manual lists and defines the command utilities that configure the various components of the Sun StorEdge 3900 and 6900 series storage systems. If you encounter errors with the command line utilities, refer to the recommendations for corrective action in this appendix.
The environment variable VEPASSWD might be set to an incorrect value. Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only Suggested Corrective Action Run ps -ef | grep savevemap...
Page 185
Virtualization Engine Error Messages (Continued) TABLE B-1 Source of Error Message Cause of Error Message Common to After resetting the virtualization virtualization engine engine, the $VENAME is unreachable. The hardware might be faulty. Common to • The device-side operating mode is virtualization engine not set properly.
Page 186
Sun StorEdge T3+ array physical LUN ${t3lun} for disk pool ${diskpool} might not be mounted. Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only Suggested Corrective Action 1. Run the checkvemap command again.
Page 187
Virtualization Engine Error Messages (Continued) TABLE B-1 Source of Error Message Cause of Error Message • The import zone data failed. restorevemap • The restore physical and logical data failed. • The restore zone data failed. • The virtualization engine is unable setdefaultconfig to properly configure the virtualization engine host...
The user has set a login id and password on the 2Gbit switch. Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only Suggested Corrective Action Either call the command with the...
Page 189
Sun StorEdge Network FC Switch Error Messages (Continued) TABLE B-2 Source of Error Message Cause of Error Message • The current configuration on checkswitch $switch does not match the defined configuration. • One of the predefined static switch configuration parameters that can be overridden for special configurations (such as NT connect or cascaded switches) is set incorrectly.
Page 190
${switch} to ${cid}. This occurs only in a SAN environment with cascaded switches. Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only Suggested Corrective Action You might be attempting to download a flash file for an 8-port switch to a 16- port switch.
Sun StorEdge T3+ Array Partner Group Error Messages Caution – Running restoret3config(1M) or modifyt3config(1M) destroys all data on the Sun StorEdge T3+ array Sun StorEdge T3+ Array Error Messages TABLE B-3 Source of Error Message Cause of Error Message Common to Sun •...
Page 192
Snapshot configuration files are not checkt3config present. Unable to check configuration. Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only Suggested Corrective Action 1. Refer to the T3 default/custom configuration table in the Sun StorEdge 3900 and 6900 Series 2.0...
Sun StorEdge T3+ Array Error Messages (Continued) TABLE B-3 Source of Error Message Cause of Error Message • The $lun status reported a bad or checkt3mount nonexistent LUN. • While checking the configuration using the showt3 -n command, operations abort. User-specified LUN $lun does not createt3group exist on the Sun StorEdge T3+ array.
Page 194
LUN $lun does not exist on the Sun sett3lunperm StorEdge T3+ array. Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only Suggested Corrective Action 1. Check the Sun StorEdge T3+ configuration with the showt3 -n t3_name command.
Other SUNWsecfg Error Messages TABLE B-4 Source of Error Message Cause of Error Message Common to all If the Sun StorEdge 3900 or 6900 components series has more than two failures (for example, both virtualization engines and two switches are...
Page 196
Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only...
Abbreviations and Acronyms This list contains definitions for acronyms used in this troubleshooting guide. ASIC application-specific integrated circuit command-line interface cyclic redundancy code direct attached storage end of file Fibre Channel FC-ELS Fibre Channel Extended Link Service field replaceable unit gigabit interface converter GBIC graphical user interface...
Page 198
Service Request Number Sun Remote Services Storage Service Processor storage virtualization engine TCP/IP transport control protocol/internet protocol VLUN virtual LUN worldwide name Abbreviations and Acronyms-178 Sun Proprietary/Confidential: Internal Use Only Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003...
Page 199
FRU tests available, 64 isolation of, 64 Storage Service Processor-side notification, 61 troubleshooting, 60 verifying data host, 62 verifying Sun StorEdge 3900 series, 62 verifying Sun StorEdge 6900 series, 62 Sun Proprietary/Confidential: Internal Use Only c2 path returning to production, 19 unconfiguring, 17...
Page 200
147 Fibre Channel link Index 180 Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide Sun Proprietary/Confidential: Internal Use Only A1 or B1 data host verification, 45 A2 to B2 host side verification, 51 A3 or B3 host-side verification, 56...
Page 201
4 Microsoft Windows 2000 troubleshooting, 137 viewing system errors, 26 Microsoft Windows NT configurations, 7 monitoring functions for Sun StorEdge 3900 and 6900 Series, 2 multipath configurator array properties, 141 healthy configuration, 140 with LUN failover, 141 Sun Proprietary/Confidential: Internal Use Only...
Page 202
12 primary data paths to Sun StorEdge T3+ array, 13 Index 182 Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide Sun Proprietary/Confidential: Internal Use Only Sun StorEdge Network FC Switch-8 and Switch-16 switch...
Page 203
test examples command line, 27 qlctest(1M), 27 switchtest(1M), 28 testing FRUs, 5 tests how to run, 5 Sun StorEdge T3+ arrays, 5 thresholds used in PFA, 2 tools troubleshooting, 23 troubleshooting broad steps, 3 check status of Sun StorEdge T3+ array, 4 check status of the Sun StorEdge network FC switch-8 and switch-16 switch, 5 check status of the virtualization engine, 5...
Page 204
74 Index 184 Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003 Sun Proprietary/Confidential: Internal Use Only...
Need help?
Do you have a question about the 3900 and is the answer not in the manual?
Questions and answers