Page 1
Sun StorEdge™ SAN 4.0 Release Field Troubleshooting Guide Sun Microsystems, Inc. 4150 Network Circle Santa Clara, CA 95054 U.S.A. 650-960-1300 Part No. 816-6580-11 October 2002, Revision A Send comments about this document to: docfeedback@sun.com...
Page 2
Sun, Sun Microsystems, the Sun logo, AnswerBook2, docs.sun.com, Sun StorEdge network FC switch-8, and Solaris are trademarks, registered trademarks, or service marks of Sun Microsystems, Inc. in the U.S. and other countries. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc.
Contents Preface xi Introduction 1 Document Scope 2 New Features of the Sun StorEdge SAN 4.0 Release 3 Cascading Switches (E_Ports) 7 Configurations 9 Supported Hardware 10 Supported Configurations 12 Operating Environments 12 Hosts Host/Operating Environment Rules 14 Storage Arrays 14...
Page 4
Software Packages and Patches 16 To generate the most recent patch list for a Sun Solaris Release 16 To generate the most recent patch list for a specific Sun StorEdge SAN 4.0 Release Configuration 16 Unbundled Software 17 Switches 18 Switch Port Types 19 New Sun StorEdge SAN 4.0 Release Port Types 19...
Page 5
Diagnostics 31 Diagnostic Tools 32 Storage Automated Diagnostic Environment Version 2.1 32 Storage Automated Diagnostic Environment Version 2.1 Functions 33 To Access the Diagnostic Tests 35 Sun Explorer Data Collector (SUNWexplo) and T3Extractor 40 Explorer 40 T3Extractor 40 Diagnosing and Troubleshooting the Sun Switch 41 Using Switch Counter Information 41 qlctest Test 42 Troubleshooting Example 43...
Page 6
2.1 Package 79 Brocade Communications Systems Switch Troubleshooting 81 Related Documentation 82 Supported Configurations 83 QuickLoop 87 Current Issues with the Storage Automated Diagnostic Environment Version 2.1 and Brocade Switches 87 Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Page 7
Storage Automated Diagnostic Environment Version 2.1 and Brocade Switches 87 brocadetest(1M) 88 Other Diagnostic Tools 89 Sun StorEdge and Brocade Communications Systems Port Descriptions and Differences 95 Accessing the Brocade Silkworm Switch 96 Power On Self Test (POST) 98 Removing Power 99...
Page 8
Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Page 9
FIGURE 2-2 Two Hosts Connected to Four Sun StorEdge T3 Array Enterprise Configurations 28 FIGURE 2-3 Two Hosts Connected to Sun StorEdge T3 Array Partner Group—Each Host with Separate FIGURE 2-4 Non-shared Storage 29 Storage Automated Diagnostic Environment Version 2.1 Home Window 32 FIGURE 3-1 Storage Automated Diagnostic Environment—Diagnose Tab Selected 35...
Page 10
Continued Link Test Example Results 115 FIGURE B-7 Continued Link Test Example Results 116 FIGURE B-8 Storage Automated Diagnostic Environment Version 2.1—Test from Topology Window 119 FIGURE B-9 viii Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Page 11
Comparison of the SAN 3.0 and SAN 4.0 Releases 3 TABLE 1-1 Supported Hardware 10 TABLE 2-1 Sun StorEdge SAN 4.0 Release Sun Operating Environment Compatibility Matrix 12 TABLE 2-2 Sun StorEdge SAN 4.0 Release Server Compatibility Matrix 13 TABLE 2-3 Sun StorEdge SAN 4.0 Release Storage Array Compatibility Matrix 14...
Page 12
TABLE B-6 Nomenclature 95 Probable Failure Actions 123 TABLE C-1 Error Message Codes Defined 124 TABLE C-2 Diagnostic Error Messages 128 TABLE C-3 ASIC and Port Values 142 TABLE D-1 Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Preface This Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide describes how to diagnose and troubleshoot the Sun StorEdge SAN 4.0 hardware. It provides information and pointers to additional documentation you may need for installing, configuring, and using the configuration. The book is intended for use by Sun Service Engineers who have a good understanding of the product.
Page 14
Shell Prompts Shell Prompt C shell machine_name% C shell superuser machine_name# Bourne shell and Korn shell Bourne shell and Korn shell superuser xii Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Page 15
Switch-16 Release Notes Installer/user Sun StorEdge Network 2Gb Switch-8/16 875-3264 information—2 Gbyte (SANbox2) Management Manual switch Sun StorEdge Network 2 Gb FC Switch-16 FRU 816-5285 Installation Sun StorEdge Network 2Gb Switch-16 875-3263 (SANbox2) Installer’s/User’s Manual Reference Brocade Fabric OS Reference Manual Version 3.0...
Page 16
Sun Cluster 3.0 Installation Guide 806-1419 Solaris Volume VERITAS Volume Manager 3.2 Installation 875-3165 Manager installation Guide RAID RAID Manager 6.22 User’s Guide 806-0478 Storage Rackmount Rackmount Placement Matrix 805-4748 Cabinet information xiv Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Page 17
Sun StorEdge SAN 4.0 Release Related Documentation (Continued) TABLE P-1 man pages cfgadm utility cfgadm_fp (1M) format utility format (1M) luxadm utility luxadm (1M) * Find these documents at: http://www.sun.com/products-n-solutions/hardware/docs/Network_Storage_Solutions/SAN/index.html → Other Documentation. Accessing Documentation Online The docs.sun.com web site enables you to access select Sun technical documentation on the Web.
Page 18
Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
This manual only addresses troubleshooting. No repair or corrective action procedures are contained herein. This chapter contains the following sections: “Document Scope” on page 2 “New Features of the Sun StorEdge SAN 4.0 Release” on page 3...
Additional information and resources are available at: http://www.sun.com/storage/san/, or at: http://sunsolve.Sun.COM → Product Patches → PatchPro. These websites contain information on software versions and provide necessary patches. Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
New Features of the Sun StorEdge SAN 4.0 Release The Sun StorEdge SAN 4.0 release supports many new features, that are summarized in . Several features of the SAN 3.x release are not included in TABLE 1-1 the SAN 4.0 release, and many features were carried forward. For an explanation of the new features, see the Sun StorEdge SAN 4.0 Release Configuration Guide.
Page 22
Pluggable (SFP) 2- Gbit transceivers replace GBICs. Long-wave only SC- Long-wave and Long-wave and SC cables supported. short-wave SC short-wave SC-SC, cables supported. SC-LC, and LC-LC cables supported. Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Page 23
Multipathing and Multipathing and load balancing load balancing supported with the through the Sun Sun StorEdge Traffic StorEdge Traffic Manager Manager application application. with SunCluster 3.0 or VERITAS Cluster Server. Chapter 1 Introduction...
Page 24
T3+ array firmware supported. supported. is supported. The Sun StorEdge 39x0, 69x0 and 99x0 series are also supported. Third-party Interoperability Compatibility capability with FC- SW2 mode on the new switches. Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Sun StorEdge and Brocade Communications Systems, Inc. In the Sun StorEdge SAN 4.0 release, switches are allowed to be cascaded together by using E_Ports. This cascading is allowed with either a shortwave or longwave Small Form Factor Pluggable (SFP) 2-gigabit transceiver. The use of shortwave SFPs allows a higher port count in a local configuration.
Page 26
Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
C H A P T E R Configurations This chapter contains information and instructions for configuring your Sun StorEdge Network Fibre Channel Switch-16 with one or more hosts and storage. This chapter contains the following sections: “Supported Hardware” on page 10 “Supported Configurations”...
In a single switch configuration, the switch is connected to the host through a fiber optic cable to a Sun StorEdge PCI Fibre Channel Network Adapter. The other ports of the switch are connected to storage devices through a fiber optic cable.
Page 29
Supported Hardware (Continued) TABLE 2-1 Model, Part Number, or System Code Description X9721A 0.4-meter fiber cable (LC-SC) X9722A 2-meter fiber cable (LC-SC) X9723A 5-meter fiber cable (LC-SC) X9724A 15-meter fiber cable (LC-SC) X9732a 2-meter fiber cable (LC-LC) X9733a 5-meter fiber cable (LC-LC) X9734a 15-meter fiber cable (LC-LC) 1 You must use a long-wave SFP and corresponding long-wave fiber cable if you cascade more than 500...
Sun StorEdge SAN 4.0 Release Sun Operating Environment Compatibility TABLE 2-2 Matrix Operating Environment Version Notes Sun Solaris 2.6 Not supported Sun Solaris 7 Not supported Sun Solaris 8 02/02 (Update 7) or later Sun Solaris 9 Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
TL/fabric switch mode Sun StorEdge 69x0 array Requires switch hardware or firmware upgrade to use SAN 4.0 capabilities. Sun StorEdge 9960 & 9910 arrays Sun StorEdge 9980 & 9970 arrays Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Maximum initiators per LUN Maximum initiators per zone 1 The host must be connected to the F_Port on the switch; a Sun StorEdge T3 array must be connected to the TL port of the switch. 2 This implies 2 initiators (2 hosts) for simple arrays (T3WG), but 4 initiators (2 hosts) for a partner pair (T3ES).
The PATCHPRO Interactive menu is displayed. 5. Select all the appropriate features of your system in the following areas of the menu: OS Release Platform Disk Array Tape Libraries Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Sun StorEdge Instant Image See SunSolve for the latest patches. “On Demand Node Creation” SUNWcfpl:VERSION=11.8.0, REV=2001.07.14.21.42, SUNWcfplx:VERSION=11.8.0, REV=2001.07.14.21.42 Switches For high availability, configure the Sun StorEdge Network FC Switch-16 switch in parallel. Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
F_, FL_, or E_Ports upon device detection. Private loop devices that require SL ports can not connect to the new switches. The 2-Gbit Sun StorEdge network adapters in this release will recognize the private loop arrays as fabric devices when they are connected with TL_Ports or L_Ports.
E_Port, F_Port, or FL_Port. A port is defined as a U_Port when it is not yet fully connected or has not yet assumed a specific function in the fabric. Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
For example, you can use port zoning to make all the disks of a Sun StorEdge T3 array belong to the same zone in a SAN. Alternately, you can share the resources of the array among several NS zones.
For high-availability applications, configure two sets of switches in parallel. Zones and Arrays Sun StorEdge T3 arrays support name server zones (or zones in which a host has made a point-to-point Fabric connection to a switch and the Sun StorEdge T3 array is attached to a TL port).
Zones and Storage You can dynamically add storage to a port-based or WWN-based zone, using cfgadm procedures for the Sun StorEdge T3 arrays. This requires the Sun StorEdge T3 and T3+ arrays to be connected as TL or Fabric devices.
Configuration Examples Single Host Connected to One Storage Array shows one host connected through fiber-optic cables to a Sun StorEdge T3 FIGURE 2-1 array enterprise configuration. Switches Host Sun StorEdge T3 array partner pair Host Adapter Host Adapter Fibre-optic cables...
Single Host Connected to Multiple Storage Arrays shows a single host connected to multiple Sun StorEdge T3 array partner FIGURE 2-2 pairs. Note – You can attach different types of storage devices to the same switch, as long as the storage devices are on different zones.
Sun StorEdge T3 array partner pairs Switches Host Host Adapter Host Adapter Single Host Connected to Multiple Sun StorEdge T3 Array Enterprise FIGURE 2-2 Configurations Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Multihost shows two hosts connected to four Sun StorEdge T3 array partner pairs. FIGURE 2-3 shows two hosts connected to a Sun StorEdge T3 array Partner Group in FIGURE 2-4 which each host maintains separate, non-shared storage. Note – You can attach different storage types to the same switch so long as the storage devices are on different zones.
Sun StorEdge T3 partner pairs Switches Host Host Adapter Host Adapter Host Host Adapter Host Adapter Two Hosts Connected to Four Sun StorEdge T3 Array Enterprise FIGURE 2-3 Configurations Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Note – You must enable Sun StorEdge Traffic Manager software for failover across multiple hosts to function. The mp_support on the Sun StorEdge T3 array should be set to mpxio (Sun StorEdge Traffic Manager Software). Sun StorEdge L180 or L700 FC Tape Library...
Page 48
Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Diagnostics This chapter provides an overview of the tools you can use to monitor, diagnose, troubleshoot, and gather information on the Sun StorEdge SAN 4.0 Release and on the Sun StorEdge Network Fibre Channel Switch-16. Detailed installation and configuration information can be found in the respective documentation of the tools.
(DAS) devices. It can be configured to monitor on a 24-hour basis, collecting information that enhances the reliability, availability, and serviceability (RAS) of the storage devices. Storage Automated Diagnostic Environment Version 2.1 Home Window FIGURE 3-1 Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
2. Reads the proper /var/adm/messages files, finds relevant entries, and reports them as events through the local email notification mechanism, if configured. 3. Connects to Sun StorEdge T3 and T3+ array storage devices directly through in- band data paths and out-of-band management paths.
Page 52
The Storage Automated Diagnostic Environment can monitor host message files for errors, or connect directly through the “in-band” data path or “out-of-band” management path of Sun StorEdge devices, in order to obtain status information about each device being monitored.
To Access the Diagnostic Tests 1. Click the Diagnose tab in the Storage Automated Diagnostic Environment home window. Three links are then displayed below the tab as shown in FIGURE 3-2 Storage Automated Diagnostic Environment—Diagnose Tab Selected FIGURE 3-2 2. Click the Diagnostic Tests link. Five tests are displayed as shown in FIGURE 3-3 Chapter 3 Diagnostics...
Using the Topology view, you can select specific subtests and test options. The monitoring status of devices and links appears both in the test topology view and in the list view. Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Storage Automated Diagnostic Environment—Test from Topology Window with Background FIGURE 3-5 Reduced to 66% Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Storage Automated Diagnostic Environment—Test from Topology Window with Background FIGURE 3-6 Reduced to 66% and Components Arranged for Viewing Chapter 3 Diagnostics...
Note – You can gather the same information by querying the Storage Automated Diagnostic Environment version 2.1 that you can gather using the sanbox API. These methods are completely supported, unlike command-line sanbox API usage. Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Using Switch Counter Information Switch counter information can be helpful in supporting troubleshooting the Sun StorEdge Network Fibre Channel Switch-16. Some general points to keep in mind when viewing switch counter information are: Quickly increasing counter values or abnormally high counter values may indicate a problem.
Sun StorEdge Network Fibre Channel Switch-16 counter information can be called up by using the SANbox Manager application. See the Sun StorEdge Network 2Gb Switch-16 (SANbox2) Management Manual. This manual can be found with the following steps. 1. Access the SAN Solutions web site.
C H A P T E R Troubleshooting Example In this section, a troubleshooting example is shown with a SAN 4.0 configured with Sun StorEdge 2 Gbyte FC switches and two Sun StorEdge T3+ arrays in an enterprise configuration. This chapter contains the following sections: “Example Configuration”...
The troubleshooting example has the following configuration: One Enterprise 450 Workgroup Server Solaris 9 update 1 with all relevant Sun StorEdge SAN 4.0 Release patches and packages Two Sun StorEdge T3+ arrays in an enterprise configuration (1 LUN per array)
The two switches are zoned such that they present two isolated paths from the HBAs through the ISL links to the Sun StorEdge T3+ arrays Each HBA has physical connectivity to only one Sun StorEdge T3+ arrays The Storage Automated Diagnostic Environment version 2.1 is configured to...
Page 64
6. Verify the fix. Storage Automated Diagnostic Environment version 2.1 monitoring status Storage Automated Diagnostic Environment version 2.1 diagnostic tests /var/adm/messages log information Multipathing status returns to normal condition LED status Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Troubleshooting Example of a Host–to–Switch Error Determine the Error The first indication of a problem can come from a Storage Automated Diagnostic Environment version 2.1 email alert: Chapter 4 Troubleshooting Example...
Page 66
1. Run the appropriate disk test Diagnostic to isloate the failing drive 2. The messages report the device that is posting the errors and the full path continued ... ( Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Page 67
continuation ... ( DETAILS: Sep 13 13:04:57 WWN: Received 6 ’SSD Warning’ message(s) on ’ssd2’ in 14 mins [threshold is 5 in 24hours] Last-Message: ’diag226.Central.Sun.COM scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/ssd@g60020f20000003d53d3493930006a222 (ssd2): ’ ------------------------------------------------------------ Site : FSDE LAB Broomfield CO Source : diag226.central.sun.com Severity : Warning...
Page 68
3. Check for other alerts that may indicate an underlying problem. (ex. Switch Ports offline) 4. The outputs of ’cfgadm -al’ and ’luxadm -e port’ may uncover other fabric problems. continued ... ( Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Page 70
The u2ctlr took control of LUN 0 on t3b2 SSD and SCSI warnings were seen on host diag226 Sun StorEdge Traffic Manager Software has degraded the paths to a device with WWN 50020f23000003d5 One HBA went from CONNECTED to NOT CONNECTED Port 0 on a Sun StorEdge 2 Gb FC switch (ip=172.20.67.84) went offline...
Determine the Extent of the Problem Use the topology display of the Storage Automated Diagnostic Environment version 2.1 to see if any problems are shown. An example is shown in FIGURE 4-2 Troubleshooting Example View 2 FIGURE 4-2 From it can be seen that the error is only affecting a single path. This can FIGURE 4-2 be confirmed by using the cfgadm command.
Page 72
The luxadm -e port output shows that one of the HBAs has been affected. This leads to the conclusion that we have a single path problem, most likely affecting the HBA-to-switch link between /devices/pci@1f,2000/SUNW,qlc@1/fp@0,0 and port 0 of one switch. Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
These command outputs indicate that both controllers are active, u2 owns all the LUNs, and WWN corresponds to the WWN of the Master 50020f23000003d5 Controller. This confirms that the problem is most likely not with the Sun StorEdge T3+ arrays. Thus, there is probably an upstream path problem. Chapter 4 Troubleshooting Example...
Port 0 has gone offline. It also FIGURE 4-3 shows that the only other device that is affected is the host. This indicates a host-switch connection problem. Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Test the FRUs The following FRUs exist in the host-to-switch link: Switch or switch port Switch-side SFP Cable Host HBA To isolate the cause, perform one of the following options with the Storage Automated Diagnostics Environment: The switchtest in combination with the qlctest The linktest Storage Automated Diagnostics Environment switchtest and qlctest Tests...
"QLC Subsystem ID = 0x106" "QLC Adapter Chip Revision = 1, Risc Revision = 4, Frame Buffer Revision = 1287, Riscrom Revision = 1, Driver Revision = 6.0-2-1.17 " continued ... ( Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Page 77
continuation ... ( "Running external loopback test" "Performing Loop Back Frame Test. Pattern: 0x7e7e7e7e" "Performing Loop Back Frame Test. Pattern: 0x7e7e7e7e" "Performing Loop Back Frame Test. Pattern: 0x1e1e1e1e" "Performing Loop Back Frame Test. Pattern: 0xf1f1f1f1" "Performing Loop Back Frame Test. Pattern: 0xb5b5b5b5" "Performing Loop Back Frame Test.
Page 78
Restore ORIGINAL FC Cable into switch2: 100000c0dd00bfda (sw-67-84), port: 0 Suspect ORIGINAL FC GBIC or SFP in switch2: 100000c0dd00bfda (sw-67-84), port: 0 Retest to verify FRU replacement. linktest completed on FC interconnect: hba to switch2 Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Verify the Fix The Storage Automated Diagnostics Environment has identified the SFP as the most likely suspect. It suggests reconnecting the link and re-running the linktest to verify the results. You could also run the switchtest to stress the link with the number of test Fibre Channel frames.
Page 80
ONLINE The luxadm display command output indicates that both paths to the Sun StorEdge T3+ array LUN are seen again. However, the array is still using the secondary paths for the I/O data stream (secondary path is ONLINE; primary path is STANDBY).
A P P E N D I X Brocade Communications Systems Upgrades and Installations This appendix contains topics that describe how to install a new SAN system using Brocade Communications Systems, Inc. Silkworm™ switch. “Installing a New SAN” on page 66 “Downloading Patches and Packages”...
Sun StorEdge Traffic Manager This is available as a patch which can be installed on Solaris 8 release 02/02 (Update 7) or later. It should be installed with the latest revision of Sun StorEdge Network Foundation Software. Sun StorEdge Network Foundation Software This software is included with the Solaris upgrades for the FC switch product.
Storage Automated Diagnostic Environment version 2.1 The Storage Automated Diagnostic Environment version 2.1 is a separately installed software product. It is a lightweight, remote, monitoring agent designed to track storage product reliability, availability and serviceability. The Storage Automated Diagnostic Environment version 2.1 also provides revision and patch level checking, log file monitoring, and diagnostic testing.
Page 86
2. Compare the checksum value that is displayed to the patch checksum value given at the checksum File link: http://sunsolve.Sun.com If the values are identical, the patches were properly downloaded. Note – The checksum file at http://sunsolve.Sun.com is approximately 614 Kbytes. Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Patch or Package Software Solaris 8 Solaris 8 02/02 (Update 7) or later 8_Recommended Solaris 8 Recommended and Security patch cluster SUNWsan Sun StorEdge SAN Foundation Kit SUNWcfpl cfgadm plug-in 32-bit package SUNWcfplx cfgadm plug-in 64-bit package 111412-07 Sun StorEdge Traffic Manager...
To Install the Software Note – These instructions are to install the Sun StorEdge Network Foundation Software 6 patch. 1. Install Solaris 8 02/02 (Update 7) or later. 2. Install the latest Solaris 8 Recommended Security patch cluster. See the README file for patch installation instructions and notes.
Page 89
For each of the storage devices, upgrade the software, firmware, or configuration. After the above steps, you can leverage additional features provided by Brocade Silkworm 2400 (8-port), 2800 (16-port), 3800 (16-port), and 12000 (32/64/128 port) for: Sun StorEdge Traffic Manager functionality additional fabric zones additional initiators per zone host fabric connectivity...
1. From the Brocade web site, retrieve the switch firmware (for example, v2.6.x). 2. Download the firmware into your root (/) directory. Note – Since UNIX contains rshd and cat daemons, you do not need to retrieve the rsh.ZIP file. Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Page 91
3. Log into the UNIX system as root and edit the following files: a. Type the IP address and the switch name into the /etc/hosts file. # vi /etc/hosts <IP_address><switch_name> The output is displayed, as in CODE EXAMPLE 4-1 /etc/hosts file CODE EXAMPLE 4-1 # cat /etc/hosts # Internet host table...
Note – With version 2.1 and higher, commands are not case-sensitive. 3. Check the syntax by typing firmwaredownload and following the screen prompts. See for an FTP example. CODE EXAMPLE A-3 Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Page 93
FTP Example CODE EXAMPLE A-3 oem240:admin>firmwareDownload Server Name or IP Address [host]: 10.32.99.29 User Name [user]: root File Name [/usr/switch/firmware]: /var/tmp/v2.6.x Protocol (RSHD or FTP) [rshd]: ftp Password: 84776+3832+130980, csum 2ef6 loading to ram ....... writing flash 0 ...... writing flash 1 ...... download complete oem240:admin>fastboot 4.
The order in which the SAN components should be upgraded is as follows: 1. Familiarize yourself with the required software components, versions and patches. Refer to Appendix B for the supportability matrix. 2. Back up all data. Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
# pkginfo -l 1. Upgrade your SUNWsan package to Sun StorEdge SAN 4.0 Release. Before you start, check your system to see if it has been installed, and if it is already up to date. Use the pkginfo command to see if it has been installed.
To Upgrade the Storage Automated Diagnostic Environment Version 2.1 Package For all upgrades, you must first install the most recent Sun StorEdge Network Foundation Software patches. Refer to “To Install the Software” on page 70 for installation instructions before installing the SUNWstade package and the Brocade Communications Systems patch.
Page 98
5. Check your SAN Management host to verify the version of the Storage Automated Diagnostic Environment version 2.1 installed. SUNWstade # pkginfo -l Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
The scope of this appendix is to highlight the differences of troubleshooting with a Brocade Silkworm configuration to that of a configuration that contains the current Sun StorEdge Network Fibre Channel family of switches. Current support is limited to diagnosing failures down to the FRU level. In Sun’s support model, the entire Silkworm switch is considered a FRU.
The Sun StorEdge switch documents are referenced for overall configuration guidelines. Sun StorEdge SAN 4.0 Release Installation Guide Sun StorEdge SAN 4.0 Release Configuration Guide Sun StorEdge SAN 4.0 Release Notes Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Supported Configurations The Brocade Communications Systems Silkworm switch configurations and the Sun StorEdge switch configurations follow the same rules for maximum number of initiators, supported number of arrays per zone, and other hardware-specific information. Refer to Chapter 2, “Configurations” of this guide for supported hardware configurations.
Disk Array Supportability Matrix with Solaris 8 02/02 (Update 7) or Later TABLE B-2 Dynamic addition of target to a zone. Disk Arrays Disk Firmware Add First/Additional T3A WG/ES 1.18 Yes/Yes T3B WG/ES Yes/Yes Fibre Channel Switch Supportability Matrix with Solaris 8 02/02 (Update 7) TABLE B-3 or Later FC Switches...
5. Select all the appropriate features of your system in the following areas of the menu: OS Release Platform Disk Array Tape Libraries Disk Drives Tape Drives Switches and HBAs SAN Products | Brocade SAN Release Software 6. Click Generate Patch List. Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
QuickLoop is a separately licensed product. Note – For the Brocade Sun StorEdge SAN 4.0 Release phase, Sun StorEdge T3 and T3+ arrays do not need Quickloop, nor do host bus adapters. Sun StorEdge T3 and T3+ arrays will auto-configure as L_Ports and HBAs will auto-configure as F_Ports if the switch is in the fabric mode.
Fan #5 is OK, speed is 8820 RPM Fan #6 is OK, speed is 8820 RPM ********************************** Detected possible bad Power supply Power Supply #1 is absent ********************************** Power Supply #2 is OK Close 172.20.67.167 Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Other Diagnostic Tools Brocade Silkworm switches also support a wide range of CLI tests that can be invoked while connected directly to the switch via a serial connection to the Silkworm 2400, by opening a telnet session, or by way of the front panel of the Silkworm 2800.
Page 108
Made on: Tue Jan 15 15:10:28 PST 2002 Flash: Tue Jan 15 15:12:04 PST 2002 BootProm: Thu Jun 17 15:20:39 PDT 1999 Centigrade Fahrenheit Power Supply #1 is absent Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Page 109
switchshow Example Output CODE EXAMPLE B-2 diag167:admin> switchshow switchName: diag167 switchType: switchState: Online switchMode: Native switchRole: Subordinate switchDomain: switchId: fffc01 switchWwn: 10:00:00:60:69:20:1e:fc switchBeacon: Zoning: ON (Main) port 0: sw Online E-Port 10:00:00:60:69:10:71:25 "diag164" (upstream) port 1: -- No_Module port 2: sw Online F-Port 21:01:00:e0:8b:23:61:f9...
Page 110
The "1000" is the number of passes, the "1" denotes singlePortAlso mode, which allows the test to be run on a single port with a loopback connector plug inserted Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Page 111
loopPortTest Example Output CODE EXAMPLE B-6 diag164:admin> loopporttest 100,2,0x7e7e7e7e,4 Configuring L-port 2 to Cable Loopback Port..done. Will use pattern: 7e7e7e7e 7e7e7e7e 7e7e7e7e 7e7e7e7e Running Loop Port Test ..passed. Configuring Loopback L-port(s) back to normal L- port(s)..done. Note – Notes on loopPortTest Syntax is loopporttest <num_passes>,<port>,<user_pattern>,<pattern_width>...
Page 112
T300 0118] Fabric Port Name: 20:0e:00:60:69:10:71:25 The Local Name Server has 2 entries } Note – nsShow is a listing of WWNs of the devices connected to the switch. Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Sun StorEdge and Brocade Communications Systems Port Descriptions and Differences Sun StorEdge and Brocade Communications Systems Port Descriptions TABLE B-5 Port Nomenclature Function E_Port Expansion or inter-switch port. A type of switch port that can be connected to an E_Port of another switch to, in effect, create a cascading interswitch link (ISL).
Note – The Java Plug-in that is supplied with Solaris 8 02/02 (Update 7) is required. To Verify the Web License, type the following: admin> licenseshow SeRdQeQSbzTfSqSY: Web license Zoning license Quickloop license Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Brocade Webtools GUI FIGURE B-1 See the Brocade Web Tools User’s Guide for more information on WebTools usage. Note – The rest of this guide will assume telnet usage. Appendix B Brocade Communications Systems Switch Troubleshooting...
This indicates that the switch failed one of the initial stages of POST and that the CPU is not able to bring up the operating system. Should this occur, replace the switch. Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Removing Power Caution – Error messages are stored in RAM and are lost when power is removed from the switch. Capture and view the error log output and note any error messages before removing power. Status and Activity Indicators Front Panel LED Port Indicators Front Panel LEDs Definition No light showing...
Page 118
9. Routing table construction—after addresses are assigned, the unicast routing tables are constructed. 10. Enable normal port operation. Note – If any of the steps listed above fails, replace the entire switch as a single FRU. Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
3. Check Array Status. Open a telnet session to the Sun StorEdge T3 array Refer to the luxadm display output for Sun StorEdge A5200 arrays Raid Manager Healthcheck for the Sun StorEdge A3500FC arrays Storage Automated Diagnostic Environment version 2.1 instrumentation reports...
Page 120
6. Verify the fix. /var/adm/messages (path online, multipath informational messages) Storage Automated Diagnostic Environment version 2.1 status Sun StorEdge Traffic Manager or VxDMP, to return the path to its normal state Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Configuration Sun Fire V880 Solaris 8 02/02 (Update 7) with all recommended and latest Sun StorEdge Network Foundation Software patches Sun StorEdge T3 array Partner Pair with FW 1.18 Brocade Silkworm 2400 and 2800 switches with v2.6.0 firmware Storage Automated Diagnostic Environment version 2.1 with the latest patches...
Storage Automated Diagnostic Environment Version 2.1 Topology , a Sun StorEdge T3 array enterprise configuration is connected to a FIGURE B-2 cascaded switch. In another possible configuration, two separate switches can be used to eliminate a single point of failure.
Page 123
1. Discover the Error using Storage Automated Diagnostic Environment Alerts as shown in FIGURE B-3 Site : Lab Broom Source : diag229.central.sun.com Severity : Error (Actionable) Category : BROCADE DeviceId : brocade:1000006069201efc EventType: StateChangeEvent.M.port.2 EventCode: 5.26.35 EventTime: 2002/07/11 10:32:33 ’port.2’ in BROCADE br-67-167 (ip=172.20.67.167) is now Not-Available (state changed from ’online’...
Page 124
Serial Num: Unsupported Unformatted capacity: 241724.000 MBytes Write Cache: Enabled Read Cache: Enabled Minimum prefetch: Maximum prefetch: Device Type: Disk device Path(s): /dev/rdsk/c7t2B00006022004188d0s2 /devices/sbus@8,0/SUNW,qlc@1,30000/fp@0,0/ssd@w2b00006022004188,0:c,raw ---------------------------------------------------------------------- [continued on next page] Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
/devices/sbus@8,0/SUNW,qlc@1,30000/fp@0,0/ssd@w2b00006022004188,0:c,raw ---------------------------------------------------------------------- Storage Automated Diagnostic Environment Alert FIGURE B-3 This alert shows: An error on port two of switch 172.20.67.167 occurred A Sun StorEdge Traffic Manager offline event occurred The HBA is offline Appendix B Brocade Communications Systems Switch Troubleshooting...
Page 126
The device on c3 has disappeared. In addition, the luxadm output of the Sun StorEdge T3 arrays shows the following. # luxadm display /dev/rdsk/c6t60020F2000003EE53AAF7A09000DA257d0s2 /: luxadm display 50020f23000068cc Error: Invalid pathname (50020f23000068cc) Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Page 127
Ports highlighted by the color red are circled. From the topology, notice the HBA and port two of the first switch have errors. Note – From this Topology view, concentrate on the link between the HBA and the switch port 2. Appendix B Brocade Communications Systems Switch Troubleshooting...
Page 128
7: sw Online F-Port 21:00:00:e0:8b:03:61:f9 This switchshow output from the first switch confirms that port 2 has gone offline. No other ports seem to be affected at this point. Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Page 129
5. Use the Link Test to check the FRUs. In the Switch-to-HBA link there are potentially four FRUs: Cable Switch SFP Switch chassis Note – Before starting the Link Test, you must enter the password for the Brocade switch in the configuration menu. a.
The Link Test starts by running the HBA Test. In this example, the HBA Test fails. The Link Test then requests you to insert a loopback cable into the HBA. See FIGURE B-5 Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
The Link Test then runs the HBA Test again. This time the HBA Test succeeds and you are requested to reconnect the loopback cable into the HBA, as shown in FIGURE B-6 Test Result Details Showing a Successful Test FIGURE B-6 Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
The Link Test new runs the Switch Port Test. In this example, the Switch Port Test passes. The Link Test then requests you to insert a new fiber cable between the HBA and the Brocade switch port as shown in FIGURE B-7 Continued Link Test Example Results FIGURE B-7...
The Link Test then reruns the HBA Test. This time the HBA Test passes and the Link Test indicates that the fiber cable is the suspected failure cause. Continued Link Test Example Results FIGURE B-8 Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Page 135
6. Verify the fix. a. Check the cfgadm output to see if the device appears back in the fabric. cfgadm Output CODE EXAMPLE B-10 # cfgadm -al fc-fabric connected configured unknown c3::50020f23000068cc disk connected configured unusable fc-private connected unconfigured unknown fc-fabric connected configured...
Page 136
As a final check, look to the Storage Automated Diagnostic Environment version 2.1 topology. The ports that were in error are now green and the [mpx] error is green as well, as shown in FIGURE B-9 Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Storage Automated Diagnostic Environment Version 2.1—Test from Topology Window FIGURE B-9 Appendix B Brocade Communications Systems Switch Troubleshooting...
Page 138
Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
A P P E N D I X Brocade Communications Systems Error Messages This appendix explains the error message format and possible errors and contains the following topics: “Error Message Formats” on page 122 “Diagnostic Error Message Formats” on page 123...
(up to 999), and the error number is not incremented (that is, this error, though it may occur 999 times, occupies one message in the 32-message buffer). Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
To Display Error Messages from the Front Panel 1. From the front panel, select the Status menu. 2. Select Error Log. 3. Scroll through the error log. If no errors are encountered, the panel displays No Error. Diagnostic Error Message Formats If any port fails during a diagnostic test, it is marked BAD in the status display.
Page 145
Error Message Codes Defined (Continued) TABLE C-2 Error Number Test Name Error Name 3040 DIAG-ERRSTAT(ENCIN) crossPortTest 3041 DIAG-ERRSTAT(CTL) 3042 DIAG-ERRSTAT(TRUNC) 3043 DIAG-ERRSTAT(2LONG) 3044 DIAG-ERRSTAT(BADEOF) 3045 DIAG-ERRSTATENCOUT) 3046 DIAG-ERRSTAT(BADORD) 3047 DIAG-ERRSTAT(DISC3) 304F DIAG-INIT 305F DIAG-PORTDIED 3060 DIAG-STATS(FTX) 3061 DIAG-STATS(FRX) 3062 DIAG-STATS(C3FRX) 306E DIAG-DATA 306F...
Page 146
• Switch not disabled [camTest] failure • Diagnostic queue absent • Malloc failed • Chip is not present • Port is not in loopback mode • Port is not active Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Page 147
Diagnostic Error Messages (Continued) TABLE C-3 Message Description Probable Cause Action DIAG-CAMSID ASIC failed SID NO ASIC failure Replace mainboard translation test assembly Err#223C [camTest] DIAG-CLEAR_ERR Port’s diag error flag (OK or Information only None required BAD) is cleared Err#0001 DIAG-CMBISRF ASIC’s Central Memory ASIC failure...
Page 148
Err#102C [centralMemoryTest] DIAG-LCMEM Data read from the Central ASIC failure Replace mainboard Memory location did not assembly Err#1027 match data previously [centralMemoryTest, written into the same location cmemRetentionTest] Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Page 149
Diagnostic Error Messages (Continued) TABLE C-3 Message Description Probable Cause Action DIAG-LCMEMTX Central Memory transmit mainboard failure Replace mainboard path failure: ASIC 1 failed to assembly Err#1F27, 1028 read ASIC 2 via the transmit [centralMemoryTest] path DIAG-LCMRS Central Memory Read Short: ASIC failure Replace mainboard M bytes requested but got...
Page 150
Err#2271, 2671, 3071, 3871 [portLoopbackTest, crossPortTest, spinSilk, camTest] CONFIG CORRUPT The switch configuration OS error The system information has become automatically resorts irrevocably corrupted. to the default configuration settings. Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Page 151
Diagnostic Error Messages (Continued) TABLE C-3 Message Description Probable Cause Action CONFIG OVERFLOW The switch configuration OS error Contact customer information has grown too support large to be saved or has an invalid size. CONFIG VERSION The switch has encountered OS error The system an unrecognized version of...
Page 152
OS error Contact customer support SEMA, SEMFLUSH, L, M Unable to flush a semaphore OS error Contact customer support PANIC, TASKSPAWN, Task creation failed OS error Contact customer LOG_PANIC support Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Page 153
Diagnostic Error Messages (Continued) TABLE C-3 Message Description Probable Cause Action PANIC, SEMCREATE, Semaphore creation failed OS error Contact customer LOG_PANIC support PANIC, SEMDELETE, Semaphore OS error Contact customer LOG_PANIC support PANIC, QCREATE, Message queuer failed OS error Contact customer LOG_PANIC support PANIC, QDELETE,...
Page 154
FSPF, NBRCHANGE, Wrong neighbor ID in Hello OS error Contact customer LOG_WARNING message from port support FSPF, INPORT, LOG_ERROR Input port out of range OS error Contact customer support Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Diagnostic Error Messages (Continued) TABLE C-3 Message Description Probable Cause Action FSPF, VERSION, FSPF version not supported OS error Contact customer LOG_ERROR support FSPF, SECTION, Wrong section ID OS error Contact customer LOG_ERROR support FSPF, REMDOMAIN, Remote Domain ID out of OS error Contact customer LOG_ERROR...
Page 156
Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
A P P E N D I X Converting Sun FC Switches Fibre Channel Addresses This appendix explains how the Sun FC switch encodes Fibre Channel addresses. Note – This information only applies to the Sun FC switches. This appendix contains the following topics: “Converting a Fabric Address into Fabric ID, Chassis ID, ASIC, Port, and AL_PA”...
Or, you may see a luxadm -e dump_map output like the following: # luxadm -e dump_map /devices/pci@8,700000/pci@3/SUNW,qlc@4/fp@0,0:devctl Port_ID Hard_Addr Port WWN Node WWN Type 1084e4 1000e4 50020f2300009697 50020f2000009697 (Disk device) 108000 210100e08b2366f9 200100e08b2366f9 0x1f (Unknown Type,Host Bus Adapter) Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
The AL_PA will be zero if the device is a full fabric device, otherwise, it will be the AL_PA of the loop device. StorEdge Network Fibre Channel Switches have 2 or 4 ASICS (2 on the 8port switch, 4 on the 16port switch). These ASICs are numbered from 0-3.
Page 160
Knowing this information, you can easily determine where this device is located in the SAN. See TABLE D-1 ASIC and Port Values TABLE D-1 Switch Port ASIC ID Port ID Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Page 161
ASIC and Port Values (Continued) TABLE D-1 Appendix D Converting Sun FC Switches Fibre Channel Addresses...
Page 162
Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Page 163
Sun StorEdge Network Data Replicator (formerly “Sun StorEdge Remote SNDR Dual Copy”) that enables private TL_Port A Translated Loop Port on the Sun StorEdge T3 array devices to communicate with fabric or public devices A universal port that can operate as an E_Port, F_Port, or FL_Port. U_Port...
Page 164
There are several types of zones and a port may be defined in any. No port can be in all zone types simultaneously. Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Page 165
Index typographic, xii count adapters, 6 cascade limit, 3 arrays hop limit, 3 configuration guidelines, 22 ISL limit, 3 ISL link limit, 3 long-wave tranceiver limit, 3 maximum switches, 3 backward compatability, 5 diagnostic tool T3Extractor, 40 cable, 4 disaster tolerant configuration, 3 LC-LC, 4 SC-LC, 4 document...
Page 166
4 AnswerBook, xi Solaris Handbook for Sun Peripherals, xi rules adding and removing devices, 23 cascading, 23 maximum switch count, 3 zoning, 22 mesh configuration, 3 Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Page 167
17 websites for additional information, 2 storage device attachment, 3 WWN-based zones, 4 storage devices supported, 6 StorEdge Traffic Manager tool, 5 Sun StorEdge T3+ arrays, 6 SunCluster 3.0, 5 zone supported configurations, 3 name server, 21 switch zones configuration guidelines, 22...
Page 168
Sun StorEdge SAN 4.0 Release Field Troubleshooting Guide • October 2002...
Need help?
Do you have a question about the StorEdge and is the answer not in the manual?
Questions and answers