Sun Microsystems Sun StorEdge Availability Suite 3.2 Troubleshooting Manual
Sun Microsystems Sun StorEdge Availability Suite 3.2 Troubleshooting Manual

Sun Microsystems Sun StorEdge Availability Suite 3.2 Troubleshooting Manual

Advertisement

Quick Links

Sun StorEdge
Availability Suite 3.2 Software

Troubleshooting Guide

Sun Microsystems, Inc.
www.sun.com
Part No. 817-3752-10
December 2003, Revision 51
Submit comments about this document at: http://www.sun.com/hwdocs/feedback

Advertisement

Table of Contents
loading
Need help?

Need help?

Do you have a question about the Sun StorEdge Availability Suite 3.2 and is the answer not in the manual?

Questions and answers

Summary of Contents for Sun Microsystems Sun StorEdge Availability Suite 3.2

  • Page 1: Troubleshooting Guide

    Sun StorEdge ™ Availability Suite 3.2 Software Troubleshooting Guide Sun Microsystems, Inc. www.sun.com Part No. 817-3752-10 December 2003, Revision 51 Submit comments about this document at: http://www.sun.com/hwdocs/feedback...
  • Page 2 Copyright© 2003 Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, Californie 95054, Etats-Unis. Tous droits réservés. Sun Microsystems, Inc. a les droits de propriété intellectuels relatants à la technologie qui est décrit dans ce document. En particulier, et sans la limitation, ces droits de propriété...
  • Page 3: Table Of Contents

    Contents Preface v Point-in-Time Copy Software Troubleshooting Tips 1 Troubleshooting Checklist 1 Checking Log Files 2 Improving Performance 2 Safeguarding the VTOC Information 3 Remote Mirror Software Troubleshooting Tips 5 Troubleshooting Checklist 6 Troubleshooting Log Files and Services 6 Checking Log Files 7 Checking the /etc/nsswitch.conf File 8 Checking That the rdc Service Is Running 8 If the /dev/rdc Link Is Not Created 9...
  • Page 4 Correcting Common User Errors 13 Enabled Software on Only One Host 13 Volumes Are Inaccessible 13 Wrong Volume Set Name Specified 14 Accommodating Memory Requirements 16 Error Messages 19 Sun StorEdge Availability Suite 3.2 Software Troubleshooting Guide • December 2003...
  • Page 5: Preface

    Preface Sun StorEdge Availability Suite 3.2 Software Troubleshooting Guide helps users solve common problems that might arise when using the Sun StorEdge™ Availability Suite 3.2 software. Before You Read This Book To use the information in this document, you must have thorough knowledge of the topics discussed in these books: Sun StorEdge Availability Suite 3.2 Point-in-Time Copy Software Administration and...
  • Page 6: How This Book Is Organized

    See the following for this information: Software documentation that you received with your system Solaris™ operating environment documentation, which is at http://docs.sun.com vi Sun StorEdge Availability Suite 3.2 Software Troubleshooting Guide • December 2003...
  • Page 7: Typographic Conventions

    Shell Prompts Shell Prompt C shell machine-name% C shell superuser machine-name# Bourne shell and Korn shell Bourne shell and Korn shell superuser Typographic Conventions Typeface Meaning Examples The names of commands, files, Edit your.login file. AaBbCc123 and directories; on-screen Use ls -a to list all files. computer output % You have mail.
  • Page 8: Related Documentation

    Sun StorEdge Availability Suite 3.2 Point-In-Time 817-2781 Copy Software Administration and Operations Guide Cluster Sun Cluster 3.0 and Sun StorEdge Software 816-5127 Integration Guide Configuration Sun Enterprise 10000 InterDomain Network 806-5230 Configuration Guide viii Sun StorEdge Availability Suite 3.2 Software Troubleshooting Guide • December 2003...
  • Page 9 Sun is interested in improving its documentation and welcomes your comments and suggestions. You can submit your comments by going to: http://www.sun.com/hwdocs/feedback Please include the title and part number of your document with your feedback: Sun StorEdge Availability Suite 3.2 Software Troubleshooting Guide, part number 817- 3752-10 Preface...
  • Page 10 Sun StorEdge Availability Suite 3.2 Software Troubleshooting Guide • December 2003...
  • Page 11: Point-In-Time Copy Software Troubleshooting Tips

    This table shows the troubleshooting checklist and related sections. Troubleshooting Checklist TABLE 1-1 Step For Instructions 1. Check for installation errors. Sun StorEdge Availability Suite 3.2 Software Installation Guide 2. Check that /dev/ii is created after Sun StorEdge Availability Suite 3.2 Software reboot. Installation Guide 3.
  • Page 12: Checking Log Files

    The sv_threads value is in the /usr/drv/conf/sv.conf file. Because the file is read when a module loads, changes to the sv_threads value do not take effect until you reboot the system. Sun StorEdge Availability Suite 3.2 Software Troubleshooting Guide • December 2003...
  • Page 13: Safeguarding The Vtoc Information

    Safeguarding the VTOC Information Caution – When creating shadow volume sets, do not create shadow or bitmap volumes using partitions that include cylinder 0. Data loss might occur. The Solaris system administrator must be knowledgable about the virtual table of contents (VTOC) that is created on raw devices by the Solaris operating system.
  • Page 14 Sun StorEdge Availability Suite 3.2 Software Troubleshooting Guide • December 2003...
  • Page 15: Remote Mirror Software Troubleshooting Tips

    “Checking the Integrity of the Link” on page 10 “Correcting Common User Errors” on page 13 Note – The Sun StorEdge Availability Suite 3.2 Remote Mirror Software Administration and Operations Guide describes the dsstat and scmadm commands. These commands are useful for displaying information about remote mirror and point-in-...
  • Page 16: Troubleshooting Checklist

    Troubleshooting Log Files and Services The remote mirror software is client-server software that is bidirectional. The primary and secondary hosts each act as a client and server in the protocol. Sun StorEdge Availability Suite 3.2 Software Troubleshooting Guide • December 2003...
  • Page 17: Checking Log Files

    Checking Log Files Check the following files to troubleshoot problems: /var/opt/SUNWesm/ds.log The /var/opt/SUNWesm/ds.log file contains timestamped messages about the software. For example: Aug 20 19:13:55 scm: scmadm cache enable succeeded Aug 20 19:13:55 ii: iiboot resume cluster tag <none> Aug 20 19:13:58 sndr: sndrboot -r first.atm /dev/vx/rdsk/rootdg/vol5 /dev/vx/rdsk/ rootdg/bm6 second.atm /dev/vx/rdsk/rootdg/vol7 /dev/vx/rdsk/rootdg/bm7 Successful...
  • Page 18: Checking The /Etc/Nsswitch.conf File

    121/tcp # SNDR server daemon Use the rpcinfo and netstat commands to check the service: rpcinfo # rpcinfo -T tcp hostname 100143 program 100143 version 6 ready and waiting Sun StorEdge Availability Suite 3.2 Software Troubleshooting Guide • December 2003...
  • Page 19: If The /Dev/Rdc Link Is Not Created

    where: -T tcp specifies the transport that the service uses. hostname is the name of the machine where the service is running. If the service is not running, this message is displayed: rpcinfo: RPC: Program not registered If you see this message, it is possible that the /etc/nsswitch.conf services: entry is incorrectly configured.
  • Page 20: Checking The Integrity Of The Link

    Use the snoop or atmsnoop commands to make sure the software is copying data. Note – The dsstat command displays volume information The sndradm -H command displays link I/O statistics. Sun StorEdge Availability Suite 3.2 Software Troubleshooting Guide • December 2003...
  • Page 21: Testing With Ifconfig

    Testing with ifconfig Use the ifconfig command to make sure that the network interface is configured and running correctly. This example output shows all the interfaces that are configured and running: # ifconfig -a ba0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 9180 index 1 inet 192.9.201.10 netmask ffffff00 broadcast 192.2.201.255 ether 8:0:20:af:8e:d0 lo0: flags=1000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4>...
  • Page 22 RECEIVE : VC=32 TCP D=1011 S=121 Ack=2333980449 Seq=2878301022 Len=0 Win=36450 _____________________________________________________________________________ RECEIVE : VC=32 RPC R (#4) XID=1930565346 Success _____________________________________________________________________________ TRANSMIT : VC=32 TCP D=121 S=1011 Ack=2878301054 Seq=2333980449 Len=0 Win=41076 Sun StorEdge Availability Suite 3.2 Software Troubleshooting Guide • December 2003...
  • Page 23: Correcting Common User Errors

    Correcting Common User Errors This section describes user errors encountered often when using the software. “Enabled Software on Only One Host” on page 13 “Volumes Are Inaccessible” on page 13 “Wrong Volume Set Name Specified” on page 14 Enabled Software on Only One Host New users sometimes forget to issue the sndradm -e enable command on both the primary host and the secondary host.
  • Page 24: Wrong Volume Set Name Specified

    If you issue an sndradm command without specifying a volume set name, the software executes the command on all configured volume sets. Make sure that you specify the correct volume set on the command line. Sun StorEdge Availability Suite 3.2 Software Troubleshooting Guide • December 2003...
  • Page 25 For example, this command updates the volume on the secondary host calamari from the primary host volume: # sndradm -un calamari:/dev/vx/rdsk/rootdg/tony1 To correctly display the volume set name, use the sndradm -p command on the primary host. See “To Find the Volume Set Name” on page Using the dsstat Command Incorrectly An administrator might use the dsstat(1M) command instead of sndradm -p to find the volume set name.
  • Page 26: Accommodating Memory Requirements

    Accommodating Memory Requirements In releases prior to the Sun StorEdge Availability Suite 3.2 software, a single asynchronous thread was created for each group of volume sets on the primary host. Asynchronous I/O requests were placed on an in-memory queue and serviced by this single thread.
  • Page 27 The order of write operations must be maintained within a group. Therefore, these out of order requests must be stored in memory on the secondary host until the missing request comes in and completes. The secondary host can store up to the hard-coded limit of 64 requests per group. Exceeding 64 stored requests stalls the primary host from issuing any more requests.
  • Page 28 Sun StorEdge Availability Suite 3.2 Software Troubleshooting Guide • December 2003...
  • Page 29: Error Messages

    Availability Suite 3.2 Software Installation Guide. Solaris error messages related to the Sun StorEdge Availability Suite software are described in ..lists Sun StorEdge Availability Suite 3.2 error messages in alphabetical TABLE 3-1 order. The error messages come from the following sources: PITC: From the point-in-time copy software.
  • Page 30 Error Messages for the Sun StorEdge Availability Suite 3.2 Software (Continued) TABLE 3-1 Error Message From Meaning PITC iiadm could not abort a copy or update operation on a Abort failed set. Possible errors: EFAULT: The kernel module tried to read out-of- bounds.
  • Page 31 Error Messages for the Sun StorEdge Availability Suite 3.2 Software (Continued) TABLE 3-1 Error Message From Meaning Kernel A request to reconfigure the bitmap on the local host Bitmap reconfig failed %s:%s has failed. This can happen for two reasons: •...
  • Page 32 Error Messages for the Sun StorEdge Availability Suite 3.2 Software (Continued) TABLE 3-1 Error Message From Meaning Remote mirror set cannot be found in the configuration cannot find SNDR set <shost>:<svol> in config database. The set is not configured. Check the entry for errors.
  • Page 33 Error Messages for the Sun StorEdge Availability Suite 3.2 Software (Continued) TABLE 3-1 Error Message From Meaning Kernel A set being enabled or resumed has a secondary Cannot enable %s:%s ==> %s:%s, secondary in use in another set volume that is already in use as a secondary volume for another remote mirror set.
  • Page 34 Error Messages for the Sun StorEdge Availability Suite 3.2 Software (Continued) TABLE 3-1 Error Message From Meaning PITC A copy or update operation could not be initiated. Copy failed Possible errors: EFAULT: The kernel module tried to read out-of- bounds. File a bug against iiadm.
  • Page 35 Error Messages for the Sun StorEdge Availability Suite 3.2 Software (Continued) TABLE 3-1 Error Message From Meaning PITC An overflow volume couldn’t be initialized. Possible Create overflow failed errors: EFAULT: The kernel module tried to read out-of- bounds. File a bug against iiadm.
  • Page 36 Error Messages for the Sun StorEdge Availability Suite 3.2 Software (Continued) TABLE 3-1 Error Message From Meaning Kernel A request to disable the disk queue is already in Disable pending on diskq %s, try again later progress. Verify that the previous request has completed successfully.
  • Page 37 Error Messages for the Sun StorEdge Availability Suite 3.2 Software (Continued) TABLE 3-1 Error Message From Meaning The disk queue volume specified for the disk queue volume <vol> must not match any primary SNDR volume or reconfiguration operation is already in use by the...
  • Page 38 Error Messages for the Sun StorEdge Availability Suite 3.2 Software (Continued) TABLE 3-1 Error Message From Meaning PITC Could not enable volume. Possible errors: Enable failed EFAULT: The kernel module tried to read out-of- bounds. File a bug against iiadm.
  • Page 39 Error Messages for the Sun StorEdge Availability Suite 3.2 Software (Continued) TABLE 3-1 Error Message From Meaning PITC iiadm had a problem detaching the overflow volume Failed to detach overflow volume from a set. Possible errors: EFAULT: The kernel module tried to read out-of- bounds.
  • Page 40 Error Messages for the Sun StorEdge Availability Suite 3.2 Software (Continued) TABLE 3-1 Error Message From Meaning PITC Because CFG_MAX_BUF is 1k, this message is not hostname tag exceeds CFG_MAX_BUF expected to be reported. PITC Could not import shadow volume. Possible errors:...
  • Page 41 Error Messages for the Sun StorEdge Availability Suite 3.2 Software (Continued) TABLE 3-1 Error Message From Meaning PITC Could not join shadow volume back to the set. Possible Join failed errors: EFAULT: The kernel module tried to read out-of- bounds. File a bug against iiadm.
  • Page 42 Error Messages for the Sun StorEdge Availability Suite 3.2 Software (Continued) TABLE 3-1 Error Message From Meaning PITC iiadm ran out of memory. Memory allocation failure Kernel The user issued a remote mirror command but does Must be super-user to execute not have superuser privileges.
  • Page 43 Error Messages for the Sun StorEdge Availability Suite 3.2 Software (Continued) TABLE 3-1 Error Message From Meaning PITC iiadm could not get a list of overflow volumes from Overflow list access failure the kernel. Possible errors: EFAULT: The kernel module tried to read out-of- bound.
  • Page 44 Error Messages for the Sun StorEdge Availability Suite 3.2 Software (Continued) TABLE 3-1 Error Message From Meaning Kernel The user requested a forward sync operation for a Reverse sync needed, cannot sync %s:%s ==> %s:%s remote mirror set which needs a reverse sync. This...
  • Page 45 Error Messages for the Sun StorEdge Availability Suite 3.2 Software (Continued) TABLE 3-1 Error Message From Meaning PITC The user attempted to perform a copy or update Shadow group %s is suspended operation on a group with one or more suspended sets.
  • Page 46 Error Messages for the Sun StorEdge Availability Suite 3.2 Software (Continued) TABLE 3-1 Error Message From Meaning Set does not have a disk queue attached when SNDR set does not have a disk queue attempting either a queue remove operation or a queue replace operation.
  • Page 47 Error Messages for the Sun StorEdge Availability Suite 3.2 Software (Continued) TABLE 3-1 Error Message From Meaning Kernel The data volume in the remote mirror set is already in The volume %s is already in use use as a bitmap volume or a disk queue volume. Use a different data volume.
  • Page 48 Error Messages for the Sun StorEdge Availability Suite 3.2 Software (Continued) TABLE 3-1 Error Message From Meaning Could not determine the host name of the system. unable to determine hostname: <host> The IP address for either the primary host or the unable to determine IP addresses for either host <phost>...
  • Page 49 Error Messages for the Sun StorEdge Availability Suite 3.2 Software (Continued) TABLE 3-1 Error Message From Meaning Lookup of the set ID in the configuration database for unable to obtain unique set id for <shost>:<svol> this set has failed. The volume specified for the bitmap could not be Unable to open bitmap file <vol>...
  • Page 50 Error Messages for the Sun StorEdge Availability Suite 3.2 Software (Continued) TABLE 3-1 Error Message From Meaning PITC One or more volumes in a group copy or update Update failed command failed. Possible errors: EFAULT: The kernel module tried to read out-of- bounds.
  • Page 51 Error Messages for the Sun StorEdge Availability Suite 3.2 Software (Continued) TABLE 3-1 Error Message From Meaning PITC iiadm detected that the master, shadow, and bitmap Volumes are not in same disk group volumes are not all in the same cluster device group, as required by the point-in-time copy software.
  • Page 52 /etc/init.d/scm stop 2. Issue the cfgadm command. 3. Start I/O to the sets, using the following series of commands: /etc/init.d/scm start /etc/init.d/sv start /etc/init.d/ii start /etc/init.d/rdc start /etc/init.d/rdcfinish start Sun StorEdge Availability Suite 3.2 Software Troubleshooting Guide • December 2003...

Table of Contents