Human Errors - Red Hat ENTERPRISE LINUX 4 - INTRODUCTION TO SYSTEM ADMINISTRATION Administration Manual

Introduction to system administration
Hide thumbs Also See for ENTERPRISE LINUX 4 - INTRODUCTION TO SYSTEM ADMINISTRATION:
Table of Contents

Advertisement

158
The point here is that your organization must determine at what point an extended outage will just
have to be tolerated. Or if that is not an option, your organization must reconsider its ability to function
completely independently of on-site power for extended periods, meaning that very large generators
will be needed to power the entire building.
Of course, even this level of planning cannot take place in a vacuum. It is very likely that whatever
caused the extended outage is also affecting the world outside your organization, and that the outside
world will start having an affect on your organization's ability to continue operations, even given
unlimited power generation capacity.
8.1.3.3. Heating, Ventilation, and Air Conditioning
The Heating, Ventilation, and Air Conditioning (HVAC) systems used in today's office buildings are
incredibly sophisticated. Often computer controlled, the HVAC system is vital to providing a com-
fortable work environment.
Data centers usually have additional air handling equipment, primarily to remove the heat generated
by the many computers and associated equipment. Failures in an HVAC system can be devastating to
the continued operation of a data center. And given their complexity and electro-mechanical nature,
the possibilities for failure are many and varied. Here are a few examples:
The air handling units (essentially large fans driven by large electric motors) can fail due to electri-
cal overload, bearing failure, belt/pulley failure, etc.
The cooling units (often called chillers) can lose their refrigerant due to leaks, or they can have their
compressors and/or motors seize.
HVAC repair and maintenance is a very specialized field — a field that the average system admin-
istrator should leave to the experts. If anything, a system administrator should make sure that the
HVAC equipment serving the data center is checked for normal operation on a daily basis (if not more
frequently) and is maintained according to the manufacturer's guidelines.
8.1.3.4. Weather and the Outside World
There are some types of weather that can cause problems for a system administrator:
Heavy snow and ice can prevent personnel from getting to the data center, and can even clog air
conditioning condensers, resulting in elevated data center temperatures just when no one is able to
get to the data center to take corrective action.
High winds can disrupt power and communications, with extremely high winds actually doing
damage to the building itself.
There are other types of weather than can still cause problems, even if they are not as well known. For
example, exceedingly high temperatures can result in overburdened cooling systems, and brownouts
or blackouts as the local power grid becomes overloaded.
Although there is little that can be done about the weather, knowing the way that it can affect your
data center operations can help you to keep things running even when the weather turns bad.

8.1.4. Human Errors

It has been said that computers really are perfect. The reasoning behind this statement is that if you
dig deeply enough, behind every computer error you will find the human error that caused it. In this
section, the more common types of human errors and their impacts are explored.
Chapter 8. Planning for Disaster

Advertisement

Table of Contents
loading

Table of Contents