Memory Protection - IBM BladeCenter PS703 Technical Overview And Introduction

Hide thumbs Also See for BladeCenter PS703:

Service manual (298 pages)

Table Of Contents

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

page of 198

/ 198
Contents
Table of Contents
Bookmarks

Table of Contents

If there are available inactivated processor cores or capacity-on-demand (CoD) processor

cores, the system effectively puts a CoD processor into operation after it has been

determined that an activated processor is no longer operational. In this way the server

remains with its total processor power.

If there are no CoD processor cores available, system-wide total processor capacity is

lowered beneath the licensed number of cores.

Single processor checkstop

As in POWER6, POWER7 provides single processor check stopping for certain processor

logic or command or control errors that cannot be handled by the availability enhancements

mentioned previously.

This reduces the probability of any one processor affecting total system availability by

containing most processor checkstops to the partition that was using the processor at the

time full checkstop goes into effect.

Even with all these availability enhancements to prevent processor errors from affecting

system-wide availability, errors might result on a system-wide outage.

4.3.3 Memory protection

A memory protection architecture that provides good error resilience for a relatively small L1

cache might be inadequate for protecting the much larger system main store. Therefore, a

variety of protection methods are used in POWER processor-based systems to avoid

uncorrectable errors in memory.

Memory protection plans must take into account many factors, including:

Size

Desired performance

Memory array manufacturing characteristics

POWER7 processor-based systems have a number of protection schemes designed to

prevent, protect, or limit the effect of errors in main memory. This includes the following

capabilities:

64-byte ECC code

This innovative ECC algorithm from IBM research allows a full 8-bit device kill to be

corrected dynamically. This ECC code mechanism works across DIMM pairs on a rank

basis. (Depending on the size, a DIMM might have one, two, or four ranks.) With this ECC

code, an entirely bad DRAM chip can be marked as bad (chip mark). After marking the

DRAM as bad, the code corrects all the errors in the bad DRAM. The code can

additionally mark a 2-bit symbol as bad and correct it. Providing a double-error detect or

single error correct ECC or a better level of protection is additional to the detection or

correction of a chipkill event.

Hardware-assisted memory scrubbing

Memory scrubbing

processor-based systems periodically address all memory locations. Any memory

locations with a correctable error are rewritten with the correct data.

CRC

The bus transferring data between the processor and the memory uses CRC error

detection with a failed operation retry mechanism and the ability to retune bus parameters

124

IBM BladeCenter PS703 and PS704 Technical Overview and Introduction

is a method for dealing with intermittent errors. IBM POWER

Table of Contents

This manual is also suitable for:

Bladecenter ps704

Memory Protection - IBM BladeCenter PS703 Technical Overview And Introduction

4.3.3 Memory protection

Related Manuals for IBM BladeCenter PS703

Related Content for IBM BladeCenter PS703

This manual is also suitable for:

Table of Contents