IBM System/370 145 Manual page 197

Hide thumbs Also See for System/370 145:

Operating procedures manual (52 pages)

Operating procedures manual (102 pages)

Reference summary (161 pages)

Table Of Contents

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

page of 265

/ 265
Contents
Table of Contents
Bookmarks

Table of Contents

that alter data as they execute, so that instructions can be retried

from the point of correct execution.

When the CPU is enabled for machine check

interrup~ions,

interruption takes place after a CPU error occurs and is retried.

microinstruction retry was successful, the failure need only be

recorded; if unsuccessful, programmed recovery procedures are required.

The microinstruction retry feature provides the Model 145 with the

ability to recover from intermittent, CPU failures that would otherwise

cause' a system halt and necessitat1e a re-IPL or cause an executing

program to be terminated.

Correcbed errors are logged by recovery

routines for later diagnosis duriug scheduled maintenance periods,

thereby increasing system availability,.

Retry of failing CPU operations on the Model 40 is not provided by

either system ' hardware or programming support.

ECC VALIDITY CHECKING ON PROCESSOR AND CONTROL STORAGE

The ECC method of validity checking on both processor and control

storage provides automatic single-lbit error detection and correction.

It also detects all double-bit and some multiple-bit processor and

control storage errors but does

n~t

correct them.

Checking is handled

on an eight-byte basis, using an eight-bit'modified Hamming code, rather

than on a single-byte. basis, using a single parity bit.

However, parity

checking is still used to verify oither data in a Model 145 system that

is not contained in processor or control storage.

Models 30 and 40 use

parity checking for main storage data verification.

Data enters and leaves storage :in the CPU through the storage adapter

unit, which performs ECC validity

4~hecking

on each doubleword.

Another

storage adapter is contained in

thE~

processor storage frame.

When a

doubleword (72 bits, as shown in Figure 50.10.1) is fetched from

processor or control storage, the appropriate storage adapter unit

checks the eight-bit ECC code to validate the 64 data bits.

If the data

is correct, the adapter unit generates the appropriate parity bit for

each of the eight data bytes and reformats the doubleword to look as

shown in Figure 50.10.2.

If a sin9le-bit error is detected, the

identified data bit in error is

co)~rected

automatically by the corrector

unit in the storage adapter and sent to the

cpu.

A corrected doubleword

is sent back to control storage but. not. back to processor storage.

When

a doubleword is to be placed in processor storage by a program or in

control storage during

microprogrcu~

loading, the storage adapter unit

strips the eight parity bits, cons1:ructs the necessary eight-bit ECC

code, and appends the code to the «;4 data bits.

The 72 bits are then

stored as shown in Figure 50.10.1.

Additional CPU time is required to

cor:rect a single-bit error that occ:urs for a fetch to control storage.

When a single-bit storage error occurs, the hardware also determines

whether the error is intermittent or solid by retrying the storage

operation to see whether the error occurs again.

With one exception,

only intermittent single-bit storage errors can cause a machine check.

When an intermittent single-bit

stc~rage

error is detected and corrected

during the execution of an instruct:ion or I/O operation, a machine check

pending latch is set on and the opE!ration continues.

At the completion

of the CPU operation, a machine chE!ck interruption occurs to allow error

recording to be done unless the CPU has been disabled for ECC correction

interruptions.

The occurrence of

machine check interruption after an

intermittent single-bit processor or control storage correction is

dependent on the setting of three :E:CC mode bits in a special mode

The mode register bits can be set by using the DIAGNOSE

instruction.

A Guide to the IBM System/370 Model. 145

187

Table of Contents

IBM System/370 145 Manual page 197

Related Manuals for IBM System/370 145

Related Products for IBM System/370 145

Table of Contents