Intel Pentium II Developer's Manual page 29

Hide thumbs Also See for Pentium II:

Application note (16 pages)

Table Of Contents

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

page of 226

/ 226
Contents
Table of Contents
Bookmarks

Table of Contents

During every clock cycle, up to three Intel Architecture macro instructions can be decoded in

the ID1 pipestage. However, if the instructions are complex or are over seven bytes then the

decoder is limited to decoding fewer instructions.

The decoders can decode:

1. Up to three macro-instructions per clock cycle.

2. Up to six µops per clock cycle.

3. Macro-instructions up to seven bytes in length.

Pentium II processors have three decoders in the D1 pipestage. The first decoder is capable

of decoding one Intel Architecture macro-instruction of four or fewer µops in each clock

cycle. The other two decoders can each decode an Intel Architecture instruction of one µop in

each clock cycle. Instructions composed of more than four µops will take multiple cycles to

decode. When programming in assembly language, scheduling the instructions in a 4-1-1 µop

sequence increases the number of instructions that can be decoded each clock cycle. In

general:

•

Simple instructions of the register-register form are only one µop.

•

Load instructions are only one µop.

•

Store instructions have two µops.

•

Simple read-modify instructions are two µops.

•

Simple instructions of the register-memory form have two to three µops.

•

Simple read-modify write instructions are four µops.

•

Complex instructions generally have more than four µops, therefore they will take

multiple cycles to decode.

For the purpose of counting µops, MMX technology instructions are simple instructions. See

Appendix D in AP-526, Optimizations for Intel's 32-bit Processors (Order Number 242816)

for a table that specifies the number of µops for each instruction in the Intel Architecture

instruction set.

Once the µops are decoded, they will be issued from the In-Order Front-End into the

Reservation Station (RS), which is the beginning pipestage of the Out-of-Order core. In the

RS, the µops wait until their data operands are available. Once a µop has all data sources

available, it will be dispatched from the RS to an execution unit. If a µop enters the RS in a

data-ready state (that is, all data is available), then the µop will be immediately dispatched to

an appropriate execution unit, if one is available. In this case, the µop will spend very few

clock cycles in the RS. All of the execution units are clustered on ports coming out of the RS.

Once the µop has been executed it returns to the ROB, and waits for retirement.

In this pipestage, all data values are written back to memory and all µops are retired in-order,

three at a time. The figure below provides details about the Out-of-Order core and the In-

Order retirement pipestages.

MICRO-ARCHITECTURE OVERVIEW

2-11

Table of Contents

Intel Pentium II Developer's Manual page 29

Related Manuals for Intel Pentium II

Related Products for Intel Pentium II

Table of Contents