IA-32 Intel® Architecture Optimization
Example 7-4
Spin-wait Loop and PAUSE Instructions
(a) An un-optimized spin-wait loop experiences performance penalty when exiting
the loop. It consumes execution resources without contributing computational
work.
do {
// This loop can run faster than the speed of memory access,
// other worker threads cannot finish modifying sync_var until
// outstanding loads from the spinning loops are resolved.
} while( sync_var != constant_value);
(b) Inserting the PAUSE instruction in a fast spin-wait loop prevents
performance-penalty to the spinning thread and the worker thread
do {
_asm pause
// Ensure this loop is de-pipelined, i.e. preventing more than one
// load request to sync_var to be outstanding,
// avoiding performance penalty when the worker thread updates
// sync_var and the spinning thread exiting the loop.
}
while( sync_var != constant_value);
(c) A spin-wait loop using a "test, test-and-set" technique to determine the
availability of the synchronization variable. This technique is recommended when
writing spin-wait loops to run on IA-32 architecture processors.
Spin_Lock:
CMP lockvar, 0 ;
JE Get_lock
PAUSE;
JMP Spin_Lock;
Get_Lock:
MOV EAX, 1;
XCHG EAX, lockvar; // Try to get lock.
CMP EAX, 0;
JNE Spin_Lock;
Critical_Section:
<critical section code>
MOV lockvar, 0;
7-24
// Check if lock is free.
// Short delay.
// Test if successful.
// Release lock.
Need help?
Do you have a question about the ARCHITECTURE IA-32 and is the answer not in the manual?