provided __asm version of assembly code for atomic atomic operations
for better compatibility.
(this is a temporary resolution so that this one won't block other tests.
we'll revisit this change when we figure out performance implication of
the __asm version.)