16

I have read the assembly line

OR EAX, 0xFFFFFFFF

and in the register EAX the program has stored a string. I have problems to understand how we can make a comparison with a string and a value like that. After performing that instruction, EAX has the value 0xFFFFFFFF.

Can someone tell me which purpose that operation has ? Is it a line which comes frequently in an assembly code ? (for example the line XOR EAX, EAX which is an efficient way to make EAX = 0 ? Is it something like that ?)

perror
  • 19,083
  • 29
  • 87
  • 150
user3097712
  • 1,541
  • 1
  • 25
  • 44

3 Answers3

23

I think that in order to understand why the compiler does this, study the following disassembly:

B8 FF FF FF FF                          mov     eax, 0FFFFFFFFh
83 C8 FF                                or      eax, 0FFFFFFFFh

What the compiler is trying to accomplish is probably to set the eax register to -1 using as few bytes as possible in order to be cache friendly. OR also has about twice the throughput of the MOV instruction as long as you don't mind messing up the flags.

This is probably a variable being initialized to -1.

Peter Andersson
  • 5,701
  • 1
  • 32
  • 49
  • It's quite common to use -1 to indicate an error or some other special reserved value if 0 is considered a valid value in the range. I'm speculating of course. – Peter Andersson Jun 13 '14 at 20:03
  • Since the OP mentioned strings, it is also feasible that strcmp or its brethren is involved, and -1 could be an ordinary return value in that case. But I agree, it's impossible to give a non-speculative answer to "why" without seeing what input generates this result, and how that result is used later in the code. – DCoder Jun 13 '14 at 20:17
  • @DCoder very true. It could be a completely normal return value from a comparison function with -1 (<0) as less, 0 as equal and 1 (>0) as greater which wouldn't be uncommon either. – Peter Andersson Jun 13 '14 at 20:22
  • 1
    @user3097712 -1 is expressed as 0xFFFFFFFF (all bits set) in a 2-complement way. – glglgl Jun 14 '14 at 12:06
  • 1
    "OR also has about twice the throughput of the MOV instruction as long as you don't mind messing up the flags." Well, no, that's not true. As far as the CPU is concerned, or eax, -1 depends on the previous value of the eax register, which lengthens the code's dependency chain and will significantly decrease performance compared to if you had used a mov. There is a code size reduction, as you demonstrated, but there is a very significant speed reduction. It is almost never worth the 2 bytes. (Yes, chips could conceivably special-case an OR with all bits set, but they don't.) – Cody Gray - on strike May 29 '17 at 11:49
  • @PeterAndersson you should really add that note from glglgl in the answer. For instance, FORTH uses -1 to refer to true. It helps for me to think of this optimization as a quick way to set all the bits to 1. – Evan Carroll Feb 28 '18 at 02:59
6

Sorry, I can't post this as a comment but a couple of quick (and non-exhaustive) tests show the following:

  • gcc (4.6.3) uses or instead of mov when optimising for size (/Os)
  • msvc (13) uses or instead of mov whatever the optimisation setting (including disabled)
  • clang (3.0) uses mov whatever the optimisation setting

gcc's behaviour, in particular, supports Peter Andersson's answer.

phuclv
  • 476
  • 3
  • 15
Ian Cook
  • 2,548
  • 11
  • 18
5

This will always result in setting the EAX register equal to 0xFFFFFFFF and will also have the side effect of setting the flags appropriately (that is N=1, Z=0, etc.). It is not a common idiom.

Edward
  • 2,521
  • 18
  • 25
  • What is the N flag (Carry, Overflow, Zero, etc are expected, but what does N represent)? – jww Jun 15 '14 at 00:12
  • 4
    "... it is not a common idiom" - its actually quite common, especially in older software. The OR instruction is smaller than the MOV instruction. A machine with 640K or 1MB or memory needed the savings (yep, it dates back that far). And XOR was (and still is) used to zero a register for the same reason. – jww Jun 15 '14 at 00:15
  • @jww: Sorry, I work with a lot of different processors. I meant the SF (sign flag) which is what Intel calls it; other manufacturers call it N for Negative. It might be useful to the OP if you could specify compilers which generate that sequence. None of the ones I have handy do so. – Edward Jun 16 '14 at 11:42
  • @jww: thanks for your explanation. I did not know the fact that OR is smaller than MOV. Now i know it. Thanks! From this posting, I have learn a lot. – user3097712 Jun 18 '14 at 23:02