Where can I find the x86 instructions execution time? How to find out which instruction is faster or smaller?
2 Answers
http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html - you want the optimization manual for the CPU you are interested in; AMD publishes an optimization manual for their CPUs.
Keep in mind that there is no "time" for each instruction, these days. You have out of order execution, memory and register stalls, and instruction level parallelism to take into account.
Different instructions do still have different latencies and throughputs, and number of uops or m-ops they decode to, and the execution ports those uops can run on. The best source for these numbers are Agner Fog's instruction tables, and his microarchitecture pdf with the details of how these numbers matter. See also the optimization section in Stack Overflow's x86 tag wiki.

- 3,670
-
3Also its varies wildly between chips, an instruction which could takes ages on an atom, could, be done ten times faster on a XEON. – James Anderson Feb 09 '12 at 01:20
-
Don't forget about page misses. The exact same instruction on the same processor (same physical hardware) could takes order of magnitude differences in time. – mattnz Feb 09 '12 at 01:28
-
1Not completely, although it is less deterministic than might be nice. You can certainly, for example, get value out of selecting appropriate assembly level representations of operations, or understanding the relative cost of operations at a statistical level - or even a "based on queried cache size" level. – Daniel Pittman Feb 09 '12 at 04:53
-
1The raw numbers are in those manuals, though. No they aren't. Intel doesn't publish latency / uop counts for all their CPUs. I think they did for a while, and Intel's optimization still have the tables, but IIRC they only have entries for P4. See Agner Fog's instruction tables for experimental numbers for all modern x86 CPUs (Pentium to Skylake, and AMD and VIA), with throughput, latency, and fused/unfused uop count (or AMD m-op, or whatever), and which execution ports those instructions/uops can run on. Also his uarch pdf to grok these numbers. – Peter Cordes Aug 08 '16 at 21:05
Daniel already summed up the answer, +1 to that. Bottom line is that on modern CPUs with over 2 billion transistors, they do such crazy things that you can't look at assembly instructions and expect to guess timing. The only thing you can really do is write code and measure its performance.
On that note, if you are curious about learning more, take a look at http://www.flounder.com/exceptions.htm. The guy who wrote that article is a PhD and actually has a lot of cool things to say about many things. I've spent some time going through all his articles. The one I'm linking talks about measuring performance of exception handling and he goes right down to assembly instruction level.

- 19,992
-
1Actually, the 68020 manual (ca. 1986) had comments along the lines of "it is very difficult to predict the execution time of some code, even if you completely understand the underlying architecture". – gnasher729 Aug 08 '16 at 23:00