How not to use a chess engine?

Question

How not to use a chess engine? What are the bad practices when using a chess engine, and the good ones? How trustworthy are their suggestions and evaluations?

ferit · Accepted Answer · 2018-09-15T11:38:04.440

What's the purpose of this question & answer?

I see a lot of misuse of engines in this community. I see topics where people do opening "analysis" by copy pasting engine outputs. Even worse, I saw opening "analysis" by copy pasting in first move!

Lots of beginner in this community believe that engines give best possible move in every position, because they beat human players.

I tried to explain why this is wrong under several topics shortly, but I see short explanations are not enough, people don't want to believe some stranger saying that using the engine in such ways is wrong without explaining with every detail. So I, as a computer science student and hobby player(used to play OTB but quit it) wrote this very long, tiring answer, in hope of changing wrong beliefs.

Notes: This is a very long answer. I have probably a lot of grammar mistake and typo. Also it may be possible to explain these in a more fluent or in a shorter way. So, if you think you can improve this answer, please suggest edits, I would highly appreciate it.

How does an engine work? What are these numbers in analysis window?

Engines evaluate all positions using different metrics(which is very closely related to engines playing strength)

For example: Engines has predefined values for materials, like 1 points for a pawn, and 3 points for a bishop etc.

But that's not all of course, they use more advanced metrics too. For example: A passed pawn 0.2 points, bishop pair 0.1 points etc.

So, it evaluates all positions like this, and inserts into a tree(if you want to learn about tree's more, please take a look at this paper).

But, as the memory is not infinite, if computer runs out of memory, engine stops right? So engine has to use memory carefully. How? It deletes some nodes (positions) from tree which are not promising (has bad (it may not be easy to decide which is bad) evaluation score).

Longest possible path in the tree is the depth of the analysis. As every node represents a position, depth of 2 equals 1 move.

What are the strengths and weaknesses of engines comparing to humans?

Engines are absolutely better at tactics comparing to humans. Because tactics are

convertible to concrete (material, including checkmate) advantages in short-term (generally in less than 10 moves)
computers have overwhelmingly more memory comparing to humans (Can any chess player memorize and evaluate 10^3 different continuations in any position in 1 minute? No. But computers today can do it for more than 10^12)
computers don't make calculation mistakes(practically), but humans do.

Humans are absolutely better at strategy and positional evaluations. Because:

after all, engines are using metrics to evaluate which are absolutely open to question, for example there is no correct metric for bishop pair, or pawn.
a master chess player is far more better when assessing metrics(machines are stupid)
humans are overwhelmingly stronger at pruning, our search tree is far more smaller(we can eliminate a lot of unnecessary positions, computers fall short in this)

Let me demonstrate these in this position, which a 1200 Elo player can assess a draw(rightfully) in 3 seconds.

 [Title "Komodo-9.3 depth=48, -17.61"]
 [FEN "rr2k3/6b1/8/p1p1p1p1/PpPpPpPp/1P1P1P1P/8/4K3 b - - 0 1"]

Engine says it's -17.61, after seeing 24 moves ahead. Why? Is it that stupid? Can't it understand black can't progress?

Yes! It is that stupid! And you are the one who needs to understand it's draw, not the engine, it's a tool for you!

-17.61 is result of the evaluation metrics. Is the evaluation wrong then? No, evaluation is correct, if you interpret it correctly. The engine sums up metrics, which results -17.61, because black has a lot of material advantage. Interesting thing is that much material advantage isn't enough in this position and that's the part you should assist the engine with your intelligence.

Evaluation scores are not position assessments!!!

There are only two things can be considered as an assessment in an engine's output:

Mate in x
Draw by 0.00

Other than these two all evaluation score outputs are opinion-based. Like in the position above -17 evaluation can be assessed as a draw, it's very unlikely but possible.

So, engines are really far from perfect, that's why we have tablebases. Tablebases are sources of the absolute truth of a given position. A tablebase gives an output which tells the position is either mate (in x moves) or draw. State of the art tablebases has covered all positions having at most seven pieces.

How to interpret evaluation scores?

As I mentioned earlier, engines do their best to use limited memory and computational power. Because these are limited, engines tell users that how deep they went.

Imagine that you are a commander (engine) of an army, and you are in an open terrain. Your army has 10k soldiers. You have 10km (depth of search) line of sight, and you see enemy forces 9km ahead, you guess (evaluate) their count as roughly 100 soldiers, and you decide to attack. You and your army starts to run towards enemy. But after getting 3kms closer to enemy forces (gaining 3km more line of sight), you suddenly see 100k more enemy, and you suddenly decide that this battle is lost.

This was an analogy of horizon effect problem. There are times, in which the key move happens out of the horizon of the engine, and engine evaluates the position falsely. This problem happens lot more in endgame, because in endgame, comparing to middlegame and opening, positions evolve slowly, more moves needed to get same amount of progress. Remember the long manoeuvres of endgames. That's why engines are using tablebases (precalculated, sure as death assessments of positions which have less than 8 piece on board), to fight with horizon effect.

Let's return to position above and clarify why we can so easily assess it as draw, but computer don't give 0.00?

Engine has calculated 24 moves deep, nice. But we are calculating all possible continuations, until the end! We are not faster than computers, but we are quicker than them. Because we understand, after checking several moves, all continuations are trivial, and prune them all. We only calculate few positions, and assess it as draw. But engine calculates tons of, because its not clever enough to understand its trivial to calculate all these positions.

Now, how not to use chess engines?

As an endgame god, like it knows it all
As an opening god, like it knows it all
As a strategy god, like it knows it all
As a tactical god, like it knows it all(it generally knows it all, but there are exceptions, because of the horizon effect)

But to use as:

A foolproof mate searcher
The best tactician in the world
A questionable advisor in strategical/positional evaluations

Don't forget that analysis and match are different tasks. Humans can't beat engines on matches doesn't mean it's the same in analysis. Actually these are very distinct tasks, because analysis aims to find the best move with infinite resources, but match aims to win with finite resources, and for winning, it's enough to play better than opponent. All competitive engines are tuned for matches, to get highest possible success in engine tournaments.

Read also

A nice article about chess engines, relevant to this topic.
A very instructive article about interpreting engine evaluations.
A related topic: Why do chess engines sometimes miss good moves (or take forever to spot)?
Another related topic: Computer evaluations: How trustworthy are they?

Examples of Engine Fails, for Unsatisfied Readers

More examples of positions, which state-of-the-art engines can't find the correct move, which was found by human-beings

[Event "Linares "]
[White "Topalov, Veselin"]
[Black "Shirov, Alexei"]
[FEN "8/8/4kpp1/3p1b2/p6P/2B5/6P1/6K1 b - - 0 47"]

1...Bh3!!-+ {Shirov played this} ( 1...a3? {Komodo-9.3 with 50 ply, 5GB Hashtable, evaluates -2.58} )

Analysis of Shirov's game

I know is your question, but I think it misses completely the point. The question was: when to use engines; I thought it was about when to use it for enhance learning, given as a fact that the computer is perfect. Which indeed it is. All the drawbacks you mention were relevant in the 80's or 90's. Now even Carlsen does not hold a candle to the best computers; they are the best tactician, the best endgame player, the best opening player, and the best strategic players. The "example" you put forward is pathological and expecially crafted; it never happens in real chess. — Ant, Dec 27 '15 at 12:18
@Ant Your claims are definitely wrong. Please search on internet about your claims, you will see. All I wrote in this answer is valid for today's chess. Anyway, I am going to update my answer, covering your claims, with concrete references to satify you. Thanks for letting me know that I need to update my answer. — ferit, Dec 27 '15 at 12:29
Alright, I will wait for your update. In any case, it it well known that the best computers are too strong for humans. I mean even your average grandmaster will be defeated 99 times out of 100, so if your ELO is below 2500 computers are basically the perfect player. — Ant, Dec 27 '15 at 12:46
Yes, you are right, engines are too strong today even for GMs, I don't claim otherwise. Are you sure that you read my post entirely? Or just read the title and downvoted? — ferit, Dec 27 '15 at 12:48
I read your post entirely. When you talk of horizon effect, for example; in today's computer, it won't happen. Of course the horizon effect is still there, but is so far away that for all scope and purposes of a human player (or at the very least a sub-2500 human player) it is basically absent. So if you're using a 3000-something chess engine, and you are a sub 2500 player, you can treat the computer as a endgame, strategical and positional god like it knows it all. Computers haven't solved chess, but they are so strong most of us can't tell the difference — Ant, Dec 27 '15 at 12:52
@Ant What you need to learn is, analysis and match are different. Humans cant beat engines in matches doesn't mean humans can't beat engines in analysis. Afterall, the purpose of analysis is different. And how can you claim that the horizon effect is not happening today, what's your argument? What's so far away? Please do some search before claiming these, please. If you believe that engines are gods of chess today, please burn all your endgame and opening books, as your engine knows it all anyway, you don't need them... — ferit, Dec 27 '15 at 12:58
Why exactly do you think that a match and analysis are so different? There is not a huge gap in my opinion. And of course I need endgame and opening books, because they explain why some position is bad and gives us humans some patterns to recognize or some familiar plans to develop. Computers may give you the correct move or correct evaluation but it won't tell you why, and there's a difference — Ant, Dec 27 '15 at 13:02
Can you please do these baseless claims in answer? If you believe that you are right, why don't you post an answer, so that we can learn? — ferit, Dec 27 '15 at 13:02
Maybe I will. By the way it's not like your claims are supported by hard evidence, so please add it in before calling my claims false ;-) — Ant, Dec 27 '15 at 13:04
@Ant Analysis aims to find the best move, within unlimited time. In match both sides aim to checkmate. To checkmate, you don't need to play best moves, it's enough to play better than your opponent. That's the difference. And engines are tuned for matches, do you know this? Especially competitive engines. — ferit, Dec 27 '15 at 13:05
I explain my arguments for my claims, I wrote a wrong post to explain everything. You are just claiming things without arguments and without explanation here. We can't go anywhere in such discussion. So you better write a good answer to criticise me. That would be more appropriate. — ferit, Dec 27 '15 at 13:08
I didn't, but again I don't think it makes a difference. It's not like the computer will blunder its queen hoping that a 900 elo opponent won't see it. If they are tuned, the difference in play will only be visible at very high levels. I am explaining my reasoning, by the way, which essentially boils down to: "Computers aren't perfect but any difference in play with respect to a "perfect player" will not be visible, neither in matches nor analysis, to a sub 2500 player because of the huge skill gap". Maybe I'll write an answer later but it is by no means inappropriate to tell why you think — Ant, Dec 27 '15 at 13:11
an answer is wrong in the comment sections. That's what comments are for. An answer like "I think Saibot's answer is wrong" is completely inappropriate, on the other hand. So maybe I'll make a more comprehensive answer later. In any case if you do not wish to continue the discussion, well that's up to you — Ant, Dec 27 '15 at 13:12
@Ant So you say, an 2500 player can't find a better move(comparing to engine) in any positions? :) Anyway, If you write an answer people can vote and comment on your answer. — ferit, Dec 27 '15 at 13:16
And this is basically wrong my friend :) Please write these in an answer. — ferit, Dec 27 '15 at 13:18
There was a game So -- Nakamura in the last Sinquefield Cup, where So fell for a prepared line in the King's Indian, which (current) engines evaluate as +1, but is actually losing for White. — Jester, Dec 27 '15 at 22:48
@Jester Thank for pointing out! Can you tell which move is it? I want to add it to my answers "Examples" section. — ferit, Dec 27 '15 at 22:59
@Saibot I agree with you that computers are not gods, and GMs can sometimes find better moves. However, for us mortals, engine analysis is the very best resource. Using engine analysis as supporting evidence shouldn't be criticized. Doctors sometimes give wrong advice, but for normal people, we must follow doctor's advice — jf328, Dec 30 '15 at 16:43
I agree with using engines as supporting evidence, that's the way they should be used. But I disagree using engines outputs as final assessments. If the question is how to proceed in a given position, engine suggestion is quite acceptable, but if the question is what's the BEST move here, then dumping engine output is stupid. That's what I'm trying to tell. There is a question, which asks which asks the worst move for white on move 1, and dumps an engine output. Another one, dumps an engine outputs as an answer. I know it's a bit long for comment but, am I able to express myself clearly? — ferit, Dec 30 '15 at 16:57
One should simply think like this: Strong players lose against weaker players from time to time, and engine is just a very strong player. Actually a player with known weaknesses. Would you accept every suggestion of a very strong player? No. Then why do you accept an engines every move? Because engines are unbeatable? No, they are beatable by humans under some situtations, and by other engines. — ferit, Dec 30 '15 at 17:04
This question-answer thing is great. Why was everyone so mad about it? Sailbot, I found this post really helpful. Thanks, man. — Joseph Farah, Dec 31 '15 at 23:54
@JosephFarah Thanks for appreciation! Don't forget to check extra articles at the bottom of the answer, where you can find more stuff about this. And about your question, I really don't know :-) — ferit, Jan 01 '16 at 00:17
Nice answer. Horizon effect is still there. Many problems with engine analysis like recognizing minefields that humans never cross. Better when our position is patzerproof (and opponent's not) than getting mysterious 0.2 points. Even if they were gods you still shouldn't play often it's first line as it can be hugely unpractical for humans. Also awful endgame play when approaching tablebase positions. http://chess.stackexchange.com/questions/16461/are-chess-engines-too-briliant-to-play-good-endgame — hoacin, Mar 26 '17 at 06:28
Thanks @hoacin . It's been a time, and I read what I wrote recently, I realized I forgot a very interesting thing! There is a thing in which engines are really bad, playing in lost positions. Why? Engines play best moves according to their evaluation metrics, however, in lost positions best moves are not 'best moves'. You have to play inferior tricky trappy moves to trick your opponent to get a chance. If you succeed, you can draw, or even you can win, if you fail, you will lose more advantage, but there is nothing to lose at all, you were losing anyway. Gonna add this when I'm not lazy. :D — ferit, Mar 27 '17 at 17:22

How not to use a chess engine?

1 Answers1

Linked