-1

Here, it is shown 90 of 960 positions involve having to move a rook on 1 side in order to castle on the other side (unless the rook is somehow captured or something). These 90 positions come from 18 rearrangements/permutations/combinations/whatever from each of 5 groups of starting positions (SPs)

  1. RKRXXXXX
  2. RKXRXXXX
  3. XRKRXXXX
  4. XXXXXRKR
  5. RXKRXXXX

Based on the Sesse evals for the 960 SPs (see here also) or based on other similar sources (such as practical instead of theoretical statistics like win rate eg this (computer chess) [again see here]), what is white's increased advantage in chess90 compared to chess870?

Note 1: This can't be vacuous. If white actually has a bigger advantage in chess870 compared to chess90, then the answer is simply negative.

Note 2: I'm answering below for the sesse evals. I have yet to answer or receive an answer for the computer chess statistics.

BCLC
  • 1,892
  • 1
  • 17
  • 46

1 Answers1

0

Group of SPs and then average evaluation (rounded to 4 decimal places):

  1. RKRXXXXX - 0,1900
  2. RKXRXXXX - 0,1906
  3. XRKRXXXX - 0,1878
  4. XXXXXRKR - 0,2017
  5. RXKRXXXX - 0,1867
  6. Whole of Chess960 - 0.1801
  7. Whole of Chess90 - 0.1913
  8. Whole of Chess870 - 0.1790

Therefore,

(0.1913-0.1790)/0.1790 = 0.06871508379 ~ 6.87%, almost 7%.


Remarks:

  1. Chess960 has a higher evaluation than chess870, but only by (0.1801-0.1790)/(0.1790) = 0.00614525139 ~ 0.6%.

  2. Chess90 and each of its 5 SP groups have an average evaluation higher than the average evaluation of both chess960 and chess870.

  3. SP 518 is 0.22, which is higher than each of the 8 evaluations.


In case you want to verify for yourself, here are my google sheet codes for the 5 SP groups of chess90:

  1. RKRXXXXX - =IF(LEFT(B2;3)="RKR";1;0)
  2. RKXRXXXX - =IF(AND(LEFT(B2;2)="RK";MID(B2;4;1)="R");1;0)
  3. XRKRXXXX - =IF(MID(B2;2;3)="RKR";1;0)
  4. XXXXXRKR - =IF(RIGHT(B2;3)="RKR";1;0)
  5. RXKRXXXX - =IF(AND(LEFT(B2;1)="R";MID(B2;3;2)="KR");1;0)
BCLC
  • 1,892
  • 1
  • 17
  • 46
  • 2
    Are those differences meaningful though? What's the difference between a +0.18 position and a +0.20 one? – David Jan 20 '22 at 19:55
  • 3
    @BCLC This percentage doesn't tell the whole picture or even close: there's a 200% difference between +0.01 and +0.03 but they're hardly distinguishable as chess engine evaluations. A better way would be to hypothesis test. (That said, it's an interesting question you ask and have started analysing.) – Mobeus Zoom Jan 20 '22 at 20:32
  • @MobeusZoom oh lol you're right. thanks – BCLC Jan 21 '22 at 09:07
  • @David anyway where did you get 0.18 and 0.20 ? i computed/calculated a 7% difference '(0.1913-0.1790)/0.1790 = 0.06871508379 ~ 6.87%, almost 7%' i.e. a chess90 SP is 7% more powerful for white compared to a chess870 SP Edit: Ah you mean XXXXXRKR and chess870 ? – BCLC Jan 21 '22 at 09:08
  • 1
    Alright, call it 0.1913 and 0.1790 then. Same point applies – David Jan 23 '22 at 00:49
  • @David well i guess maybe not so meaningful based on comment of Mobeus Zoom – BCLC Jan 23 '22 at 06:10
  • @MobeusZoom even with hypothesis testing, a statistically significant difference is not the same as a practically relevant one. You could get a statistically significant result just by adding more and more possible starting positions, no matter how tiny the average difference is. But that wouldn't make the actual difference more relevant – David Jan 23 '22 at 11:42
  • @David and MobeusZoom how would i do a hypothesis test here? hypothesis testing sounds more relevant for those computer chess winning percentages than for theoretical evaluations. like here i just have 2 numbers 0.1913 and 0.1790. hypothesis testing seems to be more for like having much more data to have a sample size n. looks like sample size is just 2 points here idk. – BCLC Jan 23 '22 at 14:33
  • @David It is up to BCLC, not you, how he wishes to define sufficient advantage to be "relevant" in characterising the positions. I don't see his post suggesting that a computer-judged advantage is irrelevant if it doesn't translate to significantly disparate practical results. Given the metric for advantage is BCLC's choice, the point is only regarding the methodology. – Mobeus Zoom Jan 23 '22 at 18:53
  • @BCLC The sample size is much greater than 2. You want to compare sets of positions (of size 90 and 870) if I'm not mistaken. – Mobeus Zoom Jan 23 '22 at 18:54
  • @MobeusZoom ok so what is the/a right way to compare the 90 points vs the 870 points ? Edit: Ah wait i think i remember: you mean null hypothesis test that average mean of 1 set is less than or is unequal to or whatever the mean of the other set? but these 90 and 870 are not exactly 'samples' of 'things'....they are the actual things. afaiu, hypothesis testing assumes there's like a real 'thing' and then the data we have are samples of the real things – BCLC Jan 23 '22 at 19:39
  • @MobeusZoom Update: https://stats.stackexchange.com/questions/561590/can-you-do-hypothesis-testing-when-instead-of-a-sample-size-you-have-actual – BCLC Jan 23 '22 at 20:33