Ticket #41: STS2GNUGo.txt

File STS2GNUGo.txt, 4.8 kB (added by cisba, 13 months ago)
Line 
1
2Some explanation about the criteria
3I adopted to make the porting of STS-RV.
4
51) The first step for conversion
6is done with this awk script:
7
8<awk-script>
9#!/bin/awk -f
10# skip comment lines
11/^#[^?]/ { next }
12/^# *$/ { next }
13{
14# replace path and function
15gsub(/Sgf_test_files/,"games/STS-RV")
16gsub(/solve_semeaiS/,"analyze_semeai")
17# translate exit code of solve_semeaiS
18gsub(/\[0 /,"[0 0 ")
19gsub(/\[1 /,"[1 1 ")
20gsub(/\[2 /,"[1 0 ")
21gsub(/\[3 /,"[3 @ ")
22gsub(/\[4 /,"[4 @ ")
23# remove espected failures mark
24gsub(/\]\*/,"]")
25print
26}
27</awk-script>
28 
292) then I edited manually the special cases
30marked with a '@' becouse the Ko (4) could
31be good [2 x ...] or bad [3 x ...].
32The Unknow result (3) of solve_semeaiS never
33occurred as espected reply in original tests.
34
353) When the analysis response was correct
36but the move was not in the espected set,
37I checked if the move was anyway a good reply.
38This becouse often the set was too small and the
39move could be not wrong but simply not the
40best, or simply missed in the original test
41although good and valid.
42
43This revision included the very frequent case
44that test espected a PASS reply and GNU Go played
45some move elsewere, not relevant for the semeai,
46but not wrong. The workaround suggested by Gunnar
47is a regexp test like [x y (.*)].
48
49In other cases I simply appended the valid move(s)
50to the espected set of good replies.
51
52NOTE: I assumed implicitly that if test is passed
53then test is well defined. This is not true in
54general becouse is possible that changing the
55code of GNU Go, in the future it will play
56a correct (different) move not included in the
57set of good replies (like often detected and
58revised as described above).
59
60But to avoid this, *each* test should be revised
61to verify that *all* good moves are included.
62And this require too much work, so it's better
63to take benefit from the STS-RV test suite
64anyway, despite this small accuracy lack.
65
66-----------
67
68Here the complete list of change made to
69the original test suite:
70
71
72# File: STS-RV_1.tst
73
744 FAILED: Correct '1 1 (A10|A12|B11)', got '1 1 B8' ==> 1 1 (A10|A12|B11|A8|B8)
756 FAILED: Correct '1 1 (J2|K1|H1)', got '1 1 M1' ==> 1 1 (J2|K1|H1|M2|M1)
768 FAILED: Correct '1 1 (S8|T9|T7)', got '1 1 T11' ==> 1 1 (S8|T9|T7|R11|S11|T11)'
7717 FAILED: Correct '1 1 (D14|C14|C13|C12|C11|B10)', got '1 1 B19' ==> 1 1 (D14|C14|C13|C12|C11|B10|B19) # B19 irrelevant but not wrong
7861 FAILED: Correct '1 1 T3', got '1 1 O2' ==> 1 1 (T3|O1|O2|P1|P3|R1)
79113 FAILED: Correct '1 1 B2', got '1 1 C3' ==> 1 1 (B2|C3|D2)
80129 FAILED: Correct '1 1 (S18|T16)', got '1 1 T14' ==> 1 1 (S18|T16|T14)
81135 FAILED: Correct '1 1 C2', got '1 1 D2' ==> 1 1 (C2|D2)
82137 FAILED: Correct '1 1 R3', got '1 1 O1' ==> 1 1 (R3|O1)
83163 FAILED: Correct '1 1 C18', got '1 1 C12' ==> 1 1 (C18|D17)
84170 FAILED: Correct '1 0 K18', got '1 0 L17' ==> 1 0 (K18|L17)
85186 FAILED: Correct '1 1 J2', got '1 1 C4' ==> 1 1 (J2|B2|B3|D3)
86
87
88# File: STS-RV_GSAT.tst
89
902 FAILED: Correct '1 1 (B1|A2)', got '1 1 C1' ==> 1 1 (B1|A2|C1)
9160 FAILED: Correct '1 1 (P2|P3|Q1|R1|S2|S3)', got '1 1 T2' ==> 1 1 (P2|P3|Q1|R1|S2|S3|T2)
9268 FAILED: Correct '1 1 (P1|P2|O3|S1|S2)', got '1 1 T3' ==> 1 1 (P1|P2|O3|S1|T2|T3)
93
94
95# File: STS-RV_e.tst
96
97127 FAILED: Correct '1 0 B4', got '1 0 C6' => 1 0 (B4|C6)
98128 FAILED: Correct '1 1 B4', got '1 1 E3' => 1 1 (B4|E3)
99168 FAILED: Correct '1 1 (T2|S1)', got '1 1 S2' ==> 1 1 (T2|S1|S2)
100181 FAILED: Correct '1 1 C1', got '1 1 A1' ==> 1 1 (C1|A1)
101201 FAILED: Correct '1 0 L2', got '1 0 M1' ==> 1 1 (M1|L2)
102
103
104# File: STS-RV_Misc.tst
105
10651 FAILED: Correct '1 1 (O14|N14|M14|K14|H12)', got '1 1 F11' => 1 1 (O14|N14|M14|K14|H12|F11)
107
108
109# File STS-RV_RH.tst
110
1117 FAILED: Correct '1 1 (F19|G19|H19)', got '1 1 E19' ==> 1 1 (F19|G19|H19|E19)
1128 FAILED: Correct '1 1 (C19|D19|E19)', got '1 1 F19' ==> 1 1 (C19|D19|E19|F19)
11316 FAILED: Correct '1 1 PASS', got '1 1 Q16' ==> 1 1 (PASS|Q16)
11439 FAILED: Correct '1 1 Q1', got '1 1 R1' ==> 3 3 (Q1|R1)
11540 FAILED: Correct '1 1 Q1', got '1 1 T2' ==> 3 3 (Q1|R1|T2)
11653 FAILED: Correct '1 0 G14', got '1 0 J19' ==> 1 0 (G14|J19|J18|J17|H16)
11760 FAILED: Correct '1 1 (G1|K1|D1)', got '1 1 J7' ==> 1 1 (G1|K1|D1|J7)
11869 FAILED: Correct '1 1 N17', got '1 1 O19' ==> 1 1 (N17|O19)
11978 FAILED: Correct '1 1 C12', got '1 1 C7' => 1 1 (C12|A7|B7|C7|D8|D9|D10|D11)
12088 FAILED: Correct '1 1 (E18|G19|D16|G18|E19)', got '1 1 H19' ==> 1 1 (E18|G19|D16|G18|E19|H19)
12190 FAILED: Correct '1 1 (H1|G1|E3|K5|L7)', got '1 1 J5' ==> 1 1 (H1|G1|E3|K5|L7|J5)
122
123
124NOTE: The 39-40 is the unique testcase in wich I was forced
125to change the original result value (from [1 1] to [3 3])
126I think a black stone in T2 is missed in diagram:
127so if someone own the book of Richard Hunter, please
128checks this diagram (pag147_P05P08.sgf) and report it.
129
130Also for test n. 78 I suspect a defect in diagram.