Ticket #41 (closed task: fixed)

Opened 3 years ago

Last modified 5 months ago

Adapt semeai test suite STS-RV for GNU Go.

Reported by: gunnar Owned by: gnugo
Priority: normal Milestone: 3.7.12
Component: regressions Version:
Severity: minor Keywords: semeai
Cc: patch: no

Description (last modified by gunnar) (diff)

At http://gobase.org/reading/preview/Semeai/#STS there is a very comprehensive semeai test suite (722 tests!) compiled by Ricard Vilà. It is in GTP format but it's not a perfect match for GNU Go because it uses a custom command called solve_semeaiS, specified by:

#/***********************
# * Solving Semeai      *
# ***********************/
#/* Function:  Decide Semeai Status and move to play.
# * Arguments: vertex for essential white/black block, vertex for essential black/white block.
# * First color is assumed to play.
# * Fails:     invalid vertex, empty vertex, vertices of same colors
# * Returns:   Semeai status (if play) followed by move to play.
# * 0=Looser, 1=Winner, 2=Seki, 3=Unknown, 4=Ko
# */
# gtp command: solve_semeaiS

It would be good to have test files using the normal GNU Go commands.

Attached is an implementation of solve_semeaiS for GNU Go, but I don't propose that we include it in GNU Go.


This test suite, modified to use the usual GNU Go GTP semeai command, has been added to the GNU Go distribution now, excluding the tests taken from "Get Strong at Tesuji" and "Counting Liberties and Winning Capturing Races", due to missing knowledge about exactly how far the permissions given to Ricard Vilà by Richard Bozulich and Richard Hunter extend. The excluded tests can still be found in the attachments to this ticket, however

Attachments

solve_semeaiS.diff (3.6 kB) - added by gunnar 3 years ago.
Implementation of solve_semeaiS for GNU Go
games_STS-RV.zip (82.0 kB) - added by cisba 9 months ago.
STS-RV_0.tst (3.6 kB) - added by cisba 9 months ago.
STS-RV_e.tst (13.8 kB) - added by cisba 9 months ago.
STS-RV_Misc.tst (5.0 kB) - added by cisba 9 months ago.
STS-RV_GSAT.tst (6.1 kB) - added by cisba 9 months ago.
STS-RV_RH.tst (6.5 kB) - added by cisba 9 months ago.
STS2GNUGo.txt (4.8 kB) - added by cisba 9 months ago.
STS-RV_1.tst (12.7 kB) - added by cisba 8 months ago.

Regression Results

Attachment Rev. PASS FAIL Nodes Status
solve_semeaiS.diff 2381 0% 0% 0% builds with warning(s) details

Change History

  Changed 3 years ago by gunnar

  • type changed from defect to task

Changed 3 years ago by gunnar

Implementation of solve_semeaiS for GNU Go

  Changed 3 years ago by gunnar

  • description modified (diff)

  Changed 9 months ago by cisba

The file STS-RV_0.tst is based on the normal GNU Go commands (analyze_semeai) and will enable the execution of all the tests in the semeais_0.tst from the original suite.

In file games_STS-RV.zip you will find a folder named STS-RV to put in the regression/games directory.

Changed 9 months ago by cisba

  Changed 9 months ago by cisba

In some problem GNU Go suggest a move instead to PASS but the result of the semeai analysis is correct, so to avoid a fail not relevant for the test, the move is ignored using a regular expression: #? [x y (.*)]

File STS-RV_0.tst has been replaced with this revision.

Now result is: Summary: 25/26 passes. 0 unexpected passes, 1 unexpected failure

Changed 9 months ago by cisba

  Changed 9 months ago by cisba

Second semeai file: STS-RV_1.tst

In 12 cases I made some revision, as below after the "==>" mark:

4 FAILED: Correct '1 1 (A10|A12|B11)', got '1 1 B8' ==> 1 1 (A10|A12|B11|A8|B8) 6 FAILED: Correct '1 1 (J2|K1|H1)', got '1 1 M1' ==> 1 1 (J2|K1|H1|M2|M1) 8 FAILED: Correct '1 1 (S8|T9|T7)', got '1 1 T11' ==> 1 1 (S8|T9|T7|R11|S11|T11)' 17 FAILED: Correct '1 1 (D14|C14|C13|C12|C11|B10)', got '1 1 B19' ==> 1 1 (D14|C14|C13|C12|C11|B10|B19) # B19 irrelevant 61 FAILED: Correct '1 1 T3', got '1 1 O2' ==> 1 1 (T3|O1|O2|P1|P3|R1) 113 FAILED: Correct '1 1 B2', got '1 1 C3' ==> 1 1 (B2|C3|D2) 129 FAILED: Correct '1 1 (S18|T16)', got '1 1 T14' ==> 1 1 (S18|T16|T14) 135 FAILED: Correct '1 1 C2', got '1 1 D2' ==> 1 1 (C2|D2) 137 FAILED: Correct '1 1 R3', got '1 1 O1' ==> 1 1 (R3|O1) 163 FAILED: Correct '1 1 C18', got '1 1 C12' ==> 1 1 (C18|D17) 170 FAILED: Correct '1 0 K18', got '1 0 L17' ==> 1 0 (K18|L17) 186 FAILED: Correct '1 1 J2', got '1 1 C4' ==> 1 1 (J2|B2|B3|D3)

In case 163 and 186 GNU Go fail independently of revision but test was wrong. In the other cases, original test file was wrong, after the revision GNU Go passed the test.

follow-up: ↓ 13   Changed 9 months ago by cisba

Sorry, I forgot preformat tags in the last message ...

Second semeai file: STS-RV_1.tst

In 12 cases I made some revision, as below after the "==>" mark:

4 FAILED: Correct '1 1 (A10|A12|B11)', got '1 1 B8' ==> 1 1 (A10|A12|B11|A8|B8) 
6 FAILED: Correct '1 1 (J2|K1|H1)', got '1 1 M1' ==> 1 1 (J2|K1|H1|M2|M1) 
8 FAILED: Correct '1 1 (S8|T9|T7)', got '1 1 T11' ==> 1 1 (S8|T9|T7|R11|S11|T11)' 
17 FAILED: Correct '1 1 (D14|C14|C13|C12|C11|B10)', got '1 1 B19' ==> 1 1 (D14|C14|C13|C12|C11|B10|B19) # B19 irrelevant but not wrong
61 FAILED: Correct '1 1 T3', got '1 1 O2' ==> 1 1 (T3|O1|O2|P1|P3|R1) 
113 FAILED: Correct '1 1 B2', got '1 1 C3' ==> 1 1 (B2|C3|D2) 
129 FAILED: Correct '1 1 (S18|T16)', got '1 1 T14' ==> 1 1 (S18|T16|T14) 
135 FAILED: Correct '1 1 C2', got '1 1 D2' ==> 1 1 (C2|D2) 
137 FAILED: Correct '1 1 R3', got '1 1 O1' ==> 1 1 (R3|O1) 
163 FAILED: Correct '1 1 C18', got '1 1 C12' ==> 1 1 (C18|D17) 
170 FAILED: Correct '1 0 K18', got '1 0 L17' ==> 1 0 (K18|L17) 
186 FAILED: Correct '1 1 J2', got '1 1 C4' ==> 1 1 (J2|B2|B3|D3)

In case 163 and 186 GNU Go fail independently of revision but test was wrong. In the other cases, original test file was wrong, after the revision GNU Go passed the test.

  Changed 9 months ago by cisba

File: STS-RV_GSAT.tst

In 3 cases I made some revision (see after the "==>" mark):

2 FAILED: Correct '1 1 (B1|A2)', got '1 1 C1' ==> 1 1 (B1|A2|C1)
60 FAILED: Correct '1 1 (P2|P3|Q1|R1|S2|S3)', got '1 1 T2' ==> 1 1 (P2|P3|Q1|R1|S2|S3|T2)
68 FAILED: Correct '1 1 (P1|P2|O3|S1|S2)', got '1 1 T3' ==> 1 1 (P1|P2|O3|S1|T2|T3)

Result: Summary: 83/88 passes. 0 unexpected passes, 5 unexpected failures

  Changed 9 months ago by cisba

File: STS-RV_e.tst

I made 5 revision:

127 FAILED: Correct '1 0 B4', got '1 0 C6' => 1 0 (B4|C6)
128 FAILED: Correct '1 1 B4', got '1 1 E3' => 1 1 (B4|E3)
168 FAILED: Correct '1 1 (T2|S1)', got '1 1 S2' ==> 1 1 (T2|S1|S2)
181 FAILED: Correct '1 1 C1', got '1 1 A1' ==> 1 1 (C1|A1)
201 FAILED: Correct '1 0 L2', got '1 0 M1' ==> 1 1 (M1|L2)

Summary: 209/252 passes. 0 unexpected passes, 43 unexpected failures

Changed 9 months ago by cisba

  Changed 9 months ago by cisba

File: STS-RV_Misc.tst

I made one revision:

51 FAILED: Correct '1 1 (O14|N14|M14|K14|H12)', got '1 1 F11' => 1 1 (O14|N14|M14|K14|H12|F11)

Summary: 35/54 passes. 0 unexpected passes, 19 unexpected failures

Changed 9 months ago by cisba

Changed 9 months ago by cisba

  Changed 9 months ago by cisba

Added 2 lines of credits in file STS-RV_GSAT.tst

# Problems taken from Get Strong At Tesuji, Richard Bozulich
# Kiseido Publishing Company

  Changed 9 months ago by cisba

File STS-RV_RH.tst

7 FAILED: Correct '1 1 (F19|G19|H19)', got '1 1 E19' ==> 1 1 (F19|G19|H19|E19)
8 FAILED: Correct '1 1 (C19|D19|E19)', got '1 1 F19' ==> 1 1 (C19|D19|E19|F19)
16 FAILED: Correct '1 1 PASS', got '1 1 Q16' ==> 1 1 (PASS|Q16)
39 FAILED: Correct '1 1 Q1', got '1 1 R1' ==> 3 3 (Q1|R1)
40 FAILED: Correct '1 1 Q1', got '1 1 T2' ==> 3 3 (Q1|R1|T2)
53 FAILED: Correct '1 0 G14', got '1 0 J19' ==> 1 0 (G14|J19|J18|J17|H16)
60 FAILED: Correct '1 1 (G1|K1|D1)', got '1 1 J7' ==> 1 1 (G1|K1|D1|J7)
69 FAILED: Correct '1 1 N17', got '1 1 O19' ==> 1 1 (N17|O19)
78 FAILED: Correct '1 1 C12', got '1 1 C7' => 1 1 (C12|A7|B7|C7|D8|D9|D10|D11)
88 FAILED: Correct '1 1 (E18|G19|D16|G18|E19)', got '1 1 H19' ==> 1 1 

(E18|G19|D16|G18|E19|H19)
90 FAILED: Correct '1 1 (H1|G1|E3|K5|L7)', got '1 1 J5' ==> 1 1 (H1|G1|E3|K5|L7|J5)

NOTE: The 39-40 is the unique testcase in wich I was forced to change the original result value (from [1 1] to [3 3]) I think a black stone in T2 is missed in diagram: so if someone own the book of Richard Hunter, please checks this diagram (pag147_P05P08.sgf) and report it.

Also for test n. 78 I suspect a defect in diagram.

Changed 9 months ago by cisba

Changed 9 months ago by cisba

  Changed 9 months ago by cisba

General summary: 565 passes 157 fails 722 total

I added the STS2GNUGo.txt file with some explanation of method and comments.

in reply to: ↑ 6 ; follow-up: ↓ 14   Changed 8 months ago by gunnar

Replying to cisba:

Sorry, I forgot preformat tags in the last message ... Second semeai file: STS-RV_1.tst In 12 cases I made some revision, as below after the "==>" mark: {{{ 4 FAILED: Correct '1 1 (A10|A12|B11)', got '1 1 B8' ==> 1 1 (A10|A12|B11|A8|B8) 6 FAILED: Correct '1 1 (J2|K1|H1)', got '1 1 M1' ==> 1 1 (J2|K1|H1|M2|M1) 8 FAILED: Correct '1 1 (S8|T9|T7)', got '1 1 T11' ==> 1 1 (S8|T9|T7|R11|S11|T11)' 17 FAILED: Correct '1 1 (D14|C14|C13|C12|C11|B10)', got '1 1 B19' ==> 1 1 (D14|C14|C13|C12|C11|B10|B19) # B19 irrelevant but not wrong 61 FAILED: Correct '1 1 T3', got '1 1 O2' ==> 1 1 (T3|O1|O2|P1|P3|R1) 113 FAILED: Correct '1 1 B2', got '1 1 C3' ==> 1 1 (B2|C3|D2) 129 FAILED: Correct '1 1 (S18|T16)', got '1 1 T14' ==> 1 1 (S18|T16|T14) 135 FAILED: Correct '1 1 C2', got '1 1 D2' ==> 1 1 (C2|D2) 137 FAILED: Correct '1 1 R3', got '1 1 O1' ==> 1 1 (R3|O1) 163 FAILED: Correct '1 1 C18', got '1 1 C12' ==> 1 1 (C18|D17) 170 FAILED: Correct '1 0 K18', got '1 0 L17' ==> 1 0 (K18|L17) 186 FAILED: Correct '1 1 J2', got '1 1 C4' ==> 1 1 (J2|B2|B3|D3) }}} In case 163 and 186 GNU Go fail independently of revision but test was wrong. In the other cases, original test file was wrong, after the revision GNU Go passed the test.

In 4,6,8 it's also possible to play B12, H2, and S9 respectively.

In 170 M18 also works.

in reply to: ↑ 13 ; follow-up: ↓ 15   Changed 8 months ago by cisba

Replying to gunnar:

Replying to cisba:

Sorry, I forgot preformat tags in the last message ... Second semeai file: STS-RV_1.tst

In 4,6,8 it's also possible to play B12, H2, and S9 respectively. In 170 M18 also works.

Ok, I think many test could have the same "defect". But I focused on porting the STS-RV suite to GNU Go, not on revise deeply the suite, so I fixed a defect only when I detected a false FAIL result.

Well, I had the feeling that the STS-RV suite was not very accurate, but I assumed it "good enough".

The entire revision could take many months. So before to revise the suite is better to decide if it is necessary, and how deeply.

Let me know what you think.

For now I upload a STS-RV_1.tst file with yours fixes.

Of course, I will be happy to receive any suggestion about others defects, and to fix it.

Changed 8 months ago by cisba

in reply to: ↑ 14 ; follow-up: ↓ 16   Changed 8 months ago by gunnar

Replying to cisba:

Ok, I think many test could have the same "defect". But I focused on porting the STS-RV suite to GNU Go, not on revise deeply the suite, so I fixed a defect only when I detected a false FAIL result.

Me too. Those were cases which turned up when I tried some code modifications. :-)

Well, I had the feeling that the STS-RV suite was not very accurate, but I assumed it "good enough". The entire revision could take many months. So before to revise the suite is better to decide if it is necessary, and how deeply. Let me know what you think.

There's nothing urgent about it. It can be fixed when problems appear.

in reply to: ↑ 15   Changed 8 months ago by gunnar

Replying to gunnar:

Me too. Those were cases which turned up when I tried some code modifications. :-)

Or maybe I misremembered. It's more likely I found those when I reviewed your modifications. Those all looked right, by the way.

  Changed 5 months ago by gunnar

  • milestone changed from 3.8 to 3.7.12

  Changed 5 months ago by gunnar

  • status changed from new to closed
  • resolution set to fixed
  • description modified (diff)

See the addendum to the description for the status of inclusion in the GNU Go distribution. I'm closing this ticket now. Reopen if new information appears that can bring new light on the exact permissions.

Note: See TracTickets for help on using tickets.