Ticket #127 (new enhancement)

Opened 2 years ago

Last modified 2 years ago

fourlib-depth 8 proposed as default (instead of 7)

Reported by: alain Assigned to: gnugo
Priority: normal Milestone: 3.8
Component: source Version:
Severity: normal Keywords:
Cc: patch: 1

Description (Last modified by alain)

Regression summary:
Total nodes: 1888209453 3307921 12605560
16 PASS (corrected with the vie:8 pass wrongly counted as a fail)
9 FAIL

  • lots of very good pass,
  • only 2 very bad fail,
  • 2 unclear failure where the proposed move is probably sente.
  • 2 unclear situations that will probably not occur now, where the test should be refined to be meaningfull (restricted_genmove or specific test)

It seems 5% faster !

=> it seems good to put fourlib-depth 8 as the default instead of 7.

13x13:8         PASS J8 [J8]
	good

9x9:197         FAIL D6 [E8|H5]
	bad, probably one more dead stone.

gifu03:304      PASS J7 [J7]
	good and quite big

kgs:230         PASS J19 [J12|J11|L10|L9|J19]
	 HUGE

manyfaces1:36   FAIL S11 [P16]
	S11 seems big and probably sente, allowing to go back to P16

nando:6         FAIL 1 [0]
	i don't understand this kind of owl-test

nicklas1:2002   FAIL H6 [J5]
	very bad failure, greedy

nicklas2:904    FAIL F7 [B1|E1]
	very bad, greedy

ninestones:260  PASS B16 [B16]
	very good

ninestones:370  PASS R5 [R5]
	very good

ninestones:790  FAIL K5 [B1]
	unclear K5 is sente need to dig this...

ninestones:800  PASS K1 [K1]
	very good

nngs:120        PASS S1 [S1]
	very good, tesuji for life, only move

nngs:590        PASS G3 [G3]
	good

owl1:345        PASS 1 N13 [1 N13]
	very good

owl1:348        PASS 0 [0]
	good, see that the surrounded group is dead. 
	might provide huge improvement if it is a general feature

semeai:104      FAIL 1 1 B6 [1 1 (D6|C7)]
	i don't understand this kind of test

strategy4:155   FAIL P2 [D18]
	fail, _but_ difficult, messy,
	probably such a position won't occur if fourlib was increased.
	P2 D18 are uncorrelated! need restricted gen_move or other
        more specific tests.

strategy4:167   PASS D4 [D4]
	very good

strategy:50     PASS Q9 [Q9]
	very good. The test is:	
	#CATEGORY=OWL_TUNING
	#DESCRIPTION=P9 is pointless compared to Q9.
	#SEVERITY=8
	# Q9 is clearly better than Q11 because it stops a black connection
	# along the edge.
	# So much better, that I removed Q11 option -trevor
	loadsgf games/strategy13.sgf
	50 reg_genmove white
	#? [Q9]*

trevora:290     FAIL C8 [!C8]
	small failure, c8 makes paranoid life and lose some yose points.

trevorb:140     PASS K2 [L2|K2]
	very difficult ko life. It would be better to avoid being
        in such bad position.

trevor:9        PASS E5 [E5]
	very good, the game is still lost, but it's the biggest move ;)

trevor:1060     PASS 1 [1]
	very good

vie:8           FAIL 1 R7 [1 (R8|S9)]
	# comment says: gnugo like R7 but this leave W with Q8.
	The question is owl_defend R10
	R7 is a correct answer to the owl_defend R10
	=> R7 is a success not a failure. 
	(i bet this was a genmove change to owl_defend but the 
        comment is still here ;)

Attachments

fourlib-8.log (6.5 kB) - added by alain on 04/09/06 16:08:45.
regress.pike log with node counts. Maybe the test are not in good order.
reading-fourlib.diff (0.9 kB) - added by alain on 04/11/06 17:42:56.
Probable bugfix: elsewhere the test are similar, stackp <= fourlib_depth , or fourlib_depth < stackp

Change History

04/09/06 01:16:38 changed by alain

  • type changed from defect to enhancement.

04/09/06 01:21:11 changed by alain

  • description changed.

04/09/06 01:24:34 changed by alain

  • description changed.

04/09/06 09:15:00 changed by gunnar

  • component changed from regressions to source.

It seems 5% faster !

Can you run the regressions with regress.pike to see what the node counts have to say?

04/09/06 09:52:34 changed by gunnar

nando:6         FAIL 1 [0]
	i don't understand this kind of owl-test

"owl_does_defend S2 Q2" basically means

trymove white S2
return REVERSE_RESULT(owl_attack(Q2))

although the implementation is more involved. Or in plain text, does S2 owl_defend Q2?

semeai:104      FAIL 1 1 B6 [1 1 (D6|C7)]
	i don't understand this kind of test

The test is "analyze_semeai E9 D9" and means read the semeai of E9 vs D9 with white (E9) moving first. The correct result "1 1 (D6|C7)" means that white successfully defends its dragon (first 1) and also successfully attacks the opponent dragon (second 1) by playing D6 or C7. Seki (or mutual independent life) is "1 0" while a complete loss of the semeai is "0 0". The numbers can also be ko result codes.

Maybe B6 also is effective but from only a quick glance I'm doubtful.

04/09/06 16:08:45 changed by alain

  • attachment fourlib-8.log added.

regress.pike log with node counts. Maybe the test are not in good order.

04/09/06 16:12:49 changed by alain

The regression log is attached.
I use a test list, so maybe the order of test is not good, but it should be all the "official" regression suite, and only it.

04/09/06 19:23:47 changed by alain

This is version 3.7.9.tar.gz from gnugo site.
Configured with default param (except fourlib depth ;)
Compiled with gcc-4.0.2.

04/09/06 20:10:47 changed by gunnar

This means that there's a 12% increase of reading nodes for a 0.3% decrease of owl nodes, which makes it very unlikely that it would give a faster engine. More likely it slows it down by 10-12% or so.

04/11/06 17:22:17 changed by alain

The patch was done after the tests. I rerun regression test for the patch ...

04/11/06 17:24:12 changed by alain

  • patch set to 1.

04/11/06 17:42:56 changed by alain

  • attachment reading-fourlib.diff added.

Probable bugfix: elsewhere the test are similar, stackp <= fourlib_depth , or fourlib_depth < stackp

04/11/06 19:46:41 changed by alain

With the patch, and default params (including fourlib-depth), 3 PASS included in those above:

13x13:8         PASS J8 [J8]
strategy:50     PASS Q9 [Q9]
trevorb:140     PASS K2 [L2|K2]

Total nodes: 1715869355 3317957 12612198
Total time: 7699.83 (7844.91)
Total uncertainty: 46.97
3 PASS
no FAIL

04/11/06 22:52:21 changed by alain

With the patch, at fourlib-depth 8, additional regression result with special bonus :-)

cgf2004:70      PASS N4 [N4]      HUGE, key move of the game.
nngs3:400       FAIL N8 [N13]     failure, but interesting move.

09/18/06 09:10:35 changed by arend

Here is the breakage (as of r2363) of setting FOURLIB_DEPTH to 8:

trevora:290     FAIL C8 [!C8]
trevorb:140     PASS K2 [L2|K2]
nicklas1:2002   FAIL H6 [J5]
trevor:9        PASS E5 [E5]
trevor:1060     PASS 1 [1]
nngs:120        PASS S1 [S1]
nngs:590        PASS G3 [G3]
vie:8           FAIL 1 R7 [1 (R8|S9)]
13x13:8         PASS J8 [J8]
strategy4:155   FAIL P2 [D18]
strategy4:163   PASS P8 [O7|P8]
owl1:345        PASS 1 N13 [1 N13]
owl1:348        PASS 0 [0]
ninestones:260  PASS B16 [B16]
ninestones:370  PASS R5 [R5]
ninestones:790  FAIL K5 [B1]
ninestones:800  PASS K1 [K1]
manyfaces1:36   FAIL S11 [P16]
nando:6         FAIL 1 [0]
gifu03:304      PASS J7 [J7]
9x9:197         FAIL D6 [E8|H5]
kgs:230         PASS J19 [J12|J11|L10|L9|J19]
14 PASS
8 FAIL
Total nodes: 1893577229 3317756 12595885 (+12% -0.35% -0.019%)