Ticket #127 (new enhancement)

Opened 4 years ago

Last modified 18 months ago

fourlib-depth 8 proposed as default (instead of 7)

Reported by: alain Owned by: gnugo
Priority: normal Milestone: 3.9.x
Component: source Version:
Severity: normal Keywords:
Cc: patch: yes

Description (last modified by alain) (diff)

Regression summary:
Total nodes: 1888209453 3307921 12605560
16 PASS (corrected with the vie:8 pass wrongly counted as a fail)
9 FAIL

  • lots of very good pass,
  • only 2 very bad fail,
  • 2 unclear failure where the proposed move is probably sente.
  • 2 unclear situations that will probably not occur now, where the test should be refined to be meaningfull (restricted_genmove or specific test)

It seems 5% faster !

=> it seems good to put fourlib-depth 8 as the default instead of 7.

13x13:8         PASS J8 [J8]
	good

9x9:197         FAIL D6 [E8|H5]
	bad, probably one more dead stone.

gifu03:304      PASS J7 [J7]
	good and quite big

kgs:230         PASS J19 [J12|J11|L10|L9|J19]
	 HUGE

manyfaces1:36   FAIL S11 [P16]
	S11 seems big and probably sente, allowing to go back to P16

nando:6         FAIL 1 [0]
	i don't understand this kind of owl-test

nicklas1:2002   FAIL H6 [J5]
	very bad failure, greedy

nicklas2:904    FAIL F7 [B1|E1]
	very bad, greedy

ninestones:260  PASS B16 [B16]
	very good

ninestones:370  PASS R5 [R5]
	very good

ninestones:790  FAIL K5 [B1]
	unclear K5 is sente need to dig this...

ninestones:800  PASS K1 [K1]
	very good

nngs:120        PASS S1 [S1]
	very good, tesuji for life, only move

nngs:590        PASS G3 [G3]
	good

owl1:345        PASS 1 N13 [1 N13]
	very good

owl1:348        PASS 0 [0]
	good, see that the surrounded group is dead. 
	might provide huge improvement if it is a general feature

semeai:104      FAIL 1 1 B6 [1 1 (D6|C7)]
	i don't understand this kind of test

strategy4:155   FAIL P2 [D18]
	fail, _but_ difficult, messy,
	probably such a position won't occur if fourlib was increased.
	P2 D18 are uncorrelated! need restricted gen_move or other
        more specific tests.

strategy4:167   PASS D4 [D4]
	very good

strategy:50     PASS Q9 [Q9]
	very good. The test is:	
	#CATEGORY=OWL_TUNING
	#DESCRIPTION=P9 is pointless compared to Q9.
	#SEVERITY=8
	# Q9 is clearly better than Q11 because it stops a black connection
	# along the edge.
	# So much better, that I removed Q11 option -trevor
	loadsgf games/strategy13.sgf
	50 reg_genmove white
	#? [Q9]*

trevora:290     FAIL C8 [!C8]
	small failure, c8 makes paranoid life and lose some yose points.

trevorb:140     PASS K2 [L2|K2]
	very difficult ko life. It would be better to avoid being
        in such bad position.

trevor:9        PASS E5 [E5]
	very good, the game is still lost, but it's the biggest move ;)

trevor:1060     PASS 1 [1]
	very good

vie:8           FAIL 1 R7 [1 (R8|S9)]
	# comment says: gnugo like R7 but this leave W with Q8.
	The question is owl_defend R10
	R7 is a correct answer to the owl_defend R10
	=> R7 is a success not a failure. 
	(i bet this was a genmove change to owl_defend but the 
        comment is still here ;)

Attachments

fourlib-8.log Download (6.5 KB) - added by alain 4 years ago.
regress.pike log with node counts. Maybe the test are not in good order.
reading-fourlib.diff Download (0.9 KB) - added by alain 4 years ago.
Probable bugfix: elsewhere the test are similar, stackp <= fourlib_depth , or fourlib_depth < stackp

Regression Results

Attachment Rev. PASS FAIL Nodes Status
reading-fourlib.diff Download 2381 1 +2% -0.017% +0.02% details

Change History

Changed 4 years ago by alain

  • type changed from defect to enhancement

Changed 4 years ago by alain

  • description modified (diff)

Changed 4 years ago by alain

  • description modified (diff)

Changed 4 years ago by gunnar

  • component changed from regressions to source

It seems 5% faster !

Can you run the regressions with regress.pike to see what the node counts have to say?

Changed 4 years ago by gunnar

nando:6         FAIL 1 [0]
	i don't understand this kind of owl-test

"owl_does_defend S2 Q2" basically means

trymove white S2
return REVERSE_RESULT(owl_attack(Q2))

although the implementation is more involved. Or in plain text, does S2 owl_defend Q2?

semeai:104      FAIL 1 1 B6 [1 1 (D6|C7)]
	i don't understand this kind of test

The test is "analyze_semeai E9 D9" and means read the semeai of E9 vs D9 with white (E9) moving first. The correct result "1 1 (D6|C7)" means that white successfully defends its dragon (first 1) and also successfully attacks the opponent dragon (second 1) by playing D6 or C7. Seki (or mutual independent life) is "1 0" while a complete loss of the semeai is "0 0". The numbers can also be ko result codes.

Maybe B6 also is effective but from only a quick glance I'm doubtful.

Changed 4 years ago by alain

regress.pike log with node counts. Maybe the test are not in good order.

Changed 4 years ago by alain

The regression log is attached.
I use a test list, so maybe the order of test is not good, but it should be all the "official" regression suite, and only it.

Changed 4 years ago by alain

This is version 3.7.9.tar.gz from gnugo site.
Configured with default param (except fourlib depth ;)
Compiled with gcc-4.0.2.

Changed 4 years ago by gunnar

This means that there's a 12% increase of reading nodes for a 0.3% decrease of owl nodes, which makes it very unlikely that it would give a faster engine. More likely it slows it down by 10-12% or so.

Changed 4 years ago by alain

The patch was done after the tests. I rerun regression test for the patch ...

Changed 4 years ago by alain

  • patch set

Changed 4 years ago by alain

Probable bugfix: elsewhere the test are similar, stackp <= fourlib_depth , or fourlib_depth < stackp

Changed 4 years ago by alain

With the patch, and default params (including fourlib-depth), 3 PASS included in those above:

13x13:8         PASS J8 [J8]
strategy:50     PASS Q9 [Q9]
trevorb:140     PASS K2 [L2|K2]

Total nodes: 1715869355 3317957 12612198
Total time: 7699.83 (7844.91)
Total uncertainty: 46.97
3 PASS
no FAIL

Changed 4 years ago by alain

With the patch, at fourlib-depth 8, additional regression result with special bonus :-)

cgf2004:70      PASS N4 [N4]      HUGE, key move of the game.
nngs3:400       FAIL N8 [N13]     failure, but interesting move.

Changed 4 years ago by arend

Here is the breakage (as of r2363) of setting FOURLIB_DEPTH to 8:

trevora:290     FAIL C8 [!C8]
trevorb:140     PASS K2 [L2|K2]
nicklas1:2002   FAIL H6 [J5]
trevor:9        PASS E5 [E5]
trevor:1060     PASS 1 [1]
nngs:120        PASS S1 [S1]
nngs:590        PASS G3 [G3]
vie:8           FAIL 1 R7 [1 (R8|S9)]
13x13:8         PASS J8 [J8]
strategy4:155   FAIL P2 [D18]
strategy4:163   PASS P8 [O7|P8]
owl1:345        PASS 1 N13 [1 N13]
owl1:348        PASS 0 [0]
ninestones:260  PASS B16 [B16]
ninestones:370  PASS R5 [R5]
ninestones:790  FAIL K5 [B1]
ninestones:800  PASS K1 [K1]
manyfaces1:36   FAIL S11 [P16]
nando:6         FAIL 1 [0]
gifu03:304      PASS J7 [J7]
9x9:197         FAIL D6 [E8|H5]
kgs:230         PASS J19 [J12|J11|L10|L9|J19]
14 PASS
8 FAIL
Total nodes: 1893577229 3317756 12595885 (+12% -0.35% -0.019%)

Changed 18 months ago by gunnar

  • milestone changed from 3.8 to 3.9.x
Note: See TracTickets for help on using tickets.