Ticket #148 (new defect)
Some upgrades
| Reported by: | draqo | Owned by: | gnugo |
|---|---|---|---|
| Priority: | normal | Milestone: | Future |
| Component: | source | Version: | |
| Severity: | minor | Keywords: | |
| Cc: | patch: | yes |
Description
Ok. I'm creating a new ticket which holds changes from two of my previous tickets (nr 144 and 145). They should be closed.
There is a final relase upgrade, cache, board.c optimization and some small changes which I've done during code overlooking. Now I'll systematically go through all other engine files and optimize them and look for errors.
As for now I've found 2 of them:
1). One is in extended_chainlinks function, which was counting the string as neighbor itself (we didn't mark ourselves when both_colors were true). This caused one more regression test passed.
2). Second was in build_worms, where an origin was set for every vertex. Now it is set only for vertexes with stone on them. This change gives us two more regressions passed.
Passed regressions are (I don't remember which change gives which regression test passed): strategy2-55, arend-9, ninestones-280.
All other changes don't change a logic of the GnuGo? and gives us only a very little performance upgrade. Except one more: in komaster_trymove in scheme 2b and 4 komaster is reset only, when the owner of komaster plays at kom_pos - not when opponent plays them (I don't think that it changes anything, but it should be like that). I've done all regressions for these changes and it gives us 3 more passes as mentioned above. Other works the same as old version. I've also played about 100 games between old and new version and the result is near 50% won games.
There are no changes in performance for non-final release version, because I've added some more asserts and not optimized too much. When final release version is turned on it gives about 1,2% better performance.
Please accept these changes, because soon there will be much more :-) I've finished project in my job and now again will have some time for GnuGo?. My Master's Degree is late already... :)
Attachments
Regression Results
| Attachment | Rev. | PASS | FAIL | Nodes | Status | |
| board-mem.patch | 2381 | patch does not apply | details | |||
| 2382 | patch does not apply | details | ||||
| changes.diff | 2381 | patch does not apply | details | |||
| 2382 | build error | details | ||||
| changes2.diff | 2381 | patch does not apply | details | |||
| 2382 | patch does not apply | details | ||||
| cleanup.diff | 2381 | patch does not apply | details | |||
| 2382 | patch does not apply | details | ||||
| cleanup2.patch | 2381 | patch does not apply | details | |||
| 2382 | patch does not apply | details | ||||
| code.diff | 2381 | patch does not apply | details | |||
| 2382 | patch does not apply | details | ||||
| dfa.patch | 2381 | patch does not apply | details | |||
| 2382 | patch does not apply | details | ||||
| doc.diff | 2381 | patch does not apply | details | |||
| 2382 | 0% 0% 0% | details | ||||
| loops.patch | 2381 | patch does not apply | details | |||
| 2382 | patch does not apply | details | ||||
| matchpat.diff | 2381 | patch does not apply | details | |||
| 2382 | patch does not apply | details | ||||
| reg_tests.diff | 2381 | patch does not apply | details | |||
| 2382 | 0% 0% 0% | details | ||||
| unconditional.diff | 2381 | patch does not apply | details | |||
| 2382 | patch does not apply | details | ||||
| vcproj-opt.diff | 2381 | patch does not apply | details | |||
| 2382 | patch does not apply | details | ||||
| vcproj.diff | 2381 | patch does not apply | details | |||
| 2382 | patch does not apply | details | ||||
| worm1.diff | 2381 | patch does not apply | details | |||
| 2382 | patch does not apply | details | ||||
| worm2.diff | 2381 | patch does not apply | details | |||
| 2382 | patch does not apply | details | ||||
| worm3.patch | 2381 | patch does not apply | details | |||
| 2382 | patch does not apply | details | ||||
| worm4.patch | 2381 | patch does not apply | details | |||
| 2382 | patch does not apply | details | ||||
| worm5.patch | 2381 | patch does not apply | details | |||
| 2382 | patch does not apply | details |
Change History
comment:1 in reply to: ↑ description Changed 5 years ago by draqo
I splitted changes in 3 files, because looking through them in Trac worked really slow. Now it works only slow :-) I don't split it further because you want to have all changes in even one file.
comment:2 follow-up: ↓ 5 Changed 5 years ago by arend
Some comments on changes2.diff:
- I don't like the changes str->str_pos, and I am sure others don't like them either, otherwise it would have been done long ago.
- Your "inline"s don't do anything mostly, as GCC can't inline functions when they are used in another file as the definition; OTOH, when they are in the same file, GCC probably does a better job at deciding whether to inline them or not than we do (as GCC has much more information). Also please don't inline not performance-critical functions, they only make code size bigger.
- Please don't keep your changes consistent with the existing coding style.
- To me, your changes from "deepest" to "most reliable" in the hashing code make no sense to me, but maybe I am misunderstanding s.th. "more reliable" to me implies the two entries refer to the same position, which they don't.
- Please don't make logic changes in cleanup patches (adjust_level_offset).
- Please don't clutter essential, but not at all time-critical functions as examine_position() with asserts to save time.
- Please dont mix logic changes into a cleanup patch. Please don't.
I will stop here, at about half-way through the file.
comment:3 follow-up: ↓ 6 Changed 5 years ago by arend
Could you make two separate patches for the bugs (1) and (2) you listed above? Preferably in separate tickets.
comment:4 in reply to: ↑ description ; follow-up: ↓ 7 Changed 5 years ago by arend
Replying to draqo:
There are no changes in performance for non-final release version, because I've added some more asserts and not optimized too much. When final release version is turned on it gives about 1,2% better performance.
This is a trade-off. I prefer having 1.2% worse performance in released version and finding more bug this way by reports from users. This will be much better for GNU Go's strength and performance in the long run.
comment:5 in reply to: ↑ 2 ; follow-up: ↓ 9 Changed 5 years ago by draqo
- I don't like the changes str->str_pos, and I am sure others don't like them either, otherwise it would have been done long ago.
I think that this change is helpful, because you immediately know if variable is string number or string position (in board.c both are used).
- Your "inline"s don't do anything mostly, as GCC can't inline functions when they are used in another file as the definition; OTOH, when they are in the same file, GCC probably does a better job at deciding whether to inline them or not than we do (as GCC has much more information). Also please don't inline not performance-critical functions, they only make code size bigger.
I inlined only really small functions, where parameters passing is quite costly in comparison with function code. But yes, it doesn't give us nearly anything, because most of these funcions are called once for a move. I'll change it soon.
- Please don't keep your changes consistent with the existing coding style.
?
- To me, your changes from "deepest" to "most reliable" in the hashing code make no sense to me, but maybe I am misunderstanding s.th. "more reliable" to me implies the two entries refer to the same position, which they don't.
Most reliable is for one entry only (so for one position - exactly for positions with the same hashing code). Caching code stores in the entry the shallowest node (not the deepest) and if a depth is the same, it stores results from function with bigger cost (which analyze better) - that's why I changed variable name to "most_reliable".
- Please don't make logic changes in cleanup patches (adjust_level_offset).
- Please dont mix logic changes into a cleanup patch. Please don't.
I'm looking through GnuGo? file by file, so it will be really difficult to split patch into logic and other changes patches. And I'm always writing, what logic changes I've made in ticket description.
comment:6 in reply to: ↑ 3 Changed 5 years ago by draqo
Replying to arend:
Could you make two separate patches for the bugs (1) and (2) you listed above? Preferably in separate tickets.
Ok. I could do this. I will made 3 tickets - one for bug (1), second for bug (2) and third for the rest of the changes. But I'll do it later.
comment:7 in reply to: ↑ 4 Changed 5 years ago by draqo
Replying to arend:
Replying to draqo:
There are no changes in performance for non-final release version, because I've added some more asserts and not optimized too much. When final release version is turned on it gives about 1,2% better performance.
This is a trade-off. I prefer having 1.2% worse performance in released version and finding more bug this way by reports from users. This will be much better for GNU Go's strength and performance in the long run.
That's why FINAL_RELEASE should be used only in important tournaments - once in the year or something. I wrote better descriptions of this in previous tickets (nr 144 and 145) - look there.
comment:8 in reply to: ↑ description Changed 5 years ago by gunnar
1). One is in extended_chainlinks function, which was counting the string as neighbor itself (we didn't mark ourselves when both_colors were true). This caused one more regression test passed.
That's clearly a bug.
2). Second was in build_worms, where an origin was set for every vertex. Now it is set only for vertexes with stone on them. This change gives us two more regressions passed.
Historically worms could also be built from empty points (aka caverns), but that was removed when it turned out not be of any use any longer. Nowadays it shouldn't make any difference at all and therefore the regressions point to bugs elsewhere in the code, at least some of which are addressed by ticket #155. Thanks for providing the hint. :-)
comment:9 in reply to: ↑ 5 Changed 5 years ago by arend
Replying to draqo:
- I don't like the changes str->str_pos, and I am sure others don't like them either, otherwise it would have been done long ago.
I think that this change is helpful, because you immediately know if variable is string number or string position (in board.c both are used).
Ah I see your point. However, in the rest of GNU Go, "str" always is a string position, so it would make more sense to me to always use "s" as a string number.
- Your "inline"s don't do anything mostly, as GCC can't inline functions when they are used in another file as the definition; OTOH, when they are in the same file, GCC probably does a better job at deciding whether to inline them or not than we do (as GCC has much more information). Also please don't inline not performance-critical functions, they only make code size bigger.
I inlined only really small functions, where parameters passing is quite costly in comparison with function code. But yes, it doesn't give us nearly anything, because most of these funcions are called once for a move. I'll change it soon.
But GCC knows that, too. Generally speaking, unless you really know what you are doing (as in knowing that the compiler does the wrong decision on a common platform), it is much better to leave inline decisions to the compiler these days.
comment:10 Changed 5 years ago by draqo
I've added new patch, in which I removed unnecessary inlines. In everything else it's identical to changes.diff and changes2.diff (it's just in one file now).
Changed 5 years ago by draqo
-
attachment
worm1.diff
added
worm cleanup patch nr 1 (no logic changes) - apply after code.diff
Changed 5 years ago by draqo
-
attachment
unconditional.diff
added
cleanup patch nr 2 (apply after patch nr 1)
comment:11 Changed 5 years ago by draqo
I realized, that regression tests can differ, depending on how much memory we assign for cache. And when I changed it to standard value (-1) I got one more regression passed (due to extended_chainlinks or build_worms change) - trevorb:140.
Changed 5 years ago by draqo
-
attachment
cleanup.diff
added
cleanup patch nr 3 (apply after patch nr 2)
Changed 5 years ago by draqo
-
attachment
vcproj.diff
added
Visual Studio patch (apply after cleanup patch nr 3)
comment:12 Changed 5 years ago by draqo
I've just added Visual Studio 2005 patch. It includes new solution and project files. But also it removes memory leaks - with exception when exit is called. And it adds to sgfnode.c isascii function to avoid asserts in isspace.
comment:13 follow-up: ↓ 14 Changed 5 years ago by draqo
After previous patches, tickets 162, 163 and 164 should be applied now. After adding these 3 tickets, regression tests changes to: filllib.tst:35 PASSED semeai.tst:47 PASSED trevord.tst:680 FAILED nngs4.tst:270 PASSED 9x9.tst:650 PASSED
After applying changes from one of these tickets there are possibly more passes or failures, but combination of all of them gives the above results. When I made changes from ticket 162 I got some unexpected failures and I was working them out, because I REALLY don't want any of them (which I couldn't achieve because of trevord.tst:680 - it gives failure because of different semeai move valuation - a result of semeai is the same). After repairing some failures I got another and this iterating process leads finally to all 3 tickets - that's why these tickets should be treated as 1 really (they are splitted for easier reading).
Changed 5 years ago by draqo
-
attachment
reg_tests.diff
added
Regression tests changes after tickets 162-164
comment:14 in reply to: ↑ 13 ; follow-up: ↓ 15 Changed 5 years ago by draqo
Replying to draqo:
After previous patches, tickets 162, 163 and 164 should be applied now. After adding these 3 tickets, regression tests changes to: filllib.tst:35 PASSED semeai.tst:47 PASSED trevord.tst:680 FAILED nngs4.tst:270 PASSED 9x9.tst:650 PASSED
I forgot to add regression tests changes. Now they are in reg_tests.diff patch. For 13x13c:43 K12 gives same results as G13. As for strategy2:55, B13 is good move too, maybe even better than C12, because it gives white better shape, black D15 group can't make 2 eyes anyway (the fight is really complicated but I couldn't make 2 eyes - my level is about 3 or 4 kyu - GnuGo? can't make them too) and B15 dragon wins semeai with C18. Anyway, GnuGo? generates C12 now, so it should be one more unexpected PASS, if you don't agree with me on B13 move.
comment:15 in reply to: ↑ 14 Changed 5 years ago by draqo
comment:16 follow-up: ↓ 17 Changed 5 years ago by draqo
Hmm... I've seen a second ago, that some patches were applied to GnuGo? recently. And what I'm asking is when my patches would be applied? The longer the time will be the bigger probability that these patches won't match actual files in GnuGo?. I know that they are big, but most of this code doesn't change any logic. Beside of that I tested all of them very good - make regression tests, play several games and everything is working ok. As for me you can choose to not apply any of them at all, because this work is for Masters Degree, but I want to give something from me more than straight work to school and it would be nice, when it will be used by you in GnuGo?. I am interested in AI and I'm doing this because I really want GnuGo? to play better and it gives me some satisfaction :-) You can cut, what you don't like, but apply at least logic changes, please...
comment:17 in reply to: ↑ 16 ; follow-ups: ↓ 18 ↓ 19 Changed 5 years ago by gunnar
Replying to draqo:
Hmm... I've seen a second ago, that some patches were applied to GnuGo? recently. And what I'm asking is when my patches would be applied? The longer the time will be the bigger probability that these patches won't match actual files in GnuGo?. I know that they are big, but most of this code doesn't change any logic. Beside of that I tested all of them very good - make regression tests, play several games and everything is working ok. As for me you can choose to not apply any of them at all, because this work is for Masters Degree, but I want to give something from me more than straight work to school and it would be nice, when it will be used by you in GnuGo?. I am interested in AI and I'm doing this because I really want GnuGo? to play better and it gives me some satisfaction :-) You can cut, what you don't like, but apply at least logic changes, please...
As I said in a comment to ticket #145 a prerequisite for accepting non-trivial patches is a copyright assignment to the Free Software Foundation for the changes, see http://lists.gnu.org/archive/html/gnugo-devel/2006-09/msg00011.html.
When it comes to the patches I'm generally not enthusiastic about the non-functional code changes. On the other hand the doc improvements look useful, as do the functional code changes. More specifically on the latter #162 is a step in the right direction although I probably don't agree with all details, #163 looks very interesting, and #164 is probably a good idea.
comment:18 in reply to: ↑ 17 Changed 5 years ago by draqo
Replying to gunnar:
As I said in a comment to ticket #145 a prerequisite for accepting non-trivial patches is a copyright assignment to the Free Software Foundation for the changes, see http://lists.gnu.org/archive/html/gnugo-devel/2006-09/msg00011.html.
It isn't a problem. Just tell me what I should send to you.
When it comes to the patches I'm generally not enthusiastic about the non-functional code changes. On the other hand the doc improvements look useful, as do the functional code changes.
I know that not all are valuable. Eg. like assigning to some temporary variables values which are accessed more than once or twice (values like table[xx].field1 - to which access means 3 reads from memory) - but it gives us at least one less read from memory and the code often looks easier to read for human. But some changes gives us more, eg. using memcpy instead of for loop. And there are some new comments, where not everything was easy to understand and took me a time to analyze. I think all of them gives us at least a very little enhancement in speed, code organization or readability of the code - sometimes very little indeed, but when I'm analyzing the code line by line, such ideas comes to the head, so why not apply them :-)
comment:19 in reply to: ↑ 17 Changed 5 years ago by draqo
Replying to gunnar:
More specifically on the latter #162 is a step in the right direction although I probably don't agree with all details, #163 looks very interesting, and #164 is probably a good idea.
As for 162 tell with what you don't agree (write it in #162).
And about #164: Look on the new pattern CC505, defend_both is used there. If the old version of the function would be used, this pattern would make dragons for worms, which you could connect only, when you would win a ko - and this situation definitely shouldn't be treated as one dragon. Changes in all other places, where the defend_both is used, should make this change even more valuable.
comment:20 follow-up: ↓ 21 Changed 5 years ago by nando
Hi draqo,
I see that you haven't logged on KGS in a long time, so that's probably the reason why you didn't get my message: there's a room under 'Social' named 'GNU Go Hackers'. Gunnar, Arend, Alain and I are relatively often there and you are cordially invited to join us there.
Other than that, I also have many comments about your proposed patches, but I think it's useless to discuss about those, as long as you haven't done the necessary paperwork (did you get in touch with Daniel Bump about this procedure?)
comment:21 in reply to: ↑ 20 Changed 5 years ago by draqo
Replying to nando:
I see that you haven't logged on KGS in a long time, so that's probably the reason why you didn't get my message: there's a room under 'Social' named 'GNU Go Hackers'. Gunnar, Arend, Alain and I are relatively often there and you are cordially invited to join us there.
I'll log there soon :-)
Other than that, I also have many comments about your proposed patches, but I think it's useless to discuss about those, as long as you haven't done the necessary paperwork (did you get in touch with Daniel Bump about this procedure?)
I don't know the procedure and I want to contact him. By mail or how?
Changed 5 years ago by draqo
-
attachment
worm2.diff
added
worm cleanup patch nr 2 (no logic changes, added discarded moves) - apply to 3.7.11 ver and after all mine previous patches and tickets 162-164
Changed 5 years ago by draqo
-
attachment
matchpat.diff
added
matchpat.c cleanup patch (includes also minimal change in attack.db and defense.db) - apply after worm2 patch
comment:22 follow-up: ↓ 23 Changed 5 years ago by draqo
After examining matchpat I am curious, why DFA pattern matching isn't always used? Is sometimes it takes longer than a normal matching with a grid optimization?
comment:23 in reply to: ↑ 22 Changed 5 years ago by gunnar
Replying to draqo:
After examining matchpat I am curious, why DFA pattern matching isn't always used? Is sometimes it takes longer than a normal matching with a grid optimization?
If I remember correctly it can require a significant amount of memory, which isn't always justified when doing non time critical pattern matching.
Changed 5 years ago by draqo
-
attachment
vcproj-opt.diff
added
VisualC projects optimization enable patch
Changed 5 years ago by draqo
-
attachment
dfa.patch
added
Usage of DFA in all patterns and few errors corrected + small upgrade in owl_test_cuts
comment:24 Changed 5 years ago by draqo
Ok. I've changed all patterns to use DFA. It uses about 800 kB memory more and in these times it is a small amount - the time is much more important. Due to this I've changed some things in dfa and mkpat and corrected some errors, too. Also I had to change throw_in_atari_helper and cutstone2_helper from helper to autohelper because of constant offsets inside these functions (pattern ED40 didn't work anyway because it was rotated and didn't match these offsets).
I've optimized DFA too, which hasn't been done for a really long time and added a note in the documentation about optimizing to remind it :-) After the optimization some patterns becomes rotated differently and an order of matching them changed. Due to this some regression results changed too. I've corrected this by adding counting of liberties in owl_test_cuts, but it also gives one unexpected fail (anyway, some other patterns' rotation could give this fail too) - but 2 unexpected passes are bigger than the fail (passes are: strategy4:205, ninestones:700 and the fail is: handtalk:15).
Also my signed copyright assignments arrived in Boston and finally my patches could be applied :-)
Changed 5 years ago by draqo
-
attachment
all-in-one.zip
added
All my patches in one file (from this ticket and 162, 163 and 164) - easier to apply and diffed with actual version of GnuGo?, which changed since my early patches
comment:25 Changed 5 years ago by draqo
Regression changes after all-in-one patch are (these were written in patches descriptions, but I want to gather them in one place):
- filllib:35 PASS
- semeai:47 PASS
- strategy4:205 PASS
- nngs4:270 PASS
- ninestones:280 PASS
- ninestones:700 PASS
- 9x9:650 PASS
- trevord:680 FAIL
- handtalk:15 FAIL
The program is working faster, too. After some rough tests I got 8% faster result (enabling FINAL_RELEASE gives even better result) - but this must be analyzed more precisely.
Changed 5 years ago by draqo
-
attachment
loops.patch
added
Loops optimization - and small cleanup in worm.c
comment:26 Changed 5 years ago by draqo
After previous patches, tickets #168, #169, #170, #171 and #172 should be applied now (upgrade.patch from ticket #163 have to be applied too). After adding these tickets, regression tests changes to: nngs2.tst:600 PASSED 9x9.tst:120 PASSED global:44 PASSED nicklas3:1403 PASSED ninestones:260 PASSED 13x13:64 FAILED nngs2.tst:150 FAILED 9x9.tst:197 FAILED. So we have 5 new passes and 3 new failures - but passes are in majority and best 3 of them are bigger than failures.
After applying changes from one of these tickets there are possibly more passes or failures, but combination of all of them gives the above results - it's like in comment from 01/23/07 00:19:53.
Changed 5 years ago by draqo
-
attachment
worm4.patch
added
upgraded threat finding (+ small cleanup) - fix2.patch from ticket #163 is needed to don't change regression
Changed 5 years ago by draqo
-
attachment
worm5.patch
added
upgraded: attack_threats when liberties==1 and find_lunch - no changes in regression
comment:27 Changed 5 years ago by draqo
I added ticket 173.
Changed 5 years ago by draqo
-
attachment
board-mem.patch
added
small cleanup in tables memory handling in board.c
Changed 5 years ago by draqo
-
attachment
all-in-one2.zip
added
all my patches in one file - end of my work
comment:28 Changed 5 years ago by draqo
Ok. My work finally comes to the end. I've attached a file, where all my patches are - these from tickets: 148, 162, 163, 164, 168, 169, 170, 171, 172 and 173.
Results are:
- 2% slower work, but enabling FINAL_RELEASE gives about 1,5% faster work and then the speed is almost the same
- unexpected PASSES (19): filllib:35, viking:1, nicklas3:1403, trevorc:410, global:44, semeai:47, strategy4:205, nngs2:600, nngs4:270, strategy5:295, ninestones:260, ninestones:280, ninestones:340, ninestones:700, thrash:26, 9x9:120, 9x9:300, 9x9:640, 9x9:650
- unexpected FAILURES (5): nngs:700, trevord:680, handtalk:15, nngs2:150, 9x9:197
- after playing 200 games between a recent version (in CVS) and my new version the result is 102:98 to my version (both players played 100 games as black and 100 as white)
So a result of the regression is significantly better, but unfortunately it doesn't have influence on overall level of GnuGo?. I think, that with more aggresive player strength of the new version would be bigger, but it must be tested in real life :)
Last thing is copyright assignments from my University. I have a meeting in next week, on which I should get this assignment and after another few days it should get to Boston. I'll write, when all this bureaucracy comes to the end.
comment:29 Changed 5 years ago by draqo
Finally I managed to meet with somebody on my University to get copyright assignments. I sent them today, so in few days (maybe more) they should arrive in Boston. I'll write, when this happens.
comment:30 follow-up: ↓ 36 Changed 5 years ago by arend
Have you measured whether using DFA for all the pattern databases actually speeds up the program? I doubt it, actually.
comment:31 follow-up: ↓ 37 Changed 5 years ago by arend
Why are you changing DEFAULT_MEMORY to -1?
comment:32 follow-up: ↓ 38 Changed 5 years ago by arend
What are you trying to do with SIZEOF_INT? I don't think we want to add code to take care of machines with int size smaller than 4, if that's what you are trying to do.
comment:33 follow-up: ↓ 39 Changed 5 years ago by arend
As I said earlier: let's keep str for string positions in board.c/board.h, that's what str is used for in the rest of GNU Go. If somewhere str is used for a string number, that should be fixed.
comment:34 follow-up: ↓ 40 Changed 5 years ago by arend
Can you explain why the new VC project files are needed? I know nothing about VC, but we used to have one/two developers using VC, my understanding was it used to work.
comment:35 follow-up: ↓ 41 Changed 5 years ago by arend
I think board-mem.patch is wrong, it reduces data locality and my bet is it is more likely to cause a slowdown than a speedup.
comment:36 in reply to: ↑ 30 Changed 5 years ago by draqo
Replying to arend:
Have you measured whether using DFA for all the pattern databases actually speeds up the program? I doubt it, actually.
I measured a time only after using DFA in all pattern databases. And it speeds it up quite a lot. But I don't remember now how much exactly.
comment:37 in reply to: ↑ 31 Changed 5 years ago by draqo
Replying to arend:
Why are you changing DEFAULT_MEMORY to -1?
Because it uses then a default value saved in a program. It's nearly 8 MB used when DEFAULT_MEMORY==8. But only nearly - not the same - and this could lead to different regression results, when cache is different sized. I decided to unify this, because in Visual config.h it was -1 and in Linux config.h it was 8 (if I remember correctly).
comment:38 in reply to: ↑ 32 Changed 5 years ago by draqo
Replying to arend:
What are you trying to do with SIZEOF_INT? I don't think we want to add code to take care of machines with int size smaller than 4, if that's what you are trying to do.
It is used for machines with int bigger than 4 bytes, too.
comment:39 in reply to: ↑ 33 Changed 5 years ago by draqo
Replying to arend:
As I said earlier: let's keep str for string positions in board.c/board.h, that's what str is used for in the rest of GNU Go. If somewhere str is used for a string number, that should be fixed.
When somebody new is looking on the board code it is really confusing. And now both You and new GnuGo? programmers won't have this problem. It will be clear if it is a number or not. But it's up to You to leave it as it was...
comment:40 in reply to: ↑ 34 Changed 5 years ago by draqo
Replying to arend:
Can you explain why the new VC project files are needed? I know nothing about VC, but we used to have one/two developers using VC, my understanding was it used to work.
Because there is new VC version and it uses different format for projects. I couldn't work on old version.
comment:41 in reply to: ↑ 35 ; follow-up: ↓ 42 Changed 5 years ago by draqo
Replying to arend:
I think board-mem.patch is wrong, it reduces data locality and my bet is it is more likely to cause a slowdown than a speedup.
It reduces locality but speeds up a program, because memory doesn't need to be allocated at each call to functions. It could be better cached, too. I tested using static and local tables in a small program and results was... no difference :-) Or in error margin. But I think if it would change something (on other compilers, systems, etc.), it change it on plus.
comment:42 in reply to: ↑ 41 ; follow-up: ↓ 44 Changed 5 years ago by arend
Replying to draqo:
Replying to arend:
I think board-mem.patch is wrong, it reduces data locality and my bet is it is more likely to cause a slowdown than a speedup.
It reduces locality but speeds up a program, because memory doesn't need to be allocated at each call to functions. It could be better cached, too. I tested using static and local tables in a small program and results was... no difference :-) Or in error margin. But I think if it would change something (on other compilers, systems, etc.), it change it on plus.
The memory doesn't need to be allocated, the array is just put on the stack. Anyway, in case it is unclear whether this is a speedup, then always the more maintainable code wins, which is the current version.
comment:43 follow-up: ↓ 45 Changed 5 years ago by arend
loops.patch is definitely wrong. This kind of optimization is done by every modern compiler (if it isn't, file a bug with your compiler), AND your change makes the code less readable.
comment:44 in reply to: ↑ 42 Changed 5 years ago by draqo
Replying to arend:
Replying to draqo:
Replying to arend:
I think board-mem.patch is wrong, it reduces data locality and my bet is it is more likely to cause a slowdown than a speedup.
It reduces locality but speeds up a program, because memory doesn't need to be allocated at each call to functions. It could be better cached, too. I tested using static and local tables in a small program and results was... no difference :-) Or in error margin. But I think if it would change something (on other compilers, systems, etc.), it change it on plus.
The memory doesn't need to be allocated, the array is just put on the stack. Anyway, in case it is unclear whether this is a speedup, then always the more maintainable code wins, which is the current version.
Array is put on the stack, so it is allocated - not by an user but by a program. If you don't want just don't apply this patch :)
comment:45 in reply to: ↑ 43 ; follow-up: ↓ 46 Changed 5 years ago by draqo
Replying to arend:
loops.patch is definitely wrong. This kind of optimization is done by every modern compiler (if it isn't, file a bug with your compiler), AND your change makes the code less readable.
I don't think so. But... just don't apply. You could make some tests by yourself, too.
comment:46 in reply to: ↑ 45 ; follow-up: ↓ 47 Changed 5 years ago by arend
Replying to draqo:
Replying to arend:
loops.patch is definitely wrong. This kind of optimization is done by every modern compiler (if it isn't, file a bug with your compiler), AND your change makes the code less readable.
I don't think so. But... just don't apply. You could make some tests by yourself, too.
I thought it might be nicer to explain why I don't think it is worth applying...
See, I am not an expert in low-level optimization at all, but over the years working on GNU Go I have learned a bit here and there. For example, I know that a loop variable k that is only used to index an array never actually appears in the compiler output. Try looking at the (optimized) assembler output of a compiler that has that option, and you will see. This is such a common optimization that every compiler that wasn't able to do that would be completely horrible at SPEC etc.
As a general rule, low-level optimization of C code is pretty much not worth it anymore, and not for GNU Go in particular:
- Unlike us coders, often the compiler knows which processor it is compiling the code for, and knows a lot more about that than we have ever heard. (Try learning what the scheduler of a compiler does if you don't believe me.)
- Even if you are lucky with one particular optimization and trick the compiler into generating more efficient code (which is essentially what you are trying to do), then there is no guarantee that the same will happen with a different compiler, or a new version of the same compiler.
- A lot of code in GNU Go is hardly performance critical, but will have to be maintained for a long time, and get changed many times during that period. So maintainability is more important than miniscule performance improvements.
Also, we are not asking you to do anything that we wouldn't do for our own patches. When I try a performance optimization, I will test this as a single patch changing only one thing (not throwing several possible improvements together), and post detailed numbers showing the performance improvements, all measured on an otherwise idle computer.
comment:47 in reply to: ↑ 46 ; follow-up: ↓ 48 Changed 5 years ago by draqo
Replying to arend:
Replying to draqo:
Replying to arend:
loops.patch is definitely wrong. This kind of optimization is done by every modern compiler (if it isn't, file a bug with your compiler), AND your change makes the code less readable.
I don't think so. But... just don't apply. You could make some tests by yourself, too.
I thought it might be nicer to explain why I don't think it is worth applying...
Ok, it's not a problem :-)
See, I am not an expert in low-level optimization at all, but over the years working on GNU Go I have learned a bit here and there. For example, I know that a loop variable k that is only used to index an array never actually appears in the compiler output. Try looking at the (optimized) assembler output of a compiler that has that option, and you will see. This is such a common optimization that every compiler that wasn't able to do that would be completely horrible at SPEC etc.
I read a book on it, but it was many years ago and it gave just a general rule, that pointers are faster than indexing by a variable. I have a little knowledge about compilers, but my upgrade can only give better results or make no difference in time. And readability is, I think, similar - it depends rather on style of coding. So I'm leaving decision to apply this patch to you. I thought that this could make GnuGo? working faster, but if you are sure that all standard compilers do this optimization, it doesn't need to be applied.
comment:48 in reply to: ↑ 47 Changed 5 years ago by gunnar
Replying to draqo:
I read a book on it, but it was many years ago and it gave just a general rule, that pointers are faster than indexing by a variable. I have a little knowledge about compilers, but my upgrade can only give better results or make no difference in time. And readability is, I think, similar - it depends rather on style of coding. So I'm leaving decision to apply this patch to you. I thought that this could make GnuGo? working faster, but if you are sure that all standard compilers do this optimization, it doesn't need to be applied.
Pointers were faster than indexing in the past. 8 years or so ago I gained quite a bit of performance by optimizing some convolver code that way. But it has changed. Today the recommendation is to use indexing rather than pointers because it allows the compiler the optimize more aggressively; with pointers it's much harder to e.g. determine whether they may overlap. For simple cases like here it's unlikely to make a difference either way though.
For readability I see no question about indexing being far more obvious. My convolver code, by the way, became completely unreadable from the optimizations.
comment:49 follow-up: ↓ 50 Changed 5 years ago by arend
draqo added infrastructure to free all malloc()'d memory before exiting. I think it's perfectly fine to just exit and let the OS do the clean-up, but I have no strong opinions on that. Any other opinions?
comment:50 in reply to: ↑ 49 ; follow-up: ↓ 51 Changed 5 years ago by draqo
Replying to arend:
draqo added infrastructure to free all malloc()'d memory before exiting. I think it's perfectly fine to just exit and let the OS do the clean-up, but I have no strong opinions on that. Any other opinions?
I added this, because it allows checking if there are any memory leaks somewhere. Visual C writes information about it, when the program is exiting. But there was many intended memory leaks now and it makes check impossible. So I repaired it and now, when somebody writes a new code with malloc, he can check if he is freeing memory properly (in function, which allocates memory in each move, it must be freed properly).
comment:51 in reply to: ↑ 50 ; follow-up: ↓ 52 Changed 5 years ago by arend
Replying to draqo:
Replying to arend:
draqo added infrastructure to free all malloc()'d memory before exiting. I think it's perfectly fine to just exit and let the OS do the clean-up, but I have no strong opinions on that. Any other opinions?
I added this, because it allows checking if there are any memory leaks somewhere. Visual C writes information about it, when the program is exiting. But there was many intended memory leaks now and it makes check impossible. So I repaired it and now, when somebody writes a new code with malloc, he can check if he is freeing memory properly (in function, which allocates memory in each move, it must be freed properly).
I understand that, but this kind of checking is special to Visual C, and there are better memory leak tests available on other platforms (e.g. valgrind can check whether there is unreferenced unfreed memory at program exit). So I don't think the memory leak checking by itself is worth the extras complication in the code.
comment:52 in reply to: ↑ 51 ; follow-up: ↓ 54 Changed 5 years ago by draqo
Replying to arend:
Replying to draqo:
Replying to arend:
draqo added infrastructure to free all malloc()'d memory before exiting. I think it's perfectly fine to just exit and let the OS do the clean-up, but I have no strong opinions on that. Any other opinions?
I added this, because it allows checking if there are any memory leaks somewhere. Visual C writes information about it, when the program is exiting. But there was many intended memory leaks now and it makes check impossible. So I repaired it and now, when somebody writes a new code with malloc, he can check if he is freeing memory properly (in function, which allocates memory in each move, it must be freed properly).
I understand that, but this kind of checking is special to Visual C, and there are better memory leak tests available on other platforms (e.g. valgrind can check whether there is unreferenced unfreed memory at program exit). So I don't think the memory leak checking by itself is worth the extras complication in the code.
How much extra compilation is this comparing to the whole program? Few lines to many thousands. I don't think that this is any argument... And on other platforms are other tools, but they will show actual unfreed memory, too.
comment:53 follow-up: ↓ 55 Changed 5 years ago by draqo
Ok.My assignment arrived finally in Boston,so now patches could be applied :-)
comment:54 in reply to: ↑ 52 Changed 5 years ago by arend
Replying to draqo:
How much extra compilation is this comparing to the whole program? Few lines to many thousands. I don't think that this is any argument... And on other platforms are other tools, but they will show actual unfreed memory, too.
It is not about extra code in the compiled program, it is about making the source code a tad more complicated, which should never be done without a good reason.
comment:55 in reply to: ↑ 53 ; follow-up: ↓ 56 Changed 5 years ago by arend
Replying to draqo:
Ok.My assignment arrived finally in Boston,so now patches could be applied :-)
It would REALLY help for that if you could post single-issue patches as separate tickets, as I have done for various stuff.
We really don't want to apply patches that do various unrelated things, there are many reasons not to do so.
comment:56 in reply to: ↑ 55 Changed 5 years ago by draqo
Replying to arend:
Replying to draqo:
Ok.My assignment arrived finally in Boston,so now patches could be applied :-)
It would REALLY help for that if you could post single-issue patches as separate tickets, as I have done for various stuff.
We really don't want to apply patches that do various unrelated things, there are many reasons not to do so.
I can't do that, because I don't have time. Sorry. I posted most of upgrades in different files and for all important changes created new tickets. All that I can do now is apply all changes manually (by writing in the code)... But I'll leave it for you, because I don't know, which changes you like and which you don't like. Really, I think that all this changes should be applied (without loops patch and eventually board-mem patch), but you have sometimes strange (for me) arguments. But this is your project and you are deciding. Choose what you like - all changes are throughoutly tested.
comment:57 follow-up: ↓ 58 Changed 5 years ago by draqo
I saw that you add one of my patches :-) But there is a lot more. If you don't want, don't add any that doesn't change any regression (but at least some of them are helpful, some repairs errors - especially in DFA matcher, and there are new Visual Studio files). It would be nice if my work doesn't come to nothing. Anyway, there are few that change regression: nr #162, #163, #164, #168, #169, #170, #171, #172 and #173 - and I think that 19 PASSES and 5 FAILS are good result. But these patches works best combined together, because some of them repairs failures that some others generate. As for now I don't see any activity on these patches. Does somebody overlooked them?
comment:58 in reply to: ↑ 57 Changed 5 years ago by gunnar
Replying to draqo:
Anyway, there are few that change regression: nr #162, #163, #164, #168, #169, #170, #171, #172 and #173 - and I think that 19 PASSES and 5 FAILS are good result.
Yes, that's good results.
But these patches works best combined together, because some of them repairs failures that some others generate.
Been there, done that. When you've combined a series of modifications iteratively to fix failures the end result is not very predictable. E.g. suppose patch A fixes something but introduces new failures, patch B fixes those failures and introduces some of its own, and patch C finally fixes the latter without bad side effects. Then it may well happen that patch C is great on its own, A and C together does everything that all three patches do, and B in fact is no good. So we will cherrypick and test out things first separately and then in combination if needed.
As for now I don't see any activity on these patches. Does somebody overlooked them?
There's more going on than meets the eye but our time is limited too so things are moving slowly. Most of the current action is to be found in the regressiontest tab (see the upper right of the navigation bar) where regression results for patches are reported. Many of the patches still need to be rediffed so that they apply cleanly for (semi-)automated testing.
comment:59 Changed 5 years ago by gunnar
For the record, all-in-one2.zip applies cleanly on top of r2379. It doesn't build cleanly on Linux due to init_sockets() being #defined to emptiness but that's easily fixed. The regression delta agrees with the previously stated 19 PASS / 5 FAIL, more precisely
filllib:35 PASS A8 [A8|B8] viking:1 PASS O15 [O15] nicklas3:1403 PASS H9 [J6|H9] nngs:700 FAIL K18 [K17] trevorc:410 PASS G12 [G12] global:44 PASS D5 [D5] semeai:47 PASS 1 0 R9 [1 0 (PASS|T11|T9|R9|T10|R10|R11)] trevord:680 FAIL R15 [S16] strategy4:205 PASS P11 [P11] handtalk:15 FAIL M19 [J3] nngs2:150 FAIL J18 [M3|L3] nngs2:600 PASS B5 [B5] nngs4:270 PASS M1 [!P5] strategy5:295 PASS J2 [J2] ninestones:260 PASS B16 [B16] ninestones:280 PASS S2 [R1|S1|S2] ninestones:340 PASS H4 [H4] ninestones:700 PASS K15 [K15] thrash:26 PASS A1 [E1|A1] 9x9:120 PASS F1 [F1] 9x9:197 FAIL D8 [E8|H5] 9x9:300 PASS D2 [D2] 9x9:640 PASS A8 [A8|G6|J6] 9x9:650 PASS E1 [E1] 19 PASS 5 FAIL Total nodes: 1735615389 3337766 12649478 (+2.3% +0.043% +2.7%)
