This article is rated Start-class on Wikipedia's
content assessment scale. It is of interest to the following WikiProjects: | |||||||||||
|
The pseudo code is correct, but it's hard to interpreter it correctly. I've successfully implemented it.
The code is *WRONG*! For negamax we must negate the heuristic value of node for player 2. Also, if the root of this call is for player 2, we must negate the final result. Please correct the code.
def negamax(node, depth, alpha, beta): if type(node) == type(1): return node else: for child in node: alpha = max(alpha, -negamax(child, depth-1, -beta, -alpha)) # the following if statement constitutes alpha-beta pruning} if alpha>=beta: return beta return alpha
>>> tree = [ 1 ] >>> negamax(tree, 5, -1000, 1000) -1
JSB73 ( talk) 22:38, 28 April 2009 (UTC)
The code is definitely wrong as per the reasons stated above. More importantly the code is completely unreferenced. Here is a link that mentions the negation of the heuristic: http://www.hamedahmadi.com/gametree/
I'd fix this myself if I didn't have to make up for the time that I lost trying to debug my java implementation of this negamax pseudocode. 202.89.191.178 ( talk) 17:19, 6 May 2009 (UTC)
MAKE MORE SENSE —Preceding unsigned comment added by 198.49.142.2 ( talk) 15:26, 19 February 2010 (UTC)
I would like to discuss about the last paragraph of this page:
« What can be confusing for beginners is how the heuristic value of the current node is calculated. In this implementation, the value is always calculated from the point of view of the player running the algorithm because of the color parameter. This is the same behavior as the normal minimax algorithm. If this parameter was not present, the evaluation function would need to return a score for the current player, i.e. the min or max player. »
I had really hard time to implement this algorithm directly from the given pseudo-code. It was mainly because I misunderstood the role of the 'color' variable. Indeed, in my implementation the alternation of players is handled outside of the algorithm and one can ask the object 'game' which is the current active player.
The fact is that my evaluation function for the leafs (or nodes, when 'depth=0') was already giving a '+1' when the root player was winning and a '-1' when the opponent of the root player was winning, so I discarded the color
argument...
I was wrong... but as much as the pseudo code was. Indeed, the evaluation function should give '+1' (positive number) if the current player is winning and '-1' (negative number) if the opponent of the current player is winning. It should be a local decision and not a decision based on the player who ran the initial call of the negamax function.
I realized, once this bug as been fixed in my implementation, the meaning of the last paragraph (which is partially wrong as one can notice that the 'color' parameter is here to invert the result of the evaluation function which was probably like my own function), but, for me, it should be rewritten in a clearer way. Thus, I propose this:
« This pseudo-code assumes that the heuristic value of a node/leaf will be given for the current player of the node and not for the player running the initial call. Meaning that a positive result means that the current player is gaining the given amount (and the opponent score is deduced of the exact same amount).
In the case of a heuristic value computed for the player running the initial call, one should take care of carrying the players' alternation to negate this effect.
Anyway, it must be understood that the negamax algorithm is the only one taking care of propagating the alternation and only local choices have to be taken. » —Preceding unsigned comment added by 82.242.180.106 ( talk) 14:48, 26 December 2010 (UTC)
I'm very appreciative of the code on this page, as it helped me significantly with implementing alpha-beta pruning with transposition tables.
But the following code on the page currently is incorrect:
if bestValue ≤ alphaOrig ttEntry.Flag := UPPERBOUND else if bestValue ≥ β ttEntry.Flag := LOWERBOUND else ttEntry.Flag := EXACT endif
In fact this will lead to different results than alpha-beta pruning alone. I have tested this with my own working code. alphaOrig is not a tight enough bound to avoid problems with nodes marked EXACT, even though the nodes marked UPPERBOUND and LOWERBOUND are correct (but not all nodes that should be marked UPPERBOUND and LOWERBOUND are currently marked, which is the crux of the problem actually).
The current alpha must in fact be used, and not the original alpha, if you want to get the same results as alpha-beta pruning alone. But alpha sometimes "masquerades" as beta in Negamax, I believe, because beta is never updated while processing. So you must be careful how the current alpha is used.
I'm not 100% sure since I'm not implementing Negamax in my code here, but I believe the correct code should look like the following:
if (color = 1) if bestValue ≤ alpha ttEntry.Flag := UPPERBOUND else ttEntry.Flag := EXACT endif else if bestValue ≥ alpha ttEntry.Flag := LOWERBOUND else ttEntry.Flag := EXACT endif — Preceding unsigned comment added by 66.192.104.10 ( talk) 21:29, 26 February 2015 (UTC) endif
Just leaving this here in case anyone runs into the same problem I did. — Preceding unsigned comment added by 66.192.104.10 ( talk) 21:16, 26 February 2015 (UTC)
I realize that my code above is different from other programming sites and references. I believe in fact that the existing sources are wrong, and that they did not notice because the problems (at least in my app) don't become apparent until 5 levels or deeper of lookahead, which I think is hard to realize. I think most folks verify with shallower trees. I spent a lot of time working trees out by hand and debugging using my visualization tool, and I can't see any remaining bugs, though it's always possible I missed them.
Even my changed version is not perfect -- it diverges from plain alpha-beta at 9 levels deep, but at least it's closer. I think this problem is fundamental to using a transposition table with alpha-beta pruning. The only way I could duplicate alpha-beta pruning EXACTLY was to disallow any transposition table use unless no alpha-beta pruning had happened at this layer or at any deeper layer, but that rules out an awful lot of cached nodes, and the difference in play is not large (another reason I think this mainly goes unnoticed). Furthermore, with better move ranking, this problem also becomes harder to notice. Probably I only noticed because I chose to implement my code for Reversi/Othello, where good move ranking is hard (in my opinion).
This is all due to original research on my part, and I'm not planning on publishing any of this, so I guess I'll never be able to convince folks to add this to the Wikipedia article. Probably that's the right way to do things anyway. But I at least wanted to have this information out there. Thanks for taking the time to review it. — Preceding unsigned comment added by 73.21.166.99 ( talk) 04:19, 16 March 2015 (UTC)
2A02:AB88:39C0:3900:CC2A:E662:9F18:5846 ( talk) 11:46, 28 April 2021 (UTC)
---
It appears to me that the use of the temporary variable alphaOrig in the pseudo code is confusing at best and may be divergent from some other references of "negamax with alpha beta pruning and transposition tables". The issue is that alpha can change during "Transposition Table Lookup; node is the lookup key for ttEntry", so perhaps alphaOrig should capture the value *after* this code rather than before.
For comparision, view A REVIEW OF GAME-TREE PRUNING, T.A. Marsland, page 14. In this pseudo-code version, no temporary variable is used, and alpha itself is not modified during move evaluation. (Rather than updating alpha, it recurses using the function -max( α ,score)). So alpha is the value *after* the transposition table lookup, not before as with alphaOrig in the Wikipedia pseudo-code.
So personally, I prefer the Marsland pseudo-code for education purposes. For implementation purposes, the code used by Alpha-Beta with Sibling Prediction Pruning in Chess, Jeroen W.T. Carolus is perhaps more clever, and no temporary variable is used (by checking LOWERBOUND before UPPERBOUND).
Janesn ( talk) 22:00, 27 September 2016 (UTC)
if(best <= alpha) // a lowerbound value StoreTTEntry(board.getHashKey(), best, LOWERBOUND, depth); else if(best >= beta) // an upperbound value StoreTTEntry(board.getHashKey(), best, UPPERBOUND, depth); else // a true minimax value StoreTTEntry(board.getHashKey(), best, EXACT, depth); return best;
Your comparision between Marsland and Breuker pinpoints why I find the early assignment of alphaOrig confusing. In Marsland it is clear that "flag" always captures the fact that the called negamax function did not complete exactly, thus setting upper bound. In Breuker, it can set the EXACT flag, which at least to me is not clear how he is able to make that optimization. My suspicion remains that a subsequent retrieval of the position in the transposition table could diverge from alpha-beta.
The Plaat code, while not negamax, has the same approach as Marsland. The alpha and beta variables are set after the transposition table lookup and not modified during move evaluation.
Janesn ( talk) 19:52, 1 October 2016 (UTC)
While it is OK to flag a known-to-be-exact value as lower or upper bound, it is suspicious to flag an upper bound as an exact value when the lower and upper bound scores differ. But this is what can happen in Breuker when the transposition table lookup updates α because it's a "lower bound" and then the called negamax function does not complete exactly. Marsland flags the new entry as an "upper bound" score. Both Breuker and Marsland completely replace the old entry, so the "lower bound" score is lost.
Plaat stores the lower bound and upper bound scores separately, so the same transposition table entry can be used for both lower bound and upper bound checking. Only when both scores are exactly the same is the transposition table entry deemed exact. Thus Plaat has the same outcome as Marsland, perhaps slightly faster.
As I am already repeating myself, this will be my last posting on the subject. Thank you for your discussion. Janesn ( talk) 21:10, 3 October 2016 (UTC)
The link to the C99 example code under external links is dead. — Preceding unsigned comment added by 2605:A000:131A:3E:ECE3:A970:240E:6901 ( talk) 09:12, 12 February 2016 (UTC)