Razarac's Retreat: Revisitation to Minimax

-D29, M03, Y15-

This week I am required to revisit a previous slog post of mine and explore it in greater detail. I am more than fine with this, for I have succeeded in implementing a working version of the minimax artificial intelligence strategy. As such, I will be revisiting my slog post from the 21st of March, with the exception of deep-copying, since I already went 'in too deep' analyzing copy issues.

After exploring a number of my fellow students' ~~distinctly empty~~ blogs, I stumbled upon a gif and concise interpretation of what it felt like to implement minimax and 'tippy' (an adaptation of tic-tac-toe): behold! Check out Xie's week 9 post. Also partake in the terror posted here under the Impressions of week 7 topic. What do these two 'sloggers' have in common? They both worked in groups and succeeded in creating minimax on time. As I stated on the 21st of March, I failed to do either.

BUT NOW, NOW I CAN RAISE MY FIST IN THE AIR, CHAMPION OF MINIMAX. Not only did I implement the strategy without utilising any form of summation--I instead determined each game move's chances of victory by dividing its outcome by the depth of recursion taken to reach it--but I did it without a group.
However, I must give major props to this man (I posted his slog link in my previous post as well) for opening my eyes to the fact that I could not depend upon the outcome method implemented within the games themselves. Why not depend on the outcome of a game to determine a score? Quite simply because the outcome takes a specific game-state as an argument, therefore, regardless of who the next player at this specific game-state is, the result will always be a loss. As such, upon reaching this end of the game--this leaf node in a branch of many game-states--it is necessary to compare the outcome's losing player to the player from the original--the root--game-state.

Now, even with this success in mind, yet another challenge has arisen: optimization of the ~~dreaded~~ minimax strategy. For the final assignment for this class, we were given the option to explore and optimise minimax (which I naturally chose in my stubborn refusal to be bested by a program that other people managed to implement), or some other clearly less exciting option which I skimmed over. But how can one optimise a program that always chooses the best choice in any two-player game?
1) Memoization: Basically, if you have seen the game-state before, then don't bother recalculating the outcome; just grab the number obtained in the previous calculation. Us human beings can easily recognize when we are wasting time: do you, at this point in time, really need to calculate 5/2 via long division each and every time you see it? Surely not! Your brain has created something like a dictionary for such menial tasks, e.g., {'5/2': 2.5}.
2) Pruning: The more one looks at the game-states as a tree, the better. To prune a tree, one cuts off unnecessary branches and leaves. Similarly, if you have 50 moves to choose from, each leading to who knows how many more choices, it would be great to cut short all computations that lead to something worse than the choices already discovered before it.
3) Myopia: Or maybe, we are just lazy, so rather than spend time computing all these things, we should just stop after going n levels deep into the game-state tree. If we find an outcome by that nth level, we'll take note of it; but if we don't, we'll have some other method determine a rough outcome for that state of the game without looking any deeper.

Noticeably, the last optimisation is the least accurate of the two, which also makes it the easiest to implement (I speak from experience). Moreover, Memoization is more difficult to implement, and my current version is not adding enough branches and outcomes to the dictionary as it should be. Finally, Pruning stands squarely as my greatest adversary of the three. All detailed explanations of pruning I have heard from professors and the World Wide Web have not been of great help. I can see how it would work if the optimisation finds a really strong move early on, allowing it to just toss the remaining x number of options in the trash for this gem of a move; however, I can't escape the intuition that one must at least glance at the other outcomes to make sure that one has indeed caught hold of the best choice. Hopefully, the majority of this intuition is correct, and that the correct implementation(s) of pruning in minimax do not require significant alterations to the structure of my minimax algorithm.

In sum, minimax is still a dreadful monster of a program; nevertheless, it is not impossible, and it has been a powerful teacher of the utility of recursion. Even if my grade does not reflect what I have come to understand, I at least will know that I have learned quite a bit.

Thank you for reading.

Razarac's Retreat

Sunday, 29 March 2015

Revisitation to Minimax

No comments:

Post a Comment