Enhanced Monte Carlo Tree Search in Game-Playing AI: Evaluating Deepmind's Algorithms

Gonzalez, Karla

Please use this identifier to cite or link to this item: https://hdl.handle.net/11264/1502

Full metadata record

DC Field	Value	Language
dc.contributor.author	Gonzalez, Karla	-
dc.contributor.other	Royal Military College of Canada	en_US
dc.date.accessioned	2023-09-18T14:22:52Z	-
dc.date.available	2023-09-18T14:22:52Z	-
dc.date.issued	2023-09-18	-
dc.identifier.uri	https://hdl.handle.net/11264/1502	-
dc.description.abstract	In the realm of artificial intelligence (AI), game-playing algorithms have reached remarkable heights of performance, exemplified by DeepMind's AlphaGo, AlphaZero, and MuZero algorithms. However, despite these achievements, there exists a need for a comprehensive evaluation and comparison of these algorithms, as well as an exploration of modifications to the underlying Monte Carlo Tree Search (MCTS) algorithm. This thesis represents a step forward in addressing these needs, aiming to shed light on the strengths and limitations of these state-of-the-art algorithms, the impact of modifications to MCTS, and their performance across diverse game environments. The thesis first delves into the foundations of reinforcement learning (RL) and the MCTS algorithm, providing a solid understanding of the key concepts. It then proceeds to evaluate two modifications of the MCTS algorithm, UCT and Loss Avoidance (LA). We found that both modifications, particularly the simultaneous implementation of both, improved not only MCTS in our selected environment, but also Alpha-Zero when used within its MCTS search. Building upon this foundation, the thesis turns its attention to the implementation and evaluation of the DeepMind algorithms themselves: AlphaGo, AlphaZero, and MuZero. These algorithms are evaluated in a range of game environments, varying in complexity and size. By subjecting them to identical computational constraints and neural network architectures, we found that although Muzero is learning a model unlike it's sibling algorithms, overtime will outperform within fairly complex environments, such as Othello and Pinball.	en_US
dc.description.abstract	Dans le domaine de l'intelligence artificielle, les algorithmes de jeu ont atteint des niveaux de performance remarquables, comme en témoignent les algorithmes AlphaGo, AlphaZero et MuZero de DeepMind. Cependant, malgré ces réalisations, il existe un besoin d'implémentation, d'évaluation et de comparaison complètes de ces algorithmes, ainsi que d'exploration des modifications apportées à l'algorithme sous-jacent de Recherche Arborescente Monte Carlo (MCTS). Cette thèse représente une avancée significative dans la réponse à ces besoins, visant à mettre en lumière les forces et les limites de ces algorithmes de pointe, l'impact des modifications apportées à l'algorithme MCTS, et leur performance dans divers environnements de jeu. La thèse se penche d'abord sur les fondements de l'apprentissage par renforcement et de l'algorithme MCTS, ce qui permet de bien comprendre les concepts clés. Elle procède ensuite à l'évaluation de deux modifications de l'algorithme MCTS, Upper Confidence Bounds for Trees (UCT) et Loss Avoidance. Nous avons constaté que les deux modifications, en particulier l'implémentation simultanée des deux, ont amélioré non seulement MCTS dans notre environnement sélectionné, mais aussi AlphaZero lorsqu'il est utilisé dans sa recherche MCTS. Sur cette base, la thèse s'intéresse à la mise en œuvre et à l'évaluation des algorithmes DeepMind eux-mêmes : AlphaGo, AlphaZero et MuZero. Ces algorithmes sont évalués dans une série d'environnements de jeu, dont la complexité et la taille varient. En les soumettant à des contraintes de calcul et à des architectures de réseaux neuronaux identiques, nous avons constaté que, bien que Muzero apprenne un modèle différent de celui de ses autres algorithmes Alpha, les heures supplémentaires seront plus performantes dans des environnements assez complexes, comme Othello et Pinball.	en_US
dc.language.iso	en	en_US
dc.subject	Monte Carlo Tree Search	en_US
dc.subject	Artificial Intelligence	en_US
dc.subject	Reinforcement Learning	en_US
dc.subject	Deep Learning	en_US
dc.subject	AlphaZero	en_US
dc.subject	Algorithmic Improvements	en_US
dc.title	Enhanced Monte Carlo Tree Search in Game-Playing AI: Evaluating Deepmind's Algorithms	en_US
dc.title.translated	Amélioration de la Recherche Arborescente Monte Carlo dans l'Intelligence Artificielle pour les Jeux : Évaluation des Algorithmes de DeepMind	en_US
dc.contributor.supervisor	Rivest, Francois	-
dc.date.acceptance	2023-08-21	-
thesis.degree.discipline	Computer Science/Sciences informatiques	en_US
thesis.degree.name	MSc (Master of Science/Maîtrise ès sciences)	en_US
Appears in Collections:	Theses

Files in This Item:

File	Description	Size	Format
MASTER_THESIS_Karla_Aug_2023 (1).pdf		1.89 MB	Adobe PDF	View/Open

Show simple item record