Proximal Reward Shaping with Action Masking to create a Pacifist NetHack Agent

WADHA SAUD NASSER ALHAMDAN

Proximal Reward Shaping with Action Masking to create a Pacifist NetHack Agent

dc.contributor.advisor	Dr. Raluca D. Gaina
dc.contributor.author	WADHA SAUD NASSER ALHAMDAN
dc.date	2022
dc.date.accessioned	2022-06-04T19:33:28Z
dc.date.available	2022-04-19 20:26:09
dc.date.available	2022-06-04T19:33:28Z
dc.description.abstract	Reward shaping is a classic and effective technique in reinforcement learning that uses domain knowledge to guide agents to a solution. This project implements a proximal variation of reward shaping that rewards and penalizes the agent for being in proximity to certain entities. It also experiments with 3 versions of action masking, which is a technique that prevents the agent from performing sets of actions. We perform 10 experiments, outputting various training plots, testing results, and 10 testing videos to qualitatively and quantitatively assess an agent. This paper presents the results of these experiments using these two methods to create a pacifist agent in the game NetHack using NetHack Learning Environment (NLE). Given the complexity and depth of NetHack and the difficultly of maintaining a pacifist agent, this project could not create such an agent using the mentioned methods. Despite lackluster results, several agents were created, trained, and analyzed; Results, even if disappointing, are still valuable for future research.
dc.format.extent	8
dc.identifier.other	110781
dc.identifier.uri	https://drepo.sdl.edu.sa/handle/20.500.14154/66312
dc.language.iso	en
dc.publisher	Saudi Digital Library
dc.title	Proximal Reward Shaping with Action Masking to create a Pacifist NetHack Agent
dc.type	Thesis
sdl.degree.department	Big Data Science
sdl.degree.grantor	Queen Mary University of London
sdl.thesis.level	Master
sdl.thesis.source	SACM - United Kingdom

Collections

SACM - United Kingdom

Proximal Reward Shaping with Action Masking to create a Pacifist NetHack Agent

Files

Collections