Proximal Reward Shaping with Action Masking to create a Pacifist NetHack Agent

dc.contributor.advisorDr. Raluca D. Gaina
dc.contributor.authorWADHA SAUD NASSER ALHAMDAN
dc.date2022
dc.date.accessioned2022-06-04T19:33:28Z
dc.date.available2022-04-19 20:26:09
dc.date.available2022-06-04T19:33:28Z
dc.description.abstractReward shaping is a classic and effective technique in reinforcement learning that uses domain knowledge to guide agents to a solution. This project implements a proximal variation of reward shaping that rewards and penalizes the agent for being in proximity to certain entities. It also experiments with 3 versions of action masking, which is a technique that prevents the agent from performing sets of actions. We perform 10 experiments, outputting various training plots, testing results, and 10 testing videos to qualitatively and quantitatively assess an agent. This paper presents the results of these experiments using these two methods to create a pacifist agent in the game NetHack using NetHack Learning Environment (NLE). Given the complexity and depth of NetHack and the difficultly of maintaining a pacifist agent, this project could not create such an agent using the mentioned methods. Despite lackluster results, several agents were created, trained, and analyzed; Results, even if disappointing, are still valuable for future research.
dc.format.extent8
dc.identifier.other110781
dc.identifier.urihttps://drepo.sdl.edu.sa/handle/20.500.14154/66312
dc.language.isoen
dc.publisherSaudi Digital Library
dc.titleProximal Reward Shaping with Action Masking to create a Pacifist NetHack Agent
dc.typeThesis
sdl.degree.departmentBig Data Science
sdl.degree.grantorQueen Mary University of London
sdl.thesis.levelMaster
sdl.thesis.sourceSACM - United Kingdom

Files

Copyright owned by the Saudi Digital Library (SDL) © 2025