SACM - United Kingdom

Permanent URI for this collectionhttps://drepo.sdl.edu.sa/handle/20.500.14154/9667

Browse

Search Results

Now showing 1 - 5 of 5

Embargo
Reinforcement Learning For Game Theory
(Saudi Digital Library, 2028-07) Ulian, Salem; Hedges, Jules
Reinforcement learning, especially deep reinforcement learning, has recently achieved impressive results in playing complex board games like chess and Go, as well as video games such as StarCraft II. However, there has been limited research into how these techniques work with strategic games from game theory. This project aims to create a reinforcement learning system that learns to play a repeated game, such as the iterated prisoner's dilemma, against itself and to compare its performance with traditional strategies.
16 0
Embargo
Reinforcment Learning for Game Theory
(Saudi Digital Library, 2028-07) Ulian, Salem; Hedges, Jules
Reinforcement learning, especially deep reinforcement learning, has recently achieved impressive results in playing complex board games like chess and Go, as well as video games such as StarCraft II. However, there has been limited research into how these techniques work with strategic games from game theory. This project aims to create a reinforcement learning system that learns to play a repeated game, such as the iterated prisoner's dilemma, against itself and to compare its performance with traditional strategies.
14 0
Embargo
Reinforcement Learning for Game Theory
(Saudi Digital Library, 2028-07) Ulian, Salem; Hedges, Jules
Reinforcement learning, especially deep reinforcement learning, has recently achieved impressive results in playing complex board games like chess and Go, as well as video games such as StarCraft II. However, there has been limited research into how these techniques work with strategic games from game theory. This project aims to create a reinforcement learning system that learns to play a repeated game, such as the iterated prisoner's dilemma, against itself and to compare its performance with traditional strategies.
21 0
Restricted
Measuring Human’s Trust in Robots in Real-time During Human-Robot Interaction
(Swansea University, 2025) Alzahrani, Abdullah Saad; Muneeb, Imtiaz Ahmad
This thesis presents a novel, holistic framework for understanding, measuring, and optimising human trust in robots, integrating cultural factors, mathematical modelling, physiological indicators, and behavioural analysis to establish foundational methodologies for trust-aware robotic systems. Through this comprehensive approach, we address the critical challenge of trust calibration in human-robot interaction (HRI) across diverse contexts. Trust is essential for effective HRI, impacting user acceptance, safety, and overall task performance in both collaborative and competitive settings. This thesis investigated a multi-faceted approach to understanding, modelling, and optimising human trust in robots across various HRI contexts. First, we explored cultural and contextual differences in trust, conducting cross-cultural studies in Saudi Arabia and the United Kingdom. Findings showed that trust factors such as controllability, usability, and risk perception vary significantly across cultures and HRI scenarios, highlighting the need for flexible, adaptive trust models that can accommodate these dynamics. Building on these cultural insights as a critical dimension of our holistic trust framework, we developed a mathematical model that emulates the layered framework of trust (initial, situational, and learned) to estimate trust in real-time. Experimental validation through repeated interactions demonstrated the model's ability to dynamically calibrate trust with both trust perception scores (TPS) and interaction sessions serving as significant predictors. This model showed promise for adaptive HRI systems capable of responding to evolving trust states. To further enhance our comprehensive trust measurement approach, this thesis explored physiological behaviours (PBs) as objective indicators. By using electrodermal activity (EDA), blood volume pulse (BVP), heart rate (HR), skin temperature (SKT), eye blinking rate (BR), and blinking duration (BD), we showed that specific PBs (HR, SKT) vary between trust and distrust states and can effectively predict trust levels in real-time. Extending this approach, we compared PB data across competitive and collaborative contexts and employed incremental transfer learning to improve predictive accuracy across different interaction settings. Recognising the potential of less intrusive trust indicators, we also examined vocal and non-vocal cues—such as pitch, speech rate, facial expressions, and blend shapes—as complementary measures of trust. Results indicated that these cues can reliably assess current trust states in real-time and predict trust development in subsequent interactions, with trust-related behaviours evolving over time in repeated HRI sessions. Our comprehensive analysis demonstrated that integrating these expressive behaviours provides quantifiable measurements for capturing trust, establishing them as reliable metrics within real-time assessment frameworks. As the final component of our integrated trust framework, this thesis explored reinforcement learning (RL) for trust optimisation in simulated environments. Integrating our trust model into an RL framework, we demonstrated that dynamically calibrated trust can enhance task performance and reduce the risks of both under and over-reliance on robotic systems. Together, these multifaceted contributions advance a holistic understanding of trust measurement and calibration in HRI, encompassing cultural insights, mathematical modelling, physiological and expressive behaviour analysis, and adaptive control. This integrated approach establishes foundational methodologies for developing trust-aware robots capable of enhancing collaborative outcomes and fostering sustained user trust in real-world applications. The framework presented in this thesis represents a significant advancement in creating robotic systems that can dynamically adapt to human trust states across diverse contexts and interaction scenarios.
18 0
Restricted
Energy Efficient D2D Communications Underlay Future Wireless Networks
(University of Exeter, 2023-05-09) Alenezi, Sami Mohammed L; Min, Geyong; Luo, Chunbo
From the first generation of mobile networks to the present, the demand for more network bandwidth and energy has grown significantly as a result of the growth in users and applications. In the future, there will be billions of heterogeneous connected devices requiring high-quality network services. The demands of these cellular users are difficult to be satisfied by the technologies currently available particularly due to the limited spectrum resources. Device to Device (D2D) communication is a potential strategy for improving device performance by allowing direct communication between user pairs that are close to each other. Reducing network latency, decreasing energy consumption, increasing throughput, and improving coverage area are potential advantages of using D2D communications. However, key problems may arise when operating D2D communications in cellular networks to directly or indirectly affect energy and spectrum efficiency, for example, the interference problems between D2D devices, the interference between D2D devices and cellular devices, device discovery problems, and mode selection problems. Although traditional techniques have been proposed to solve such problems, device position, power transmission, and channel conditions are typically dynamic, particularly in the future dynamic cellular network environment. Because traditional optimisation techniques are facing increasing difficulty in rapidly changing environments, machine learning techniques become a promising tool for effective resource allocation and interference management. From this standpoint, this thesis aims to propose methods based on machine learning in order to increase the energy efficiency of D2D-assisted cellular networks. The contributions from the Machine Learning view are that the state space, action space and reward function are defined in a distributed and centralised manner to further specify the problem and use the reinforcement learning-based method to maximise energy efficiency. To be more specific, the key contributions in this thesis are listed as follows: - Few studies have been conducted to investigate the impact of user mobility on energy and spectrum efficiency of D2D communications. The effect of user mobility on the energy efficiency of D2D communications in the high-speed scenario has not been thoroughly studied especially in the state-of-the-art research in which user speed is considered very low. Thus, more research is needed to explain how D2D performance could be improved in dynamic scenarios. This thesis investigates 1) the impact of mobility on D2D communication in order to better understand the operational efficiency of next-generation cellular network-assisted D2D technologies; 2) the potential of Machine Learning (ML) algorithms to mitigate the negative impact of unpredictable user mobility; and 3) the performance gain of the proposed methods over other ML and more traditional methods. - The thesis further studies the energy efficiency of D2D communications in cellular networks. In particular, it proposes a centralised power control algorithm based on reinforcement learning to optimise energy usage while minimising interference to cellular users in order to maintain the Quality of Service (QoS). The centralised power control algorithm is hosted at the base station. Compared to the benchmark algorithms, simulation results show that the proposed method can effectively increase system energy efficiency while maintaining cellular user QoS. - Moreover, to optimise resource allocation and improve energy efficiency, the thesis proposes a Proximal Policy Optimisation (PPO)-based joint channel selection and power allocation scheme based on the Markov Decision Process (MDP). Channel selection and power allocation are jointly considered with the aim to maximise the overall energy efficiency of the network while guaranteeing the minimum requirement of QoS. Extensive simulation experiments have been carried out to validate the effectiveness of the proposed method. In terms of energy efficiency and training time, the results show that the proposed method outperforms other existing algorithms.
21 0

SACM - United Kingdom

Browse

Filters

Settings

Sort By

Results per page

Search Results