Mohamed, Nahed Belhadj, Kaddoum, Georges et Hassan, Md. Zoheb.
2026.
« Digital twin-driven continual deep reinforcement learning for coexistence of multiple radio access technology IoT links with nonlinear receivers ».
IEEE Transactions on Machine Learning in Communications and Networking, vol. 4.
pp. 591-611.
Prévisualisation |
PDF
Kaddoum-G-2026-33553.pdf - Version publiée Licence d'utilisation : Creative Commons CC BY. Télécharger (1MB) | Prévisualisation |
Résumé
This article investigates the coexistence of downlink Internet-of-Things (IoT) links enabled by multiple radio access technologies (RATs), including long-term evolution (LTE) and 5G new radio (NR). The coexistence of multiple RAT IoT links is significantly challenged by adjacent channel interference (ACI) and hardware impairments (HWI) that arise from practical low-complexity radio-frequency front ends. To mitigate these challenges, we propose a radio resource optimization scheme that dynamically adjusts link adaptation parameters (transmit power, modulation, and coding rate) to maximize overall throughput while explicitly accounting for ACI and HWI. However, the proposed optimization is an NP-hard mixed-integer non-linear programming problem that requires global channel state information and centralized optimization, making it impractical for large-scale, dynamic multi-RAT IoT networks. To enable distributed optimization under ACI and HWI, we reformulate the problem as a Markov game and develop a multi-agent deep reinforcement learning (MADRL) framework that derives equilibrium link adaptation policies from local observations. Direct deep reinforcement learning (DRL) training in real networks, however, incurs high communication overhead and can create adverse effects due to the random explorations. To overcome these limitations, we introduce a context-aware digital twin network (DTN) that provides a safe and efficient virtual environment for training. In particular, we propose a novel DTN-empowered MADRL scheme that employs a replay memory-based continual model updating strategy, enabling policies to be learned from DT-generated experiences and periodically refined with real network data. This approach alleviates the need for frequent physical network interactions and significantly reduces communication overhead. Extensive simulations demonstrate that the proposed framework is scalable, computationally efficient, and robust in dynamic IoT environments, while outperforming 3GPP-standardized link adaptation in the presence of non-negligible ACI and HWI.
| Type de document: | Article publié dans une revue, révisé par les pairs |
|---|---|
| Chercheur(-euse): | Chercheur(-euse) Kaddoum, Georges |
| Affiliation: | Génie électrique |
| Date de dépôt: | 01 avr. 2026 20:21 |
| Dernière modification: | 22 avr. 2026 19:47 |
| URI: | https://espace2.etsmtl.ca/id/eprint/33553 |
Actions (Authentification requise)
![]() |
Dernière vérification avant le dépôt |

