A Dynamic Reward-Based Deep Reinforcement Learning for IoT Intrusion Detection

Summary:

IoT devices are increasingly vulnerable to cyberattacks because they don’t have strong computing resources or storage. Traditional intrusion detection systems don’t keep up well with new attacks. Introducing a deep reinforcement learning (DRL) model that can automatically learn how to detect both common and rare attack types. One thing that this system uses is the use of a dynamic reward function, which essentially gives the model different rewards depending on how difficult a sample is to classify. This helps the system pay more attention to attack categories that don’t appear often in the dataset.

Key Points:

  1. IoT devices’ limitations create major security vulnerabilities.
  2. Traditional IDS struggle to detect zero-day or evolving attacks.
  3. The DRL model uses DQN and SARSA-encoded data.
  4. A dynamic reward function increases recognition of minority attack classes.
  5. Bot-IoT dataset used; ~3.3 million cleaned samples after preprocessing.
  6. Achieves ~99% accuracy in classification.

Images

DRL_SchematicDiagram
Figure 1: schematic diagram
DRL_DistributionChart
Figure 2: Data distribution chart


DRL_BinaryVs.Multi


Figure 3: Confusion matrix plot with the correct labels of the data on the vertical axis and the labels predicted by the model on the horizontal axis (a) dichotomous confusion matrix plot, (b) multiclassification confusion matrix plot.

Bibliography Citation:

K. Ren, L. Liu, H. Bai and Y. Wen, "A Dynamic Reward-Based Deep Reinforcement Learning for IoT Intrusion Detection,"
2024 2nd International Conference on Intelligent Communication and Networking (ICN), Shenyang, China, 2024,
pp. 110–114, doi: 10.1109/ICN64251.2024.10865958.