Predicting Host Compromise Risk on Linux Hosts Using Machine Learning over CIS Compliance Failures
DOI:
https://doi.org/10.47392/IRJAEH.2026.0447Keywords:
Linux security, Machine learning, Configuration compliance, Predictive security analytics, Risk assessment.Abstract
There are a lot of users of configuration compliance tools such as security standards like the Center for Internet Security (CIS) and security frameworks like NIST SP 800-53 to make Linux systems secure. However, these tools are mostly used to generate a report on pass or fail based on the configurations of the Linux systems. Therefore, it is the responsibility of the security administrator to make sense of the results of the configuration compliance failure. The main objective of this study is to introduce the readers to the machine learning-based system to convert the problems identified in the CIS benchmark compliance to warning signs for possible cyber attacks. The proposed framework does not rely on the CVE database or exploit traces, as in traditional vulnerability-based methods. Rather, the proposed framework only depends on the misconfigurations and hardening issues detected by the CIS Benchmark scans. The feature engineering process, being a systematic approach, processes the compliance data from various Linux virtual machines, resulting in the classification of low-level control failures into security domains. A deterministic rule-based approach is applied to map the misconfigurations to potential adversarial objectives, resulting in a supervised multi-class classification problem. A random forest approach is applied to classify the types of attacks that are likely to occur, including brute force, privilege escalation, persistence, remote, multi-vector attacks, etc.
This can be achieved through experimental assessment, whereby multi-class performance metrics are used to evaluate the results obtained. The results indicate that there is enough predictive information in the configuration compliance failures to make precise predictions about the nature of attacks that can be expected. Feature importance gives us even more useful information about the CIS domains, which are of most importance in the nature of attacks predicted. The research has bridged the gap between configuration auditing and proactive defence in a significant manner
Downloads
Downloads
Published
Issue
Section
License
Copyright (c) 2026 International Research Journal on Advanced Engineering Hub (IRJAEH)

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
.