Predicting Host Compromise Risk on Linux Hosts Using Machine Learning over CIS Compliance Failures

Rahul Bhattarai; Sandeep M; G Jai Ganesh; Maranco M

doi:10.47392/IRJAEH.2026.0447

Authors

Rahul Bhattarai UG - CSE-Cyber security, Department of Networking and Communications, SRM Institute of Science and Technology, Chennai, Tamilnadu, India Author
Sandeep M UG - CSE-Cyber security, Department of Networking and Communications, SRM Institute of Science and Technology, Chennai, Tamilnadu, India Author
G Jai Ganesh UG - CSE-Cyber security, Department of Networking and Communications, SRM Institute of Science and Technology, Chennai, Tamilnadu, India Author
Maranco M Assistant Professor, Department of Networking and Communications, SRM Institute of Science and Technology, Chennai, Tamilnadu, India. Author

DOI:

https://doi.org/10.47392/IRJAEH.2026.0447

Keywords:

Linux security, Machine learning, Configuration compliance, Predictive security analytics, Risk assessment.

Abstract

There are a lot of users of configuration compliance tools such as security standards like the Center for Internet Security (CIS) and security frameworks like NIST SP 800-53 to make Linux systems secure. However, these tools are mostly used to generate a report on pass or fail based on the configurations of the Linux systems. Therefore, it is the responsibility of the security administrator to make sense of the results of the configuration compliance failure. The main objective of this study is to introduce the readers to the machine learning-based system to convert the problems identified in the CIS benchmark compliance to warning signs for possible cyber attacks. The proposed framework does not rely on the CVE database or exploit traces, as in traditional vulnerability-based methods. Rather, the proposed framework only depends on the misconfigurations and hardening issues detected by the CIS Benchmark scans. The feature engineering process, being a systematic approach, processes the compliance data from various Linux virtual machines, resulting in the classification of low-level control failures into security domains. A deterministic rule-based approach is applied to map the misconfigurations to potential adversarial objectives, resulting in a supervised multi-class classification problem. A random forest approach is applied to classify the types of attacks that are likely to occur, including brute force, privilege escalation, persistence, remote, multi-vector attacks, etc.

This can be achieved through experimental assessment, whereby multi-class performance metrics are used to evaluate the results obtained. The results indicate that there is enough predictive information in the configuration compliance failures to make precise predictions about the nature of attacks that can be expected. Feature importance gives us even more useful information about the CIS domains, which are of most importance in the nature of attacks predicted. The research has bridged the gap between configuration auditing and proactive defence in a significant manner

Downloads

Download data is not yet available.

Predicting Host Compromise Risk on Linux Hosts Using Machine Learning over CIS Compliance Failures

Authors

DOI:

Keywords:

Abstract

Downloads

Downloads

Published

Issue

Section

License

How to Cite

Language

Information

Make a Submission