By David Omurwa — 16 Mar 2025

Using AI in Vulnerability Analysis - Part 2 (Testing Gemini 2.0 Flash)

In contrary to my previous post "Using AI in Vulnerability Analysis - Part 1 (Testing the Limits of ChatGPT Plus)", this is a more systematic approach towards understanding the capabilities of AI when it comes to simple vulnerability management tasks. Nonetheless, the objective is the same, I will provide the LLM with a list of vulnerabilities and ask it to categorize it into 'OS-Related' and 'Non-OS Related' vulnerabilities. This article explores the implementation of AI solutions in vulnerability analysis. (There will be an article specifically focused on their performance.)

Methodology

Last time, I was very curious to see what a tool like ChatGPT Plus could do right out of the box and to say the least, it was a disappointment. This time, I devised a system that would make it easier for the AI system to ingest the data and make accurate decisions. Additionally, I switched over from ChatGPT to Google's Gemini. This is mainly because of the ability to make use of Google's AI Studio for free.

Use Case

An analyst retrieves a vulnerability report in the form of an Excel sheet, from a vulnerability scanner. They would like to know which vulnerability affects the core OS and which vulnerabilities affect applications installed on the device (Non-OS). This report has information such as affected asset, OS, and most importantly the CVEID.

System Design

The solution is implemented in a Python script that does the following:

Extracts the unique CVEs from the vulnerability report
Enriches the CVEs with descriptions from NIST's NVD
Asks Gemini to decide on whether the CVE was OS or Non-OS based on its description
Provides the analysis on a new sheet on the Excel Sheet

Implementation

The source code can be viewed on my personal repository through this link.

The prompt used was as follows:

Below is a vulnerability description taken from the NIST NVD Database.
I would like you to analyse the vulnerability and return either 'OS' if, based on the description the vulnerability affects an Operating System or, 'Non-OS', if, based on the description the vulnerability does not affect an operating system. 

For example, a vulnerability that affects a web application would be classified as 'Non-OS' while a vulnerability that affects a Linux Kernel would be classified as 'OS'.

In another case, a vulnerability that affects a Windows Service would be classified as 'OS' while a vulnerability that affects a desktop application would be classified as 'Non-OS'.

Description :\n\n{description}

Challenges

Both the NIST NVD and Google AI Studio (Free Teir) had API request rate limits, so, it was necessary to add sleep functions into the code to ensure that the scripts ran within the rate limits. This meant that the script took a few more minutes to run which was acceptable in this testing scenario.

Result

The result was a clean and concise analysis. Gemini was able to correctly classify 23/25 (92%) of the vulnerabilities provided in a clear, concise and consistent manner. I have to admit the list of vulnerability was diverse but not comprehensive as there were no Linux OS vulnerabilities.

An additional python function could easily be added to the script to enrich the existing data as we tried to do in the previous article.

Conclusion

In this article, we explored a systematic approach which involved preprocessing vulnerability data before the ultimate sorting of data by AI. This proved to be both beneficial and practical. Now, the question about how accurately can different models perform this task. We will answer this question in our upcoming blog posts, stay tuned!