LLMEx Comparison

Generated: 2025-06-06 11:16:50

Best Performing Model

TheBloke/Mistral-7B-v0.1-GGUF

Pass Rate: 96.89%

Most Vulnerable Model

Qwen/Qwen2-7B-Instruct-GGUF

Pass Rate: 93.79%

Visualization Dashboard

Model Comparison Dashboard

Model Comparison Dashboard

Security Performance Radar Comparison

Security Performance Radar Chart

Higher values indicate better performance in each metric

Vulnerability Category Heatmap

Category Heatmap

Vulnerability Severity Heatmap

Severity Heatmap

Pass Rate Comparison

Model Pass Rate (%)
TheBloke/Mistral-7B-v0.1-GGUF 96.89%
TheBloke/deepseek-coder-6.7B-instruct-GGUF 94.41%
Qwen/Qwen2-7B-Instruct-GGUF 93.79%

Vulnerability Comparison

Model Vulnerable Tests False Positives Average Vulnerability Score
TheBloke/Mistral-7B-v0.1-GGUF 5 0 4.2
TheBloke/deepseek-coder-6.7B-instruct-GGUF 9 0 4.688888888888889
Qwen/Qwen2-7B-Instruct-GGUF 10 0 5.14

Vulnerability by Category

Prompt Injection

Model Vulnerable Count
TheBloke/Mistral-7B-v0.1-GGUF 1
Qwen/Qwen2-7B-Instruct-GGUF 3

Excessive Agency

Model Vulnerable Count
Qwen/Qwen2-7B-Instruct-GGUF 1
TheBloke/deepseek-coder-6.7B-instruct-GGUF 1

Malicious or Illicit Content Generation

Model Vulnerable Count
Qwen/Qwen2-7B-Instruct-GGUF 2
TheBloke/deepseek-coder-6.7B-instruct-GGUF 2

Leverage tactics inspired by L1B3RT4S Project.

Model Vulnerable Count
TheBloke/Mistral-7B-v0.1-GGUF 3
Qwen/Qwen2-7B-Instruct-GGUF 4
TheBloke/deepseek-coder-6.7B-instruct-GGUF 5

Overreliance

Model Vulnerable Count
TheBloke/Mistral-7B-v0.1-GGUF 1

Data Leakage

Model Vulnerable Count
TheBloke/deepseek-coder-6.7B-instruct-GGUF 1

Summary and Recommendations

Based on the security testing results, TheBloke/Mistral-7B-v0.1-GGUF demonstrates the best overall security posture with a pass rate of 96.89%.

The most significant security concerns were found in Qwen/Qwen2-7B-Instruct-GGUF with a pass rate of only 93.79%.

Key Recommendations:

Developed by Soufiane Tahiri