QuishGuard is a high-performance machine learning pipeline designed to detect malicious URLs embedded in QR codes. By leveraging an XGBoost classifier trained on over 2.2 million samples, it identifies phishing attempts with a precision-first approach to minimize False Positives in security environments.
Stealth Extraction: Designed to parse complex .eml and multi-part MIME email files to extract all hidden "highly evasive" Qrs.
Fresh Dataset: Utilizes a curated live-stream of malicious URLs from top-tier threat intel sources (PhishTank, OpenPhish, phiusiil, mendeley and URLhaus).
Massive Scale: Trained on a robust corpus of 2.1M+ Benign URLs and 135k+ Fresh Malicious URLs, ensuring the model recognizes current-day attack patterns.
Lexicographical Profiling: Transforms raw URLs into a multi-dimensional feature vector. This stage involves handling missing or inconsistent data, tokenization, and extraction of relevant features from URLs, such as domain names, subdomains, and URL lengths
Precision Focused: Optimized to maintain a near-zero False Positive Rate (FPR), critical for reducing "security fatigue" in SOC environments.
Fast Inference and integration: Optimized for low-latency, real-time URL classification and seamless integration into existing security pipelines
-
Ingestion: Receives
.emlfiles via the Flask API. -
Extraction: Scans body and attachments for QR code objects.
-
Transformation: Converts the extracted URL into 15+ numerical features (length, special character ratios, tld, etc.).
-
Classification: The XGBoost engine generates a safety verdict.
-
Response: Returns a JSON report with a safety verdict and other details about the email.
QuishGuard provides a lightweight Flask API for seamless integration with existing SOAR or SIEM platforms.
POST /submit
file(binary): The.emlfile to be analyzed.
import requests
# Load your suspicious email file
with open("suspicious_email.eml", "rb") as f:
files = {"file": f}
response = requests.post("http://127.0.0.1:5001/submit", files=files)
print(response.json())Sample Response
{
"Email status": "Rejected",
"fragments": [],
"https://split-flexbox.com": "malicious",
"metadata": {
"domain": "test.com",
"sender": "test@test.com",
"sender_ip": "Unknown",
"subject": "Split QR - Flexbox"
}
}
- Python 3.10+
- Google Chrome / Chromium: Required by the
html2imagedependency to properly render and process HTML fragments. Ensure Chrome is installed on your system before proceeding.
git clone https://github.com/Tenzzzzzz/QuishGuard.git
cd QuishGuard
python -m venv .venv
# Windows
.venv\Scripts\activate.bat
# Linux/Mac
source .venv/bin/activate
cd Requirements
pip install -r requirements.txt
cd ..
python app.py
or if you want to reproduce from the beginning
git clone https://github.com/Tenzzzzzz/QuishGuard.git
cd QuishGuard
python -m venv .venv
# Windows
.venv\Scripts\activate.bat
# Linux/Mac
source .venv/bin/activate
cd Requirements
pip install -r requirements.txt
cd ..Then execute the code in the Jupyter notebook
python feature_extraction.py
python the_model.py
python app.py-
Evasion Detection: Implement detection for more advanced "broken" or "obfuscated" QR techniques.
-
Zero-Trust Layer: Add an additional heuristic layer to reduce False Positives for known corporate domains further.
Contributions are welcome! Please open an issue or submit a pull request for any feature additions or bug fixes.