Skip to content
Go back

Phishing Website Detection Using Machine Learning

Project Overview

In this project the objective was to design, train, and evaluate a machine learning model capable of classifying websites as phishing or legitimate based on extracted features.

Phishing attacks are among the most widespread and damaging threats to online security. Automating their detection can help prevent fraud, protect user data, and improve trust in online services.


Dataset

We used the Phishing Websites Dataset from the UCI Machine Learning Repository, containing 11,055 records and 31 features describing website characteristics, including:

The target variable is binary:


Preprocessing

The preprocessing pipeline included:


Models and Evaluation

We trained and evaluated multiple supervised classification algorithms:

Evaluation Metrics:


Results

The Random Forest Classifier delivered the highest performance across all metrics:

ModelAccuracyPrecisionRecallF1-Score
Decision Tree0.9510.950.950.95
Random Forest0.9770.980.980.98
Logistic Regression0.9260.930.930.93
SVM0.9610.960.960.96
KNN0.9340.930.930.93

The Random Forest model showed excellent generalization on unseen data and is the recommended approach for deployment.


Conclusions


Team & Acknowledgments

This was a group project completed as part of the Fundamentos de Aprendizaje de Máquina course at PUCP, with:

GitHub Repository: https://github.com/milkreator/deteccion_phishing_web


Share this post on:

Previous Post
Churn Prediction Using Random Forest: A Business Value Perspective
Next Post
Cybernetic Revolutionaries: A Brief Review of Eden Medina’s Book