Building a Real-Time Credit Card Fraud Detection System with FastAPI and Machine Learning

王林
Release: 2024-08-13 06:54:33
Original
510 people have browsed it

Building a Real-Time Credit Card Fraud Detection System with FastAPI and Machine Learning

Introduction

Credit card fraud poses a significant threat to the financial industry, leading to billions of dollars in losses every year. To combat this, machine learning models have been developed to detect and prevent fraudulent transactions in real time. In this article, we'll walk through the process of building a real-time credit card fraud detection system using FastAPI, a modern web framework for Python, and a Random Forest classifier trained on the popular Credit Card Fraud Detection Dataset from Kaggle.

Overview of the Project

The goal of this project is to create a web service that predicts the likelihood of a credit card transaction being fraudulent. The service accepts transaction data, preprocesses it, and returns a prediction along with the probability of fraud. This system is designed to be fast, scalable, and easy to integrate into existing financial systems.

Key Components

  1. Machine Learning Model: A Random Forest classifier trained to distinguish between fraudulent and legitimate transactions.
  2. Data Preprocessing: Standardization of transaction features to ensure the model performs optimally.
  3. API: A RESTful API built with FastAPI to handle prediction requests in real time.

Step 1: Preparing the Dataset

The dataset used in this project is the Credit Card Fraud Detection Dataset from Kaggle, which contains 284,807 transactions, of which only 492 are fraudulent. This class imbalance presents a challenge, but it's addressed by oversampling the minority class.

Data Preprocessing

The features are first standardized using a StandardScaler from scikit-learn. The dataset is then split into training and testing sets. Given the imbalance, the RandomOverSampler technique is applied to balance the classes before training the model.

from sklearn.preprocessing import StandardScaler from imblearn.over_sampling import RandomOverSampler # Standardize features scaler = StandardScaler() X_scaled = scaler.fit_transform(X) # Balance the dataset ros = RandomOverSampler(random_state=42) X_resampled, y_resampled = ros.fit_resample(X_scaled, y)
Copy after login

Step 2: Training the Machine Learning Model

We train a Random Forest classifier, which is well-suited for handling imbalanced datasets and provides robust predictions. The model is trained on the oversampled data, and its performance is evaluated using accuracy, precision, recall, and the AUC-ROC curve.

from sklearn.ensemble import RandomForestClassifier from sklearn.metrics import classification_report, roc_auc_score # Train the model model = RandomForestClassifier(n_estimators=100, random_state=42) model.fit(X_resampled, y_resampled) # Evaluate the model y_pred = model.predict(X_test_scaled) print(classification_report(y_test, y_pred)) print("AUC-ROC:", roc_auc_score(y_test, model.predict_proba(X_test_scaled)[:, 1]))
Copy after login

Step 3: Building the FastAPI Application

With the trained model and scaler saved using joblib, we move on to building the FastAPI application. FastAPI is chosen for its speed and ease of use, making it ideal for real-time applications.

Creating the API

The FastAPI application defines a POST endpoint /predict/ that accepts transaction data, processes it, and returns the model's prediction and probability.

from fastapi import FastAPI, HTTPException from pydantic import BaseModel import joblib import pandas as pd # Load the trained model and scaler model = joblib.load("random_forest_model.pkl") scaler = joblib.load("scaler.pkl") app = FastAPI() class Transaction(BaseModel): V1: float V2: float # Include all other features used in your model Amount: float @app.post("/predict/") def predict(transaction: Transaction): try: data = pd.DataFrame([transaction.dict()]) scaled_data = scaler.transform(data) prediction = model.predict(scaled_data) prediction_proba = model.predict_proba(scaled_data) return {"fraud_prediction": int(prediction[0]), "probability": float(prediction_proba[0][1])} except Exception as e: raise HTTPException(status_code=400, detail=str(e))
Copy after login

Step 4: Deploying the Application

To test the application locally, you can run the FastAPI server using uvicorn and send POST requests to the /predict/ endpoint. The service will process incoming requests, scale the data, and return whether the transaction is fraudulent.

Running the API Locally

uvicorn main:app --reload
Copy after login

You can then test the API using curl or a tool like Postman:

curl -X POST http://127.0.0.1:8000/predict/ \ -H "Content-Type: application/json" \ -d '{"V1": -1.359807134, "V2": -0.072781173, ..., "Amount": 149.62}'
Copy after login

The API will return a JSON object with the fraud prediction and the associated probability.

Conclusion

In this article, we've built a real-time credit card fraud detection system that combines machine learning with a modern web framework. The github link is here. The system is designed to handle real-time transaction data and provide instant predictions, making it a valuable tool for financial institutions looking to combat fraud.

By deploying this model using FastAPI, we ensure that the service is not only fast but also scalable, capable of handling multiple requests concurrently. This project can be further extended with more sophisticated models, improved feature engineering, or integration with a production environment.

Next Steps

To enhance the system further, consider the following:

  1. Model Improvements: Experiment with more advanced models like XGBoost or neural networks.
  2. Feature Engineering: Explore additional features that might improve model accuracy.
  3. Real-World Deployment: Deploy the application on cloud platforms like AWS or GCP for production use.

The above is the detailed content of Building a Real-Time Credit Card Fraud Detection System with FastAPI and Machine Learning. For more information, please follow other related articles on the PHP Chinese website!

source:dev.to
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!