Quick Start

Follow these steps to set up the Customer Churn Analyzer and run your first prediction in under five minutes.

1. Prerequisites

Ensure you have Python 3.8+ installed. You will also need the churn_sample.csv dataset located in your project root or a specified directory.

2. Installation

Clone the repository and install the required data science libraries:

# Install dependencies
pip install pandas scikit-learn matplotlib seaborn

3. Prepare the Data

If running locally, ensure the dataset path in the script matches your local environment. Update the following line in customer_churn_analyzer.py:

# Change this to your local path
df = pd.read_csv('churn_sample.csv')

4. Run the Analysis

You can execute the project as a script to train the model and generate Exploratory Data Analysis (EDA) visualizations:

python customer_churn_analyzer.py

5. Predict Churn for a New Customer

To use the trained model for predicting whether a specific customer will leave, use the following interface. The model expects categorical features to be encoded using the LabelEncoder instances created during training.

import pandas as pd

# Define new customer data
new_customer = pd.DataFrame({
    'Contract': [le_dict['Contract'].transform(['Monthly'])[0]],
    'SupportCalls': [6],
    'MonthlyBill': [110.0],
    'PaymentMethod': [le_dict['PaymentMethod'].transform(['CreditCard'])[0]],
    'BillingIssues': [1],
    'DataUsageGB': [85.0],
    'TenureMonths': [2],
    'AutoPay': [le_dict['AutoPay'].transform(['No'])[0]]
})

# Generate prediction
pred = model.predict(new_customer)
status = le_dict['Churn'].inverse_transform(pred)[0]

print(f"Predicted Churn Status: {status}")

6. Expected Output

Upon running the analysis, the system will output:

Visualizations: Correlation heatmaps and billing distribution plots.
Model Metrics: A Confusion Matrix and F1 Score (typically ~0.54 for the baseline model).
Prediction: A "Yes" or "No" classification for the input customer data.