Pi-Yu Wu

Machine Learning

INCOME Prediction Model

Project Goal

Predict income for new customers to increase the overall efficiency of loan application to loan approval.

Challenges

Freeform job titles, extreme values from existing income data, legality, and maximizing business opportunities under controllable risk

Methods & tools
  1. Aggregated similar job titles with BERT model (converted job titles to vectors).

  2. Segmented customers with similar income with the KMeans model for predictions to avoid impact from extreme values.

  3. Used LightGBM classification to select customers who may fit the lowest income requirement.

  4. Applied rules to ensure overall risk is under control in order to follow legal criteria.

    — Tool used: Python/ PostgreSQL/ Jupytor lab/ VS code/Azure DevOps

Impact
  1. Model was implemented into a fully automated loan approval process that is now patent protected.

    {Patent Link}

  2. On average 416 manpower days were reclaimed yearly after project launch.

 

CUSTOMEr BEHAVIORS PREDICTION MODEL 1

Project Goal

Predict when customers will go abroad in the near future to offer timely services or promotions

Challenges

Communicating with a business team that had no technical background; True Y is unavailable

Methods & tools
  1. Used multiple data sources to simulate customers' travel behaviors, such as offline transactions in foreign countries.

  2. Built classification model with LightGBM. Y= travel or not within the next 90 days.

  3. Designed A/B test to form marketing campaigns.

  4. Analyzed and explained important factors with Explainable AI.

    — Tool used: Python/ PostgreSQL/ Jupytor lab/ VS code/Azure DevOps

Impact
  1. Decreased marketing costs by 50% while achieving the same marketing objectives.

CAR Price Prediction

Project Goal

1. Predict a reasonable price for our used car or when buying a used car.
2. Knowing what car features influence the price more and keep in mind when buying a new car.

Challenges

Non-US dataset; Dataset contains unnamed features.

Methods & tools
  1. EDA

  2. Model use: Linear Regression/Random forest Regression/ Xgboost Regression

  3. Explainable AI: SHAP; Understanding the ML black box.

    – Source of data: https://www.kaggle.com/lepchenkov/usedcarscatalog

    – Tool used: R /R studio

Impact
  1. Decreased marketing costs by 50% while achieving the same marketing objectives.

 
 
 

Underwriting MOdel- Avoid the BAD LOAN

Projet Goal
  1. Provide suggestions for future loan application approvals to avoid the bad loans

Challenges
  1. Limited data points and columns

Methods and tools
  1. EDA

  2. Model Use: logistic regression/ CART/ Lightgbm

  3. Explainable AI: SHAP

  4. Python

 
 
 

Contact me at