📊 Day 13 - Data Download

Banking Fraud Detection - Transaction Analysis

Download CSV files for all 4 tables with 20+ rows of messy banking data

Fraud Detection Window Functions Risk Scoring Data Cleaning

💳 Table 1: transactions

25 rows of messy transaction data with inconsistent formats, NULL values, and suspicious patterns

txn_id customer_id merchant_id amount txn_date status
TXN001 C12345 M001$2,500.002024-01-15 14:23:11COMPLETED
TXN002c67890m00245.515/01/2024pending
TXN003C12345M003$-1502024-01-15 14:25:00Refund
TXN004C99999M00115000.002024/01/16 03:45FLAGGED
TXN005NULLM002$99.992024-01-16failed
TXN006C11111M004$25,000.002024-01-16 10:15:30COMPLETED
TXN007c22222m00550016/01/2024 11:30COMPLETED
TXN008C12345M001$125.502024-01-16 14:26:00completed
TXN009C67890M002$78.002024/01/17 09:00COMPLETED
TXN010c33333M0045000017/01/2024FLAGGED
TXN011C11111M005$1,200.002024-01-17 15:45:22completed
TXN012C22222M003$350.992024-01-18 08:12:00COMPLETED
TXN013 C44444 m00189.518/01/2024 10:30pending
TXN014C99999M004$8,500.002024/01/18 14:20FLAGGED
TXN015c12345M002452024-01-18 16:00:00completed
TXN016C33333M005$3,750.0019/01/2024COMPLETED
TXN017C55555M001$02024-01-19 11:15:30failed
TXN018c11111M003$-2002024/01/19 13:45Refund
TXN019C67890M002125.9919/01/2024 16:20COMPLETED
TXN020NULLM004$45,000.002024-01-20FLAGGED
TXN021C22222m001$67.502024-01-20 09:30:15completed
TXN022C44444M005$2,100.0020/01/2024 12:00COMPLETED
TXN023c99999M004120002024/01/20 15:45FLAGGED
TXN024C12345M003$450.002024-01-21 10:10:00COMPLETED
TXN025C55555M002$99.9921/01/2024pending

👥 Table 2: customers

20 rows of customer data with inconsistent names, emails, and account statuses

customer_id full_name email account_status join_date country
C12345john smithJOHN.SMITH@EMAIL.COMactive2023-05-10 USA
c67890SARAH JONESsarah@invalidACTIVE10/06/2023uk
C11111 Mike Wilson mike.w@bank.comsuspended2023-07-15Canada
C99999anna garciaanna@email.coFLAGGED2024-01-01MEXICO
C22222Bob Brown bob@mail.net active2023/08/20usa
c33333EMMA DAVISemma.d@crypto.comFLAGGED2023-09-10UK
C44444david leeDAVID@SHOP.COMactive15/10/2023 CANADA
C55555Lisa Martinezlisa_mactive2023-11-05spain
c66666JAMES TAYLORjames.t@email.comsuspended2023/12/01USA
C77777 Maria Lopez maria@invalidactive20-12-2023mexico
C88888robert clarkROBERT.C@MAIL.NETACTIVE2024-01-10UK
c10000Nancy Whitenancy.white@bank.comactive15/01/2024canada
C20000KEVIN HARRISkevin@crypto.ioFLAGGED2024/01/18 USA
C30000linda martinlinda@email.comactive2024-01-20Spain
c40000Paul Thompsonpaulsuspended22/01/2024UK
C50000 KAREN WALKER karen.w@shop.netactive2024-01-25USA
C60000mark hallMARK.H@EMAIL.COMACTIVE26/01/2024canada
c70000Sandra Youngsandra_y@invalidactive2024/01/28 MEXICO
C80000DANIEL KINGdaniel@bank.comactive2024-01-30uk
C90000betty wrightbetty.w@mail.netsuspended01-02-2024Spain

🏪 Table 3: merchants

5 rows of merchant reference data with risk levels (Clean data)

merchant_id merchant_name category risk_level
M001Amazon StoreRetailLOW
M002Gas Station XYZFuelLOW
M003Luxury Goods LtdRetailMEDIUM
M004Crypto Exchange ProFinanceHIGH
M005Online Casino 24/7GamblingHIGH

⚠️ Table 4: fraud_reports

8 rows of reported fraud cases linked to transactions

report_id txn_id report_date fraud_type verified
FR001TXN0042024-01-16Suspicious AmountYes
FR002TXN0102024-01-17Unusual LocationNo
FR003TXN0152024-01-18Multiple AttemptsYes
FR004TXN0062024-01-16High Value TransactionYes
FR005TXN0202024-01-20Suspicious AmountYes
FR006TXN0142024-01-18Rapid TransactionsNo
FR007TXN0232024-01-20Flagged AccountYes
FR008TXN0112024-01-17High Risk MerchantYes

📝 How to Import CSV Files

MySQL:

CREATE DATABASE banking_fraud;
USE banking_fraud;

-- Import transactions
LOAD DATA LOCAL INFILE 'transactions.csv'
INTO TABLE transactions
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;

-- Repeat for other tables...

PostgreSQL:

CREATE DATABASE banking_fraud;
\c banking_fraud

-- Import transactions
COPY transactions FROM '/path/to/transactions.csv'
DELIMITER ',' CSV HEADER;

-- Repeat for other tables...

Tip: Create tables with appropriate data types before importing. Use VARCHAR for IDs, DECIMAL for amounts, TIMESTAMP for dates.

← Back to Day 13 Challenge