Banking Fraud Detection - Transaction Analysis
Download CSV files for all 4 tables with 20+ rows of messy banking data
25 rows of messy transaction data with inconsistent formats, NULL values, and suspicious patterns
| txn_id | customer_id | merchant_id | amount | txn_date | status |
|---|---|---|---|---|---|
| TXN001 | C12345 | M001 | $2,500.00 | 2024-01-15 14:23:11 | COMPLETED |
| TXN002 | c67890 | m002 | 45.5 | 15/01/2024 | pending |
| TXN003 | C12345 | M003 | $-150 | 2024-01-15 14:25:00 | Refund |
| TXN004 | C99999 | M001 | 15000.00 | 2024/01/16 03:45 | FLAGGED |
| TXN005 | NULL | M002 | $99.99 | 2024-01-16 | failed |
| TXN006 | C11111 | M004 | $25,000.00 | 2024-01-16 10:15:30 | COMPLETED |
| TXN007 | c22222 | m005 | 500 | 16/01/2024 11:30 | COMPLETED |
| TXN008 | C12345 | M001 | $125.50 | 2024-01-16 14:26:00 | completed |
| TXN009 | C67890 | M002 | $78.00 | 2024/01/17 09:00 | COMPLETED |
| TXN010 | c33333 | M004 | 50000 | 17/01/2024 | FLAGGED |
| TXN011 | C11111 | M005 | $1,200.00 | 2024-01-17 15:45:22 | completed |
| TXN012 | C22222 | M003 | $350.99 | 2024-01-18 08:12:00 | COMPLETED |
| TXN013 | C44444 | m001 | 89.5 | 18/01/2024 10:30 | pending |
| TXN014 | C99999 | M004 | $8,500.00 | 2024/01/18 14:20 | FLAGGED |
| TXN015 | c12345 | M002 | 45 | 2024-01-18 16:00:00 | completed |
| TXN016 | C33333 | M005 | $3,750.00 | 19/01/2024 | COMPLETED |
| TXN017 | C55555 | M001 | $0 | 2024-01-19 11:15:30 | failed |
| TXN018 | c11111 | M003 | $-200 | 2024/01/19 13:45 | Refund |
| TXN019 | C67890 | M002 | 125.99 | 19/01/2024 16:20 | COMPLETED |
| TXN020 | NULL | M004 | $45,000.00 | 2024-01-20 | FLAGGED |
| TXN021 | C22222 | m001 | $67.50 | 2024-01-20 09:30:15 | completed |
| TXN022 | C44444 | M005 | $2,100.00 | 20/01/2024 12:00 | COMPLETED |
| TXN023 | c99999 | M004 | 12000 | 2024/01/20 15:45 | FLAGGED |
| TXN024 | C12345 | M003 | $450.00 | 2024-01-21 10:10:00 | COMPLETED |
| TXN025 | C55555 | M002 | $99.99 | 21/01/2024 | pending |
20 rows of customer data with inconsistent names, emails, and account statuses
| customer_id | full_name | account_status | join_date | country | |
|---|---|---|---|---|---|
| C12345 | john smith | JOHN.SMITH@EMAIL.COM | active | 2023-05-10 | USA |
| c67890 | SARAH JONES | sarah@invalid | ACTIVE | 10/06/2023 | uk |
| C11111 | Mike Wilson | mike.w@bank.com | suspended | 2023-07-15 | Canada |
| C99999 | anna garcia | anna@email.co | FLAGGED | 2024-01-01 | MEXICO |
| C22222 | Bob Brown | bob@mail.net | active | 2023/08/20 | usa |
| c33333 | EMMA DAVIS | emma.d@crypto.com | FLAGGED | 2023-09-10 | UK |
| C44444 | david lee | DAVID@SHOP.COM | active | 15/10/2023 | CANADA |
| C55555 | Lisa Martinez | lisa_m | active | 2023-11-05 | spain |
| c66666 | JAMES TAYLOR | james.t@email.com | suspended | 2023/12/01 | USA |
| C77777 | Maria Lopez | maria@invalid | active | 20-12-2023 | mexico |
| C88888 | robert clark | ROBERT.C@MAIL.NET | ACTIVE | 2024-01-10 | UK |
| c10000 | Nancy White | nancy.white@bank.com | active | 15/01/2024 | canada |
| C20000 | KEVIN HARRIS | kevin@crypto.io | FLAGGED | 2024/01/18 | USA |
| C30000 | linda martin | linda@email.com | active | 2024-01-20 | Spain |
| c40000 | Paul Thompson | paul | suspended | 22/01/2024 | UK |
| C50000 | KAREN WALKER | karen.w@shop.net | active | 2024-01-25 | USA |
| C60000 | mark hall | MARK.H@EMAIL.COM | ACTIVE | 26/01/2024 | canada |
| c70000 | Sandra Young | sandra_y@invalid | active | 2024/01/28 | MEXICO |
| C80000 | DANIEL KING | daniel@bank.com | active | 2024-01-30 | uk |
| C90000 | betty wright | betty.w@mail.net | suspended | 01-02-2024 | Spain |
5 rows of merchant reference data with risk levels (Clean data)
| merchant_id | merchant_name | category | risk_level |
|---|---|---|---|
| M001 | Amazon Store | Retail | LOW |
| M002 | Gas Station XYZ | Fuel | LOW |
| M003 | Luxury Goods Ltd | Retail | MEDIUM |
| M004 | Crypto Exchange Pro | Finance | HIGH |
| M005 | Online Casino 24/7 | Gambling | HIGH |
8 rows of reported fraud cases linked to transactions
| report_id | txn_id | report_date | fraud_type | verified |
|---|---|---|---|---|
| FR001 | TXN004 | 2024-01-16 | Suspicious Amount | Yes |
| FR002 | TXN010 | 2024-01-17 | Unusual Location | No |
| FR003 | TXN015 | 2024-01-18 | Multiple Attempts | Yes |
| FR004 | TXN006 | 2024-01-16 | High Value Transaction | Yes |
| FR005 | TXN020 | 2024-01-20 | Suspicious Amount | Yes |
| FR006 | TXN014 | 2024-01-18 | Rapid Transactions | No |
| FR007 | TXN023 | 2024-01-20 | Flagged Account | Yes |
| FR008 | TXN011 | 2024-01-17 | High Risk Merchant | Yes |
CREATE DATABASE banking_fraud; USE banking_fraud; -- Import transactions LOAD DATA LOCAL INFILE 'transactions.csv' INTO TABLE transactions FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n' IGNORE 1 ROWS; -- Repeat for other tables...
CREATE DATABASE banking_fraud; \c banking_fraud -- Import transactions COPY transactions FROM '/path/to/transactions.csv' DELIMITER ',' CSV HEADER; -- Repeat for other tables...
Tip: Create tables with appropriate data types before importing. Use VARCHAR for IDs, DECIMAL for amounts, TIMESTAMP for dates.