Projects

Hotel Booking Cancellations — Predictive & Geospatial Analysis
2025
Calibrated, reproducible Geo-ML pipeline predicting booking-level cancellation risk and explaining cross-country heterogeneity.
- Harmonised multi-year hotel data with ISO-3; enriched with World Bank WDI + Hofstede; staged imputation (linear fills → UNSD region medians → PMM-MICE) with leakage controls and principled outlier handling.
- Benchmarked Logistic Regression, Random Forest, XGBoost, LightGBM, and MLP via stratified CV in a single ColumnTransformer pipeline; tuned with Optuna; post-calibrated (Platt/Isotonic) and thresholded for operations.
- Explained drivers using TreeSHAP, PDP/ICE, and group-permutation importance to quantify the incremental value of external context beyond a country label.
More details
- Geospatial lens: GeoPandas/Folium choropleths, Bayesian-shrunk country profiles, Moran’s I & LISA diagnostics, and bivariate maps linking predicted risk with macro indicators.
- Outcome: a pragmatic, explainable pipeline that turns calibrated risk into geo-tiered playbooks—supporting overbooking buffers, deposit/guarantee policy tests, and targeted pre-stay outreach.

Temporal Churn Prediction (ML)
2025
Weekly churn pipeline with strict temporal causality: 90-day lookback, 38-day churn horizon, rolling 7-day snapshots.
- Features predominantly engineered in SQL (Spark): RFM, tenure, inter-purchase gaps & momentum, recency-weighted monetary value, rolling frequency, product diversity, and seasonal sine/cosine.
- Winsorised, log-transformed, standardised; temporal split (train→val→test by week) with rolling backtests; calibrated probabilities (Platt/Isotonic).
- Benchmarked Logistic Regression, Random Forest, XGBoost; interpretability via SHAP & PDPs; drift-aware weekly evaluation.
More details
- Delivered dual outputs each week: unbiased churn-probability forecasts for planning and a thresholded churn-flag list for targeting.
- Outcome: a robust, explainable pipeline that flags at-risk customers weekly, supports targeted retention, and remains stable under temporal drift.

Term-Deposit Subscription Predictor (ML)
2024
Production-ready classifier to prioritise telemarketing contacts for fixed-term deposits.
- EDA on ~4k historical calls; addressed class imbalance (~78.5% non-subscribers); clean categorical encodings; leakage audit.
- Deliberately excluded call duration from final model to avoid target leakage, while retaining it for descriptive insight.
- Benchmarked Logistic Regression, Naive Bayes, Decision Tree, KNN, Random Forest with tuned hyperparameters and stratified evaluation.
More details
- Champion Random Forest: ≈81.4% accuracy, precision ≈62%, recall ≈34%, ROC-AUC ≈0.72; best precision for reducing costly false-positive calls.
- Packaged deployable artefacts (model + encoder) with straightforward scoring instructions.
- Outcome: a pragmatic, explainable pipeline that focuses agents on receptive leads and cuts wasted call time.

Brand Analysis Using Twitter/X (ML)
2025
Actionable text-analytics workflow to compare brand conversations and surface micro-influencers.
- Text cleaning & normalisation: URL/emoji/mention stripping, tokenisation, stop-word removal, WordNet lemmatisation.
- Hashtag/mention extraction, engagement time-series; sentiment via VADER/TextBlob; TF-IDF + NMF topic modelling.
- Mention-graphs (NetworkX) for visibility/centrality; PCA + K-Means clustering of features.
More details
- Follower-bounded micro-influencer filter; transparent weighted scoring of reach, engagement, centrality, sentiment, activity.
- Outcome: a reproducible analytics stack that clarifies brand themes, finds partnerable micro-influencers, and guides campaign planning.

Customer Analytics & Segmentation (ML)
2025
End-to-end segmentation for a national convenience retailer (3k customers, 6 months).
- Consolidated customers, category_spends, baskets, lineitems with strict cleansing: date parsing, currency/format normalisation, negative-value fixes, category corrections (e.g., bakery spend recomputation from line-items).
- Engineered features: RFM, tenure, avg basket spend, avg spend per product, unique category count, proportional category spend, avg basket categories.
- Median imputation, IQR outlier clipping, log1p transforms, standardisation; PCA (≥80% variance) to stabilise clustering.
More details
- Benchmarked K-Means (k=5–7 per brief) vs DBSCAN; evaluated with silhouette, SSE, Calinski–Harabasz, Davies–Bouldin.
- Produced pen profiles and an attractiveness_score (weighted features) to rank segments for targeting and campaign design.
- Outcome: a business-aligned, explainable segmentation that turns raw logs into campaign-ready audience playbooks and supports re-clustering cycles.

Zoho ↔ Zoey Integrations (EMtel)
2025
Production automations linking CRM and billing to streamline operations and engagement.
- End-to-end integrations between Zoho One and Tekton Zoey: data, events, and workflows unified across platforms.
- Custom Deluge functions, webhooks, REST integrations (Zoho CRM/Books/Campaigns) with MySQL on Plesk; designed for reliability, auditability, and low maintenance.
- Automation layers for lead routing, campaign engagement, account creation, call scheduling; standardised error handling and observability.
More details
- Predictive signals for likelihood-to-stay and interest intent embedded into Zoho for targeted outreach and tiered follow-ups.
- Partnered with business users on requirements, rapid prototyping, training, and SOPs to drive adoption; reusable modules to accelerate future projects.
- Outcome: resilient, traceable CRM-billing automation that improves customer journeys and reduces manual effort.
Experience
Project Developer, EMtel Limited
- Led the integration of Zoho One ↔ Tekton Zoey, unifying CRM, billing and operations—bridging data, events and workflows across platforms.
- Shipped custom Deluge functions, webhooks and REST integrations (Zoho CRM/Books/Campaigns) backed by MySQL on Plesk; designed for reliability, auditability and low maintenance.
- Built automation layers for lead routing, campaign engagement, account creation, call scheduling; standardized error handling and observability for faster troubleshooting.
- Developed predictive signals to score likelihood-to-stay and intent; surfaced results in Zoho for targeted outreach and tiered follow-ups.
- Partnered with business users and stakeholders to capture requirements, prototype quickly and iterate to fit real-world workflows; delivered training and SOPs to drive adoption.
- Implemented secure data flows and schema design in MySQL, optimized queries and created reusable modules to accelerate future projects.
- Maintained source control and documentation in GitHub for reproducible releases and peer review.
Core stack & tools (click to expand)
Platforms: Zoho One, Tekton Zoey, Plesk
Languages: Deluge, Python, SQL (MySQL), PHP, HTML/CSS/JavaScript
Integration: REST APIs, webhooks, OAuth, error/exception handling, idempotency
DevOps/Tools: GitHub, logging/monitoring, documentation & SOPs
Business: Stakeholder management, requirements elicitation, user training, process redesign
Senior Solution Engineer, IBS Software
Client: Expedia Group
- Spearheaded migration of the Runtime Compute Platform (RCP) for Airmate (flight operations), delivering optimized infrastructure to support new BI capabilities.
- Bridged IT capabilities with business objectives; guided stakeholders to implement data-driven decision-making via insights platforms.
- Led testing and integration for Airmate enhancements, improving availability, quality and scalability of flight operations data.
- Drove high-impact initiatives; applied analytical modelling and data visualisations to optimize configurations, bolster scalability and streamline flight-related processes.
- Awards: Team Champ (successful RCP migration); Debutant; Passionate & Persistent (Virtual) for technical leadership and transformation impact.
- Progression: Promoted from Solution Engineer to Senior Solution Engineer for stakeholder understanding, solution ideation and measurable delivery.
Core stack & tools (click to expand)
Full-Stack & Languages: Core Java, Kotlin, Python, JavaScript, React.js
Cloud & DevOps: AWS, Kubernetes, Docker, Jenkins, Spinnaker, GitHub Actions
Data & DB: SQL
Practices: Project management, problem solving, business analysis, GitHub/CI/CD
Education
MSc Business Analytics, University of Nottingham (UK)
Scholarship Postgraduate Excellence Award
Modules covered
- Machine Learning and Predictive Analytics
- Foundational Business Analytics
- Analytics Specializations and Applications
- Data at Scale: Management, Processing, Visualization
- Leading Big Data Business Projects
- Supply Chain Planning and Management
- Digital Marketing
Bachelor’s in Electronics & Communication Engineering, Adi Shankara Institute of Engineering and Technology (India)
Capstone project: Kinesics Articulation
Built a microprocessor-driven system in Python to translate sensor-captured gestures into audible speech, supporting communication for deaf and hard-of-hearing users. Also designed a mobile text-to-speech app (Android Studio + Flutter) for accessible, real-time communication. Demonstrated in schools and deaf community facilities, earning recognition from colleges and universities.
Honors & Awards
Recognitions
- Postgraduate Excellence Award Scholarship — University of Nottingham
- Team Champ Award — IBS Software
- Passionate Virtual Award — IBS Software
- Perseverant Virtual Award — IBS Software
- Debutant Award — IBS Software
- ACCESS Project Presentation Certificate — Adi Shankara Institute of Engineering and Technology
© 2025 Karthik Rajesh • Built with GitHub Pages (Jekyll)