Rehema Kemunto

Data Analysis | Statistics | Research

Connect on LinkedIn

About Me

I specialize in extracting actionable insights from complex datasets through statistical analysis and data visualization. My work spans rigorous hypothesis testing (Welch's t-tests, Mann-Whitney U, Bayesian inference), root cause analysis for operational problems, and building interactive dashboards that communicate data clearly. I've analyzed datasets ranging from 5,000 survey responses to 185,000+ e-commerce transactions, consistently translating technical findings into strategic recommendations. Whether working in R, Python, SQL, or Power BI, my goal is always the same: turn messy data into decisions that create measurable business value.

View My Work

Technical Skills

Hypothesis Testing
Experimental Design
Bayesian Inference
Regression Analysis
R
Python
SQL
Power BI
Tableau
Excel
Looker Studio
SPSS
Root Cause Analysis
Data Cleaning
Dashboard Development

Featured Projects

My top 5 projects showcasing statistical rigor, business impact, and technical depth

Weather Impact Analysis: Hypothesis Testing

Statistical Research & Optimization

Conducted Welch's two-sample t-test on 17,400+ hourly observations to quantify weather impact on public transit demand. Validated statistical assumptions (normality, variance) and delivered data-backed staffing recommendations achieving £23K/month cost savings through evidence-based dynamic scheduling.

Python Hypothesis Testing Statistical Validation Operations Research

Left-Digit Bias Audit: Pricing Psychology

Behavioral Economics Research

Analyzed 185,000+ e-commerce transactions testing whether $X.99 pricing increases sales across product categories. Applied Mann-Whitney U test after confirming non-normal distribution via Shapiro-Wilk. Found significant effect for phones (+5.16% lift, p<0.001) but no effect for laptops, demonstrating category-specific pricing psychology.

Python Mann-Whitney U Test Non-Parametric Statistics Pricing Strategy

Retail Operations & Finance Optimization

MIS - End-to-End Workflow

Comprehensive MIS project tracking $66.31M revenue. Built Power BI dashboard, automated invoice alerts with Google Apps Script, and identified root cause warehouses reducing fulfillment delays through systematic data analysis.

SQL Power BI Google Apps Script Root Cause Analysis

Bayesian A/B Testing: E-Commerce Optimization

Experimental Design

Designed Beta-Binomial Bayesian model testing product density impact on conversion rates. Conducted sensitivity analysis across multiple priors comparing Bayesian vs Frequentist approaches. Demonstrated 98% probability of superiority and advantages of probabilistic decision-making for real-time experiments.

Python Bayesian Statistics A/B Testing Monte Carlo

Polo Shirt Product Performance Analysis

Root Cause Analysis

Analyzed 5,000+ transactions using statistical validation to diagnose 48% return rate. Identified Black variant quality issue through systematic data segmentation and delivered evidence-based recommendations projected to increase profit margin by 23%.

SQL Power BI Root Cause Analysis Statistical Validation

Complete Project Archive

All projects organized by category

E-Commerce Superstore BI Dashboard

Data Analysis & BI

Full-cycle BI analysis from Power Query to strategic recommendations. Advanced DAX measures for Net Sales and Profit Margin. Delivered Q4 campaign and regional optimization strategies based on seasonal trends and profitability analysis.

Power BI DAX Power Query

Customer Segmentation Dashboard

Marketing Analytics

Tableau dashboard using behavioral data to create distinct customer segments (High-Value, At-Risk) enabling targeted marketing campaigns and personalized retention strategies based on spending patterns and loyalty metrics.

Tableau Customer Segmentation Marketing Analytics

Social Media Ad Performance Dashboard

Marketing Analytics

Power BI dashboard analyzing campaigns across Facebook and Instagram. Tracked 340K impressions and $101K revenue with 333% ROI. Identified optimal ad scheduling times and demographic performance patterns for budget allocation.

Power BI DAX Marketing Analytics

Nobel Prize Trends Analysis (1901-2025)

Statistical Research & Visualization

Analyzed 124 years of Nobel Prize data (1,000+ laureates) to uncover trends in age, collaboration, gender representation, and geographic distribution. Built interactive Looker Studio dashboard revealing the gender gap is slowly closing (6.6% historically → 19.4% in 2020s) and solo science is extinct (78% individual prizes in early 1900s → 35% today).

R Looker Studio Data Analysis Trend Analysis

Coffee Shop Survey Analysis

Statistical Analysis

Comprehensive statistical analysis using chi-square tests, ANOVA, and logistic regression to model customer satisfaction and loyalty drivers. Identified key factors influencing Net Promoter Score and likelihood to recommend.

R Logistic Regression ANOVA Chi-Square Tests

College Event Feedback Analysis

NLP & Sentiment Analysis

Data analysis project using Natural Language Processing to evaluate student sentiment and satisfaction trends from campus event surveys. Applied text analysis techniques to extract meaningful insights from open-ended feedback.

Python NLP Sentiment Analysis Text Analysis

Demographic Data Analysis

Descriptive Statistics

Comprehensive statistical analysis exploring descriptive statistics, distribution characteristics, and relationships between key demographic variables in a dataset of 5,000 respondents. Applied correlation analysis and visualization techniques.

Python Pandas EDA Statistical Analysis

Cluster Analysis on Health Data

Unsupervised Learning

Applied k-means and hierarchical clustering to segment cardiovascular health data patterns from the Framingham Heart Study. Revealed distinct customer segments based on risk factors for targeted health interventions.

R K-Means Hierarchical Clustering

Certifications

J.P. Morgan - Quantitative Research

Forage

October 2025

BCG - Data for Decision Makers

Forage

November 2025

Deloitte - Data Analytics

Forage

August 2025

Accenture - Data Analytics & Visualization

Forage

May 2025

Data Manipulation with Pandas

DataCamp

September 2025

Understanding Machine Learning

DataCamp

February 2025

View all certifications on LinkedIn

Let's Connect

I'm always open to discussing new projects, creative ideas, or opportunities to bring data-driven solutions to life. Feel free to reach out via LinkedIn or GitHub!