Hao Xu

MagicMenu - AI Powered Smart Dining Assistant

Personal Project — AI & Flutter-based Smart Menu Assistant 2025.10 – 2025.11

When dining out, especially abroad, it's often hard to decide what to order just by looking at dish names. Language barriers and poor translations make it difficult to know ingredients or cooking methods.
To address this, I developed a Flutter-based mobile app. Its core function allows users to upload menu photos, which are analyzed by LLM APIs to generate a structured menu in the user's chosen language. This new menu includes photos, ingredients, cooking methods, allergens, and cultural background. The project demonstrates my full-stack capabilities in mobile development, AI integration, and complex data processing.

1. Intelligent Multi-Image Menu Analysis: Users can capture or upload multiple menu photos at once. Using asynchronous processing streams, unstructured images are converted into structured dish data—including names, recipes, ingredients, allergens, and cultural history—presented in a standardized format.
2. AI Ordering Assistant: An AI conversational assistant based on the current menu context. Users can ask questions like "Is this dish spicy?" or "Recommend a vegetarian option," and the AI responds based on the parsed menu data.
3. LBS Nearby Restaurant Explorer: Integrated geolocator to fetch real-time user location and dynamically display nearby restaurants. Implemented distance and rating-based sorting algorithms, and an interactive details page UI with image carousels, ratings, and service tags. It can also crawl restaurant websites to generate preset menus.
Core Tech & AI Integration
1. Multi-Model AI Engine Integration: Encapsulated a generic LLM service layer supporting OpenAI (GPT-4o), Google Gemini, and Qwen. Designed System Prompts to force strict JSON output, ensuring frontend parsing stability.
2. SerpApi Search Capability: Utilized Google Maps Engine API to fetch real-time restaurant metadata (Place ID, hours, reviews) and search for dish images.
3. Local Caching: Adopted shared_preferences for user settings (API Keys, model selection) and JSON file system for history storage, achieving a backend-free architecture.

Flutter Dart LLM Integration SerpApi Google Maps API Mobile Development

Evaluate the impact of a Pricing System on the Revenue using Two-Way Fixed Effects DID model

Stena Line, Sweden 2025.01 - 2025.02

Project Background & Objective
In 2020, Stena Line launched an EMSR-based automatic pricing system to help Regional Managers set ticket prices. The system was not mandatory, which led to very different adoption rates across routes—some routes barely used it (usage ≈ 0%), while others use it often (usage ≈ 100%).
Management wanted to assess whether this system actually increased passenger revenue. We selected 15 routes for analysis. However, COVID-19 also broke out in 2020 — at the exact same time the pricing system was introduced — causing dramatic industry-wide revenue fluctuations. This made it impossible to simply compare "pre-2020 vs. post-2020" or predict post-pandemic revenue based on historical trends.
Methodology / Solution
We applied a Two-Way Fixed Effects Difference-in-Differences (DID) model:ln(Revenue_it + 1) = β ⋅ (UsageRate_it × Post_t) + γ_i + δ_t + ε_itWhere:
• γ_i (Route Fixed Effects) — controls for time-invariant route-specific characteristics
• δ_t (Time Fixed Effects) — controls for shocks affecting all routes at the same time
• UsageRate × Post — the key interaction term capturing post-COVID treatment intensity
Reasons:
• The adoption of the pricing system is continuous (0%–100%) rather than binary (0/1).
• Without route fixed effects, a traditional specification (β * usage_rate) would incorrectly attribute baseline revenue differences between routes to the pricing system. γ_i removes these inherent cross-route differences that are unrelated to usage.
• COVID-19 and the subsequent recovery significantly impacted passenger revenue over time. Time fixed effects δ_t eliminate industry-wide fluctuations and prevent falsely attributing common shocks to the pricing system.
• Revenue levels differ greatly across routes, and using raw values can let outliers dominate the regression. Taking the natural logarithm stabilizes variance and allows the coefficient to be interpreted as a percentage effect rather than an absolute change.
Implementation Steps
1. Data Setup
• Routes: 15
• Time Range: 2015–2025 (monthly)
• Variables: Revenue, usage rate, time dummy (post)
2. Variable Construction
• post = 1 if year ≥ 2020, else 0
• ur_post = usage_rate × post
• Apply natural log transformation to revenue: ln(Revenue + 1)
3. Model Estimation
• Use statsmodels OLS with cluster-robust standard errors
Results
The interaction term β is estimated at 0.08, meaning that a route with full adoption of the pricing system would see approximately an 8% increase in passenger revenue. However, the p-value is 0.12, indicating the result is positive in direction but only marginally significant, rather than statistically significant at conventional thresholds.

Python Econometrics Statistical Analysis Difference-in-Differences Fixed Effects Models Regression Analysis Data Analysis

Estimated Ready for Pick-up Time (Estimated RPT) Optimization

Stena Line, Sweden 2024.09 - 2024.12

By this project, the accuracy of the estimated RPT was increased from 10% to 90%.

1. Problem Breakdown
Estimated RPT = ETA + Unloading Duration. Improving accuracy requires optimizing both parts.
• ETA: Highly uncertain, influenced by route, weather, and vessel speed, often with deviations of several hours.
• Unloading Duration: Large vessels take 6–10 hours to unload, creating significant gaps between the first and last Real RPT.
2. ETA Optimization
• Use Stena Line's offshore real-time ETA system via API integration.
• Refresh ETA at key stages: departure, hourly during voyage, and upon arrival.
3. Unloading Duration Optimization
3.1 Deck and Subsection
• Current deck-level grouping is too coarse; unloading times vary within a deck.
• Further divide decks into ~6 subsections (front/middle/rear × left/right).
• Smaller variance within subsections improves Estimated RPT accuracy.
• While zone-level averages provide a baseline, unloading times need also be adjusted for the vessel's time of arrival, which significantly impacts the unloading time.
3.2 Subsection Trade-off
• Too large → high variance, low accuracy.
• Too small → sparse data, risk of overfitting, higher operational complexity.
• Optimal size determined through analysis and simulation.
4. Accuracy vs. Effectiveness
Estimated RPT is an interval, Real RPT a time point. Accuracy means the Real RPT falls within the interval. Expanding the interval improves accuracy but reduces prediction effectiveness. Different ports adopt tailored accuracy targets and interval widths.

Python Data Analysis API Integration Predictive Modeling Statistical Analysis Simulation Operations Research

Full-Stack Serverless Photography Portfolio Website

Personal Project — www.haoexplore.com 2025.07 - 2025.08

As one of my passions, I have always wanted to create a personal website for photography, and now, it is completed. This website is more than just a static gallery — it's an interactive, cloud-powered platform showcasing my photos while demonstrating my end-to-end full-stack development skills.

🎨 Frontend
I designed and implemented a fully responsive, modern web application using HTML5 | CSS3 | JavaScript ES6+, featuring:
1. Interactive Photo Gallery with smooth animations, lazy loading, and intuitive navigation
2. 360° Spherical Panoramic View using Pannellum.js for immersive experiences
3. Photo Rating System with 5-star ratings and cloud synchronization
4. Leaflet.js Map Integration displaying geographic footprints with year-based filtering
5. Smart Image Processing with automatic WebP conversion and thumbnail generation
6. User Engagement Tools including email subscription and social media integration
⚙️ Backend (Serverless on AWS)
Built a highly scalable, cost-efficient serverless architecture on AWS:
1. Amazon API Gateway — RESTful API endpoints with CORS configuration and request validation
2. AWS Lambda (Python) — Business logic including: Gallery and photo management with CRUD operations; Advanced image processing using Pillow library; Direct S3 upload using presigned URLs (bypassing 10MB API Gateway limit); Photo rating system with device-based user identification, etc
3. Amazon S3 — Optimized photo storage with WebP format and intelligent tiering
4. Amazon DynamoDB — Three-table NoSQL architecture for galleries, photos, and ratings
5. AWS Lambda Layers — Pre-built layers for Pillow and requests libraries
6. Performance Optimization: WebP format conversion (95% quality originals, 40% thumbnails); Parallel photo processing and uploads; Automatic thumbnail generation with smart dimensions; Metadata synchronization between S3 and DynamoDB

Full-Stack Web Development End-to-End Development AWS Cloud (API Gateway, Lambda, S3, DynamoDB, SES) UI/UX Design Serverless Architecture Python

Power BI-Based Port Operations Monitoring System

Stena Line, Sweden 2023.03 - 2023.10

At the request of management, I designed and developed a Power BI Operations Monitoring System to provide multi-dimensional analysis of port efficiency (daily, weekly, monthly, as well as vessel-level and port-level granularities). The dashboard later served as the foundation for multiple spin-off projects that further improved port efficiency.

Data Integration & Extraction:
• Wrote complex SQL queries to extract vessel arrival/departure, loading/unloading, gate operations, and trailer movement data.
• Built automated ETL pipelines using Python scripts and Power Query for bulk data ingestion, cleaning, and transformation. Tasks included filling missing timestamps, validating voyage IN/OUT sequences, deduplication, and outlier handling.
• Applied Python for advanced ETL logic, such as irregular timestamp conversion and batch correction of abnormal voyages.
Data Modeling:
• Designed a star schema in Power BI, separating fact tables (shipping operations, port activities) from dimension tables (vessel, port, date, weekday).
• Created key DAX measures, including unloading/loading efficiency, average port turnaround time (PT), and gate exit rate.
Visualization & Analytics:
• Sailing Level Report: monitored unloading/loading ratios and time consumption for individual voyages.
• Port Level Report: aggregated efficiency at the port level, with time-series comparisons and trend analysis.
• Trend Charts: visualized efficiency changes over time, supporting anomaly detection and KPI monitoring.
• Delivered interactive slicers (by vessel, port, date) to support flexible, ad-hoc analysis by operations teams.
Deployment & Optimization:
• Deployed to Power BI Service with scheduled auto-refresh (twice daily).
• Reduced data refresh time by ~40% through SQL pre-aggregation and Power Query optimization.
• Built a usage monitoring report to track user adoption and continuously improve functionality.
Impact:
• System has been running reliably for over 2 years, becoming the department's core operational tool.
• Significantly improved operational visibility and decision-making efficiency compared to fragmented legacy reports.
• Enabled effective monitoring of loading/unloading efficiency and gate processes, uncovering multiple issues that led to follow-up optimization projects.

Python Power BI SQL Data & Semantic Modeling Data Pipeline Data Visualization Data Analysis

Master's Thesis – Economic and Environmental Impacts of Dry Ports and Triangulation Transport on the Empty Container Repositioning Problem

Chalmers University, Sweden 2022.01 - 2022.08

This thesis focused on the Empty Container Repositioning (ECR) problem within Sweden's inland container transport network, assessing the economic and environmental impacts of introducing dry ports and adopting triangulation transport strategies. Using the case of Gothenburg Port and Eskilstuna Dry Port, the study developed an agent-based discrete-event simulation model in AnyLogic to compare multiple scenarios with and without dry ports and different repositioning strategies.

Key Contributions & Technical Details

Research Design & Data Collection
• Conducted a systematic literature review covering container logistics, ECR strategies, and intermodal transport.
• Collected case data from importers/exporters and calibrated it against historical transport records.
• Defined key variables including container flows, transport costs, and CO₂ emission factors (train vs. truck).
Simulation Modeling
• Built four scenarios in AnyLogic:
   ◦ With Dry Port: Introduces a dry port as an inland consolidation node to relieve seaport congestion and optimize ECR.
   ◦ Without Dry Port: Baseline case where all containers move directly between the seaport and customers without inland terminals.
   ◦ With Dry Port + Triangulation: Uses triangulation strategies under the dry port model, assigning import containers directly to export shipments to reduce empty repositioning.
   ◦ With Dry Port + Street-turn: Applies a street-turn strategy under the dry port model, where import containers are reused immediately by exporters, minimizing storage and repositioning needs.
• Combined Agent-Based Modeling (ABM) for network actors (shipping lines, ports, customers) with Discrete-Event Simulation (DES) for facility-level operations.
• Incorporated stochastic parameters (e.g., demand fluctuations, transport time variability) to improve realism.
Analysis & Results
• Introducing a dry port reduced inland transport costs by ~62–66% and CO₂ emissions by ~71–79%.
• Adding triangulation strategies provided further reductions (~25–27% in costs, ~7–10% in emissions) and significantly decreased the share of empty container movements.
• The street-turn strategy also produced benefits but was less effective than triangulation.
Impact
• Demonstrated the strategic value of dry ports in lowering inland transport costs and emissions.
• Provided quantitative insights for sustainable intermodal transport planning in Sweden.
• Advanced simulation methodology by combining ABM and DES with stochastic variables, improving both realism and applicability in logistics research.

Python Java Logistcs Network Modeling Agent-based Discrete-event Simulation (AnyLogic) Academic Research & Writing

Data Scientist/Engineer | IT Professional | Supply Chain Analytics

Email

LinkedIn

Phone

WeChat

Technical Skills

Data Analytics & Data Science

Data Engineering & Development

Visualization & BI

Professional Experience

Data Scientist

Stena Line

Research Assistant (Data Analyst)

Chalmers University of Technology

Supply Chain Planner (Intern)

Midea Property Group

Education

Master of Science

Supply Chain Management

Erasmus Exchange

Operations Management and Logistics

Bachelor of Engineering

Industrial Engineering

Featured Projects

MagicMenu - AI Powered Smart Dining Assistant

Evaluate the impact of a Pricing System on the Revenue using Two-Way Fixed Effects DID model

Estimated Ready for Pick-up Time (Estimated RPT) Optimization

Full-Stack Serverless Photography Portfolio Website

Power BI-Based Port Operations Monitoring System

Master's Thesis – Economic and Environmental Impacts of Dry Ports and Triangulation Transport on the Empty Container Repositioning Problem

Interests & Hobbies

Traveling

Photography

Hiking & Nature

Reading & Learning