Employer Case Study

Sports ML: evaluation before ego

A compact example of how I build model-backed decision systems: reusable Python primitives, an evaluation dashboard, and reporting that surfaces calibration, rolling accuracy, and CLV-style behavior instead of hiding behind one flattering number.

Ian Alloway ML Engineer / Data Scientist ian@allowayllc.com ianalloway.xyz March 2026

Problem

Sports models are easy to oversell with a headline accuracy number. That is rarely enough for a serious team. If the model is going to influence decisions, you need to know whether it is calibrated, whether the behavior is stable over time, and whether the evaluation language resembles the way real forecasting and risk work is reviewed.

My working bias: if a dashboard makes the model look amazing but hides uncertainty, drift, or market context, it is probably a sales page, not an evaluation layer.

Outcome

This project gives employers a fast way to see how I think. I did not just train a model. I built the layer around it so another person can inspect quality, reason about tradeoffs, and decide whether the signal is real enough to trust.

1 dashboard that prioritizes model honesty
3 public repos tied into one coherent workflow
1 clear hiring narrative: evaluation-first ML systems

Approach

  • Primitives library: nba-ratings / nba-edge packages Elo updates, logistic win probability, implied probability math, and Kelly helpers into reusable, test-backed Python primitives.
  • Monitoring layer: odds-drift-watch tracks meaningful sportsbook movement via FastAPI and webhooks for research and decision-support use cases.
  • Evaluation UI: nba-clv-dashboard surfaces calibration, rolling accuracy, Brier score, and CLV-oriented reporting in a way that can swap in real backtest output without rewriting the interface.

Why It Matters

The strongest signal here is not “sports betting.” It is that I know how to build evaluation-first analytics in a noisy domain where leakage, overconfidence, and vanity metrics are common. The same thinking translates well to forecasting, experimentation, fraud, risk, and decision-support products.

Python FastAPI Chart.js pytest ruff GitHub Actions