Predict-Then-Bet: ML-Based Probability Forecasting
and Fractional Kelly Optimization
Emily Zhang, Summer Sun, Bill Li, Yiping Gao
May 4 2025
1 Abstract
In the high-stakes world of soccer betting, predictive accuracy alone does not guarantee
profitability. We develop a machine learning–based betting strategy that combines probabilistic
outcome prediction with capital allocation via the Kelly criterion. To avoid replicating
bookmaker biases, we compare a decorrelation-based loss with a calibration-aware approach
using Classwise Expected Calibration Error (ECE). XGBoost is selected for its balance of
calibration performance and computational efficiency. Although models achieve over 70%
accuracy in static evaluation, betting simulations yield limited profitability—likely due to
bookmaker margins and lack of temporal adaptation. Odds normalization improves returns,
revealing the structural disadvantage of betting against market-implied prices. Our findings
highlight the importance of aligning model calibration with financial objectives and motivate
future work on Bayesian Kelly optimization, time-aware modeling, and portfolio-level betting
strategies.
2 Introduction
Sports betting, particularly soccer betting, is a multi-billion-dollar global industry, with soccer
alone projected to generate $53 billion in gross gaming revenue from $570 billion in total wagers
in 2024—accounting for over 56% of the global sports betting market. In a standard 1X2 format,
bettors stake money on one of three outcomes: home win, draw, or away win. Payouts are
determined by multiplying the stake by the odds, but if the prediction is wrong, the full stake is
lost.
Despite its simplicity, the system is designed to favor bookmakers, who embed profit margins
into the odds. As a result, casual bettors rarely earn sustained profits unless they can identify
inefficiencies in the market. Recent advancements in machine learning offer promising tools to
do so. By training models on historical match data, we can generate probabilistic forecasts that
are more objective than human judgment. Still, models that closely track bookmaker odds tend to
be unprofitable, as they reflect the same embedded margins.
To overcome this, we propose a two-stage framework: first, a modified loss function is used to
decorrelate model predictions from bookmaker odds; then, a fractional Kelly optimization
strategy allocates capital in a risk-aware manner. This integrated approach aims to maximize
long-term profit growth while mitigating risk in a highly uncertain environment.
1
3 Related Work
As aforementioned, we are adopting a modified loss function and fractional Kelly optimization
strategy for optimizing long-term profit and risk management. Therefore, in the following
section, we will explore, explain, and integrate related research in machine learning and Kelly
optimization strategies with different variations.
3.1 Machine Learning
This research paper suggests a model that punishes the betting algorithm if it is close to the
bookmakers strategies and win rate oriented probilities. Therefore, the researchers betting
algorithm is less recognizable and susceptible for book makers. However, if one wants to
maximize the profit without utilizing book makers odds. It is important to train the machine
learning models on the past matches to recognize and exploit high-odds matches. Therefore,
betters and the algorithm can sustain profit with book-makers adjusting their odds. (Hubáček,
Šourek, & Železný, 2019)
3.2 Kelly Betting Strategy
The research paper “Kelly System for Investing and Kuhn-Tucker Conditions” provides a
fundamental understanding of the Kelly System, its variations, and methods of adaptation to the
sports betting market. The Kelly System is a well-received investment and game strategy
developed by John Kelly Jr. in Bell Labs. The system maximizes the expected value of asset
geometric growth by controlling two aspects: optimizing the individual bet size considering total
bankroll and managing lost chances (Boucherie, n.d.).
However, the Kelly System is challenged in the contemporary online sports betting scene. This
system focuses on compound growth over a period of time, however, online betting companies
are fast strategies. Furthermore, the betting game becomes extremely fast-templed, with multiple
variables, diverse betting areas, and tight latency budgets. Hence, we want a closed form, fast
algorithm, and multitasking method that performs the optimization to increase the income. The
paper introduces the Karush–Kuhn–Tucker equation to modify the Kelly betting strategy for the
investment algorithm to adapt to the market (Boucherie, n.d.).
The Goal of the Kelly System is to optimize individual investment with strategies to maximize
the growth rate of capital investment. The strategy is to invest a fixed proposition of the bankroll
every investment or betting cycle, and one can expect long-term growth. However, if one
bankroll is a significant amount, and the investment strategy is fixed, then the betting companies
would quickly adjust against these strategies. Also, there are more categories of betting, of which
one could take advantage. For example, in e-sports science, one can bet kills per game or kill
before a time threshold. In traditional soccer betting, there are live odds, the place of the goal,
and shots before a threshold. To get back from the divergence, the alphas in the equation is the
betting amount each time against the current bankroll. So, the beta is the amount one preserves
(Boucherie, n.d.).
2
The investments accumulated is Vm, which is the summation of all the Kelly models after each
investment cycle. The R in the equation below is the factor determining the win or loss of each
cycle. (Boucherie, n.d., p. 5)
To address the limitations of the Kelly Model aforementioned, the paper suggests applying
Karush–Kuhn–Tucker conditions and transforming the Kelly Model from a betty strategy into an
optimization model. Using the Karush–Kuhn–Tucker conditions, the better want to maximize
each alpha in their Kelly, such that the extended investment is optimized and maximized. The
paper explains the transformation of the Kelly model, by forest turning it into an optimization
problem. (Boucherie, n.d., p. 6)
(Boucherie, n.d.)
The paper rewrites the optimization problem with Karush–Kuhn–Tucker conditions.
(Boucherie, n.d.)
Ultimately, turning the Kelly strategy into an algorithm.
4 Machine Learning for Match Outcome Prediction
4.1. Data features
To predict soccer match outcomes, we extract features from the EPL dataset that capture team
performance and match context, explicitly excluding bookmaker odds to avoid market influence.
For each match, we combine features from the home and away teams, based on their prior
matches within the same season—a common window reflecting relevant form.
Team-level features include Elo ratings, win rates, goals scored/conceded, and goal differences
(Eastwood, 2025). Optional player-level features aggregate stats like key passes or tackles but
require reliable lineup data. Recent form is captured via rolling averages over the last 3–5
matches, helping models adapt to short-term momentum (Constantinou & Fenton, 2013).
3
Additional statistics include expected goals (xG), clean sheets, and goal variance (Mead et al.,
2023).
Contextual factors—such as home/away status, rest days, and match timing—account for fatigue
and scheduling effects (Lago-Peñas et al., 2011; Pollard, 2008). Team investment metrics, like
market value or salary, serve as proxies for long-term strength (He, 2019).
All features are numeric and combined into a vector , used as input to the predictive
function . While odds can be informative, we exclude them to ensure predictions reflect
independent signals (Hubáček et al., 2019). In Section 6, we compare models trained with and
without odds to evaluate their impact.
4.2 Loss function Selection
4.2.1 Decorrelation Strategy
High accuracy alone doesn't ensure profitability if model predictions align too closely with
bookmaker odds. To counter this, we adopt the decorrelation strategy from Hubáček et al.
(2019), which modifies the loss function to penalize similarity to bookmaker-implied
probabilities, encouraging market-independent predictions.
, where is the offered odds for the winning outcome. The combined loss is:
The constant c controls the strength of the decorrelation effect. A higher c increases the model's
independence from bookmaker predictions, potentially allowing it to spot mispriced
opportunities, even at the cost of slightly lower accuracy. We set c = 0.4 here to get a good
trade-off between deviation with odds and accuracy.
We revise this approach by replacing with the normalized implied probability derived from all
outcomes in the match:
This ensures that the penalty term compares our predicted probability against a proper
probability distribution over outcomes, improving consistency with probability-based calibration
and allowing fairer evaluation across matches with varying margin sizes.
This strategy is particularly useful for spotting market inefficiencies. As shown by Hubáček et al.
(2019), reducing correlation with bookmaker odds increases the likelihood of identifying upsets
and undervalued outcomes. We adapt their loss formulation for our three-class setting.
4
4.2.2 Classwise-ECE
In sports betting, well-calibrated probability estimates are often more valuable than raw
classification accuracy, as wagering relies on both correct predictions and appropriate
confidence. Following Walsh and Joshi (2023), we adopt a calibration-aware model selection
strategy to improve downstream betting performance.
To measure calibration, we use Expected Calibration Error (ECE) and its multiclass variant,
Classwise-ECE. Unlike traditional ECE, which may be biased under class imbalance,
Classwise-ECE evaluates calibration per class and averages the results, offering a more robust
assessment of probabilistic reliability.
where:
is the number of outcome classes,
denotes the set of predictions for class cc falling into bin jj,
is the number of samples in bin jj for class cc,
is the total number of samples,
is the empirical accuracy in the bin, and
is the average predicted confidence for class cc in that bin.
We evaluate models on a held-out validation set using Classwise-ECE and select the
best-calibrated model for downstream betting. No post-hoc calibration (e.g., Platt or temperature
scaling) is applied; calibration quality stems from the model's architecture, loss, and
regularization.
This aligns with our goal of maximizing expected return rather than accuracy alone. Even
accurate models can underperform if poorly calibrated, especially when overconfident in
low-probability outcomes. Prioritizing calibration improves stake sizing and risk management by
ensuring predicted probabilities better reflect true outcome frequencies.
4.2.3 Selection Criterion
In practice, we evaluate both the decorrelation-augmented loss and the calibration-aware
selection approach based on their effectiveness in supporting profitable betting decisions. While
decorrelation encourages market divergence and value discovery, calibration improves the
reliability of predicted probabilities for stake sizing. We compare both strategies using expected
5
return and positive-EV hit rate, and retain the one with superior profitability for downstream
testing.
5 Optimization
5.1 Optimization Function
To optimize the Kelly criterion under real-world betting constraints, we formulated a constrained
nonlinear optimization problem. The objective was to maximize the expected logarithmic wealth
by determining optimal bet allocations (alpha values) across the three possible match outcomes:
home, draw, and away. These allocations had to satisfy two key constraints: each alpha must lie
between 0 and 1, and the total sum of the alphas must be less than or equal to 1, reflecting a
realistic betting budget.
The specific function we optimized was:
where is the predicted probability of outcome , is the bookmaker's odds, and is the
fraction of total capital allocated to that outcome. This formulation aligns with the classic Kelly
criterion, originally proposed by Kelly (1956), and ensures that capital is allocated in a way that
maximizes long-term expected log-wealth, while directly accounting for both model confidence
and odds-implied edge.
Our approach differs from the generalized multi-outcome Kelly formulation often used in
portfolio theory, which includes an extra term to account for the probability that none of the
chosen bets pays off. That formulation is:
This extended version appears in the literature on multi-asset portfolio betting (Smoczynski &
Tomkins, 2010), and is relevant when the probability distribution is incomplete or when there's a
non-zero chance that none of the bets succeed. In financial or multi-asset settings, that additional
term models scenarios where none of the outcomes happen (i.e., all bets lose), and it prevents
overbetting when probabilities are uncertain or don't sum to 1. However, in our case—betting on
soccer match outcomes—exactly one outcome always occurs, and our predicted probabilities are
explicitly modeled to sum to 1. This makes the second term vanish and allows us to simplify the
optimization while preserving the theoretical foundation. Omitting the extra term reduces
computational complexity and makes interpretation easier, without sacrificing correctness.
Whelan (2023) also supports this simplification in betting contexts where the events are mutually
exclusive and exhaustive.
6
5.2 Optimization Solver
We used the SLSQP (Sequential Least Squares Programming) algorithm due to its strength in
handling nonlinear objectives with inequality constraints (Press et al., 2007). This makes it
well-suited for our problem, which involves both individual bounds and a total budget limit.
Unlike grid search or unconstrained methods, SLSQP guarantees feasibility at each step while
leveraging gradient-based updates for fast, accurate convergence.
Figure 1. Kelly Optimization Convergence (Match 0)
Figure 1 above shows the optimization progress for one example match (Match 0). The vertical
axis represents the negative log-utility, which is the loss that the optimizer is minimizing. The
loss decreases in steps, showing that the optimizer initially explored a flat region before finding
better bet allocations in later iterations. After about 10 steps, the changes become small,
indicating that the optimizer is fine-tuning the solution. This curve demonstrates that the Kelly
optimization is both stable and efficient for the problem structure we designed, and confirms that
the optimizer quickly converges to a well-behaved solution under real-world constraints.
6 Results and Discussion
6.1 ML Model Selection
For model selection, we evaluate four classifiers suited for probabilistic prediction: LightGBM,
XGBoost, Gradient Boosting, and Logistic Regression. Results on a structured three-class dataset
(Home Win, Draw, Away Win) show that LightGBM achieves the highest accuracy at 78.6%,
followed by XGBoost at 75.6%. Both tree-based models perform well on home and away
outcomes, while draw prediction remains challenging due to class imbalance. Gradient Boosting
and Logistic Regression perform less competitively, with accuracies of 67.2% and 64.4%,
respectively.
7
Figure 2. Confusion Matrices and Accuracy Scores for Four Classification Models
These results may overstate real-world performance, as they are based on clean historical data
without noise or missing features typical in live settings. While LightGBM achieves the highest
accuracy, we select XGBoost for downstream tasks due to its faster training speed and
scalability. XGBoost also supports flexible objective customization, making it more suitable for
implementing decorrelation-aware and Classwise-ECE–regularized losses in large-scale
experiments.
6.1.2 Accuracy and decorrelation analysis
Table 1. Model Calibration Comparison and Accuracy Impact of Decorrelation Loss
We then analyze model accuracy under decorrelation constraints, aiming to balance prediction
deviation from bookmaker odds with the Kelly criterion’s sensitivity to probability errors. The
table above summarizes each model’s Classwise Expected Calibration Error (ECE) and the
corresponding accuracy change when trained with decorrelation loss. ECE reflects how well
predicted probabilities match observed outcomes, while the accuracy drop quantifies the
trade-off from reducing correlation with bookmaker pricing.
XGBoost achieves the lowest ECE at 3.05% but shows a −2.0% accuracy decrease under
decorrelation, indicating moderate sensitivity. LightGBM follows with an ECE of 3.12% and a
8
−1.9% drop, demonstrating both strong calibration and robustness. Gradient Boosting shows a
higher ECE of 3.47% with a smaller −1.5% decline, while Logistic Regression records the
highest ECE at 3.61% and the largest drop (−2.2%), likely due to limited adaptability.
These results highlight the advantage of tree-based ensemble methods—particularly XGBoost
and LightGBM—for producing well-calibrated, resilient predictions suitable for risk-aware
betting optimization.
Table 2. Accuracy change by prediction correlation under Classwise-ECE regularization and
decorrelation loss settings
To evaluate the impact of loss design, we analyze model accuracy under varying correlations
with both true outcomes and bookmaker-implied probabilities. As shown in the table, increasing
correlation with the true distribution—from 0.85 to 0.95—consistently improves accuracy from
74.1% to 76.9% under Classwise-ECE regularization. In contrast, changes in correlation with
bookmaker probabilities have minimal effect, with accuracy shifts under 0.03 percentage points.
Across all settings, models trained with Classwise-ECE consistently outperform those using
decorrelation loss, confirming that calibrating to ground-truth outcomes yields better predictive
performance than merely diverging from market odds. Based on this, we adopt the
Classwise-ECE–regularized model for downstream betting simulations, balancing predictive
strength with calibration fidelity.
6.2 Betting Optimization focus
6.2.1 Kelly vs. Greedy: Strategy Comparison under Bookmaker Odds
We conducted a comparative evaluation between the Kelly strategy and a greedy baseline across
600 historical soccer matches. Each strategy committed a fixed budget of $1000 per match. The
Kelly strategy allocated the budget fractionally across the three outcomes—home, draw, and
away—based on the optimized alpha values generated from our constrained Kelly optimization.
In contrast, the greedy strategy placed the full $1000 on the outcome with the highest predicted
probability, without considering odds inefficiencies or diversification.
9
Figure 3. Cumulative Returns of Kelly and Greedy Strategies
As shown in Figure 3, the cumulative return curve shows that the Kelly strategy consistently
outperformed the greedy approach in terms of long-term profitability. Although both strategies
invested the same total capital across all matches, Kelly achieved a higher overall return. Its
return curve remains close to the total amount invested, indicating that it minimizes capital drag
and achieves high efficiency. The greedy strategy, while occasionally outperforming in isolated
segments, underperforms cumulatively due to its inability to adapt position sizing based on risk
and model confidence.
Figure 4. Per-Match Return Comparison: Kelly vs. Greedy Strategies
To analyze performance in more detail, we examined per-match return volatility. The Kelly
strategy, while exhibiting higher variance, occasionally produces large spikes in returns. These
occur when small allocations to high-odds outcomes—guided by strong model predictions—lead
to significant payouts. Unlike overbetting, these gains result from budget-conscious scaling
based on expected edge. In contrast, the greedy strategy places a fixed $1000 on a single
outcome every time, leading to modest wins or full losses. This rigid approach lacks the
10
flexibility to adjust for uncertainty, making it especially vulnerable when model predictions are
only marginally different or overconfident.
The Kelly strategy’s strength lies in its adaptability. It increases investment when confidence is
high and conserves capital when signals are weak, providing better risk control and capital
efficiency. This dynamic sizing allows it to exploit market inefficiencies while reducing exposure
to error. Meanwhile, the greedy strategy amplifies mistakes by committing fully regardless of
edge strength. However, despite outperforming in risk-adjusted terms, the Kelly strategy still
fails to generate sustained profits across the full dataset—suggesting that bookmaker margins
present a structurally unfavorable environment for long-term gains.
6.2.2 Removing Bookmaker Bias: Odds Normalization and Profitability Recovery
Bookmakers embed a built-in profit margin by slightly underpaying on all outcomes. This
margin inflates the total implied probability above 1, meaning that even if a bettor had perfect
foresight, the expected return could still be negative unless the odds are adjusted. To investigate
whether this margin was a key driver of underperformance, we normalized the odds to remove
the margin before re-running the full Kelly optimization and evaluation pipeline. The
normalization method is documented in 4.2.1 Decorrelation Strategy.
Figure 5. Cumulative Returns of Kelly and Greedy Strategies (Normalized Odds)
After applying this normalization, the revised cumulative return curve shows meaningful
improvement, as Figure 5 showed. The Kelly strategy now surpasses both the greedy benchmark
and the total money invested line, indicating positive profitability under fair odds. This strongly
supports the hypothesis that the original stagnation in performance was due to bookmaker
overround, not a failure of the Kelly framework or the prediction model. When evaluated in a
level playing field, the Kelly strategy demonstrates its theoretical advantage—compounding
statistically favorable bets into net profit.
11
7 Limitations and Future Work
7.1 Probability Prediction Focus
Despite strong offline accuracy and calibration, our strategy failed to deliver steady profits,
exposing structural flaws. A key issue is that Classwise ECE, while central to model selection, is
nondifferentiable; treebased models like LightGBM and XGBoost therefore rely on posthoc
checks. Embedding differentiable surrogates—such as softbinning or Maximum  Mean
 Calibration Error—into neural architectures could enable true endtoend calibration.
A second limitation is the absence of dynamic updating. The current pipeline evaluates on a
fixed snapshot, ignoring the continual flow of new matches that shapes real betting markets.
Incorporating rolling or walkforward testing, together with timesensitive features (e.g.,
exponentialdecay weights, momentum embeddings), would better capture evolving team form
and yield more realistic performance estimates (Constantinou & Fenton 2013; Groll et al. 2019).
Finally, the disconnect between classification accuracy and return suggests that our loss
functions and metrics remain valueagnostic. Overconfidence, calibration drift, and the lack of
profitweighted objectives can all erode expected gains. Future work should couple probability
forecasts with return optimization through bettingspecific and profitaware objectives.
7.2 Betting Optimization Focus
The Kelly strategy is highly sensitive to errors in predicted probabilities, especially near 0 or 1.
Its logarithmic utility function amplifies the impact of overestimations, often resulting in
disproportionate bets and potential losses. This risk is well-known in finance and gambling,
where even well-calibrated models can perform poorly under full Kelly allocation (MacLean,
Thorp, & Ziemba, 2010). To mitigate this, many apply fractional Kelly betting—e.g., 50% of the
recommended stake—to reduce volatility and guard against estimation noise. Although we did
not explore this approach, future work could evaluate fractional strategies to assess the trade-off
between growth and stability.
Second, our approach treats predicted probabilities as fixed point estimates, ignoring uncertainty
from model variance or misspecification. In reality, these probabilities stem from machine
learning models and are inherently uncertain. A more robust alternative is the Bayesian Kelly
criterion, which treats predictions as random variables and adjusts bet sizing based on posterior
distributions (Browne & Whitt, 1996). This probabilistic method offers greater resilience in
noisy or data-scarce settings, reducing the risk of overbetting and potentially improving
long-term performance.
Third, our model uses single-match optimization, treating each bet independently. While suitable
for evaluating individual outcomes, this approach overlooks capital allocation across multiple
concurrent bets—a more realistic scenario for active bettors. Portfolio-based extensions of the
Kelly criterion (Bell & Cover, 1988) show that diversifying across correlated bets can improve
capital growth and reduce drawdown. Integrating portfolio-level optimization would better
reflect real-world strategies and enable more effective budget management across full game
slates.
12
8 Teammates Contribution
8.1 Yiping Gao
I was responsible for designing and implementing the optimization component of our project,
which applied the Kelly criterion to allocate bets under realistic constraints. I formulated the task
as a constrained nonlinear optimization problem to maximize expected logarithmic wealth,
ensuring each allocation remained between 0 and 1 and the total budget was not exceeded. I
implemented the solution using the SLSQP algorithm from scipy.optimize for its ability to
handle nonlinear objectives with inequality constraints. To analyze optimizer performance, I
developed visualizations to track convergence and loss reduction across iterations. I also
documented the optimization design and process in detail, covering both the mathematical
formulation and the rationale for algorithm choices.
Beyond implementation, I simplified the generalized Kelly formula by removing the “no
outcome wins” term, as our setting guarantees exactly one outcome per match with probabilities
summing to one. I evaluated the strategy’s effectiveness through cumulative and per-match
return plots and compared it to a greedy baseline strategy. After identifying margin bias in
bookmaker odds, I normalized the odds using proportional adjustment and re-ran the pipeline to
assess impact. I also documented the experimental results and visualizations for the optimization
section.Finally, I wrote the limitations and future work section for the optimization portion of the
project, focusing on Kelly sensitivity to probability errors, the potential benefits of Bayesian
extensions, and the need to extend beyond single-match optimization to portfolio-level betting
strategies.
8.2 Emily Zhang
I led the machine learning component of the project, focusing on match outcome prediction,
probability calibration, and custom loss function design. I implemented and evaluated multiple
models—including Logistic Regression, Gradient Boosting, LightGBM, and XGBoost—using
classification accuracy and Classwise Expected Calibration Error (ECE) as primary evaluation
metrics. These metrics guided model selection, helping us identify the most well-calibrated
variant for downstream betting.
To better understand model behavior, I conducted a correlation analysis between predicted
probabilities, actual match outcomes, and bookmaker-implied odds. This analysis informed the
implementation of two loss strategies: Classwise-ECE–based calibration and a
decorrelation-augmented loss function that penalizes alignment with bookmaker pricing. While
the decorrelation approach encouraged divergence from market odds, the ECE-regularized model
proved more stable and was ultimately chosen for final evaluation.
I also worked on feature engineering, integrating team performance metrics and recent match
outcomes. Although we initially used a simple categorical encoding of recent form, I identified
the potential benefits of incorporating time-sensitive techniques such as decay-weighted
tracking. In addition, I contributed to the written sections on model calibration, prediction
performance, and system limitations—especially the misalignment between static accuracy and
economic performance in real-world betting.
13
8.3 Bill Li
I conducted research and gathered information for the presentation and research paper.
Therefore, I worked on the research paper portion of the presentation and explanation of the
Kelly Model.
I researched and analyzed the Kelly Model, Bayesian Kelly criterion, and Kuhn-Tucker
Conditions to discover ways to apply it to soccer betting. Reasonably, I worked on the related
research Portion of the paper.
I designed a Kelly Model according to the information on the research papers, and applied our
data set to the model to discover whether it is profitable.
8.4 Summer Sun
I contributed to both the machine-learning pipeline and the analytical evaluation of the full
prediction–betting framework. I first designed and implemented the complete data preprocessing
pipeline, including cleaning historical match data, normalizing bookmaker odds across data
sources, resolving team-name inconsistencies, and structuring feature matrices for model
training. This ensured consistent inputs across all models and enabled reliable repeated
experimentation.
Building on this foundation, I collaborated with Emily on the development and assessment of our
probability-calibration framework. I implemented Classwise Expected Calibration Error (ECE)
as a model-selection criterion, wrote helper functions for computing per-class calibration curves,
and evaluated multiple loss-design variants, including a decorrelation-augmented loss penalizing
alignment with bookmaker-implied probabilities. These experiments clarified how calibration
quality affects downstream betting performance and informed the final selection of the
ECE-regularized model.
In addition to calibration analysis, I conducted diagnostic studies on bookmaker margin bias. I
generated plots comparing predicted probabilities, implied probabilities under varying margin
sizes, and empirical outcomes, helping the team evaluate whether odds normalization produced
more stable behavior under Kelly optimization.
Finally, I integrated the preprocessing, calibration, and optimization components into a unified
end-to-end workflow, allowing the team to run large-scale experiments and benchmarking
efficiently.
14
9 Reference
Baio, G., & Blangiardo, M. (2010). Bayesian hierarchical model for the prediction of football
results. Journal of Applied Statistics, 37(2), 253–264.
Bell, R. M., & Cover, T. M. (1988). Game-theoretic optimal portfolios. Management Science,
34(6), 724–733. https://doi.org/10.1287/mnsc.34.6.724
Browne, S., & Whitt, W. (1996). Portfolio choice and the Bayesian Kelly criterion. Advances in
Applied Probability, 28(4), 1145–1176. https://doi.org/10.2307/1428183
Boucherie, R. J. (n.d.). Addendum to 5.8.4: Kelly System for Investing and Kuhn-Tucker
Conditions. University of Twente.
https://www.utwente.nl/en/eemcs/sor/boucherie/Operations%20Research/584operationsresearchk
ellybetting.pdf
Constantinou, A. C., & Fenton, N. E. (2013). Determining the level of ability of football teams
by dynamic ratings based on the relative discrepancies in scores between adversaries. Journal of
Quantitative Analysis in Sports, 9(1), 37–50.
Eastwood, M. (2025, April 14). Pi Ratings: The smarter way to rank football teams. Pi Ratings.
Groll, A., Schauberger, G., & Tutz, G. (2019). Prediction of major international soccer
tournaments based on team-specific regularized Poisson regression. Statistical Modelling, 19(1),
55–77.
He, Q. (2019). Match performance variables influential in selection into the English national
football team (B.S. thesis). School of Biological Sciences, Nanyang Technological University,
Singapore.
Hubáček, O., Šourek, G., & Železný, F. (2019). Exploiting sports-betting market using machine
learning. International Journal of Forecasting, 35(2), 712–723.
Kelly, J. L. (1956). A new interpretation of information rate. Bell System Technical Journal,
35(4), 917–926. https://doi.org/10.1002/j.1538-7305.1956.tb03809.x
Lago-Peñas, C., Lago-Ballesteros, J., & Rey, E. (2011). The influence of a congested calendar on
performance in elite soccer. Journal of Strength and Conditioning Research, 25(8), 2111–2117.
MacLean, L. C., Thorp, E. O., & Ziemba, W. T. (2010). Long-term capital growth: The good and
bad properties of the Kelly and fractional Kelly capital growth criteria. Quantitative Finance,
10(7), 681–687. https://doi.org/10.1080/14697680903162379
15
Mead, J., O’Hare, A., & McMenemy, P. (2023). Expected goals in football: Improving model
performance and demonstrating value. PLOS ONE, 18(4), e0282295.
https://doi.org/10.1371/journal.pone.0282295
Pollard, R. (2008). Home advantage in football: A current review of an unsolved puzzle. The
Open Sports Sciences Journal, 1(1), 12–14.
Press, W. H., Teukolsky, S. A., Vetterling, W. T., & Flannery, B. P. (2007). Numerical recipes:
The art of scientific computing (3rd ed.). Cambridge University Press.
Rico-González, M., Ortega, J. P., & Clemente, F. (2023). Machine learning application in soccer:
A systematic review. Biology of Sport – Open, 40(1), 249–263.
Smoczynski, P., & Tomkins, D. (2010). An explicit solution to the problem of optimizing the
allocations of a bettors wealth when wagering on horse races. Mathematical Scientist, 35(1),
10–17.
Walsh, C., & Joshi, A. (2023). Machine learning for sports betting: Should model selection be
based on accuracy or calibration? arXiv preprint. https://arxiv.org/abs/2303.06021
Whelan, K. (2023). Fortune’s Formula or the Road to Ruin? The Generalized Kelly Criterion
with Multiple Outcomes[Working paper]. SSRN. https://doi.org/10.2139/ssrn.4382822
16