Predict-Then-Bet: ML-Based Probability Forecasting

and Fractional Kelly Optimization

Emily Zhang, Summer Sun, Bill Li, Yiping Gao

May 4 2025

1 Abstract

In the high-stakes world of soccer betting, predictive accuracy alone does not guarantee

profitability. We develop a machine learning–based betting strategy that combines probabilistic

outcome prediction with capital allocation via the Kelly criterion. To avoid replicating

bookmaker biases, we compare a decorrelation-based loss with a calibration-aware approach

using Classwise Expected Calibration Error (ECE). XGBoost is selected for its balance of

calibration performance and computational efficiency. Although models achieve over 70%

accuracy in static evaluation, betting simulations yield limited profitability—likely due to

bookmaker margins and lack of temporal adaptation. Odds normalization improves returns,

revealing the structural disadvantage of betting against market-implied prices. Our findings

highlight the importance of aligning model calibration with financial objectives and motivate

future work on Bayesian Kelly optimization, time-aware modeling, and portfolio-level betting

strategies.

2 Introduction

Sports betting, particularly soccer betting, is a multi-billion-dollar global industry, with soccer

alone projected to generate $53 billion in gross gaming revenue from $570 billion in total wagers

in 2024—accounting for over 56% of the global sports betting market. In a standard 1X2 format,

bettors stake money on one of three outcomes: home win, draw, or away win. Payouts are

determined by multiplying the stake by the odds, but if the prediction is wrong, the full stake is

lost.

Despite its simplicity, the system is designed to favor bookmakers, who embed profit margins

into the odds. As a result, casual bettors rarely earn sustained profits unless they can identify

inefficiencies in the market. Recent advancements in machine learning offer promising tools to

do so. By training models on historical match data, we can generate probabilistic forecasts that

are more objective than human judgment. Still, models that closely track bookmaker odds tend to

be unprofitable, as they reflect the same embedded margins.

To overcome this, we propose a two-stage framework: first, a modified loss function is used to

decorrelate model predictions from bookmaker odds; then, a fractional Kelly optimization

strategy allocates capital in a risk-aware manner. This integrated approach aims to maximize

long-term profit growth while mitigating risk in a highly uncertain environment.

3 Related Work

As aforementioned, we are adopting a modified loss function and fractional Kelly optimization

strategy for optimizing long-term profit and risk management. Therefore, in the following

section, we will explore, explain, and integrate related research in machine learning and Kelly

optimization strategies with different variations.

3.1 Machine Learning

This research paper suggests a model that punishes the betting algorithm if it is close to the

bookmaker’s strategies and win rate oriented probilities. Therefore, the researcher’s betting

algorithm is less recognizable and susceptible for book makers. However, if one wants to

maximize the profit without utilizing book maker’s odds. It is important to train the machine

learning models on the past matches to recognize and exploit high-odds matches. Therefore,

betters and the algorithm can sustain profit with book-makers adjusting their odds. (Hubáček,

Šourek, & Železný, 2019)

3.2 Kelly Betting Strategy

The research paper “Kelly System for Investing and Kuhn-Tucker Conditions” provides a

fundamental understanding of the Kelly System, its variations, and methods of adaptation to the

sports betting market. The Kelly System is a well-received investment and game strategy

developed by John Kelly Jr. in Bell Labs. The system maximizes the expected value of asset

geometric growth by controlling two aspects: optimizing the individual bet size considering total

bankroll and managing lost chances (Boucherie, n.d.).

However, the Kelly System is challenged in the contemporary online sports betting scene. This

system focuses on compound growth over a period of time, however, online betting companies

are fast strategies. Furthermore, the betting game becomes extremely fast-templed, with multiple

variables, diverse betting areas, and tight latency budgets. Hence, we want a closed form, fast

algorithm, and multitasking method that performs the optimization to increase the income. The

paper introduces the Karush–Kuhn–Tucker equation to modify the Kelly betting strategy for the

investment algorithm to adapt to the market (Boucherie, n.d.).

The Goal of the Kelly System is to optimize individual investment with strategies to maximize

the growth rate of capital investment. The strategy is to invest a fixed proposition of the bankroll

every investment or betting cycle, and one can expect long-term growth. However, if one

bankroll is a significant amount, and the investment strategy is fixed, then the betting companies

would quickly adjust against these strategies. Also, there are more categories of betting, of which

one could take advantage. For example, in e-sports science, one can bet kills per game or kill

before a time threshold. In traditional soccer betting, there are live odds, the place of the goal,

and shots before a threshold. To get back from the divergence, the alphas in the equation is the

betting amount each time against the current bankroll. So, the beta is the amount one preserves

(Boucherie, n.d.).

The investments accumulated is Vm, which is the summation of all the Kelly models after each

investment cycle. The R in the equation below is the factor determining the win or loss of each

cycle. (Boucherie, n.d., p. 5)

To address the limitations of the Kelly Model aforementioned, the paper suggests applying

Karush–Kuhn–Tucker conditions and transforming the Kelly Model from a betty strategy into an

optimization model. Using the Karush–Kuhn–Tucker conditions, the better want to maximize

each alpha in their Kelly, such that the extended investment is optimized and maximized. The

paper explains the transformation of the Kelly model, by forest turning it into an optimization

problem. (Boucherie, n.d., p. 6)

(Boucherie, n.d.)

The paper rewrites the optimization problem with Karush–Kuhn–Tucker conditions.

(Boucherie, n.d.)

Ultimately, turning the Kelly strategy into an algorithm.

4 Machine Learning for Match Outcome Prediction

4.1. Data features

To predict soccer match outcomes, we extract features from the EPL dataset that capture team

performance and match context, explicitly excluding bookmaker odds to avoid market influence.

For each match, we combine features from the home and away teams, based on their prior

matches within the same season—a common window reflecting relevant form.

Team-level features include Elo ratings, win rates, goals scored/conceded, and goal differences

(Eastwood, 2025). Optional player-level features aggregate stats like key passes or tackles but

require reliable lineup data. Recent form is captured via rolling averages over the last 3–5

matches, helping models adapt to short-term momentum (Constantinou & Fenton, 2013).

Additional statistics include expected goals (xG), clean sheets, and goal variance (Mead et al.,

2023).

Contextual factors—such as home/away status, rest days, and match timing—account for fatigue

and scheduling effects (Lago-Peñas et al., 2011; Pollard, 2008). Team investment metrics, like

market value or salary, serve as proxies for long-term strength (He, 2019).

All features are numeric and combined into a vector , used as input to the predictive

function . While odds can be informative, we exclude them to ensure predictions reflect

independent signals (Hubáček et al., 2019). In Section 6, we compare models trained with and

without odds to evaluate their impact.

4.2 Loss function Selection

4.2.1 Decorrelation Strategy

High accuracy alone doesn't ensure profitability if model predictions align too closely with

bookmaker odds. To counter this, we adopt the decorrelation strategy from Hubáček et al.

(2019), which modifies the loss function to penalize similarity to bookmaker-implied

probabilities, encouraging market-independent predictions.

, where is the offered odds for the winning outcome. The combined loss is:

The constant c controls the strength of the decorrelation effect. A higher c increases the model's

independence from bookmaker predictions, potentially allowing it to spot mispriced

opportunities, even at the cost of slightly lower accuracy. We set c = 0.4 here to get a good

trade-off between deviation with odds and accuracy.

We revise this approach by replacing with the normalized implied probability derived from all

outcomes in the match:

This ensures that the penalty term compares our predicted probability against a proper

probability distribution over outcomes, improving consistency with probability-based calibration

and allowing fairer evaluation across matches with varying margin sizes.

This strategy is particularly useful for spotting market inefficiencies. As shown by Hubáček et al.

(2019), reducing correlation with bookmaker odds increases the likelihood of identifying upsets

and undervalued outcomes. We adapt their loss formulation for our three-class setting.

4.2.2 Classwise-ECE

In sports betting, well-calibrated probability estimates are often more valuable than raw

classification accuracy, as wagering relies on both correct predictions and appropriate

confidence. Following Walsh and Joshi (2023), we adopt a calibration-aware model selection

strategy to improve downstream betting performance.

To measure calibration, we use Expected Calibration Error (ECE) and its multiclass variant,

Classwise-ECE. Unlike traditional ECE, which may be biased under class imbalance,

Classwise-ECE evaluates calibration per class and averages the results, offering a more robust

assessment of probabilistic reliability.

where:

● is the number of outcome classes,

● denotes the set of predictions for class cc falling into bin jj,

● is the number of samples in bin jj for class cc,

● is the total number of samples,

● is the empirical accuracy in the bin, and

● is the average predicted confidence for class cc in that bin.

We evaluate models on a held-out validation set using Classwise-ECE and select the

best-calibrated model for downstream betting. No post-hoc calibration (e.g., Platt or temperature

scaling) is applied; calibration quality stems from the model's architecture, loss, and

regularization.

This aligns with our goal of maximizing expected return rather than accuracy alone. Even

accurate models can underperform if poorly calibrated, especially when overconfident in

low-probability outcomes. Prioritizing calibration improves stake sizing and risk management by

ensuring predicted probabilities better reflect true outcome frequencies.

4.2.3 Selection Criterion

In practice, we evaluate both the decorrelation-augmented loss and the calibration-aware

selection approach based on their effectiveness in supporting profitable betting decisions. While

decorrelation encourages market divergence and value discovery, calibration improves the

reliability of predicted probabilities for stake sizing. We compare both strategies using expected

return and positive-EV hit rate, and retain the one with superior profitability for downstream

testing.

5 Optimization

5.1 Optimization Function

To optimize the Kelly criterion under real-world betting constraints, we formulated a constrained

nonlinear optimization problem. The objective was to maximize the expected logarithmic wealth

by determining optimal bet allocations (alpha values) across the three possible match outcomes:

home, draw, and away. These allocations had to satisfy two key constraints: each alpha must lie

between 0 and 1, and the total sum of the alphas must be less than or equal to 1, reflecting a

realistic betting budget.

The specific function we optimized was:

where is the predicted probability of outcome , is the bookmaker's odds, and is the

fraction of total capital allocated to that outcome. This formulation aligns with the classic Kelly

criterion, originally proposed by Kelly (1956), and ensures that capital is allocated in a way that

maximizes long-term expected log-wealth, while directly accounting for both model confidence

and odds-implied edge.

Our approach differs from the generalized multi-outcome Kelly formulation often used in

portfolio theory, which includes an extra term to account for the probability that none of the

chosen bets pays off. That formulation is:

This extended version appears in the literature on multi-asset portfolio betting (Smoczynski &

Tomkins, 2010), and is relevant when the probability distribution is incomplete or when there's a

non-zero chance that none of the bets succeed. In financial or multi-asset settings, that additional

term models scenarios where none of the outcomes happen (i.e., all bets lose), and it prevents

overbetting when probabilities are uncertain or don't sum to 1. However, in our case—betting on

soccer match outcomes—exactly one outcome always occurs, and our predicted probabilities are

explicitly modeled to sum to 1. This makes the second term vanish and allows us to simplify the

optimization while preserving the theoretical foundation. Omitting the extra term reduces

computational complexity and makes interpretation easier, without sacrificing correctness.

Whelan (2023) also supports this simplification in betting contexts where the events are mutually

exclusive and exhaustive.

5.2 Optimization Solver

We used the SLSQP (Sequential Least Squares Programming) algorithm due to its strength in

handling nonlinear objectives with inequality constraints (Press et al., 2007). This makes it

well-suited for our problem, which involves both individual bounds and a total budget limit.

Unlike grid search or unconstrained methods, SLSQP guarantees feasibility at each step while

leveraging gradient-based updates for fast, accurate convergence.

Figure 1. Kelly Optimization Convergence (Match 0)

Figure 1 above shows the optimization progress for one example match (Match 0). The vertical

axis represents the negative log-utility, which is the loss that the optimizer is minimizing. The

loss decreases in steps, showing that the optimizer initially explored a flat region before finding

better bet allocations in later iterations. After about 10 steps, the changes become small,

indicating that the optimizer is fine-tuning the solution. This curve demonstrates that the Kelly

optimization is both stable and efficient for the problem structure we designed, and confirms that

the optimizer quickly converges to a well-behaved solution under real-world constraints.

6 Results and Discussion

6.1 ML Model Selection

For model selection, we evaluate four classifiers suited for probabilistic prediction: LightGBM,

XGBoost, Gradient Boosting, and Logistic Regression. Results on a structured three-class dataset

(Home Win, Draw, Away Win) show that LightGBM achieves the highest accuracy at 78.6%,

followed by XGBoost at 75.6%. Both tree-based models perform well on home and away

outcomes, while draw prediction remains challenging due to class imbalance. Gradient Boosting

and Logistic Regression perform less competitively, with accuracies of 67.2% and 64.4%,

respectively.

Figure 2. Confusion Matrices and Accuracy Scores for Four Classification Models

These results may overstate real-world performance, as they are based on clean historical data

without noise or missing features typical in live settings. While LightGBM achieves the highest

accuracy, we select XGBoost for downstream tasks due to its faster training speed and

scalability. XGBoost also supports flexible objective customization, making it more suitable for

implementing decorrelation-aware and Classwise-ECE–regularized losses in large-scale

experiments.

6.1.2 Accuracy and decorrelation analysis

Table 1. Model Calibration Comparison and Accuracy Impact of Decorrelation Loss

We then analyze model accuracy under decorrelation constraints, aiming to balance prediction

deviation from bookmaker odds with the Kelly criterion’s sensitivity to probability errors. The

table above summarizes each model’s Classwise Expected Calibration Error (ECE) and the

corresponding accuracy change when trained with decorrelation loss. ECE reflects how well

predicted probabilities match observed outcomes, while the accuracy drop quantifies the

trade-off from reducing correlation with bookmaker pricing.

XGBoost achieves the lowest ECE at 3.05% but shows a −2.0% accuracy decrease under

decorrelation, indicating moderate sensitivity. LightGBM follows with an ECE of 3.12% and a

−1.9% drop, demonstrating both strong calibration and robustness. Gradient Boosting shows a

higher ECE of 3.47% with a smaller −1.5% decline, while Logistic Regression records the

highest ECE at 3.61% and the largest drop (−2.2%), likely due to limited adaptability.

These results highlight the advantage of tree-based ensemble methods—particularly XGBoost

and LightGBM—for producing well-calibrated, resilient predictions suitable for risk-aware

betting optimization.

Table 2. Accuracy change by prediction correlation under Classwise-ECE regularization and

decorrelation loss settings

To evaluate the impact of loss design, we analyze model accuracy under varying correlations

with both true outcomes and bookmaker-implied probabilities. As shown in the table, increasing

correlation with the true distribution—from 0.85 to 0.95—consistently improves accuracy from

74.1% to 76.9% under Classwise-ECE regularization. In contrast, changes in correlation with

bookmaker probabilities have minimal effect, with accuracy shifts under 0.03 percentage points.

Across all settings, models trained with Classwise-ECE consistently outperform those using

decorrelation loss, confirming that calibrating to ground-truth outcomes yields better predictive

performance than merely diverging from market odds. Based on this, we adopt the

Classwise-ECE–regularized model for downstream betting simulations, balancing predictive

strength with calibration fidelity.

6.2 Betting Optimization focus

6.2.1 Kelly vs. Greedy: Strategy Comparison under Bookmaker Odds

We conducted a comparative evaluation between the Kelly strategy and a greedy baseline across

600 historical soccer matches. Each strategy committed a fixed budget of $1000 per match. The

Kelly strategy allocated the budget fractionally across the three outcomes—home, draw, and

away—based on the optimized alpha values generated from our constrained Kelly optimization.

In contrast, the greedy strategy placed the full $1000 on the outcome with the highest predicted

probability, without considering odds inefficiencies or diversification.

Figure 3. Cumulative Returns of Kelly and Greedy Strategies

As shown in Figure 3, the cumulative return curve shows that the Kelly strategy consistently

outperformed the greedy approach in terms of long-term profitability. Although both strategies

invested the same total capital across all matches, Kelly achieved a higher overall return. Its

return curve remains close to the total amount invested, indicating that it minimizes capital drag

and achieves high efficiency. The greedy strategy, while occasionally outperforming in isolated

segments, underperforms cumulatively due to its inability to adapt position sizing based on risk

and model confidence.

Figure 4. Per-Match Return Comparison: Kelly vs. Greedy Strategies

To analyze performance in more detail, we examined per-match return volatility. The Kelly

strategy, while exhibiting higher variance, occasionally produces large spikes in returns. These

occur when small allocations to high-odds outcomes—guided by strong model predictions—lead

to significant payouts. Unlike overbetting, these gains result from budget-conscious scaling

based on expected edge. In contrast, the greedy strategy places a fixed $1000 on a single

outcome every time, leading to modest wins or full losses. This rigid approach lacks the

flexibility to adjust for uncertainty, making it especially vulnerable when model predictions are

only marginally different or overconfident.

The Kelly strategy’s strength lies in its adaptability. It increases investment when confidence is

high and conserves capital when signals are weak, providing better risk control and capital

efficiency. This dynamic sizing allows it to exploit market inefficiencies while reducing exposure

to error. Meanwhile, the greedy strategy amplifies mistakes by committing fully regardless of

edge strength. However, despite outperforming in risk-adjusted terms, the Kelly strategy still

fails to generate sustained profits across the full dataset—suggesting that bookmaker margins

present a structurally unfavorable environment for long-term gains.

6.2.2 Removing Bookmaker Bias: Odds Normalization and Profitability Recovery

Bookmakers embed a built-in profit margin by slightly underpaying on all outcomes. This

margin inflates the total implied probability above 1, meaning that even if a bettor had perfect

foresight, the expected return could still be negative unless the odds are adjusted. To investigate

whether this margin was a key driver of underperformance, we normalized the odds to remove

the margin before re-running the full Kelly optimization and evaluation pipeline. The

normalization method is documented in 4.2.1 Decorrelation Strategy.

Figure 5. Cumulative Returns of Kelly and Greedy Strategies (Normalized Odds)

After applying this normalization, the revised cumulative return curve shows meaningful

improvement, as Figure 5 showed. The Kelly strategy now surpasses both the greedy benchmark

and the total money invested line, indicating positive profitability under fair odds. This strongly

supports the hypothesis that the original stagnation in performance was due to bookmaker

overround, not a failure of the Kelly framework or the prediction model. When evaluated in a

level playing field, the Kelly strategy demonstrates its theoretical advantage—compounding

statistically favorable bets into net profit.

7 Limitations and Future Work

7.1 Probability Prediction Focus

Despite strong offline accuracy and calibration, our strategy failed to deliver steady profits,

exposing structural flaws. A key issue is that Classwise ECE, while central to model selection, is

non‑differentiable; tree‑based models like LightGBM and XGBoost therefore rely on post‑hoc

checks. Embedding differentiable surrogates—such as soft‑binning or Maximum Mean

Calibration Error—into neural architectures could enable true end‑to‑end calibration.

A second limitation is the absence of dynamic updating. The current pipeline evaluates on a

fixed snapshot, ignoring the continual flow of new matches that shapes real betting markets.

Incorporating rolling or walk‑forward testing, together with time‑sensitive features (e.g.,

exponential‑decay weights, momentum embeddings), would better capture evolving team form

and yield more realistic performance estimates (Constantinou & Fenton 2013; Groll et al. 2019).

Finally, the disconnect between classification accuracy and return suggests that our loss

functions and metrics remain value‑agnostic. Overconfidence, calibration drift, and the lack of

profit‑weighted objectives can all erode expected gains. Future work should couple probability

forecasts with return optimization through betting‑specific and profit‑aware objectives.

7.2 Betting Optimization Focus

The Kelly strategy is highly sensitive to errors in predicted probabilities, especially near 0 or 1.

Its logarithmic utility function amplifies the impact of overestimations, often resulting in

disproportionate bets and potential losses. This risk is well-known in finance and gambling,

where even well-calibrated models can perform poorly under full Kelly allocation (MacLean,

Thorp, & Ziemba, 2010). To mitigate this, many apply fractional Kelly betting—e.g., 50% of the

recommended stake—to reduce volatility and guard against estimation noise. Although we did

not explore this approach, future work could evaluate fractional strategies to assess the trade-off

between growth and stability.

Second, our approach treats predicted probabilities as fixed point estimates, ignoring uncertainty

from model variance or misspecification. In reality, these probabilities stem from machine

learning models and are inherently uncertain. A more robust alternative is the Bayesian Kelly

criterion, which treats predictions as random variables and adjusts bet sizing based on posterior

distributions (Browne & Whitt, 1996). This probabilistic method offers greater resilience in

noisy or data-scarce settings, reducing the risk of overbetting and potentially improving

long-term performance.

Third, our model uses single-match optimization, treating each bet independently. While suitable

for evaluating individual outcomes, this approach overlooks capital allocation across multiple

concurrent bets—a more realistic scenario for active bettors. Portfolio-based extensions of the

Kelly criterion (Bell & Cover, 1988) show that diversifying across correlated bets can improve

capital growth and reduce drawdown. Integrating portfolio-level optimization would better

reflect real-world strategies and enable more effective budget management across full game

slates.

8 Teammates Contribution

8.1 Yiping Gao

I was responsible for designing and implementing the optimization component of our project,

which applied the Kelly criterion to allocate bets under realistic constraints. I formulated the task

as a constrained nonlinear optimization problem to maximize expected logarithmic wealth,

ensuring each allocation remained between 0 and 1 and the total budget was not exceeded. I

implemented the solution using the SLSQP algorithm from scipy.optimize for its ability to

handle nonlinear objectives with inequality constraints. To analyze optimizer performance, I

developed visualizations to track convergence and loss reduction across iterations. I also

documented the optimization design and process in detail, covering both the mathematical

formulation and the rationale for algorithm choices.

Beyond implementation, I simplified the generalized Kelly formula by removing the “no

outcome wins” term, as our setting guarantees exactly one outcome per match with probabilities

summing to one. I evaluated the strategy’s effectiveness through cumulative and per-match

return plots and compared it to a greedy baseline strategy. After identifying margin bias in

bookmaker odds, I normalized the odds using proportional adjustment and re-ran the pipeline to

assess impact. I also documented the experimental results and visualizations for the optimization

section.Finally, I wrote the limitations and future work section for the optimization portion of the

project, focusing on Kelly sensitivity to probability errors, the potential benefits of Bayesian

extensions, and the need to extend beyond single-match optimization to portfolio-level betting

strategies.

8.2 Emily Zhang

I led the machine learning component of the project, focusing on match outcome prediction,

probability calibration, and custom loss function design. I implemented and evaluated multiple

models—including Logistic Regression, Gradient Boosting, LightGBM, and XGBoost—using

classification accuracy and Classwise Expected Calibration Error (ECE) as primary evaluation

metrics. These metrics guided model selection, helping us identify the most well-calibrated

variant for downstream betting.

To better understand model behavior, I conducted a correlation analysis between predicted

probabilities, actual match outcomes, and bookmaker-implied odds. This analysis informed the

implementation of two loss strategies: Classwise-ECE–based calibration and a

decorrelation-augmented loss function that penalizes alignment with bookmaker pricing. While

the decorrelation approach encouraged divergence from market odds, the ECE-regularized model

proved more stable and was ultimately chosen for final evaluation.

I also worked on feature engineering, integrating team performance metrics and recent match

outcomes. Although we initially used a simple categorical encoding of recent form, I identified

the potential benefits of incorporating time-sensitive techniques such as decay-weighted

tracking. In addition, I contributed to the written sections on model calibration, prediction

performance, and system limitations—especially the misalignment between static accuracy and

economic performance in real-world betting.

8.3 Bill Li

I conducted research and gathered information for the presentation and research paper.

Therefore, I worked on the research paper portion of the presentation and explanation of the

Kelly Model.

I researched and analyzed the Kelly Model, Bayesian Kelly criterion, and Kuhn-Tucker

Conditions to discover ways to apply it to soccer betting. Reasonably, I worked on the related

research Portion of the paper.

I designed a Kelly Model according to the information on the research papers, and applied our

data set to the model to discover whether it is profitable.

8.4 Summer Sun

I contributed to both the machine-learning pipeline and the analytical evaluation of the full

prediction–betting framework. I first designed and implemented the complete data preprocessing

pipeline, including cleaning historical match data, normalizing bookmaker odds across data

sources, resolving team-name inconsistencies, and structuring feature matrices for model

training. This ensured consistent inputs across all models and enabled reliable repeated

experimentation.

Building on this foundation, I collaborated with Emily on the development and assessment of our

probability-calibration framework. I implemented Classwise Expected Calibration Error (ECE)

as a model-selection criterion, wrote helper functions for computing per-class calibration curves,

and evaluated multiple loss-design variants, including a decorrelation-augmented loss penalizing

alignment with bookmaker-implied probabilities. These experiments clarified how calibration

quality affects downstream betting performance and informed the final selection of the

ECE-regularized model.

In addition to calibration analysis, I conducted diagnostic studies on bookmaker margin bias. I

generated plots comparing predicted probabilities, implied probabilities under varying margin

sizes, and empirical outcomes, helping the team evaluate whether odds normalization produced

more stable behavior under Kelly optimization.

Finally, I integrated the preprocessing, calibration, and optimization components into a unified

end-to-end workflow, allowing the team to run large-scale experiments and benchmarking

efficiently.

9 Reference

Baio, G., & Blangiardo, M. (2010). Bayesian hierarchical model for the prediction of football

results. Journal of Applied Statistics, 37(2), 253–264.

Bell, R. M., & Cover, T. M. (1988). Game-theoretic optimal portfolios. Management Science,

34(6), 724–733. https://doi.org/10.1287/mnsc.34.6.724

Browne, S., & Whitt, W. (1996). Portfolio choice and the Bayesian Kelly criterion. Advances in

Applied Probability, 28(4), 1145–1176. https://doi.org/10.2307/1428183

Boucherie, R. J. (n.d.). Addendum to 5.8.4: Kelly System for Investing and Kuhn-Tucker

Conditions. University of Twente.

https://www.utwente.nl/en/eemcs/sor/boucherie/Operations%20Research/584operationsresearchk

ellybetting.pdf

Constantinou, A. C., & Fenton, N. E. (2013). Determining the level of ability of football teams

by dynamic ratings based on the relative discrepancies in scores between adversaries. Journal of

Quantitative Analysis in Sports, 9(1), 37–50.

Eastwood, M. (2025, April 14). Pi Ratings: The smarter way to rank football teams. Pi Ratings.

Groll, A., Schauberger, G., & Tutz, G. (2019). Prediction of major international soccer

tournaments based on team-specific regularized Poisson regression. Statistical Modelling, 19(1),

55–77.

He, Q. (2019). Match performance variables influential in selection into the English national

football team (B.S. thesis). School of Biological Sciences, Nanyang Technological University,

Singapore.

Hubáček, O., Šourek, G., & Železný, F. (2019). Exploiting sports-betting market using machine

learning. International Journal of Forecasting, 35(2), 712–723.

Kelly, J. L. (1956). A new interpretation of information rate. Bell System Technical Journal,

35(4), 917–926. https://doi.org/10.1002/j.1538-7305.1956.tb03809.x

Lago-Peñas, C., Lago-Ballesteros, J., & Rey, E. (2011). The influence of a congested calendar on

performance in elite soccer. Journal of Strength and Conditioning Research, 25(8), 2111–2117.

MacLean, L. C., Thorp, E. O., & Ziemba, W. T. (2010). Long-term capital growth: The good and

bad properties of the Kelly and fractional Kelly capital growth criteria. Quantitative Finance,

10(7), 681–687. https://doi.org/10.1080/14697680903162379

Mead, J., O’Hare, A., & McMenemy, P. (2023). Expected goals in football: Improving model

performance and demonstrating value. PLOS ONE, 18(4), e0282295.

https://doi.org/10.1371/journal.pone.0282295

Pollard, R. (2008). Home advantage in football: A current review of an unsolved puzzle. The

Open Sports Sciences Journal, 1(1), 12–14.

Press, W. H., Teukolsky, S. A., Vetterling, W. T., & Flannery, B. P. (2007). Numerical recipes:

The art of scientific computing (3rd ed.). Cambridge University Press.

Rico-González, M., Ortega, J. P., & Clemente, F. (2023). Machine learning application in soccer:

A systematic review. Biology of Sport – Open, 40(1), 249–263.

Smoczynski, P., & Tomkins, D. (2010). An explicit solution to the problem of optimizing the

allocations of a bettor’s wealth when wagering on horse races. Mathematical Scientist, 35(1),

10–17.

Walsh, C., & Joshi, A. (2023). Machine learning for sports betting: Should model selection be

based on accuracy or calibration? arXiv preprint. https://arxiv.org/abs/2303.06021

Whelan, K. (2023). Fortune’s Formula or the Road to Ruin? The Generalized Kelly Criterion

with Multiple Outcomes[Working paper]. SSRN. https://doi.org/10.2139/ssrn.4382822