Many of you may have seen the movie (or read the book) Moneyball. The story charts Oakland A’s general manager Billy Beane’s attempt to put together a baseball club on a budget by employing computer-generated analysis, getting the most out of what were considered to be very average players.
Can we transfer the idea of computer-generated analysis to our betting decisions?
There are a few hundred betting websites, and many of them will have thousands of clients. Everyone wants to be a winner and so people will tend to subscribe to the tipsters that they perceive to be the best.
The media industry has also caught up with the idea that people want to “know more, enjoy more, and share more” in terms of their betting experience and bookmakers have been advised that their target audience is open to what I describe as “bet stimulation”. The online in-running betting market has increased in volume dramatically since a company called Bettorlogic started to supply data to the industry.
Here is an example for the Norwich v Wigan game.
As the data flashes up during a game it stimulates our brains to “trigger” a bet as we become confident that we have information that will give us an edge.
Bookmakers have gone even further by having actors appear at half-time advising you to consider backing the 1-3 scoreline in a game at 14/1. In one particular game I checked the probability of Real Madrid winning 3-1 as they were 1-0 up away at HT and I discovered that Real Madrid and Barcelona had only won once each away FT 3-1 when winning away at HT 1-0 in the last 4 seasons.
The bookmakers are using data to give them an edge over the punter by providing really poor value “trigger” bets.
The next thing to look at is random behavior. If I put £10 on black and win, and then put £10 on black again, my chance of winning has no relationship with my previous bet. This is in fact what happens to many people when they bet. The strategy of joining trends and leaving them as a bet loses and then backing Liverpool because they do not lose on a Friday will lead to long term losses unless you are the lucky chap who recently won £180 000 for predicting 4 correct scores with an initial stake of £11
Bookmakers have an edge so is it possible for us to have an edge?
Let us look at Man Utd and their shots on target data at home. Do you think it will be consistent or inconsistent depending on the type of team they are up against.
As I write I am opening up my database and I can see this season that the number reads 7 5 7 7 7 6 7 7 and I welcome you to check who the away teams were. This is certainly not a random event.
In every game this season at home Man Utd have had between 5-7 shots on target, which is a very narrow range. However, goals to shots-on-target ratios are not consistent and vary game to game.
Patterns
We can, though, look at the shots-on-target data, much like the general manager in Moneyball, to see any patterns forming that will define what we are looking for: expectation of a goal.
Have a look at Fulham v Norwich and we see that there was a cluster of shots on target before the first Fulham goal.
Of course, not all games will fit that cluster pattern.
This is where Liverpool had 9 shots on target but failed to score, which does happen but not very often.
Again there are teams that dominate the game in terms of initial shots-on-target data and still lose.
This is a good example which is Everton at Reading. Everton dominated the game and lost 2-1
People find it very easy to change their momentum process when watching a game and this is a great example from Swansea v Liverpool on Skysports commentary:
- Liverpool are now the team pressing for a late goal, but are unable to create a noteworthy chance.
- 89′ - Swansea look the most likely to grab a late winner at the moment but Liverpool can catch a break as they win a throw in inside Swansea’s half.
What can we do?
So what can we do to find an edge to give the bookmakers a run for our hard cash?
- Look at the strength of a shot, for example:39′ Chelsea win a free-kick in a dangerous position. Luiz takes it but it is a poor effort and straight at Hart with little pace on it.
- This is not going to be a likely goal so use a site like EPLIndex’s Stats Centre (shots at goal area graphic available) if you do not have pictures to see where the shots are being taken from. I went to an OPTA seminar last week which showed an excellent graphic of where the goals are scored from and the simple comment is that not many are scored from outside the box so just looking at where QPR shoot from will show you instantly that they have struggled to score as they are shooting from too far out. Harry has already stopped that and they were certainly much better in the final third at Wigan.
- Download the 4 4 2 StatsZone App which I think is free now and track a game in running. How good is the team that you are watching in the first 30 minutes of a game?
- Download the data from Football Data UK on to a spreadsheet to see what is the profile of 0-0 in the Premiership in terms of the average shots on target. A word of warning is that the site uses press association data for shots and shots on target so I manually change it back to an OPTA source like EPLIndex.
- Opening up my spreadsheet I can see that there are fourteen 0-0′s in the Premiership this season and the average shot on target for the home team is 2.85 and for the away team the average is 3.07 which has been pushed up by Liverpool having 9 shots on target at Swansea and not scoring.
- Look at historical trends such as 1-0 HT which I discussed in a previous article. If Newcastle have not won away when losing at HT 1-0 since they beat Sunderland in 2005, 4-1, then there is little point in thinking that at 1-1 Last week at Fulham (1-0) that they could win the game. If you had also noted the lack of draws when Newcastle have been down 1-0 HT then this should have “triggered” a bet and it would not be back Newcastle.
- See if you can predict expectation of goals. I was looking at the Norwich game at Swansea the shot on target data and suspected that Norwich would score very soon, which they did 60 seconds later.
This is not the best way to profit if you do find that you can predict goal expectation.
A) back in the goal market
b) lay the correct score as time decays.
Say Liverpool in real time at Swansea have had 6 shots on target after 80 minutes. You check your spreadsheet and see the average 0-0 is far lower then that so you trigger a LAY bet of the correct score ie 0-0 in anticipation of the goal which in this case did not arrive but there are a greater number of games where the goal will arrive.
Some of you may be aware that more goals are scored 90-FT then any other time band so without worrying who is going to score by backing a goal you have both teams playing for you as the game is near the end.
The same is true in reverse which is if the game is quiet there is expectation of a lack of goals but I am not a fan of this idea as I have seen quiet games that have exploded resulting in floods of goals.
You need to also look at accuracy prevention which is the ability of a team to stop the other team turning shots into shots on target which Stoke are past masters of.
At Villa last week , Villa had 13 shots and just two shots on target so the Stoke accuracy prevention ratio was 11/13 which is a massive 0.84 which is no wonder Villa fired a blank, FT 0-0.
I leave you with a breakdown of the EPL using time bands 0-30 31 -60 and 61-FT and you are welcome to change them for your strategy and a shot on target table.
Conclusion
In conclusion, betting can be random and bookmakers are using techniques to stimulate “random bets”, but I argue that football is predictable, and that if you look hard you may find some interesting patterns and historical trends.
| TEAM | GLS | TSH | SOT | GLS/SOT |
|---|---|---|---|---|
| Arsenal | 26 | 244 | 75 | 0.34 |
| Aston Villa | 12 | 183 | 52 | 0.23 |
| Chelsea | 28 | 231 | 82 | 0.34 |
| Fulham | 27 | 229 | 89 | 0.3 |
| Liverpool | 22 | 294 | 78 | 0.28 |
| Man City | 30 | 284 | 104 | 0.28 |
| Man UTD | 40 | 250 | 93 | 0.43 |
| Newcastle | 18 | 226 | 78 | 0.23 |
| Norwich | 17 | 192 | 62 | 0.27 |
| QPR | 13 | 212 | 68 | 0.19 |
| Reading | 19 | 188 | 59 | 0.32 |
| Southampton | 22 | 218 | 68 | 0.32 |
| Stoke | 14 | 161 | 47 | 0.29 |
| Sunderland | 17 | 175 | 54 | 0.31 |
| Swansea | 26 | 231 | 78 | 0.33 |
| Spurs | 29 | 258 | 92 | 0.31 |
| WBA | 24 | 198 | 67 | 0.35 |
| West Ham | 21 | 205 | 64 | 0.32 |
| Wigan | 17 | 198 | 64 | 0.26 |
| Everton | 27 | 295 | 100 | 0.27 |
Shot on target does not include the woodwork or blocks.
TABLE KEY
SOT = Shots on Target
GLS = Goals
| Team / Stats | SOT 0-30 | SOT 31-60 | SOT 61-FT | GLS 0-30 | GLS 31-60 | GLS 61-FT |
|---|---|---|---|---|---|---|
| Arsenal | 20 | 20 | 35 | 6 | 9 | 11 |
| Villa | 14 | 19 | 19 | 4 | 4 | 4 |
| Chelsea | 27 | 27 | 28 | 11 | 8 | 9 |
| Everton | 32 | 35 | 33 | 9 | 10 | 8 |
| Fulham | 22 | 16 | 51 | 6 | 6 | 15 |
| Liverpool | 23 | 24 | 31 | 5 | 8 | 9 |
| Man City | 18 | 47 | 39 | 2 | 9 | 19 |
| Man UTD | 30 | 30 | 33 | 14 | 10 | 16 |
| Newcastle | 14 | 33 | 31 | 4 | 8 | 6 |
| Norwich | 24 | 18 | 20 | 5 | 6 | 6 |
| QPR | 21 | 26 | 21 | 4 | 6 | 3 |
| Reading | 15 | 15 | 28 | 7 | 5 | 7 |
| Southampton | 15 | 29 | 24 | 4 | 8 | 10 |
| Stoke | 16 | 13 | 18 | 5 | 5 | 4 |
| Sunderland | 15 | 15 | 24 | 4 | 6 | 7 |
| Swansea | 21 | 28 | 29 | 5 | 6 | 15 |
| Spurs | 23 | 31 | 38 | 7 | 10 | 12 |
| West Ham | 14 | 26 | 24 | 4 | 9 | 8 |
| West Brom | 14 | 23 | 30 | 4 | 10 | 10 |
| Wigan | 20 | 23 | 21 | 5 | 6 | 6 |
Images taken from the Excellent FourFourTwo StatsZone app
All of the stats from this article have been taken from the Opta Stats Centre at EPLIndex.com – Subscribe Now (Includes author privileges!) Check out our new Top Stats feature on the Stats Centre which allows you to compare all players in the league & read about new additions to the stats centre.
- Excellent
- Informative
- Awesome
- Good Read
- ok
Categories: Arsenal (NN), Aston Villa, Betting Tips, Chelsea, EPL Index Featured Article, EPL Index Statistical Comparisons, Everton, Fulham, Liverpool, Manchester City, Manchester Utd, Newcastle Utd, Norwich City, QPR, Reading, Southampton, Stoke City, Sunderland, Swansea City, Tottenham Hotspur, West Bromwich Albion, West Ham United, Wigan
Tags: AFC, AVFC, Betting and Data Analysis, Betting Tips, Data Analysis for Betting, efc, EPL, epl opta stats, EPL Stats, ffc, LFC, MCFC, MUFC, NCFC, Opta Stats, premier league, Premier League Data Analysis, Premier League Stats, QPR, RFC, SCFC, SFC, THFC, WAFC, WBA, WHUFC
This article has had 2,775 Views







Very impressive article.
Thanks Debbie.
Man UTD v Sunderland
Man UTD had 5 shots on target and I predicted 5.79
Random or explained. Game me a 2-1 socerline and Stoke 1-1.
You will get games like Liverpool that are way off the prediction.
Man
United
v
Sunderland
0.34*5.79
v
0.40*2.86
=
1.96
+
1.14
=
3.10
Man
UTD
are
so
consisCng
on
the
shot
on
target
front
so
no
surprise,
Simply
says
Sunderland
will
be
restricted
to
2/3
shots
on
target
but
could
score.
If
you
are
backing
both
teams
to
score
then
be
warned
that
Sunderland
goal
to
shot
raCo
has
been
dropping
week
on
week.
Stoke
v
Everton
0.34*3.44
v
0.18*4.71
=
1.16
+
0.84
=
2.00
ExpectaCon
that
Stoke
will
restrict
Everton
in
the
final
third
as
they
do
too
every
team
they
play.
Hi Johnny,
Good article once again mate, long procrastinated over developing a model incorporating opta, to identify the environment in running that’s most conducive to goals being scored/not scored so nice work getting one set up. I use some similarish stats to assess who’s over/underperforming generally and inform some weekly bets and season long trades. Am also interested to know whether you’ve tried incorporating the clear chances created/conceded metrics? Know idea if helpful but realise a portion of these wouldn’t end up in your shots on target figures.
Still, in relation to betting I feel this piece misses out a vital piece of context. Price. Without knowing what price you’re taking to back or lay it’s impossible to have an absolutest strategy. So even in the examples you’ve given it could be that the market has adjusted the price so the actual percentage chance of what you’re backing are lower than the price available or if laying it’s a higher chance so are not viable bets. You mention Stoke for instance but the odds on low scoring games are already adjusted short accordingly. The markets tend to be so clever that in the main when the implied percentage chance of a price is incorrect it’s only small, so profits are eroded over time by the commission you pay to the exchange. Of course edges can be found but as I’ve experienced numerous occasions, much to my annoyance, the markets can wise up at any time so the same successful back/lay prices are no longer found.
A counter intuitive thing to explain to people who don’t bet is that you could have 2 rival tipping services, one who backs a selection every week and the other who lays exactly the same one. Yet even though they’re opposing the other’s outcome, without specifying a price it’s possible that both of them, neither of them or only one of them end up in profit long term, dependent on the different prices they were matched at. I know you’re aware of this, I remember some interesting posts of yours on the Daq forum, but it’s something that’s worth pointing out for those familiarising themselves with betting/trading.
Also, It’s certainly worthwhile examining trends such as how teams have got on after trailing by x goals at half time but that has more significance for making decisions at that part of the game and I’d have to slightly challenge your Newcastle example. You say it certainly wasn’t worth backing them once they’d got back to 1-1 as they’d not come from behind at halftime to win away from home since 2005 but you’d really need to assess in how many of those games they ended up level again after 53 minutes. I’d guess the sample size would be too small to draw any definitive conclusions from so it would also be worthwhile just looking at the end outcome of the all the games they’d been level away after 53 mins regardless of halftime result. It could well be that they’re still not worth backing but once again it’s entirely dependent upon the price that’s available.
Anyway, not trying to be negative, is an interesting article and is definitely some cross over in some of the the types of analysis we do but certainly thought it worth mentioning about the importance of price.
Agh, some annoying typos on my last response!
Anyway, also interesting you mention bettorlogic as effectively they’ve switched sides from punter to bookie. When they first started up they were trying to provide a betting service to calculate value bets and created an algorithm to determine how valuable each player was to every team to adjust odds according to team news. They had a fair amount of money behind them too, including Bert Black, the creator of betfair, but obviously struggled with the original business plan. Now they’ve ended up supplying info to the bookies to lure mug punters into poor value bets!
I do incorporate price at every trigger. Markets can be very inefficient. bookmakers control data flow as they pay the bills. newcastle historical trend which is called survival analysis and i call it fightback or it is also called leverage. sample size is confused v historical trends . bookmakers look at last 6 games max.I have been working on the model for 2 years and the in running team i work with at a place i will not promote has seen amazing results. This is not academic research as I have applied it to in running betting. Please find me on twitter to discuss further.
last three in running model bets were QPR O at Ht Won Norwich DNB at 1-1 WON and lay o-o spurs at evens won. In all three games the price of the outcome was massively away from my expected prices. Triggers hit and pulled and 3/3
The person who clicked the ok button must be a bookmaker.
Cheers for the response Jonny, will add you on twitter.
I wasn’t challenging whether you were looking at price. More for others reading the piece to be aware that if they backing or laying something on the back of a certain criteria of shots etc being met without awareness of the price they won’t be making a profit long term.
Of course the markets can be inefficient, pre event much less so than during windows inrunning but the intelligence behind the money makes it accurate enough that the vast majority will lose money long term when a 2-5% commission is factored into it. You’ve done incredibly well mate to find lots of market inefficiencies. I’ve put a lot of work in over the last 4 years now and I’ve managed to find a few edges but am always aware that the edge is precarious and can (and often has been) lost. Are lots of sharp minds searching to find the same things and it’s significant that most of those facing the higher end of betfair’s premium charge (courtsiders aside) are those trading the movements of the markets rather than those betting outright on specific outcomes.
Anyway, what you guys are doing sounds very interesting, definitely like to find out more. Particularly like the counter intuitive stuff so would be fascinated to learn more about your fightback analysis. Would’ve assumed using historical half time results as a predictor for full time results would become somewhat redundant 10 minutes into the second half if they’ve equalised without knowing how many times they’d made that initial fightback (and analysing what the end results were from that point). I’d have guessed full time result prediction would be more accurately found from finding instances of equivalent scores (or result) after that elapsed time but perhaps that doesn’t incorporate other nuanced factors involved when a team has come from behind.
I am exhausted but quick response is I profile shots on target as time decays to games and can predict to a high accuracy
1. when a goal is coming
2. i know the profile of say 0-0 1-0 0-0 2-0 0-1 1-1 first is ht score
3. also GRADE TEAMS……
that is also very powerful.
certainly redundant 10 mins into second half if a team fight back as another massive indicator. cannot speak now as have to go out
should say NOT redundant
Arsenal scored on 14 mins at Reading and I have written on the time of the away goal.An early away goal increases further expectation of goals(more then expected in terms of the goal expectation before the game) see Dixon and Coles Dixon and Robinson 1997 1998. Information is out there. In all my blogs I have tried to add each element of the “in running” model.
http://www.eplindex.com/21884/crucial-goal-time-goal-time-of-first-goal.html
This is where i talk about “fightback” analysis.
I wrote an article about the time of the away goal on dec 10th for another site. This was copied on another site on dec 12th. My work is being reproduced and no mention …
Thanks to all the people on twitter who are giving a very positive response to this article. I really appreciate it. The data is out there for everyone now making it a level playing field so we should see some interesting developments in the study of raw football data.
I will be doing an updated article on shot on target data soon showing teams like West Brom and Sunderland who revert to the mean… ie their shot on target data at start of season is above the average and then drops to a lower level around the average.
Very impressive article Jonny. I am a firm believer in your work. Would you be interested in discussing further over email ? I could assist you with the leg work regarding all the data you gather and then manipulate to attain the model you have.