Written by: Jonny Grossmark - December 17, 2012

Shots on Target & Goals Stats: Using Data Analysis to form betting decisions?


Many of you may have seen the movie (or read the book) Moneyball. The story charts Oakland A’s general manager Billy Beane’s attempt to put together a baseball club on a budget by employing computer-generated analysis, getting the most out of what were considered to be very average players.

Can we transfer the idea of computer-generated analysis to our betting decisions?

There are a few hundred betting websites, and many of them will have thousands of clients. Everyone wants to be a winner and so people will tend to subscribe to the tipsters that they perceive to be the best.

The media industry has also caught up with the idea that people want to “know more, enjoy more, and share more” in terms of their betting experience and bookmakers have been advised that their target audience is open to what I describe as “bet stimulation”. The online in-running betting market has increased in volume dramatically since a company called Bettorlogic started to supply data to the industry.

Here is an example for the Norwich v Wigan game.

Norwich have scored first in 5/8 home matches against bottom-six teams. There have been +2.5 goals in 15/20 Wigan away matches.

As the data flashes up during a game it stimulates our brains to “trigger” a bet as we become confident that we have information that will give us an edge.

Bookmakers have gone even further by having actors appear at half-time advising you to consider backing the 1-3 scoreline in a game at 14/1. In one particular game I checked the probability of Real Madrid winning 3-1 as they were 1-0 up away at HT and I discovered that Real Madrid and  Barcelona had only won once each away FT 3-1 when winning away at HT 1-0 in the last 4 seasons.

The bookmakers are using data to give them an edge over the punter by providing really poor value “trigger” bets.

The next thing to look at is random behavior. If I put £10 on black and win, and then put £10 on black again, my chance of winning has no relationship with my previous bet. This is in fact what happens to many people when they bet. The strategy of joining trends and leaving them as a bet loses and then backing Liverpool because they do not lose on a Friday will lead to long term losses unless you are the lucky chap who recently won £180  000 for predicting 4 correct scores with an initial stake of £11

Bookmakers have an edge so is it possible for us to have an edge?

Let us look at Man Utd and their shots on target data at home. Do you think it will be consistent or inconsistent depending on the type of team they are up against.

As I write I am opening up my database and I can see this season that the number reads 7 5 7 7 7 6 7 7 and I welcome you to check who the away teams were. This is certainly not a random event.

In every game this season at home Man Utd have had between 5-7 shots on target, which is a very narrow range. However, goals to shots-on-target ratios are not consistent and vary game to game.

Patterns

We can, though, look at the shots-on-target data, much like the general manager in Moneyball, to see any patterns forming that will define what we are looking for: expectation of a goal.

Have a look at Fulham v Norwich and we see that there was a cluster of shots on target before the first Fulham goal.

Fulham V Norwich Shots on Target & Goals Stats: Using Data Analysis to form betting decisions?

Of course, not all games will fit that cluster pattern.

This is where Liverpool had 9 shots on target but failed to score, which does happen but not very often.

Swansea Vs Liverpool Shots on Target & Goals Stats: Using Data Analysis to form betting decisions?

Again there are teams that dominate the game in terms of initial shots-on-target data and still lose.

This is a good example which is Everton at Reading. Everton dominated the game and lost 2-1

Reading VS Everton Shots on Target & Goals Stats: Using Data Analysis to form betting decisions?

People find it very easy to change their momentum process when watching a game and this is a great example from Swansea v Liverpool on Skysports commentary:

  1. Liverpool are now the team pressing for a late goal, but are unable to create a noteworthy chance.
  2. 89′ - Swansea look the most likely to grab a late winner at the moment but Liverpool can catch a break as they win a throw in inside Swansea’s half.

What can we do?

So what can we do to find an edge to give the bookmakers a run for our hard cash?

  1. Look at the strength of a shot, for example:39′ Chelsea win a free-kick in a dangerous position. Luiz takes it but it is a poor effort and straight at Hart with little pace on it.
  2. This is not going to be a likely goal so use a site like EPLIndex’s Stats Centre (shots at goal area graphic available) if you do not have pictures to see where the shots are being taken from. I went to an OPTA seminar last week which showed an excellent graphic of where the goals are scored from and the simple comment is that not many are scored from outside the box so just looking at where QPR shoot from will show you instantly that they have struggled to score as they are shooting from too far out. Harry has already stopped that and they were certainly much better in the final third at Wigan.
  3. Download the 4 4 2 StatsZone App which I think is free now and track a game in running. How good is the team that you are watching in the first 30 minutes of a game?
  4. Download the data from Football Data UK on to a spreadsheet to see what is the profile of 0-0 in the Premiership in terms of the average shots on target. A word of warning is that the site uses press association data for shots and shots on target so I manually change it back to an OPTA source like EPLIndex.
  5. Opening up my spreadsheet I can see that there are fourteen 0-0′s in the Premiership this season and the average shot on target for the home team is 2.85 and for the away team the average is 3.07 which has been pushed up by Liverpool having 9 shots on target at Swansea and not scoring.
  6. Look at historical trends such as 1-0 HT which I discussed in a previous article. If Newcastle have not won away when losing at HT 1-0 since they beat Sunderland in 2005, 4-1, then there is little point in thinking that at 1-1 Last week at Fulham (1-0) that they could win the game. If you had also noted the lack of draws when Newcastle have been down 1-0 HT then this should have “triggered” a bet and it would not be back Newcastle.
  7. See if you can predict expectation of goals. I was looking at the Norwich game at Swansea the shot on target data and suspected that Norwich would score very soon, which they did 60 seconds later.

This is not the best way to profit if you do find that you can predict goal expectation.

A) back in the goal market

b) lay the correct score as time decays.

Say Liverpool in real time at Swansea have had 6 shots on target after 80 minutes. You check your spreadsheet and see the average 0-0 is far lower then that so you trigger a LAY bet of the correct score ie 0-0 in anticipation of the goal which in this case did not arrive but there are a greater number of games where the goal will arrive.

Some of you may be aware that more goals are scored 90-FT then any other time band so without worrying who is going to score by backing a goal you have both teams playing for you as the game is near the end.

The same is true in reverse which is if the game is quiet there is expectation of a lack of goals but I am not a fan of this idea as I have seen quiet games that have exploded resulting in floods of goals.

You need to also look at accuracy prevention which is the ability of a team to stop the other team turning shots into shots on target which Stoke are past masters of.

At Villa last week , Villa had 13 shots and just two shots on target so the Stoke accuracy prevention ratio was 11/13 which is  a massive 0.84 which is no wonder Villa fired a blank, FT 0-0.

I leave you with a breakdown of the EPL using time bands 0-30 31 -60 and 61-FT and you are welcome to change them for your strategy and a shot on target table.

Conclusion

In conclusion, betting can be random and bookmakers are using techniques to stimulate “random bets”, but I argue that football is predictable, and that if you look hard you may find some interesting patterns and historical trends.

TEAM GLS TSH SOT GLS/SOT
Arsenal 26 244 75 0.34
Aston Villa 12 183 52 0.23
Chelsea 28 231 82 0.34
Fulham 27 229 89 0.3
Liverpool 22 294 78 0.28
Man City 30 284 104 0.28
Man UTD 40 250 93 0.43
Newcastle 18 226 78 0.23
Norwich 17 192 62 0.27
QPR 13 212 68 0.19
Reading 19 188 59 0.32
Southampton 22 218 68 0.32
Stoke 14 161 47 0.29
Sunderland 17 175 54 0.31
Swansea 26 231 78 0.33
Spurs 29 258 92 0.31
WBA 24 198 67 0.35
West Ham 21 205 64 0.32
Wigan 17 198 64 0.26
Everton 27 295 100 0.27

Shot on target does not include the woodwork or blocks.

TABLE KEY
SOT = Shots on Target
GLS = Goals

Team / Stats SOT 0-30 SOT 31-60 SOT 61-FT GLS 0-30 GLS 31-60 GLS 61-FT
Arsenal 20 20 35 6 9 11
Villa 14 19 19 4 4 4
Chelsea 27 27 28 11 8 9
Everton 32 35 33 9 10 8
Fulham 22 16 51 6 6 15
Liverpool 23 24 31 5 8 9
Man City 18 47 39 2 9 19
Man UTD 30 30 33 14 10 16
Newcastle 14 33 31 4 8 6
Norwich 24 18 20 5 6 6
QPR 21 26 21 4 6 3
Reading 15 15 28 7 5 7
Southampton 15 29 24 4 8 10
Stoke 16 13 18 5 5 4
Sunderland 15 15 24 4 6 7
Swansea 21 28 29 5 6 15
Spurs 23 31 38 7 10 12
West Ham 14 26 24 4 9 8
West Brom 14 23 30 4 10 10
Wigan 20 23 21 5 6 6

Images taken from the Excellent FourFourTwo StatsZone app

All of the stats from this article have been taken from the Opta Stats Centre at EPLIndex.comSubscribe Now (Includes author privileges!) Check out our new Top Stats feature on the Stats Centre which allows you to compare all players in the league & read about new additions to the stats centre.





Thanks for rating this! Now tell the world how you feel via Twitter.
What do you think about this post?
  • Excellent
  • Informative
  • Awesome
  • Good Read
  • ok


About the Author

Jonny Grossmark
My first taste of football in a stadium was Gillingham V Aston Villa 1971 and I still have the programme which cost 5p. I have been lucky to have seen a number of Cup Finals but missed the Sunderland goal in 1973 as I was in the toliet. I have recently been watching Margate and also watch around 50 other matches a month on my computer .




 
 

 
2013-05-16 17.05.39

Scouting Report: Arsenal | Stats & In Depth Tactical Analysis

As the 2012/13 Premier League season draws to a close Martin Lewis looks at Arsenal this week in his Scouting Report. Not only will Arsenal supporters find this interesting but Newcastle and maybe even Spurs supporters will wan...
by martinlewis94
2

 
 
Shocking Upset

The Premier League Shockers – How common are the shocking upsets?

Ok, so Landry was talking about the ‘other’ type of football, however the sentiment remains true for ‘real’ football.  One of the most beautiful things about the beautiful game is the potential for a fr...
by Andy Smith
0

 
 
AFC-WAFC

Arsenal 4 Wigan 1: In-Depth Tactical Analysis

With the end of the season around the corner, both the Arsenal and Wigan managers kept largely the same XIs as in the past few weeks. For Arsenal the only change was Gibbs coming in to play at left back as Monreal played there ...
by Mihail Vladimirov
0

 




16 Comments


  1. Debbie

    Very impressive article.


  2. Jonny Grossmark

    Thanks Debbie.

    Man UTD v Sunderland

    Man UTD had 5 shots on target and I predicted 5.79

    Random or explained. Game me a 2-1 socerline and Stoke 1-1.

    You will get games like Liverpool that are way off the prediction.

    Man
     United
     v
     Sunderland
    0.34*5.79
     v
     0.40*2.86
     =
     1.96
     +
     1.14
     =
     3.10
    Man
     UTD
     are
     so
     consisCng
     on
     the
     shot
     on
     target
     front
     so
     no
     surprise,
     Simply
     says
     Sunderland
     will
     be
     
    restricted
     to
     2/3
     shots
     on
     target
     but
     
     could
     score.
     If
     you
     are
     backing
     both
     teams
     to
     score
     then
     be
     warned
     
     
    that
     Sunderland
     goal
     to
     shot
     raCo
     has
     been
     dropping
     week
     on
     week.
    Stoke
     v
     Everton
    0.34*3.44
     v
     0.18*4.71
     =
     1.16
     +
     0.84
     =
     2.00
    ExpectaCon
     that
     Stoke
     will
     restrict
     Everton
     in
     the
     final
     third
     as
     they
     
     do
     too
     every
     team
     they
     play.


  3. betterbettor

    Hi Johnny,

    Good article once again mate, long procrastinated over developing a model incorporating opta, to identify the environment in running that’s most conducive to goals being scored/not scored so nice work getting one set up. I use some similarish stats to assess who’s over/underperforming generally and inform some weekly bets and season long trades. Am also interested to know whether you’ve tried incorporating the clear chances created/conceded metrics? Know idea if helpful but realise a portion of these wouldn’t end up in your shots on target figures.

    Still, in relation to betting I feel this piece misses out a vital piece of context. Price. Without knowing what price you’re taking to back or lay it’s impossible to have an absolutest strategy. So even in the examples you’ve given it could be that the market has adjusted the price so the actual percentage chance of what you’re backing are lower than the price available or if laying it’s a higher chance so are not viable bets. You mention Stoke for instance but the odds on low scoring games are already adjusted short accordingly. The markets tend to be so clever that in the main when the implied percentage chance of a price is incorrect it’s only small, so profits are eroded over time by the commission you pay to the exchange. Of course edges can be found but as I’ve experienced numerous occasions, much to my annoyance, the markets can wise up at any time so the same successful back/lay prices are no longer found.

    A counter intuitive thing to explain to people who don’t bet is that you could have 2 rival tipping services, one who backs a selection every week and the other who lays exactly the same one. Yet even though they’re opposing the other’s outcome, without specifying a price it’s possible that both of them, neither of them or only one of them end up in profit long term, dependent on the different prices they were matched at. I know you’re aware of this, I remember some interesting posts of yours on the Daq forum, but it’s something that’s worth pointing out for those familiarising themselves with betting/trading.

    Also, It’s certainly worthwhile examining trends such as how teams have got on after trailing by x goals at half time but that has more significance for making decisions at that part of the game and I’d have to slightly challenge your Newcastle example. You say it certainly wasn’t worth backing them once they’d got back to 1-1 as they’d not come from behind at halftime to win away from home since 2005 but you’d really need to assess in how many of those games they ended up level again after 53 minutes. I’d guess the sample size would be too small to draw any definitive conclusions from so it would also be worthwhile just looking at the end outcome of the all the games they’d been level away after 53 mins regardless of halftime result. It could well be that they’re still not worth backing but once again it’s entirely dependent upon the price that’s available.

    Anyway, not trying to be negative, is an interesting article and is definitely some cross over in some of the the types of analysis we do but certainly thought it worth mentioning about the importance of price.


  4. betterbettor

    Agh, some annoying typos on my last response!

    Anyway, also interesting you mention bettorlogic as effectively they’ve switched sides from punter to bookie. When they first started up they were trying to provide a betting service to calculate value bets and created an algorithm to determine how valuable each player was to every team to adjust odds according to team news. They had a fair amount of money behind them too, including Bert Black, the creator of betfair, but obviously struggled with the original business plan. Now they’ve ended up supplying info to the bookies to lure mug punters into poor value bets!


  5. jonny

    I do incorporate price at every trigger. Markets can be very inefficient. bookmakers control data flow as they pay the bills. newcastle historical trend which is called survival analysis and i call it fightback or it is also called leverage. sample size is confused v historical trends . bookmakers look at last 6 games max.I have been working on the model for 2 years and the in running team i work with at a place i will not promote has seen amazing results. This is not academic research as I have applied it to in running betting. Please find me on twitter to discuss further.


  6. jonny

    last three in running model bets were QPR O at Ht Won Norwich DNB at 1-1 WON and lay o-o spurs at evens won. In all three games the price of the outcome was massively away from my expected prices. Triggers hit and pulled and 3/3


  7. jonny

    The person who clicked the ok button must be a bookmaker.


  8. betterbettor

    Cheers for the response Jonny, will add you on twitter.

    I wasn’t challenging whether you were looking at price. More for others reading the piece to be aware that if they backing or laying something on the back of a certain criteria of shots etc being met without awareness of the price they won’t be making a profit long term.

    Of course the markets can be inefficient, pre event much less so than during windows inrunning but the intelligence behind the money makes it accurate enough that the vast majority will lose money long term when a 2-5% commission is factored into it. You’ve done incredibly well mate to find lots of market inefficiencies. I’ve put a lot of work in over the last 4 years now and I’ve managed to find a few edges but am always aware that the edge is precarious and can (and often has been) lost. Are lots of sharp minds searching to find the same things and it’s significant that most of those facing the higher end of betfair’s premium charge (courtsiders aside) are those trading the movements of the markets rather than those betting outright on specific outcomes.

    Anyway, what you guys are doing sounds very interesting, definitely like to find out more. Particularly like the counter intuitive stuff so would be fascinated to learn more about your fightback analysis. Would’ve assumed using historical half time results as a predictor for full time results would become somewhat redundant 10 minutes into the second half if they’ve equalised without knowing how many times they’d made that initial fightback (and analysing what the end results were from that point). I’d have guessed full time result prediction would be more accurately found from finding instances of equivalent scores (or result) after that elapsed time but perhaps that doesn’t incorporate other nuanced factors involved when a team has come from behind.


  9. Jonny Grossmark

    I am exhausted but quick response is I profile shots on target as time decays to games and can predict to a high accuracy

    1. when a goal is coming
    2. i know the profile of say 0-0 1-0 0-0 2-0 0-1 1-1 first is ht score

    3. also GRADE TEAMS……

    that is also very powerful.


  10. Jonny Grossmark

    certainly redundant 10 mins into second half if a team fight back as another massive indicator. cannot speak now as have to go out


  11. Jonny Grossmark

    should say NOT redundant


  12. Jonny Grossmark

    Arsenal scored on 14 mins at Reading and I have written on the time of the away goal.An early away goal increases further expectation of goals(more then expected in terms of the goal expectation before the game) see Dixon and Coles Dixon and Robinson 1997 1998. Information is out there. In all my blogs I have tried to add each element of the “in running” model.


  13. Jonny Grossmark

    http://www.eplindex.com/21884/crucial-goal-time-goal-time-of-first-goal.html

    This is where i talk about “fightback” analysis.

    I wrote an article about the time of the away goal on dec 10th for another site. This was copied on another site on dec 12th. My work is being reproduced and no mention …


  14. Jonny Grossmark

    Thanks to all the people on twitter who are giving a very positive response to this article. I really appreciate it. The data is out there for everyone now making it a level playing field so we should see some interesting developments in the study of raw football data.


  15. Jonny Grossmark

    I will be doing an updated article on shot on target data soon showing teams like West Brom and Sunderland who revert to the mean… ie their shot on target data at start of season is above the average and then drops to a lower level around the average.


  16. Ben

    Very impressive article Jonny. I am a firm believer in your work. Would you be interested in discussing further over email ? I could assist you with the leg work regarding all the data you gather and then manipulate to attain the model you have.



Leave a Reply

Your email address will not be published. Required fields are marked *


2 + five =

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>