Home American football Goals Subtracted (g-): Set piece edition — American Soccer Analysis

Goals Subtracted (g-): Set piece edition — American Soccer Analysis

0
Goals Subtracted (g-): Set piece edition — American Soccer Analysis

[ad_1]

However, there is (at least) one notable flaw with the above approach. Most of these players, 227 to be exact, played on the same team both years, and this correlation may be more due to team effects than player effects. Indeed, the slope for those 227 players who stuck with the same club was more than 0.40 (40% year-over-year stability), but the slope for the 32 players who changed teams was statistically insignificant and negative. Hmmmm. So I normalized each player’s g- value within his team and ran the regression again. This helps control for the team effect, and measures whether players perform similarly season over season relative to their teammates

With the 1,000 minute cutoff in both seasons, the overall slope fell from above 0.30 to below 0.10, and was no longer statistically significant. With a 2,000 minute cutoff, the slope bounced back up to nearly 0.30 for players staying on the same team. In other words, relative to their teammates, bad defenders were more likely to be bad defenders again, and good defenders more likely to remain good defenders, but with a fair amount of regression to the mean. 

It’s hardly conclusive, but I think we can say two things fairly confidently: (1) when it comes to defense, team defense is more than an aggregation of individual defenders, and (2) bigger sample sizes improve stability year over year. We can check out the stability of g- again when we build this out across more seasons and more leagues, to try to confirm that the 30% slope, or “stability score”, is actually true. 

Future considerations

Defender bias

While reviewing the results in aggregate, we noticed a clear bias hurting players that make a lot of interrupting actions in front of the goal. This became most obvious when we added goalkeepers to the open-play position list as an experiment. They took 67% of the responsibility for the area just in front of their goal (open play zone 3, technically) and thus they took a huge g- burden, two-to-three times that of any other position in total. That just seemed wrong. To a lesser extent this happens to centerbacks, too, both in set pieces and open play. For set pieces, we created more granular zones in front of goal to help with this issue, and we may just need to go even more granular.

If you’re wondering how this bias happens, think about this hypothetical: a ball comes flying into the box, and it’s toward your beast of a center back. They win the ball because they win these balls 75% of the time, and thus get many interrupting touches for doing their job. This also adds to their zonal responsibility burden. A second ball swings in toward your short defensive center mid, who is dropping in for a set piece or other dangerous attack. They watch the ball fly past, and only win these types of balls 25% of the time, because they are short and bad at The Defense. The overall positional responsibility ends up getting weighted toward center backs for essentially doing their job, while their mates basically suck and don’t get the interrupting touches which populate the numerator of zonal responsibility. There’s nothing to catch bad defenders not being there. The solution? We need DUELS (duels enthusiasts rejoice). Specifically, we need lost duels to help better understand who is responsible for a given plot of land. Duels weren’t readily available in our data during this research, but this is definitely a plug for duels if you were looking for a plug for duels.

Circumstance segmentation

Again, with the goal of determining who should be where and when, we could segment more. Further splitting by home and away, for example, would combine with gamestate (already in the framework) to possibly better explain how offensive or defensive certain positions tend to play, and where we should expect those players to be playing. But we have to be wary of over-segmenting the data, which creates small groupings from which to derive zonal responsibility, and also blows up my computer.  Right now, the set piece coding pipeline, for a brief moment, generates 100 million combinations of things. I had to implement some slick tricks to get it to work with just 2 seasons already.

Set piece and open play definitions

Regarding how set pieces are defined distinct from open play, we did not test those thresholds for this analysis. In retrospect, specifically the 10-action threshold for when a free kick becomes open play seems long, and we could probably quibble over others, as well. Maybe there should even be a third “phase”, distinct from open play and set pieces, that refines these middle grounds. That gets into another piece of information that could help answer the question who should be where and when. We discussed further controlling for how far into the possession we are, like how many actions have been completed to get to this point, which would help define the phase of play, players’ roles, and the resulting zonal responsibility weights between interrupting and offensive action locations by position.

I guess the only way to conclude this is to thank you for making it to the end!

[ad_2]

LEAVE A REPLY

Please enter your comment!
Please enter your name here