I did some further analysis looking at the shooting data, based on more feedback and discussion on Twitter yesterday.

Again, I started with the 5 seasons of shooting data, men and women, with the current season up to the 5th event. I then filtered for Mass Starts and Pursuits, and did some data cleaning, leaving 96,580 shots for the analysis, not limiting the data by certain ranks only.

Question 1

Aldo Ramos on Twitter wondered if we could determine the percentages of athletes that would go clean on the first 15 shots, and then had one miss. And where did that miss happen?

After moving the data back and forth between Tableau and Google Sheets, I was able to show what percentage of athletes clean on the first 15 missed their 16th target, or went clear on the first 16 and missed the 17th target, etc.

For clarity’s sake, the T16 column includes MHHHH, MMHHH, MMMHH, etc. The question was really about at what shot the “Clean spell” is broken. As athletes pick up more misses as the fifth shooting goes on, the percentages go down. Other than going clean on all 20, which 41% of the athletes clean after 15 achieve.

Question 2

Bjorn then asked on Twitter how many athletes missed more than one shot in the last shooting.

The chart above shows all sequences of the last 5 shots. As we saw above 41% goes clean all the way, after that the most common occurrence is missing the 16th shot, then missing shot 20 with 8.44%, etc.

But it doesn’t really clearly answer Bjorn’s question of how many misses in the last five were shot. The chart below answers that question more clearly:

Just over 40% went clear, and the same percentage had one miss. The group with two misses was 15%, three misses 2.5% and just over 1% missed four times. No one in the group that hit the first 15 shots had 5 misses in the last shooting.

I had a fun discussion with Bjorn Ferry, Biathlon23 and SportInDepth on Twitter. We debated if taking the 4th and 5th shot after shooting clean on the first three, is (mentally) harder than when you already have a miss in your first three shots.

My thesis is that the last shot is difficult only if you did not miss earlier in the competition. If you only look at those who hit the first 9 shots on a sprint, I think the last shot has bad statistics. But not for those who missed already in prone.

I went through the individual and the sprint competitions in Pokljuka back in 2018 to find out if it is the case that the last shot is more difficult. I compared all shooting series (6410 shots) with those that arrived to the last standing with the chance to shoot clean.->

Shot 1,2,3,4,5; which is most often missed? For the entire starting field it was no clear tendency to miss shots four or five more often. For the group that arrives to the last stand with the chance to shoot clean, it becomes clear that shots four and five are harder. ->

It is also logical because then you have the chance to make a really good result. The pressure is increasing. *whole field = blue bars those with the chance to shoot full = orange bars

It would be fun to see if this is true if you look at an entire season. If you hit the first 8 targets in a sprint, or the first 18 targets in the other disciplines. What does it look like then? I think the fourth and fifth shots are harder.

Caveat

Before I continue, I want to be very clear. One, I am not a statistician, and two, Bjorn probably knows a little more about biathlon than I do, based on his 7 wins and 22 top-3’s in the IBU World Cup, World Champs and Olympics, not even counting his 7 medals in relays. So to go out and say I did not agree with his findings was, well scary.

Data

My data however told me a slightly different story. We also started with different datasets. His data was from the individual and sprint competitions (not sure if it includes men and women) in Pokljuka in 2018 only looking at the final standing shooting, for 6410 shots. Mine was from the 2017-18 season onwards to the Oberhof event in 2021-22. It includes the final standing shooting of the Sprint, Pursuit, Mass Start and Individual events. I also only included the top 30 athletes. And 6,410 shots -vs- 31,419 will make a difference. Not better or worse, but less influenced by outliers, and more predictable.

I did not have the data in a format where I could replicate the chart from Bjorn, so initially I took a different approach in my analysis.

Analysis, approach one

The chart shows the number of hits and misses per shot of the last shooting of an event (so 2nd shooting for Sprint, and 4th shooting for the other events). As shown in the previous article, the first and last shots have the lowest hit rates.

This data can be copied from Tableau to Google sheets where I can create a table that shows the probability for every shot combination.

Now that we have a list of probabilities for every shot combination, we can group these combinations into:

Combi

Probability

All hits

44.3%

First 4 hits, miss T5

8.4%

One or more misses in first 4, hit T5

40.1%

One or more misses in first 4, miss T5

7.6%

So hitting the fifth target after hitting the first four targets has a higher probability than when missing one or more targets in the first four shots. This conclusion is different from the conclusion Bjorn drew from his chart, but perhaps this is simply because I didn’t replicate the chart he used.

Analysis, approach two

To compare apples to apples I want to duplicate the chart from Bjorn. First, I used a different subset. Still from the same seasons, I included all athletes rather than the top 30 only. I also limited the data to only Mass Starts and Pursuits to limit the number of records (Goole Sheets crashed with more than one million cells). Also, these race disciplines probably have even more pressure, since it (wo)man to (wo)man rather than time-based. This left me with 107 races and 358 athletes for 98,400 shots.

I moved this data over to Google Sheets where I could use the Pivot functionality to create the following table with Athlete, Race and shot full (20 shot) combination, in which the M indicates a miss and H a hit:

Going back to Tableau, I could now calculate the misses for every shot of the fourth (and last) shooting session, and create two groups of athletes: one that started the fourth shooting with 15 hits, and one that had at least one miss in the first 15 shots.

Going back and forth

Unfortunately, the way the data is formatted I could not convert this data to a percentage, so back to Google sheets I went to create a conversion table, which I then could use to create the chart similar to Bjorn’s.

When I display the percentages for the group of all athletes as well as those that came into the fourth shooting with zero misses, again the conclusion that can be drawn is different from Bjorn.

Let’s quickly put them side by side again to compare:

It is clear why Bjorn felt his data and chart are supporting his thesis that shooting targets 19 and 20 is harder when you go clean in the first 15 shots, compared to when you have a miss already. And I do agree with him that if you think about it would make sense. But no matter how I look at my data, I cannot come to the same result. The fourth shot is actually missed the least and the fifth is missed less often than shots one, two or three. Of course, for the fifth shot, this includes both those who shot clean the first 15 and then missed one or more shots in the final shooting.

First 18 shots clean

So what if we look at those shootings where the first 18 were clean? I basically did the same exercise adding up the misses per group in Tableau. Obviously the first three shots have no misses since the first 18 are clean. That doesn’t leave us with much data, with only 84 misses in total on the fourth and fifth shots.

Again we bring this over to Google sheets and calculate the percentages.

From there we create a chart, from which I conclude that if anything the fifth shot is harder than the fourth, confirming our earlier finds that the first and last shots of a shooting are the hardest. But to draw any conclusions based on 84 shots only is not something would recommend.

Conclusion

From the chart I created, I cannot conclude that the 4th and 5th shots are harder when clean, compared to when one already has a miss. If anything, it seems there are fewer misses in shots four and five of the last shooting when one is clean.

One of the reasons I mentioned on Twitter is that if you make it to the fourth shooting without any misses, you are a pretty darn good shot. The odds of making the next five shots are pretty good. On the other hand, if you already had one or more misses in the first fifteen, perhaps you still have some work to do on the shooting, and having another miss is not an unrealistic expectation. We need to remember that this includes all athletes from Pursuits and Mass Starts, so up to 60 per race. That includes shooters who can use some improvement still, as well as the Laegreids and Eders of biathlon who shoot around 90%.

How else can we explain the differences? The last thing I want is to create the impression that I think I know better than Bjorn Ferry, and that his chart is wrong. This is not the case (to be clear)! Just the fact that my resulting chart does not support the thesis Bjorn stated makes me nervous, especially as he mentioned his chart confirms what he expected. Someone with his experience of course knows what he is talking about. But my data and analysis don’t live up to these expectations…

I mentioned I’m not a statistician, and although I double and triple checked my data and process, so I’m confident there are no mistakes in the data or process. But if anyone can tell me after reading this article that I went wrong I’d be happy to hear from you!

Another reason I believe there are different results is that both analyses are based on using different data sources and sample sizes. I should also mention that I don’t know the details of the process that Bjorn used to create his chart. Perhaps I misunderstood what he did and used, which may have led to doing a different analysis. The larger sample size I used typically leads to less obvious differences and fewer extremes And Bjorn used data from one event with a couple of races, which will make it subjective to conditions specific to Pokljuka. Something levelled out by using more data from different event locations with different weather conditions.

All in all, it was great fun and interesting to do this analysis, and I thank Bjorn Ferry for reaching out and sharing his chart and work. Although my work does not support his thesis, I hope this article hasn’t lost me a follower on Twitter… ;o)

Let me know what you think about this article by sending me a Tweet or DM! Any feedback is highly appreciated.

In my last article, “Is one shot like any other in biathlon?“, I looked at the hit rates for every shot in biathlon for non-team races. The following looks at hit rates again, but this time for Relays. Relays are different in the sense that every shooting has three extra rounds that can be manually reloaded, so determining which shots were hits and which were misses works a little differently.

I also looked at the data to see if I could link the hit rates to pressure. But again the three reloads and also the fact that we are dealing with multiple athletes per relay team, puts a whole new perspective on that.

Data

Unfortunately, I found that the data I used for non-team races to determine per shot if it was a hit or miss, was not usable for determining the same for relay races. What I need for relays is only available since the 2020-2021 season. To be more specific, what I need is tracking of which target is hit by which shot.

For example, since the 2020-2021 season, a shooting result can look like 57621. This means the alpha target was hit with the fifth shot, the bravo target with the 7th shot, charlie with the sixth shot, delta with the second shot and echo with the first shot (biathlon targets are usually listed from left to right, abcde). We can also conclude that the athlete shot right to left with a shooting order of hit, hit, miss, miss, hit, hit, hit and that the athlete used two reloads.

Before the 2020-2021 season, this would have been tracked as 54321, meaning that every target (eventually) went down. But it would not be possible to determine which shot actually knocked the targets down and how many bullets (and reloads) were used. This wasn’t a problem for non-team races as there are no reloads, and thus always five shots only.

What this all comes down to is that I can only use data since the 2020-2021 season, which means we only have data for 24 races including the most recent ones in Oberhof 2022. So a total of 27,702 shots for all teams that finished a race.

Some quick and cool facts

Some initial findings after looking at the data are that the most common shooting is 54321. So shooting right to left and hitting all targets without reloads (15.3%). This is followed by 12345, shooting left to right hitting all targets without reloads (8.8%). On the opposite hand, one of the old school shooting orders that starts with target charlie, the centre target, 32154 only happened 5 times (0.02%) and all by Yurie Tanaka, a 33-year-old soldier from Japan.

To include or not include, that’s the question

As I write this I just saw the single mixed relay in Oberhof, and after watching that race I was debating if I should include the data from the single missed relay races or not. It follows the same pattern as all other team races, but because of the short loops and the fact that only two athletes participate, the shooting seems very chaotic and not very representative of the athletes’ abilities. That would remove another 4 races.

On the other hand, perhaps the Oberhof race was just an exception due to the conditions. And when averaging values out of 24 races, one race that may have been a little out of the ordinary will not have a major impact. So for now, I’ll be including the single mixed relays as well.

Matrix

So let’s look at all the shots (left to right) and shootings (top to bottom) and determine the hit rates for every combination. The bottom left square is shooting 8, shot 1 and the top right is shooting 1 and shot 8. The final three shots of every shooting are the manual reloads, and since not every athlete needs them, there are fewer of those. The total numbers are the averages for the respective rows and columns. Finally, the labels indicate the hit rate, also represented by the colour, total number of shots and number of hits in brackets.

One thing to remember is that all odd-numbered shootings are prone, and all even-numbered shootings are standing, and shots six, seven and eight are manual reloads.

How do “regular” shots compare to reload shots?

One thing we can conclude from the matrix above is that the hit rates for the first five shots are better than the three reloads. One can argue this is not a fair comparison as there are fewer reload shots, but I think there are enough shots to see that they differ. Let’s look at the five regular shots first.

The only thing that stands out to me here is that the 7th and 8th shootings are not the worst for prone and standing shooting respectively. With regards to having more pressure on the last shooting, this doesn’t prove (or deny) anything. But as mentioned before, the additional reloads put a different perspective on pressure. So let’s look at the last three shots for the eight shootings below.

Here we also see that the last prone and standing shootings are not the worst of the whole series. One highly important thing to remember is that the shooters are not the same for all shootings. Can we perhaps say something about tactics based on the four groups of two shootings with regards to the quality of shooters?

Let’s summarize this a little

If we aggregate the shots per shooting ( so basically look at the totals per row of the two charts above) we could conclude that the first athlete is the best shooter. Followed by the second shooter, then the fourth shooter and the third shooter being the worst of them. Note that this also includes the single mixed relay as mentioned earlier, which only includes two athletes that each have double shooting responsibilities compared to full team races.

If we take out the individual athletes we can summarize this even further by summarizing the shootings, or columns. Fair to say is that Prone shooting has a higher hit rate than Standing (duh), and Reloads have a lower hit rate than the First 5. This is not surprising as the manual reload of every bullet breaks the rhythm and shooting position every time. I also think (but cannot prove based on the data) that there is some added pressure with Reloads, as you really want to avoid the penalty loop when you have three spares at this level.

These line charts show the same data as the left matrix above, with the first chart showing the “first 5” and Reload shots, and the second chart the Total column. In these charts, orange represents Standing shooting, and blue the Prone shooting. We can more clearly see that the third shooter is typically the worst of the four.

What if we do this for the top 5 (or 10, 15)?

As if we haven’t talked about enough variables yet that impact our possible conclusions, so far we have only looked at data from all the athletes and races. Would this look different if we only looked at the top 5, top 10 or top 15?

Other than the hit rates going up from left to right as we limit ourselves to better athletes, the patterns remain the same. So from that perspective, my confidence is pretty high in saying that…

Conclusion

… also in the relay, a shot is not a shot like any other, especially between shooting from a cartridge/magazine and after loading by hand. But there are just too many factors to say something useful otherwise. And specifically anything about pressure or relay tactics, with only some hints about which shootings have the better athletes, and some of the lower reload hit rates potentially being caused by more pressure.

Do you think I could have come to any other conclusions based on the data and charts above? Please let me know on Twitter or Instagram!