In the third issue of Puck Possessed Biathlon, I want to look at the influence of things like weather and snow conditions, as well as course information. This is all summarized in reports made available on the https://biathlonresults.com/ website as Final Results – Competition Data Summary:
From this report, I used the measurements provided, except for the measurement taken half an hour before the race, as it doesn’t seem that relevant. Also, all these measurements should be taken with a grain of salt (how accurately are they measured, it’s only on one measure location, and some “measurements” are qualitative. In addition I tried my best to find a general elevation for the biathlon stadiums using Google Earth, so that data quality is also limited. Lastly, working only with the data I have, I had to make some assumptions. I realize that a maximum climb right before the shooting range makes a course harder than when it is right after the stadium. I tried looking into course profiles, but they are surprisingly hard to get (in a useful format).
To make all this data a bit easier to work with, I created a number of categories or indexes based on similar/related measurements, rather than using all data individually:
- Wind strength (using the maximum value of the Wind Direction/Speed row);
- Wind direction variability (the maximum difference in degrees between the three measured wind directions;
- Wind strength variability (difference between minimal and maximum).
- Weather description (qualitative) is typically the same during the race, with a few exceptions (two out of 25 at the time of writing). I grouped some values in categories as they are very similar related to visibility:
- Clear sky & Sunny
- Cloudy, Low-level cloud, Partly cloudy
- Light rain, Light snow, Light snowfall and Rain
- Heavy snow & Snow
- Total Course Length;
- Height Difference;
- Maximum Climb;
- Total Climb;
- Snow of the track.
- Air Temperature. Even though it varies, I don’t see how this could have an impact on performance, especially since events get cancelled when the temperature drops below a value where it could impact shooting. Note that I am aware that temperature impacts the tracks, but I think that is better measured by using Snow temperature;
- Humidity. I tried to find any correlation between humidity and shooting performance but was unable to, leading to the conclusion that humidity by itself has no impact on shooting performance. Of course humidity is related to precipitation, but that aspect is covered in the Weather section.
Now the question is how to measure shooting performance. The obvious measurement is the number of shots missed, but I don’t want to ignore shooting times. For example if athlete A has no misses but takes 30 seconds longer to shoot than athlete B who may have one miss, that still says something about shooting performance compared between athletes A and B. I also considered including range time, but I consider that to be more related to ski performance. So for this exercise I am using Shooting Times and Penalty Times (in seconds) as the latter are directly related to misses and allows for combining it with shooting speed.
Next step is indexing the different categories, starting with Wind. Let’s look first at the correlation between the different wind factors and shooting performance as described above:
This tells me that the biggest correlation (and most reliable) is the wind strength, and that both strength and direction variability are not significant:
Let’s dig a little deeper here. Although on it’s own the maximum wind speed may have the most (and only) impact, how about the combination of wind speed and speed variability and direction variability?
The following charts show there is actually a almost 70% correlation between wind strength variability and maximum strength (direction variability not at all):
So we’ll need to look at combinations of maximum wind speed and change in speed. Logically it makes sense too. Even if the wind changes direction, if the wind is not very strong it won’t have much of an impact. But variable wind speeds, especially whit some strong gusts are tough to adjust to).Now how about visibility? That becomes a bit more complicated, or less objective, as we don’t have measures for visibility, but rather subjective observations. Let’s look at the number of athletes with specific number of misses per race per season, and relate that to the weather description:
This gives me some indication of what are good shooting conditions, and which ones are less preferable. Let’s simplify this a bit more, by assuming a solid shooting performance is two misses or less; anything more and you are typically out of the race for gold (expect when you have exceptional ski speed):
Based on all this information (and knowingly ignoring other factors that contribute to these number), I’m going to state that Clear sky, Sunny, Cloudy, Light snowfall and Rain typically lead to solid shooting performances, with well over 70% of all athletes having 2 misses or less, whereas Partly cloudy, Snow, Heavy snow, Light rain, Light snow and Low-level cloud lead to lesser shooting performances. Partly cloudy, Light snow and Light rain appear to be the worst conditions.
That leaves us with the course conditions. And other than Total Climb in meters (which is still statistically insignificant with a p-value of 0.06) none of the course condition factors show any correlation to shooting performance (defined as shooting and penalty times), with p-values over 0.7 and R2-values lower than 0.005:
These charts look at event averages, but looking at individual athlete shooting performances the results are very similar:
Although it is hard to imagine course conditions having no indirect impact on shooting performance (many steep climbs, especially before entering the stadium, or wet, slow snow which makes the athletes work harder, etc.) I’m going to assume there is no direct impact on shooting performance. But that would be an interesting analysis for a future edition of Puck Possessed Biathlon for sure.
So in summary, we are going to index or score wind influence and visibility influence. And based on the information we gathered so far, I’m going to say that
Clear sky, Sunny, Cloudy, Light snowfall and Rain = Good
Snow, Heavy snow and Low-level cloud = Medium
Partly cloudy, Light snow and Light rain = Bad
IF [WindStrengthMAX (copy)] >= 2 AND [WindStrengthDiff] >= 1.2 THEN "Bad"
ELSEIF [WindStrengthMAX (copy)] >= 2 AND [WindStrengthDiff] < 1.2 THEN "Medium"
ELSEIF [WindStrengthMAX (copy)] < 2 AND [WindStrengthDiff] >= 1.2 THEN "Medium"
Now we can assign values to good, medium and bad (1, 2 and 3) and create a External Factor Index, that we can then try to measure up against the Shooting Performance indicator described earlier:
All in all a lot of work to come to the conclusion that there is a correlation between our defined Shooting Performance, and the External Factor Index, mostly based on wind and weather: the P-value is 0.0041 and thus significant, and the R2-value is 0.295.
As I am sure you have figured out if you got this far, my statistical knowledge is limited. But I would say, that based on all assumptions made above, roughly 30% of shooting performance is impacted by weather conditions mentioned above.
Of course this research can use a lot of improvement. For example rather than comparing average shooting performances per event, look at standardized shooting performances. And the External Factor Index is based on a number of assumptions that are, to say the least, arbitrary. But the exercise was fun, and I believe I learned a lot more about the data of women’s biathlon sprint races.
If you have any feedback or comments, please reach out on Twitter: @rjweise
About Post Author
Proud dad&husband; analyst & visualization specialist (Tableau, SQL & R); creator of Biathlon Analytics; blog poster on realbiathlon.com; passionate about biathlon, cross country skiing and canoeing