real biathlon
    • Athletes
    • Teams
    • Races
    • Seasons
    • Scores
    • Records
    • Blog(current)
    • More
      Patreon Content Course Profiles Explanations Shortcuts
      Error Report
      Privacy Policy About
    •     
  • Forum
  • Patreon
  • Twitter
  • YouTube
    Instagram
    Facebook

Recent Articles

  • Most improved athletes this winter
  • New biathlon point system
  • Historic biathlon results create expectations. But what about points?
  • What do you expect? Practical applications of the W.E.I.S.E.
  • Introducing W. E. I. S. E: the Win Expectancy Index based on Statistical Exploration, version 1

Categories

  • Biathlon Media
  • Biathlon News
  • Long-term trends
  • Statistical analysis
  • Website updates

Archives

  • 2022
    • December
    • June
    • May
    • March
    • February
    • January
  • 2021
    • December
    • November
    • September
    • July
    • June
    • May
    • April
    • March
    • February
    • January
  • 2020
    • December
    • November
    • August
    • June
    • March
  • 2015
    • December
  • 2013
    • August
    • July
  • 2012
    • July

Search Articles

Recent Tweets

Tweets by realbiathlon

Author: biathlonanalytics

Proud dad&husband; analyst & visualization specialist (Tableau, SQL & R); creator of Biathlon Analytics; blog poster on realbiathlon.com; passionate about biathlon, cross country skiing and canoeing

Rallenta Lisa; the athlete with 2 faces

Posted on 2022-02-09 | by biathlonanalytics | Leave a Comment on Rallenta Lisa; the athlete with 2 faces

Introduction

When watching her today, it is hard to imagine Lisa Vitozzi was fighting for the crystal globe just three seasons ago. She regularly shows flashes of fast skiing and good shooting. But unfortunately, those performances go paired with terribly bad shootings. Especially the first shooting, in prone, has been an incredibly low 41% this season. Yet, when she shows up for the first shooting of a relay race, we see a completely different athlete, shooting 84%.

Of course, shooting in a relay is not the same as shooting in non-team events due to the additional three bullets. But although one could argue relay shooting becomes easier due to this fact, another argument can be made that athletes may take more risk, as they have three bullets to spare.

Having shot four or more misses in her first shootings in the last six non-team events will have a major impact on her mindset. By now I can only imagine it’s the one thing she doesn’t want to think of, but will regardless, especially if she misses the first shot. But I wondered if there was more to it than just the mental aspect. Perhaps a different tactic has had an impact as well? In this article, I research if the data shows that there is more to Vitozzi’s demise in the non-team events than her mental state alone.

Data

I started with data for the 2020-2021 season and the current season to date, including the Individual at the Olympic Games in Beijing. In this timespan, Vitozzi participated in 39 non-team events and 11 relays, for which I analyzed all shots in her first shootings. Well, not exactly all shots, as a shooting percentage for relays typically includes spare bullets that were used as well.

For relays, I only looked at the first five shots and calculated the shooting percentage for those five shots only, knowing that this is not a completely fair comparison due to the above-mentioned difference. Also, 11 relay races is a small sample size, but I chose not to go back further in time as this mostly appears to be an issue of the current season.

Goal

What I was curious about was if her skiing tactics play a role in her shooting problems. To be more precise, does she shoot worse because she is pushing harder in the first lap in non-team events compared to the relays? After all, it is pretty common to see the first lap of a relay go at a pace that reminds us more of a warmup lap.

Since weather conditions, course profiles, types of snow and elevation all have a significant impact on skiing, I couldn’t just compare lap times. So I did the following: I looked at Vitozzi’s lap times in a race (based on course time) and compared the first lap to the average of all her laps in that race. This gave me an indication of her first lap being faster or slower than her average time on the course.

Visualization

The following chart shows all of Vitozzi’s races in the current season so far, represented by a coloured dot. They are ordered by date on the horizontal axis, with the older races on the left and the most recent on the right. The vertical axis shows how the course time of the first loop relates to the average course time of all loops. Below 100% is a faster loop, above 100% a slower one. The labels show her shooting percentage for the first shooting of that race.

A couple of things stand out: in four out of five relays this season she started slower than her average, she shot 80% or better in her first shooting. For the one relay she skied a faster first lap she shot 60%. With a few exceptions, in the races where she starts slower, she hits four out of five. Where she starts faster, she misses between three and five shots.

Now if we look at all the races from the dataset, combined with the averages for discipline, the story the data tells is not much different.

The relays, in which on average she starts her first lap slower than her average lap time, she shoots between 85 and 90%. But in all the other disciplines she starts faster (on average) than her average lap time and shoots worse.

Conclusion

It is clear that Vitozzi’s issue is going to take a lot of mental healing before we see any improvement for her on the range. And we need to be careful not to draw too firm a conclusion while using averages on small sample sizes. But considering that Vitozzi is probably looking for anything to help her right now, slowing down on her first lap may be another factor that can contribute to her getting back to the level we all know she is capable of. Rallenta Lisa!

Posted in Statistical analysis

Welcome to the Penalty Loop Podcast

Posted on 2022-02-04 | by biathlonanalytics | Leave a Comment on Welcome to the Penalty Loop Podcast

Hey everyone, a quick post here about a podcast that I (Biathlon Analytics), started with the creator of Penalty Loop. You can find our Podcast on any of the major podcast players by searching for “THE PENALTY LOOP PODCAST”.

Episode one: Welcome to the Penalty Loop Podcast with Jordan Gottschalk and regular weekly guest RJ Weise. In this episode we’ll start off by introducing ourselves and what you can expect from this podcast going forward. We’ll move on to a brief overview of the IBU World Cup season to date (9:55), we’ll check out the Power Rankings (31:00), an Under the Radar athlete or two (43:50) and some Up and Coming Biathletes (48:15). We then discuss our Topics of the Week (56:00) where we discuss what we know about the Olympic venue we’ll be seeing over the next two weeks as well as a brief touch on wax prep. Finally, we finish up with a discussion, the Stat of the Week (1:07:15) which this week is about how we measure and discuss ski speed.

Posted in Biathlon Media | Tagged podcast

Cool ways to measure and display ski speed in biathlon

Posted on 2022-02-04 | by biathlonanalytics | Leave a Comment on Cool ways to measure and display ski speed in biathlon

Introduction

The ski speed in biathlon is measured and displayed in many different ways. This article makes an attempt to review those different ways and come up with a “best way” to display ski speed, as I believe that some of the current options make ski speed hard to understand. The goal is to have a clear measure and display unit that is understandable to both die-hard biathlon fans as well as the occasional biathlon watcher.

It is also important to know that I looked at these different options from both a per-race and a seasonal-average perspective.

Lastly, I want to emphasize that whatever I end up with, none of the options is bad or wrong, and I definitely don’t want to claim I have all the knowledge to make a final decision on what is best. This is my take, and I look forward to further discussing this topic.

Data

One way or the other, all the data eventually comes from the IBU data center. Here, the IBU provides race times, ski times and course times in different formats but all from the same source. Their data is tracked and collected by Siwidata by the use of trackers that all athletes get wrapped around their ankles and data collection points along the track.

Time data

The competition analysis report, available directly from the IBU data center, combines all the time information per athlete per loop and combined:

Course data

Also important for some calculations described further down in this article is that we have a total course length for the event, and this includes the skiing tracks as well as the range section:

Unfortunately, we do not have data for just the skiing part of the track versus the range part of the track. But when we take into consideration that there typically are 30 lanes of 2.75-3m width, and the range includes 10m on either end, we can guesstimate that the range length is about 110m long. This is a very small part of the total course length (even for a sprint it is 1%). And since we don’t have a better alternative, the total course length is used in the calculations for some of the measures further down in this article.

Course map

The following image clarifies what parts of the racetrack are considered course time, range time and penalty time:

It should be noted that a small section of the penalty loop overlaps with the “normal” track, which explains why every athlete will have a couple of seconds of penalty time per loop, even if they shoot clean. Important to note here is that for ski speed in biathlon, we use only the course time data.

Time data, part II

All races have data for the total race time and the course total time, as well as the penalty time for all races except the individual race discipline which uses the ski time:

  • Race time (above with the header “Finish”) typically called Loop time is the total time from start to finish, including range and penalty time;
  • Ski time is the Race time minus the one-minute penalty times in individual races (other race disciplines do not have ski time data);
  • Course time is the total time skiing, excluding time on the range and in the penalty loop. This is the data used for measuring how fast the athletes were skiing.

Measurements

The following is a list of measurements based on the data described above, currently used in the “biathlon world”. I try to give some pros, cons and comments on each of them.

  • Seconds behind: absolute measure of the distance between athletes in seconds behind the fastest skier at the finish line
    • Good way to show actual time distance between the athletes at the finish
    • Doesn’t work well for aggregating multiple race disciplines as 10 seconds behind on the sprint is different than 10 seconds behind on the individual
  • Seconds behind leader per 1,000 meter: absolute measure of the distance between athletes in seconds behind the fastest skier per 1,000 meters
    • This normalizes the distance between athletes in seconds so it can be aggregated between multiple race disciplines
    • Uses total course length to calculate
    • Used by the IBU in biathlon information
  • Seconds behind per penalty loop: absolute measure of the distance between athletes in seconds behind the fastest skier per 150 meters, the length of a penalty loop
    • Same as previous but normailzes to a distance people, both die-hard and occasional fans, can easily relate to
    • Can be aggreated between multiple race disciplines
    • Numbers can get very small
    • Uses total course length to calculate
  • Per cent behind, or per cent back: relative measure of the distance between athletes in a percentage of the ski time of the fastest skier
    • Fastest athlete is always 0%
    • There is no range for percentages; the slowest in one race can be +33% where in other races it can be 50%
  • Per cent of average ski speed (of the whole field): relative measure of the distance between athletes in a percentage of the average ski time of all skiers
    • Average ski time (0%) includes every athlete from best to worst
    • You don’t really know how fast or slow someone is, as you don’t know the minimal and maximum percentage values
    • The negative values (-3.4%) represent being faster than average so a positive result
  • Per cent of the average of top 5/10/30 skiers: relative measure of the distance between athletes in a percentage of the average ski time of the top 5/10/30 skiers
    • Average ski time can be focussed only on top 5/10/30 skiers
    • This can level the field when comparing sprints (over 100 athletes) to mass starts (30 athletes)
  • Meters behind leader: absolute measure of the distance between athletes in meters behind the fastest skier on the total course length at the finish line
    • Apparently the way Siwidata now measures for the IBU
    • Doesn’t work well for averaging multiple race disciplines as 10 meters behind on the sprint is different than 10 meters behind on the individual for example
  • Meters behind leader per penalty loop: absolute measure of the distance between athletes in meters behind the fastest skier per 150 meters, the length of a penalty loop
    • Same as previous but normailzes to a distance people, both die-hard and occasional fans, can easily relate to
    • Can be aggreated between multiple race disciplines
    • Numbers can get very small
    • Uses total course length to calculate
  • Ski speed: absolute measure of the difference between athletes in kilometres per hour
    • Doesn’t say much about how the speed relates to time difference between athletes
    • Has different effect on race depending on the race distance
  • Ski speed rank: absolute measure of the difference between athletes in the rank in ski speed
    • Fastest is rank 1, slowest is the highest number
    • Downside is that it loses the actual distance between racers. One can be the third ranked skier by two seconds behind the leader or two minutes behind the leader
  • Zscore: computed measure of the difference between athletes in the number of standard deviations by which course times are above or below the mean, based on seconds behind from median of all athletes
    • Can be hard to relate to by all fans
    • Although a precise value, only gives a general sense of someone being faster or slower than the field mean
    • The negative values (-3.4%) represent being faster than average so a positive result
  • Course time: absolute measure of the difference between athletes in the actual course times
    • Gives a good idea of how long the athletes took to ski the track
    • Need to calculate the actual differences
  • Time behind score: relative measure of the difference between athletes on a 0-100% range, where fastest athlete is 100% and slowest is 0%
    • Further explanation can be found in this post

Ways of communicating

Tables

All measures can be shown in a table, which provides a detailed overview of the race results per that specific measure. They are great for looking up specific athletes and giving the exact numbers, but they are harder to interpret quickly and to envision the fastest skiers and how far they are from each other.

Charts

Charts on the other hand are easy to interpret quickly while still proving detailed information per individual athlete (especially when created interactively), and context (see example at end of article) to make it even easier to read.

Description/talking

Probably the most complicated but least considered aspect of communicating ski speed is how it can be discussed. What does it mean when someone says Laegreid was -2,6% from the average, Latipov was 13 seconds behind, Boe was 16 meters behind, Lesser was 3% back, and so on? For die-hard biathlon fans this may make sense for those who are “into data”. For those who just love to watch biathlon and casual fans, I believe this way of describing ski speed is not very meaningful or useful.

Rankings are clear in the sense to discuss who was faster and slower, but not how much faster or slower. Measures per a relatable distance, like a penalty loop, are easy to understand and visualize for biathlon fans at all levels. They don’t even need to know the actual distance of a penalty loop!

Conclusion

When we take a look at most of the measurements in chart format, we can see that the majority show the same data but just with a different axis and units (the red, orange and yellow icons indicate rank 1, 10 and 30 respectively). And with all charts being equal we can decide which unit of measurement would be the easiest to communicate to all types of biathlon fans while having the ability to aggregate the data for a whole season.

I already mentioned that although Seconds behind, Meters behind leader, and Course time are easy to relate to, they are not ideal for aggregation as race disciplines have different distances. The opposite is the case for Per cent behind, Per cent of average ski speed, and Zscore, as they aggregate well but are not so easy to relate to for all fans. The Ski speed in Km/h is cool in the sense that it makes you realize how fast these athletes go. But it doesn’t say much about the end result (difference between athletes) and it would be hard to aggregate.

Ski speed rank shows a different picture of the data when we put it in a chart. The ski speed rank is easy to relate to and very clear to communicate and aggregate, but it loses the information about the space between athletes.

For both single races and season aggregations, the Meters behind leader per penalty loop is a measure that is easy to understand for any biathlon fan, can be aggregated for a whole season with different race disciplines as it is normalized to a specific distance (150 meters), and it keeps the information on space between athletes intact. And on top of that, it can be visualized in cool ways.

It is also very similar to what the IBU currently uses (seconds behind per 1km) but I think that meters behind are easier to visualize mentally and understand than seconds behind and that it is good to use a distance people can relate to directly. The only downside to using the 150-meter loop is that athletes do appear very close to each other.

What is your preferred way of measuring ski speed? And do you agree or disagree with my comments above? Let’s have a conversation on Twitter, I’d love to hear other people’s perspectives.

Posted in Statistical analysis | Tagged ski speed

Has the field gotten narrower?

Posted on 2022-01-21 | by biathlonanalytics | Leave a Comment on Has the field gotten narrower?

Introduction

After listening to an episode of Doppelzimmer, a german podcast in which Erik Lesser and Arnd Peiffer talk about biathlon, I got curious. Curious about analyzing if the field has gotten narrower between biathlon nations in the last two decades. Erik and Arnd were talking about this and saying that people always mention the field is getting narrower, but that it would be interesting to do some analysis about it to see if this is actually true. This is my analysis on that topic.

Data

For this research I used all Men’s Relay races on the IBU World cup, World championships and Olympic Games since the 2000-2001 season, ending at the 5th event of the current 2021-2022 season. I removed all nations that did not start, did not finish, got lapped, etc. Then I did some conversions of times from hours, minutes and seconds into seconds and did data validation as some years had some bad data quality in some fields.

Measures

Then the question was how to measure the narrowing of the field in biathlon. I took a two-sided approach on this: for one I looked at how many seconds the 15th ranked team was behind the eventual winners of the race. This should give me a good idea of how much the weaker teams are behind the top team, expressed in time. The other approach was to see how many teams finished within 5 minutes of the winning team. This gave me another look at how many teams could be considered stronger teams.

The 15th rank and the 5 minutes are variables that can be debated forever. But the reasoning for the 15th team is that in many cases there weren’t that many more teams in the race that finished. The 5 minutes is an arbitrary number I decided on after spending some time going through the data and looking at the distance in time between the better teams of the time.

Analysis

As one would expect, due to different venues, weather conditions, team lineups and other factors, the results are kind of up and down from one race to the next.

Seconds behind lead for rank 15
Nations with 5 minutes of winning team

Although we can kind of see some vague hints of a trend in the second chart, it really doesn’t show it at a level that would make me comfortable to claim a trend exists.

Luckily we can use the moving average function. For every race, this takes the race’s result plus the previous 10 races (about two seasons worth) and averages them. This gives a clearer picture and a better idea of the trends over the last two decades.

Results

The first chart, about the time behind the leaders for teams ranked 15th, shows that despite some waves going up and down, over time the time behind the leaders has slowly but steadily decreased, from about 500 seconds to between 250 and 300 seconds. That means the 15th ranked teams have gotten 3 to 4 minutes closer in the last 20 years.

The second chart, with the number of nations within 5 minutes from the winning team, shows an even more wavy pattern. Since about 12-15 seasons ago, the general number of teams has been going up, ranging between 13 and 16, but from there it seems to have stabilized.

Overall, I think the “smaller” biathlon nations are getting closer to the leaders of the pack, but the number of top nations appears to have stabilized in the last 5-10 years. What do you think? Would other ranks and seconds from the winner values be better to use for this analysis? Please let me know on Twitter or use the interactive version of this chart and see for yourself!

Posted in Long-term trends, Statistical analysis

And we continue our Shot data analysis

Posted on 2022-01-15 | by biathlonanalytics | Leave a Comment on And we continue our Shot data analysis

I did some further analysis looking at the shooting data, based on more feedback and discussion on Twitter yesterday.

Again, I started with the 5 seasons of shooting data, men and women, with the current season up to the 5th event. I then filtered for Mass Starts and Pursuits, and did some data cleaning, leaving 96,580 shots for the analysis, not limiting the data by certain ranks only.

Question 1

Aldo Ramos on Twitter wondered if we could determine the percentages of athletes that would go clean on the first 15 shots, and then had one miss. And where did that miss happen?

After moving the data back and forth between Tableau and Google Sheets, I was able to show what percentage of athletes clean on the first 15 missed their 16th target, or went clear on the first 16 and missed the 17th target, etc.

For clarity’s sake, the T16 column includes MHHHH, MMHHH, MMMHH, etc. The question was really about at what shot the “Clean spell” is broken. As athletes pick up more misses as the fifth shooting goes on, the percentages go down. Other than going clean on all 20, which 41% of the athletes clean after 15 achieve.

Question 2

Bjorn then asked on Twitter how many athletes missed more than one shot in the last shooting.

The chart above shows all sequences of the last 5 shots. As we saw above 41% goes clean all the way, after that the most common occurrence is missing the 16th shot, then missing shot 20 with 8.44%, etc.

But it doesn’t really clearly answer Bjorn’s question of how many misses in the last five were shot. The chart below answers that question more clearly:

Just over 40% went clear, and the same percentage had one miss. The group with two misses was 15%, three misses 2.5% and just over 1% missed four times. No one in the group that hit the first 15 shots had 5 misses in the last shooting.

Posted in Statistical analysis

Posts navigation

Older posts
Newer posts

Recent Articles

  • Most improved athletes this winter
  • New biathlon point system
  • Historic biathlon results create expectations. But what about points?
  • What do you expect? Practical applications of the W.E.I.S.E.
  • Introducing W. E. I. S. E: the Win Expectancy Index based on Statistical Exploration, version 1

Categories

  • Biathlon Media
  • Biathlon News
  • Long-term trends
  • Statistical analysis
  • Website updates

Archives by Month

  • 2022: J F M A M J J A S O N D
  • 2021: J F M A M J J A S O N D
  • 2020: J F M A M J J A S O N D
  • 2015: J F M A M J J A S O N D
  • 2013: J F M A M J J A S O N D
  • 2012: J F M A M J J A S O N D

Search Articles