Tennis' Data Crunch
A few days ago, one of my co-workers sent me an article about the lack of data analytics and stats in tennis in comparison with other sports like baseball. She asked if it was something I thought was interesting. My several-paragraphs-long response was a definitive "yes," and this post is the end result of a few days of mulling the subject over.

My initial reaction to the article was one of resounding agreement. As a humble part-time tennis blogger, I frequently find myself rooting through the freely available data on the ATP, WTA, and tournament websites scrounging around for fun stats. While there is a fair amount of data out there, it isn't always meaningful and it isn't always in a useful format. So, when I want to see Novak Djokovic's record against left-handed players in World Tour Masters 1000 events that are played on hard courts, that typically means a lot of work for me pulling and combining separate data sets to put together exactly what I need.

However, after re-reading the article and thinking about tennis data more and more, I began to change my course a bit. I realized that the issue might be less about the existence of data, and more about the availability and format of said data. After all, TV commentators are always spouting off statistics about a player's serve holding percentage going up when they win the first point of the game, and there are sometimes even fancy graphics showing how far each player has run during the match. So, that data has to be somewhere, right? A quick Google search on "tennis analytics" seemed to confirm my hunch and revealed a few interesting results at the same time.

I found some very interesting blog posts on GameSetMap, about leveraging HawkEye data in match analysis. HawkEye technology is best known as the electronic line-calling system used on the pro tours for call reviews, but the system's design also allows it to track a wealth of additional information. Using the ball-tracking cameras, HawkEye can also collect information about player movement speed, distance, hit point and more.

I also found a post on a separate blog that reminded me of IBM's data partnership with the Grand Slam events and SAP's data partnership with the WTA. Both firms collect loads of data during matches that are used during in-match graphics and post-match summaries.

Unfortunately, in all three cases, the data is simply not available to the public at this time. The author of GameSetMap did manage to wrangle a few matches-worth of HawkEye data, but only with the permission of the tournament and lots of legwork. Hopefully the owners of the data and the professional tours are doing something with the data, but other than that, it seems destined to be locked in a vault away from prying eyes.

The good news is that there are some faint bright spots when it comes to tennis data and analytics. Babolat introduced the first ever connected racquet and app in late 2013. They've already released several more connected Play models since. Wilson, HEAD, Prince, and Yonex are getting in on the connected racquet movement as well. All four brands have partnered with Sony to make their new racquets compatible with the Sony Smart Tennis Sensor.

For now, those technologies are mostly targeted at amateur players, but the Sony Smart Tennis Sensor is ITF Approved so pros could start using them as soon as they're available while Babolat Play technology is already in use on the pro tour by the likes of Rafael Nadal. There's no telling if or when any pros' data will become available from either piece of technology, but I'll hold out hope, even if I won't hold my breath.

Now, by no means do I or most of the tennis-loving public have the statistical acumen to draw any real conclusions from heaps of raw tennis data. So, aside from making my life as a blogger and tennis fan a little easier and more interesting, why would I be so interested in seeing more raw data made available? Well, for my own interest, I'd enjoy just generally playing around with the data; plugging into some computer programs and just seeing what there is to see. But even more than that, I'd love to see what some of the really smart people in the world could do with that much data. Is there some hidden gem of insight just waiting in the data that could totally change our perception of today's game?

For now, we'll just have to make do with the stats we have and wait to see what the future might hold!