KEVIN GOODFELLOW: SPORTS DATA MANAGEMENT – A KEY TO ANALYTIC ADVANTAGES

Following several presentations at various sports conferences focusing on the components of elite sports analytics, Kevin Goodfellow, founder of Sports Data Hub, outlines the key challenges of Data Management, with some thoughts from three top professionals in American Sports.

Data Management is the fundamental piece of the sports analytics process on which everything else is built. It is the science of acquiring, moving, storing, organising, cleaning, and connecting fragmented data into a form that enables deeper analysis, and due to the evolution of sport analytics over the years it has been increasingly difficult work. Today the top teams are facing three major challenges that strain existing Data Management processes and require more advanced skills: width, depth, and speed. Let me explain…

Width – Width is the concept of data variety. Gone are the days of strictly examining the on-the-field metrics to understand and optimise behaviour and performance. Analytics have expanded to include much more beyond game data, such as: physiology, biology, psychology, neurology, strategy, coaching, scouting, training, trades, salary data, etc etc. This widened scope helps paint a more robust and complete picture, but comes with the corresponding challenges of collecting and connecting these different sources of data.

Depth – Depth refers to the increasing size and detail of datasets. More recently, this fine-grained data is generated from automated systems, rather than being manually collected. While this automation is a major advance, the result is often a huge amount of data that can’t be managed with simple spreadsheets. Examples of this would include: pitch tracking where the position of the ball in space is captured in incredible detail such that the physics of the pitch can be observed; motion tracking where player location and orientation is measured by GPS or Video so that a player’s place on the field is accurately recorded multiple times each second; and sensor data where heart rate, breathing rate, and impacts, etc are recorded to use in fitness, readiness, and safety measures. All of these measures lead to extremely large amounts of data that must be managed, stored, and cleaned before they can be used for analysis purposes.

Speed – With coaches and staff squeezed for time, an increase in analysis speed can be a distinct advantage, giving a team the ability to do more preparation in less time. Solutions that increase analysis speed are particularly desirable and clearly advantageous.

The challenges of width, depth, and speed are changing the world of sports analytics like never before, causing teams that want to stay ahead, to place greater emphasis on their Data Management skills and technology. These skills have not traditionally been an area of expertise in the front offices or coaching staffs but that is starting to change.

The sections below contain some brief comments from experts in American sports, outlining the increasing role of Data Management in their organisations, challenges that they face, and the benefits than can result.

Sig Mejdal
Director of Decision Sciences
World Champion St. Louis Cardinals / Houston Astros (MLB)

Data Management in Major League Baseball (MLB)
Much of the recent attention in MLB has been directed at the analysts, focusing on their mathematical skills and metric discoveries. While those skills are very important, without the back-end work to store and manage the data, many analysts have difficulty taking advantage of the ever-increasing data volumes.

Much of the standard “must see” data for MLB decision makers comes from a variety of player oriented sources such as: recent and historical in-house scouting reports, draft information, contract information, past game statistics, health records, and notes on competitor interest – just to name a few. While this is already a substantial amount of data to manage and analyse, more and more front offices are looking to analyse large volumes of detailed performance data from sources like pitch f/x (pitch tracking) and hit f/x (player tracking) for modelling purposes.

With the increasing amounts of data and the variety of sources available, data management skills are becoming one of the most important skills needed in the front office. However, these skills are often the least appreciated and least understood. This is understandable, since data management skills have not traditionally been a part of the standard front office skill set. It is my opinion, that in the very near future, these indispensable skills will be a requirement in all front offices. Without them, it will be hard to compete.

Ben Alamar
Director of Analytics
Oklahoma City Thunder (NBA)
NFL Analytics Consultant

Data Management in the National Football League (NFL)
The volume of data available to a NFL head coach has grown dramatically over the years, in part due to the growth of coaching staffs. Typical NFL teams now have a coaching and training staff of twenty to thirty. Head coaches are so pressed for time during the week in between games, that they do not have an opportunity to gather, process and evaluate all of the data collected. Instead, they rely on their staffs to provide them the most relevant data so that they can set the strategy for the week.

In the week of preparation for an upcoming opponent, each member of the staff is collecting and amassing data through film study, working directly with athletes, or talking to coaches who have played the upcoming opponent. Each staffer collects the data specific to their area of expertise (i.e. QB coaches collect data relevant to QBs, without concern for how it relates to defensive schemes), and may not see potential valuable connections between their data and that of other coaches. This not only can make the coaches imperfect and inefficient filters of information for the head coach, but an opportunity to gain important insight from the combination of all of their data is lost.

Efficient data management however, can summarise and connect information for the head coach and staff, allowing them to have a more comprehensive view of the available information about an opponent, as well as the ability to drill deeper into specific focus areas. This efficiency and speed allows the head coach to get answers extremely quickly, allowing him to ask and answer a larger set of questions in a shorter period of time, with less last minute scrambling from his staff. It is clear that a solid data management strategy results in a team that is better prepared to compete on game day.

Dr. Peter Vint
Director of High Performance
United States Olympic Committee

Data Management in Olympic Sport
The 2012 Summer Olympic Games in London will comprise 26 sports, in 39 different disciplines, and 302 contested events. Approximately 10,500 athletes from 204 nations will compete for 2,100 medals. With numbers like these, the role of data management in Olympic sport is significant and multifaceted.

In a big picture sense, the ability to rapidly harvest, manage, and analyse competition results data is critical to our understanding of international competitiveness across sports and within events. We rely on these data to identify who and where world-class performers and performances are today, who and where they will be “tomorrow”, and what it will take for us to either hold on to existing competitive advantages or to become more competitive than we are currently.

On a much finer scale, terabytes of data can be accumulated on factors that affect an individual athlete’s performance (e.g., medical histories, testing results, responses to training modalities, performance trajectories and historical outcomes, caloric and nutrient balance). I believe it is at this level that the greatest challenges and most substantial opportunities lie. The challenges are many and include managing a group of different users with different systems with different data across different sports and across different parts of the country. The upside to getting it right, however, is just as dramatic. In general it will be the ability to effectively manage, analyse, interpret, and make actionable data from a myriad of sources and systems that has the potential to be “Game Changing”.

I agree with Peter, it is most definitely “Game Changing”. I hope that this brief exploration of Data Management has provided a good perspective of what it is, the role it plays in sports analytics, the current challenges, as well as the benefits that other sports professionals have observed.

Kevin is the founder of Sports Data Hub, a company focusing on pro sports data solutions. He has over 16 years of analytics experience in a wide range of industries, for companies large and small. For the last 10 years he has provided Data Warehousing consulting specializing in Data Architecture and Data Integration. He is a graduate of Cal Poly and the University of Colorado, earning B.S. and M.S. degrees in Mechanical Engineering. Early in his career, Kevin was a systems engineer for Boeing in Seattle, working with data while designing, analyzing, and flight-testing commercial aircraft landing gear systems.

THIS MONTH’S ARTICLES:
STEVE PETERS: THE BRAIN BEHIND THE MEDALS
DAVID HORROCKS: EXPERT PERFORMANCE – THE DEVELOPMENT AND MAINTENANCE OF SUCH STATUS, THE TRUE EXPERTS VIEW
PROZONE ANALYSIS: FIFA TURF STUDY