Sportico is proud to partner with The Harvard Sports Analysis Collective, a student-run organization dedicated to the quantitative analysis of sports strategy and management.
Unlike the men’s side, where Novak Djokovic has won every Grand Slam so far, the women’s tour has seen different players raise trophies in the Australian Open, Roland Garros and Wimbledon. Likewise, of the past 18 Slams, there have been 13 unique winners, compared to only four for the ATP.
Heading into the U.S. Open, there is the mix of traditional favorites, along with aspiring first-time champions. The Americans have an impressive 16 players in the Top 100 and have always found success in front of the home crowd in New York. Ash Barty will look to go back-to-back after winning Wimbledon. Osaka returns in her first Slam since withdrawing from the French Open. The women’s draw has no shortage of story lines, making for an intriguing tournament. Amid the unpredictability of the women’s side, we attempted to forecast a winner, simulating the tournament to find estimated probabilities for every single player in the draw to not only reach any given round but also to win it all.
Trying to predict outcomes with probabilities is common in sports. But it is much easier to do in a single game between two teams or even your standard 16-team playoff bracket than it is for a Grand Slam tennis tournament with a draw size of 128. This is because a lot of the probabilities are interdependent and can change meaningfully with each result, particularly unexpected ones. For example, if Barty were to lose in an earlier round in the draw (as unlikely as that might be), No. 3 seed Osaka’s probability of winning would go drastically up since she would no longer have to play Barty.
Nevertheless, we can still try to predict the outcome of the U.S. Open through a simulation. For my simulation, I used Tennis Abstract’s Elo ratings, which can be adjusted for different surfaces. For example, Osaka ranks second in Elo on hard courts but 12th on clay. Now, with the actual draw, we can simulate the tournament thousands of times using Elo ratings to see the different potential outcomes. This will ultimately give us a probability of each player making a certain round or even winning. For instance, if Barty wins the simulation 3,000 out of 10,000 times, she has approximately a 30% chance of winning. While there is seeding in tennis (No. 1-No. 32) the draw is more random than in other professional playoffs, which rely purely on seeding. In a Grand Slam, a No. 1 seed could play a No. 17 seed or a No. 32 seed in the third round. This creates possibilities for many interesting matchups throughout the tournament, with potential upsets besetting star players.
Ash Barty: While Djokovic has been dominating on the men’s side, Barty has been commanding the women’s tour. She captured her first Wimbledon in July, dropping only two sets throughout the tournament. She did falter in the first round of the Tokyo Olympics, but she then cruised in the Cincinnati 1000 without dropping a single set. The Australian has held the No. 1 ranking since September 2019 yet has not converted as much on the Grand Slam level, with only two to her name. But 2021 has been her best year yet, with five titles and six finals appearances out of the 19 tournaments she has played. A potential finals matchup with Naomi Osaka (Barty would be a 57% favorite) would be intriguing, as the pair have split their four matches and haven’t met since 2019. In the midst of her career year, Barty has a 27% chance to take home her first U.S. Open.
Naomi Osaka: Barty might be the No. 1 seed and the overall favorite, but the most attention will likely be on Osaka in her first Slam back since withdrawing from the French Open after citing mental health concerns. In her first two tournaments back, she lost in the third round of the Olympics, in her home country, to eventual silver medalist Marketa Vondrousova, and then lost in the second round of Cincinnati. Nevertheless, she still ranks second in hard court Elo due to her strong prior history. Osaka is playing on her career best surface (70% winning percentage) and her best Slam, having won the U.S. Open in 2018 and 2020. The hard courts suit her penetrating offensive ground strokes and big serve (she wins nearly 64% of her serve points, second on tour). Osaka will look to have a triumphant return with an 21% chance at the title on the biggest stage in New York City.
With Serena Williams pulling out of the tournament due to the hamstring injury she suffered at Wimbledon, the 23 other Americans in the draw will look to make a deep run. Of these 23, nearly half are 21 years old or younger. Coco Gauff, the 17-year old phenom, Coco Gauff headlines that young group and has the fourth highest odds of any American to take the title home at 0.7% and 28% chance to make the second week, something which she has never done in as the U.S. Open, is historically her weakest Slam. Part of this is due to her tough draw. In the second round, Gauff would play either 2017 finalist Madison Keys or 2017 champion Sloane Stephens, who face off in what could be an epic first-round match. Gauff is also in the same quarter as Osaka, who just defeated her in a tight three-set match in Cincinnati.
Jenn Brady, the highest seeded American at No. 13 and 2020 semi-finalist, has better odds than Gauff for the title (2% and 42% second weekend), but so does No. 23 Jessica Pegula (2% and 40% second weekend) and Danielle Collins (1% and 29% second weekend). Even though the GOAT of American women’s tennis won’t be there (she’s made the semis or better in every U.S. Open since 2008), there is still a strong probability that an American will make a deep run.
Bianca Andreescu: The No. 6 seed has actually never lost a match in the main draw of the U.S. Open. But that is just a statistical quirk based on a very small sample size. Looking deeper at the 2019 winner—who was 19 at the time and now ranks 24th in hard Elo—Andreescu’s odds drop off to 16th. The young Canadian has struggled this year, going 13-9, yet her WTA ranking is still high due to ranking protections for COVID-19. Besides her 2019 U.S. Open, Andreescu has never made it to the third round of any other Slam. She has lost in the first round of the French Open, Wimbledon (of which our prior two Slam simulations also gave her low odds despite her high seed) and the Cincinnati Masters, which is why she enters the Billie Jean King Tennis Center with only a 36% chance to make the second weekend and 1% chance at the title.
Iga Swiatek: In addition to Andreescu, the 20-year-old Swiatek is another young player who hasn’t had enough Slam experience yet, causing her lower odds. Swiatek hasn’t made it past the third round in two appearances at Flushing Meadows. The model could be low on her odds (she ranks 18th in hard Elo) due to her lack of experience and “match data.” Nevertheless, she has not had the best warmup results, losing in the second round at Tokyo and first round at Cincinnati. As the youngest player in the Top 20, she might make her deepest U.S. Open run yet but is not expected to (41% to make second weekend) based on the math.
Although tennis fans will definitely feel the absence of Serena Williams at her career-best tournament, many other interesting players will fill the void. Whether it’s the new emerging leaders of the WTA in Barty or Osaka, the young teenagers like Gauff, or other Americans vying for a Slam on home soil, the 2021 women’s U.S. Open promises excitement.