π The PWHL with sportsdataverse-py
Welcome to professional women's hockey! The Professional Women's Hockey League (PWHL) dropped its first puck in January 2024 with six clubs β Boston, Minnesota, MontrΓ©al, New York, Ottawa and Toronto β and it's been must-watch hockey ever since. π
sportsdataverse.pwhl gives you the whole league two ways:
- π¦
load_pwhl_*release loaders β fast, reliable parquet snapshots (schedules, boxscores, play-by-play, scoring & penalty summaries, rosters). Perfect for season-long analysis, and they work great offline. - π°οΈ
pwhl_*live wrappers + analytics β straight off the HockeyTech stats feed (standings, leaders, rosters, stats, single-game PBP) plus derived on-ice metrics (Corsi, time-on-ice, shifts).
And the best part: no API key needed β the public HockeyTech client key ships with the package. R companion: fastRhockey. Let's drop the puck! π₯
π§° The toolboxβ
Everything returns a tidy polars DataFrame by default β pass return_as_pandas=True for pandas. The π¦ loaders read pre-built release parquets (one season per call); the π°οΈ live wrappers hit the HockeyTech API in real time. Both are premium PWHL sources. Click any name for the full reference:
| Function | What it gives you | Source |
|---|---|---|
load_pwhl_schedule | Games + results, one row per game | π¦ loader |
load_pwhl_rosters | One row per player per team (skaters + goalies) | π¦ loader |
load_pwhl_skater_box | Skater boxscore, one row per player per game | π¦ loader |
load_pwhl_goalie_box | Goalie boxscore (saves, shots against, GAA inputs) | π¦ loader |
load_pwhl_team_box | Team boxscore (shots, PP, faceoffs) | π¦ loader |
load_pwhl_pbp | Event-level play-by-play (wide, with coordinates) | π¦ loader |
load_pwhl_scoring_summary | Tidy goal log (scorer + assists + situation flags) | π¦ loader |
load_pwhl_penalty_summary | Tidy penalty log (infraction, minutes, who took it) | π¦ loader |
load_pwhl_shots_by_period | Per-period shot & goal totals per game | π¦ loader |
load_pwhl_three_stars | Post-game three-star selections | π¦ loader |
pwhl_schedule | Live schedule, one row per game | π°οΈ live |
pwhl_standings | Live standings, one row per team | π°οΈ live |
pwhl_teams | Teams in a season (grab team_ids) | π°οΈ live |
pwhl_team_roster | A team's roster | π°οΈ live |
pwhl_leaders | Statistical leaders | π°οΈ live |
pwhl_stats | Aggregate skater/goalie stats | π°οΈ live |
pwhl_player_search | Find a player_id by name | π°οΈ live |
pwhl_player_stats | A player's season-by-season stat lines | π°οΈ live |
pwhl_pbp | Enriched single-game play-by-play | π°οΈ live |
pwhl_game_corsi | On-ice Corsi / Fenwick per player | π°οΈ live |
pwhl_player_toi | Time-on-ice per player | π°οΈ live |
pwhl_game_shifts | Raw shift stints | π°οΈ live |
most_recent_pwhl_season Β· pwhl_season_id | Season helpers | π°οΈ live |
π Setupβ
pip install sportsdataverse
No key, no config β just import and go.
import polars as pl
import sportsdataverse.pwhl as pwhl
# The inaugural season is 2024; this helper tracks the latest known season.
print("most recent PWHL season:", pwhl.most_recent_pwhl_season())
most recent PWHL season: 2027
The π°οΈ live HockeyTech feed is seasonal and occasionally rate-limited, so a tiny safe() helper runs those calls defensively β you get the frame when the feed is up, and a friendly one-liner when it isn't (never a scary traceback). The π¦ loaders read release parquets and are rock-solid, so they don't need the wrapper. π
def safe(label, thunk):
try:
out = thunk()
print(f"β
{label}")
return out
except Exception as e: # noqa: BLE001 -- demo resilience
print(f"βοΈ {label}: unavailable right now ({type(e).__name__})")
return None
π The schedule (loader)β
load_pwhl_schedule returns one row per game with the result and a set of flag/URL columns pointing at the per-game feeds. Pass seasons=[2024] (a list β you can stack multiple seasons). β οΈ Heads up: home_score/away_score come back as strings, so cast them before doing arithmetic.
schedule = pwhl.load_pwhl_schedule(seasons=[2024])
schedule.shape
(85, 29)
schedule.select([
'game_id', 'game_date', 'home_team', 'away_team',
'home_score', 'away_score', 'winner', 'game_type',
]).head()
shape: (5, 8)
βββββββββββ¬ββββββββββββββ¬ββββββββββββ¬ββββββββββββ¬βββββββββββββ¬βββββββββββββ¬ββββββββββββ¬ββββββββββββ
β game_id β game_date β home_team β away_team β home_score β away_score β winner β game_type β
β --- β --- β --- β --- β --- β --- β --- β --- β
β str β str β str β str β str β str β str β str β
βββββββββββͺββββββββββββββͺββββββββββββͺββββββββββββͺβββββββββββββͺβββββββββββββͺββββββββββββͺββββββββββββ‘
β 84 β Wed, May 8 β Toronto β Minnesota β 4 β 0 β Toronto β playoffs β
β 98 β Wed, May 29 β Boston β Minnesota β 0 β 3 β Minnesota β playoffs β
β 90 β Wed, May 15 β Minnesota β Toronto β 1 β 0 β Minnesota β playoffs β
β 63 β Wed, May 1 β Toronto β Minnesota β 4 β 1 β Toronto β regular β
β 45 β Wed, Mar 6 β Toronto β Boston β 3 β 1 β Toronto β regular β
βββββββββββ΄ββββββββββββββ΄ββββββββββββ΄ββββββββββββ΄βββββββββββββ΄βββββββββββββ΄ββββββββββββ΄ββββββββββββ
π₯ Rosters (loader)β
load_pwhl_rosters gives one row per player per team, split into skaters and goalies via the player_type column.
rosters = pwhl.load_pwhl_rosters(seasons=[2024])
rosters.select([
'team', 'team_abbr', 'player_type', 'first_name', 'last_name',
'jersey_number', 'position',
]).head()
shape: (5, 7)
ββββββββββββββββ¬ββββββββββββ¬ββββββββββββββ¬βββββββββββββ¬ββββββββββββ¬ββββββββββββββββ¬βββββββββββ
β team β team_abbr β player_type β first_name β last_name β jersey_number β position β
β --- β --- β --- β --- β --- β --- β --- β
β str β str β str β str β str β i32 β str β
ββββββββββββββββͺββββββββββββͺββββββββββββββͺβββββββββββββͺββββββββββββͺββββββββββββββββͺβββββββββββ‘
β PWHL Toronto β TOR β skater β Jocelyne β Larocque β 3 β LD β
β PWHL Toronto β TOR β skater β Lauriane β Rougeau β 5 β LD β
β PWHL Toronto β TOR β skater β Kali β Flanagan β 6 β RD β
β PWHL Toronto β TOR β skater β Olivia β Knowles β 7 β RD β
β PWHL Toronto β TOR β skater β Alexa β Vasko β 10 β C β
ββββββββββββββββ΄ββββββββββββ΄ββββββββββββββ΄βββββββββββββ΄ββββββββββββ΄ββββββββββββββββ΄βββββββββββ
π Boxscores (loader)β
Boxscores come in three flavours β team_box, skater_box, and goalie_box β each one row per team/player per game.
| Function | One row per⦠|
|---|---|
load_pwhl_team_box | team per game |
load_pwhl_skater_box | skater per game |
load_pwhl_goalie_box | goalie per game |
skater_box = pwhl.load_pwhl_skater_box(seasons=[2024])
skater_box.select([
'game_id', 'first_name', 'last_name', 'position',
'goals', 'assists', 'points', 'shots', 'plus_minus', 'time_on_ice',
]).head()
shape: (5, 10)
βββββββββββ¬βββββββββββββ¬ββββββββββββ¬βββββββββββ¬ββββ¬βββββββββ¬ββββββββ¬βββββββββββββ¬ββββββββββββββ
β game_id β first_name β last_name β position β β¦ β points β shots β plus_minus β time_on_ice β
β --- β --- β --- β --- β β --- β --- β --- β --- β
β i32 β str β str β str β β i32 β i32 β i32 β f64 β
βββββββββββͺβββββββββββββͺββββββ ββββββͺβββββββββββͺββββͺβββββββββͺββββββββͺβββββββββββββͺββββββββββββββ‘
β 2 β Jocelyne β Larocque β LD β β¦ β 0 β 2 β -2 β 26.7 β
β 2 β Lauriane β Rougeau β LD β β¦ β 0 β 0 β 0 β 12.1 β
β 2 β Kali β Flanagan β RD β β¦ β 0 β 1 β -1 β 21.6 β
β 2 β Olivia β Knowles β RD β β¦ β 0 β 0 β 0 β 9.7 β
β 2 β Alexa β Vasko β C β β¦ β 0 β 3 β 0 β 10.5 β
βββββββββββ΄βββββββββββββ΄ββββββββββββ΄βββββββββββ΄ββββ΄βββββββββ΄ββββββββ΄βββββββββββββ΄ββββββββββββββ
goalie_box = pwhl.load_pwhl_goalie_box(seasons=[2024])
goalie_box.select([
'game_id', 'first_name', 'last_name',
'saves', 'shots_against', 'goals_against', 'time_on_ice',
]).head()
shape: (5, 7)
βββββββββββ¬βββββββββββββ¬βββββββββββββ¬ββββββββ¬ββββββββββββββββ¬ββββββββββββββββ¬ββββββββββββββ
β game_id β first_name β last_name β saves β shots_against β goals_against β time_on_ice β
β --- β --- β --- β --- β --- β --- β --- β
β i32 β str β str β i32 β i32 β i32 β f64 β
βββββββββββͺβββββββββββββͺβββββββββββββͺββββββββͺββββββββββββββββͺββββββββββββββββͺββββββββββββββ‘
β 2 β Erica β Howe β 0 β 0 β 0 β null β
β 2 β Kristen β Campbell β 24 β 28 β 4 β 60.0 β
β 2 β Corinne β Schroeder β 29 β 29 β 0 β 60.0 β
β 2 β Abbey β Levy β 0 β 0 β 0 β null β
β 3 β Sandra β Abstreiter β 0 β 0 β 0 β null β
βββββββββββ΄βββββββββββββ΄βββββββββββββ΄ββββββββ΄ββββββββββββββββ΄ββββββββββββββββ΄ββββββββββββββ
π¬ Play-by-play (loader)β
load_pwhl_pbp returns a wide event log. The event column tags each row as faceoff, shot, goal, or penalty β and there are several coordinate systems (x_coord/y_coord plus rink-normalized *_fixed / *_right variants) for drawing rink plots.
pbp = pwhl.load_pwhl_pbp(seasons=[2024])
pbp.shape
(10456, 95)
(pbp
.group_by('event')
.agg(pl.len().alias('events'))
.sort('events', descending=True))
shape: (4, 2)
βββββββββββ¬βββββββββ
β event β events β
β --- β --- β
β str β u32 β
βββββββββββͺβββββββββ‘
β shot β 4922 β
β faceoff β 4631 β
β penalty β 518 β
β goal β 385 β
βββββββββββ΄βββββββββ
π³ Cookbook: common PWHL tasksβ
Now the fun part β a baker's dozen of recipes you'll reach for constantly. Recipes 1β11 lean on the rock-solid π¦ loaders (great offline); recipes 12β13 tour the π°οΈ live wrappers, wrapped in safe() so an offseason or a flaky feed never breaks your run. Every recipe ends in a tidy, ready-to-read frame.
Recipe 1 β Standings from the schedule πβ
No loader is needed for a quick standings table: the schedule's winner column makes a regular-season win count a one-liner.
(schedule
.filter(pl.col('game_type') == 'regular')
.group_by('winner')
.agg(pl.len().alias('wins'))
.sort('wins', descending=True))
shape: (6, 2)
βββββββββββββ¬βββββββ
β winner β wins β
β --- β --- β
β str β u32 β
βββββββββββββͺβββββββ‘
β Toronto β 17 β
β Montreal β 13 β
β Boston β 12 β
β Minnesota β 12 β
β New York β 9 β
β Ottawa β 9 β
βββββββββββββ΄βββββββ
Recipe 2 β Season scoring leaders π₯β
Aggregate the skater boxscore across every game to build a points leaderboard β the inaugural-season top of the table.
(skater_box
.group_by(['player_id', 'first_name', 'last_name'])
.agg(
pl.col('goals').sum().alias('goals'),
pl.col('assists').sum().alias('assists'),
pl.col('points').sum().alias('points'),
)
.sort('points', descending=True)
.select(['first_name', 'last_name', 'goals', 'assists', 'points'])
.head(10))
shape: (10, 5)
ββββββββββββββββ¬ββββββββββββ¬ββββββββ¬ββββββββββ¬βββββββββ
β first_name β last_name β goals β assists β points β
β --- β --- β --- β --- β --- β
β str β str β i32 β i32 β i32 β
ββββββββββββββββͺββββββββββββͺββββββββͺββββββββββͺβββββββββ‘
β Natalie β Spooner β 21 β 8 β 29 β
β Marie-Philip β Poulin β 11 β 14 β 25 β
β Sarah β Nurse β 11 β 13 β 24 β
β Alex β Carpenter β 8 β 15 β 23 β
β Taylor β Heise β 9 β 12 β 21 β
β Ella β Shelton β 7 β 14 β 21 β
β Emma β Maltais β 5 β 16 β 21 β
β Erin β Ambrose β 4 β 16 β 20 β
β Brianne β Jenner β 9 β 11 β 20 β
β Grace β Zumwinkle β 12 β 8 β 20 β
ββββββββββββββββ΄ββββββββββββ΄ββββββββ΄ββββββββββ΄βββββββββ
Recipe 3 β Goalie save-percentage leaders π§€β
Sum saves and shots-against from the goalie boxscore, then compute a season save percentage. We require a minimum shot volume so a one-game cameo doesn't top the list.
(goalie_box
.group_by(['player_id', 'first_name', 'last_name'])
.agg(
pl.col('saves').sum().alias('saves'),
pl.col('shots_against').sum().alias('shots_against'),
pl.col('goals_against').sum().alias('goals_against'),
)
.filter(pl.col('shots_against') >= 100)
.with_columns(
(pl.col('saves') / pl.col('shots_against')).round(3).alias('save_pct')
)
.sort('save_pct', descending=True)
.select(['first_name', 'last_name', 'shots_against', 'goals_against', 'save_pct'])
.head(10))
shape: (10, 5)
ββββββββββββββ¬βββββββββββββ¬ββββββββββββββββ¬ββββββββββββββββ¬βββββββββββ
β first_name β last_name β shots_against β goals_against β save_pct β
β --- β --- β --- β --- β --- β
β str β str β i32 β i32 β f64 β
ββββββββββββββͺβββββββββββββͺββββββββββββββββͺββββββββββββββββͺβββββββββββ‘
β Elaine β Chuli β 253 β 13 β 0.949 β
β Aerin β Frankel β 790 β 49 β 0.938 β
β Kristen β Campbell β 718 β 48 β 0.933 β
β Corinne β Schroeder β 511 β 36 β 0.93 β
β Nicole β Hensley β 492 β 37 β 0.925 β
β Maddie β Rooney β 362 β 27 β 0.925 β
β Ann-RenΓ©e β Desbiens β 580 β 44 β 0.924 β
β Emerance β Maschmeyer β 599 β 51 β 0.915 β
β Abbey β Levy β 254 β 24 β 0.906 β
β Emma β SΓΆderberg β 170 β 17 β 0.9 β
ββββββββββββββ΄βββββββββββββ΄ββββββββββββββββ΄ββββββββββββββββ΄βββββββββββ
Recipe 4 β Biggest blowouts of the season π₯β
Cast the string scores to integers, compute the margin, and sort β the season's most lopsided games fall right out.
(schedule
.with_columns(
pl.col('home_score').cast(pl.Int32),
pl.col('away_score').cast(pl.Int32),
)
.with_columns(
(pl.col('home_score') - pl.col('away_score')).abs().alias('margin')
)
.sort('margin', descending=True)
.select(['game_date', 'home_team', 'home_score',
'away_score', 'away_team', 'winner', 'margin'])
.head(10))
shape: (10, 7)
βββββββββββββββ¬ββββββββββββ¬βββββββββββββ¬βββββββββββββ¬ββββββββββββ¬ββββββββββββ¬βββββββββ
β game_date β home_team β home_score β away_score β away_team β winner β margin β
β --- β --- β --- β --- β --- β --- β --- β
β str β str β i32 β i32 β str β str β i32 β
βββββββββββββββͺββββββββββββͺβββββββββββββͺβββββββββββββͺββββββββββββͺββββββββββββͺβββββββββ‘
β Wed, May 8 β Toronto β 4 β 0 β Minnesota β Toronto β 4 β
β Wed, Mar 13 β Minnesota β 4 β 0 β Boston β Minnesota β 4 β
β Sun, Apr 28 β New York β 2 β 6 β Toronto β Toronto β 4 β
β Sat, Mar 16 β Minnesota β 5 β 1 β New York β Minnesota β 4 β
β Sat, Jan 13 β Toronto β 1 β 5 β Ottawa β Ottawa β 4 β
β Sat, Apr 20 β Ottawa β 4 β 0 β Minnesota β Ottawa β 4 β
β Mon, Jan 1 β Toronto β 0 β 4 β New York β New York β 4 β
β Wed, May 29 β Boston β 0 β 3 β Minnesota β Minnesota β 3 β
β Wed, May 1 β Toronto β 4 β 1 β Minnesota β Toronto β 3 β
β Wed, Mar 20 β New York β 0 β 3 β Ottawa β Ottawa β 3 β
βββββββββββββββ΄ββββββββββββ΄βββββββββββββ΄βββββββββββββ΄ββββββββββββ΄ββββββββββββ΄βββββββββ
Recipe 5 β Team offense: shots & shooting % β‘β
Roll the team boxscore up to the club level for a quick offensive profile β total goals, shot volume, and finishing rate.
# Map each team_id to its abbreviation (both Int32-keyed), then roll up the
# skater box to the club level for a quick offensive profile.
team_lookup = (pwhl.load_pwhl_team_box(seasons=[2024])
.select(['team_id', 'team_abbr']).unique())
(skater_box
.join(team_lookup, on='team_id', how='left')
.group_by('team_abbr')
.agg(
pl.col('goals').sum().alias('goals'),
pl.col('shots').sum().alias('shots'),
)
.with_columns(
(pl.col('goals') / pl.col('shots') * 100).round(1).alias('shooting_pct')
)
.filter(pl.col('team_abbr').is_not_null())
.sort('goals', descending=True))
shape: (6, 4)
βββββββββββββ¬ββββββββ¬ββββββββ¬βββββββββββββββ
β team_abbr β goals β shots β shooting_pct β
β --- β --- β --- β --- β
β str β i32 β i32 β f64 β
βββββββββββββͺββββββββͺββββββββͺβββββββββββββββ‘
β TOR β 74 β 790 β 9.4 β
β MIN β 72 β 1025 β 7.0 β
β MTL β 64 β 814 β 7.9 β
β BOS β 62 β 907 β 6.8 β
β OTT β 61 β 721 β 8.5 β
β NY β 52 β 667 β 7.8 β
βββββββββββββ΄ββββββββ΄ββββββββ΄βββββββββββββββ
Recipe 6 β Power-play conversion leaders πβ
The team boxscore carries pp_goals and pp_opportunities, so a season power-play percentage is a single division.
team_box = pwhl.load_pwhl_team_box(seasons=[2024])
(team_box
.group_by('team_abbr')
.agg(
pl.col('pp_goals').sum().alias('pp_goals'),
pl.col('pp_opportunities').sum().alias('pp_opportunities'),
)
.with_columns(
(pl.col('pp_goals') / pl.col('pp_opportunities') * 100).round(1).alias('pp_pct')
)
.sort('pp_pct', descending=True))
shape: (6, 4)
βββββββββββββ¬βββββββββββ¬βββββββββββββββββββ¬βββββββββ
β team_abbr β pp_goals β pp_opportunities β pp_pct β
β --- β --- β --- β --- β
β str β i32 β i32 β f64 β
βββββββββββββͺβββββββββββͺβββββββββββββββββββͺβββββββββ‘
β OTT β 16 β 64 β 25.0 β
β NY β 19 β 78 β 24.4 β
β MTL β 16 β 94 β 17.0 β
β TOR β 11 β 80 β 13.8 β
β MIN β 7 β 87 β 8.0 β
β BOS β 4 β 68 β 5.9 β
βββββββββββββ΄βββββββββββ΄βββββββββββββββββββ΄βββββββββ
Recipe 7 β Faceoff specialists π―β
The skater boxscore tracks faceoff wins and attempts. Aggregate, gate on a minimum-draw threshold, and the dot-dominators rise to the top.
(skater_box
.group_by(['first_name', 'last_name'])
.agg(
pl.col('faceoff_wins').sum().alias('fo_wins'),
pl.col('faceoff_attempts').sum().alias('fo_attempts'),
)
.filter(pl.col('fo_attempts') >= 200)
.with_columns(
(pl.col('fo_wins') / pl.col('fo_attempts') * 100).round(1).alias('fo_pct')
)
.sort('fo_pct', descending=True)
.head(10))
shape: (10, 5)
ββββββββββββββββ¬ββββββββββββββββ¬ββββββββββ¬ββββββββββββ ββ¬βββββββββ
β first_name β last_name β fo_wins β fo_attempts β fo_pct β
β --- β --- β --- β --- β --- β
β str β str β i32 β i32 β f64 β
ββββββββββββββββͺββββββββββββββββͺββββββββββͺββββββββββββββͺβββββββββ‘
β Abby β Roque β 205 β 339 β 60.5 β
β Marie-Philip β Poulin β 326 β 546 β 59.7 β
β Alex β Carpenter β 245 β 415 β 59.0 β
β Kelly β Pannek β 344 β 630 β 54.6 β
β Brianne β Jenner β 125 β 230 β 54.3 β
β Taylor β Heise β 264 β 495 β 53.3 β
β Hannah β Brandt β 270 β 510 β 52.9 β
β Kristin β O'Neill β 240 β 460 β 52.2 β
β Jade β Downie-Landry β 116 β 225 β 51.6 β
β Jesse β Compher β 119 β 233 β 51.1 β
ββββββββββββββββ΄ββββββββββββββββ΄ββββββββββ΄ββββββββββββββ΄βββββββββ
Recipe 8 β Two-way workhorses: hits + blocks π§±β
Not every contribution shows up on the scoresheet. Sum hits and blocked shots from the skater box to surface the players doing the dirty work β defenders usually own this list.
(skater_box
.group_by(['first_name', 'last_name', 'position'])
.agg(
pl.col('hits').sum().alias('hits'),
pl.col('blocked_shots').sum().alias('blocks'),
)
.with_columns(
(pl.col('hits') + pl.col('blocks')).alias('hits_plus_blocks')
)
.sort('hits_plus_blocks', descending=True)
.head(10))
shape: (10, 6)
ββββββββββββββ¬βββββββββββββ¬βββββββββββ¬βββββββ¬βββββββββ¬βββββββββββββββββββ
β first_name β last_name β position β hits β blocks β hits_plus_blocks β
β --- β --- β --- β --- β --- β --- β
β str β str β str β i32 β i32 β i32 β
ββββββββββββββͺβββββββββββββͺβββββββββββͺβββββββͺβββββββββͺβββββββββββββββββββ‘
β Renata β Fast β RD β 77 β 23 β 100 β
β Megan β Keller β LD β 64 β 33 β 97 β
β Kaleigh β Fratkin β RD β 65 β 19 β 84 β
β Blayre β Turnbull β C β 62 β 14 β 76 β
β Allie β Munroe β LD β 44 β 25 β 69 β
β Jessica β DiGirolamo β LD β 36 β 31 β 67 β
β Emma β Maltais β LW β 53 β 8 β 61 β
β Emma β Greco β LD β 32 β 29 β 61 β
β Lee β Stecklein β LD β 36 β 25 β 61 β
β Kelly β Pannek β C β 28 β 30 β 58 β
ββββββββββββββ΄βββββββββββββ΄βββββββββββ΄βββββββ΄βββββββββ΄βββββββββββββββββββ
Recipe 9 β The penalty box π¨β
load_pwhl_penalty_summary is a tidy per-infraction log. Two quick cuts: the most common infractions league-wide, and the players spending the most time in the box.
penalties = pwhl.load_pwhl_penalty_summary(seasons=[2024])
# Most common infractions
top_infractions = (penalties
.group_by('description')
.agg(pl.len().alias('count'))
.sort('count', descending=True)
.head(8))
top_infractions
shape: (8, 2)
ββββββββββββββββββ¬ββββββββ
β description β count β
β --- β --- β
β str β u32 β
ββββββββββββββββββͺββββββββ‘
β Tripping β 106 β
β Hooking β 91 β
β Roughing β 64 β
β Interference β 53 β
β Slashing β 35 β
β Boarding β 30 β
β Cross Checking β 26 β
β Holding β 26 β
ββββββββββββββββββ΄ββββββββ
# PIM leaders (players who actually took the penalty)
(penalties
.filter(pl.col('taken_by_last').is_not_null())
.group_by(['taken_by_first', 'taken_by_last'])
.agg(
pl.col('minutes').sum().alias('pim'),
pl.len().alias('penalties'),
)
.sort('pim', descending=True)
.head(10))
shape: (10, 4)
ββββββββββββββββββ¬ββββββββββββββββ¬ββββββ¬ββββββββββββ
β taken_by_first β taken_by_last β pim β penalties β
β --- β --- β --- β --- β
β str β str β i32 β u32 β
ββββββββββββββββββͺββββββββββββββββͺββββββͺββββββββββββ‘
β Tereza β VaniΕ‘ovΓ‘ β 37 β 13 β
β Kaleigh β Fratkin β 36 β 18 β
β Abby β Roque β 31 β 10 β
β Jesse β Compher β 25 β 7 β
β Megan β Keller β 22 β 11 β
β Gabbie β Hughes β 20 β 10 β
β Allie β Munroe β 20 β 10 β
β Renata β Fast β 18 β 9 β
β Sarah β Nurse β 18 β 9 β
β Emma β Maltais β 18 β 9 β
ββββββββββββββββββ΄ββββββββββββββββ΄ββββββ΄ββββββββββββ
Recipe 10 β When do goals get scored? β±οΈβ
Slice the goal log out of the play-by-play and bucket it by period β and pull the league's top finishers straight from the event == 'goal' rows while you're there.
goal_events = pbp.filter(pl.col('event') == 'goal')
# Goals by period
goals_by_period = (goal_events
.group_by('period_of_game')
.agg(pl.len().alias('goals'))
.sort('period_of_game'))
goals_by_period
shape: (6, 2)
ββββββββββββββββββ¬ββββββββ
β period_of_game β goals β
β --- β --- β
β str β u32 β
ββββββββββββββββββͺββββββββ‘
β 1 β 109 β
β 2 β 120 β
β 3 β 138 β
β 4 β 15 β
β 5 β 2 β
β 6 β 1 β
ββββββββββββββββββ΄ββββββββ
# Top goal-scorers from the play-by-play feed
(goal_events
.filter(pl.col('player_name_last').is_not_null())
.group_by(['player_name_first', 'player_name_last'])
.agg(pl.len().alias('goals'))
.sort('goals', descending=True)
.head(10))
shape: (10, 3)
βββββββββββββββββββββ¬βββββββββββββββββββ¬ββββββββ
β player_name_first β player_name_last β goals β
β --- β --- β --- β
β str β str β u32 β
βββββββββββββββββββββͺβββββββββββββββββββͺββββββββ‘
β Natalie β Spooner β 21 β
β Grace β Zumwinkle β 12 β
β Marie-Philip β Poulin β 11 β
β Sarah β Nurse β 11 β
β Laura β Stacey β 10 β
β Daryl β Watts β 10 β
β Taylor β Heise β 9 β
β Brianne β Jenner β 9 β
β Gabbie β Hughes β 9 β
β Michela β Cava β 9 β
βββββββββββββββββββββ΄βββββββββββββββββββ΄ββββββββ
Recipe 11 β Three-stars honour roll β and a head-to-head seriesβ
Two compact joins-on-themselves. First, who collected the most first-star nods (load_pwhl_three_stars). Then a head-to-head series view from the schedule β swap in any two clubs.
three_stars = pwhl.load_pwhl_three_stars(seasons=[2024])
# First-star honour roll
(three_stars
.filter(pl.col('star') == 1)
.group_by(['first_name', 'last_name'])
.agg(pl.len().alias('first_stars'))
.sort('first_stars', descending=True)
.head(10))
shape: (10, 3)
ββββββββββββββββ¬ββββββββββββββββ¬ββββββββββββββ
β first_name β last_name β first_stars β
β --- β --- β --- β
β str β str β u32 β
ββββββββββββββββͺββββββββββββββββͺββββββββββββββ‘
β Natalie β Spooner β 7 β
β Nicole β Hensley β 4 β
β Kristen β Campbell β 4 β
β Alex β Carpenter β 3 β
β Gabbie β Hughes β 3 β
β Sarah β Nurse β 3 β
β Hilary β Knight β 3 β
β Marie-Philip β Poulin β 3 β
β Susanna β Tapani β 3 β
β Jade β Downie-Landry β 2 β
ββββββββββββββββ΄ββββββββββββββββ΄ββββββββββββββ
# Head-to-head: Boston vs. Montreal, every meeting in 2024
A, B = 'Boston', 'Montreal'
(schedule
.filter(
((pl.col('home_team') == A) & (pl.col('away_team') == B)) |
((pl.col('home_team') == B) & (pl.col('away_team') == A))
)
.select(['game_date', 'home_team', 'home_score',
'away_score', 'away_team', 'winner', 'game_status']))
shape: (7, 7)
βββββββββββββββ¬ββββββββββββ¬βββββββββββββ¬βββββββββββββ¬ββββββββββββ¬βββββββββββ¬ββββββββββββββ
β game_date β home_team β home_score β away_score β away_team β winner β game_status β
β --- β --- β --- β --- β --- β --- β --- β
β str β str β str β str β str β str β str β
βββββββββββββββͺββββββ ββββββͺβββββββββββββͺβββββββββββββͺββββββββββββͺβββββββββββͺββββββββββββββ‘
β Tue, May 14 β Boston β 3 β 2 β Montreal β Boston β Final OT β
β Thu, May 9 β Montreal β 1 β 2 β Boston β Boston β Final OT β
β Sun, Feb 4 β Boston β 1 β 2 β Montreal β Montreal β Final OT β
β Sat, May 4 β Boston β 4 β 3 β Montreal β Boston β Final β
β Sat, May 11 β Montreal β 1 β 2 β Boston β Boston β Final OT3 β
β Sat, Mar 2 β Montreal β 3 β 1 β Boston β Montreal β Final β
β Sat, Jan 13 β Montreal β 2 β 3 β Boston β Boston β Final OT β
βββββββββββββββ΄ββββββββββββ΄βββββββββββββ΄βββββββββββββ΄ββββββββββββ΄βββββββββββ΄ββββββββββββββ
Recipe 12 β Find a player, then pull her career lines π°οΈπβ
A classic two-step lookup off the live feed: pwhl_player_search resolves a name to a player_id, then pwhl_player_stats returns her season-by-season stat lines. Both are safe()-wrapped for offseason resilience.
hit = safe('player search: Spooner', lambda: pwhl.pwhl_player_search('Spooner'))
if hit is not None and getattr(hit, 'height', 0):
pid = int(hit['player_id'][0])
career = safe(f'player stats {pid}', lambda: pwhl.pwhl_player_stats(player_id=pid))
if career is not None and career.height:
keep = [c for c in ['season_name', 'team_code', 'games_played',
'goals', 'assists', 'points', 'points_per_game']
if c in career.columns]
out = career.select(keep)
else:
out = 'player stats feed unavailable right now'
else:
out = 'player search feed unavailable right now'
out
β
player search: Spooner
β
player stats 100
shape: (10, 7)
ββββββββββββββββββββββββββ¬ββββββββββββ¬βββββββββββββββ¬ββββββββ¬ββββββ ββββ¬βββββββββ¬ββββββββββββββββββ
β season_name β team_code β games_played β goals β assists β points β points_per_game β
β --- β --- β --- β --- β --- β --- β --- β
β str β str β str β str β str β str β str β
ββββββββββββββββββββββββββͺββββββββββββͺβββββββββββββββͺββββββββͺββββββββββͺβββββββββͺββββββββββββββββββ‘
β 2025-26 Regular Season β TOR β 30 β 3 β 5 β 8 β 0.27 β
β 2024-25 Regular Season β TOR β 14 β 3 β 2 β 5 β 0.36 β
β 2024 Regular Season β TOR β 24 β 20 β 7 β 27 β 1.13 β
β Total β null β 68 β 26 β 14 β 40 β 0.59 β
β 2025-26 Preseason β TOR β 1 β 0 β 1 β 1 β 1.00 β
β 2024 Preseason β TOR β 1 β 0 β 0 β 0 β 0.00 β
β Total β null β 2 β 0 β 1 β 1 β 0.50 β
β 2025 Playoffs β TOR β 4 β 0 β 1 β 1 β 0.25 β
β 2024 Playoffs β TOR β 3 β 1 β 1 β 2 β 0.67 β
β Total β null β 7 β 1 β 2 β 3 β 0.43 β
ββββββββββββββββββββββββββ΄ββββββββββββ΄βββββββββββββββ΄ββββββββ΄ββββββββββ΄βββββββββ΄ββββββββββββββββββ
Recipe 13 β A team, its roster, and a game's PBP + Corsi π°οΈπβ
The full live tour. List teams with pwhl_teams, grab a team_id, pull the roster with pwhl_team_roster, take a game_id from the loader schedule, then fetch enriched events with pwhl_pbp and shot-attempt share with pwhl_game_corsi β all from the same feed. Everything is safe()-wrapped, so offline this prints a friendly note instead of raising.
teams = safe('PWHL teams', lambda: pwhl.pwhl_teams(season=2024))
if teams is not None and teams.height:
tid = int(teams['team_id'][0])
roster = safe(f'PWHL roster {tid}', lambda: pwhl.pwhl_team_roster(team_id=tid, season=2024))
out = (roster.select([c for c in ['first_name', 'last_name', 'position', 'jersey_number']
if c in roster.columns]).head()
if roster is not None else teams.head())
else:
out = 'teams feed unavailable right now'
out
β
PWHL teams
β
PWHL roster 1
shape: (5, 3)
ββββββββββββββ¬ββββββββββββ¬βββββββββββ
β first_name β last_name β position β
β --- β --- β --- β
β str β str β str β
ββββββββββββββͺββββββββββββͺβββββββββββ‘
β Emily β Brown β D β
β Megan β Keller β D β
β Sidney β Morin β D β
β Lexie β Adzija β F β
β Sophie β Shirley β F β
ββββββββββββββ΄β βββββββββββ΄βββββββββββ
# A game_id from the loader schedule (offline-safe), then enrich it live.
gid = int(schedule['game_id'][0])
pbp_live = safe(f'PWHL pbp {gid}', lambda: pwhl.pwhl_pbp(game_id=gid))
corsi = safe(f'PWHL corsi {gid}', lambda: pwhl.pwhl_game_corsi(game_id=gid))
print('live pbp rows:', None if pbp_live is None else pbp_live.height,
'| corsi rows:', None if corsi is None else corsi.height)
β
PWHL pbp 84
β
PWHL corsi 84
live pbp rows: 188 | corsi rows: 39
π°οΈ Live standings & leadersβ
Straight off the HockeyTech feed: pwhl_standings for the live table and pwhl_leaders for the statistical leaderboard. Both take a season end-year. We keep them safe()-wrapped because live endpoints are seasonal.
standings = safe('PWHL standings', lambda: pwhl.pwhl_standings(season=2024))
if standings is not None and standings.height:
keep = [c for c in ['team', 'team_code', 'games_played', 'wins', 'losses', 'points']
if c in standings.columns]
out = standings.select(keep).head(10)
else:
out = 'standings feed unavailable right now'
out
β
PWHL standings
shape: (6, 6)
ββββββββββββββββββββββ¬ββββββββββββ¬βββββββββββββββ¬βββββββ¬βββββββββ¬βββββββββ
β team β team_code β games_played β wins β losses β points β
β --- β --- β --- β --- β --- β --- β
β str β str β str β i64 β str β i64 β
ββββββββββββββββββββββͺββββββββββββͺβββββββββββββββͺβββββββͺβββββββββͺβββββββββ‘
β x - PWHL Toronto β x - TOR β 24 β 17 β 7 β 47 β
β x - PWHL Montreal β x - MTL β 24 β 13 β 6 β 41 β
β x - PWHL Boston β x - BOS β 24 β 12 β 9 β 35 β
β x - PWHL Minnesota β x - MIN β 24 β 12 β 9 β 35 β
β e - PWHL Ottawa β e - OTT β 24 β 9 β 9 β 32 β
β e - PWHL New York β e - NY β 24 β 9 β 12 β 26 β
ββββββββββββββββββββββ΄ββββββββββββ΄βββββββββββββββ΄βββββββ΄βββββββββ΄βββββββββ
leaders = safe('PWHL leaders', lambda: pwhl.pwhl_leaders(season=2024))
if leaders is not None and getattr(leaders, 'height', 0):
keep = [c for c in ['rank', 'name', 'team_code', 'stat_formatted', 'type_formatted']
if c in leaders.columns]
out = leaders.select(keep).head(10)
else:
out = 'leaders feed unavailable right now'
out
β
PWHL leaders
shape: (10, 5)
ββββββββ¬ββββββββββββββββββββββ¬ββββββββββββ¬βββββββββββββββββ¬βββββββββββββββββ
β rank β name β team_code β stat_formatted β type_formatted β
β --- β --- β --- β --- β --- β
β i64 β str β str β str β str β
ββββββββͺββββββββββββββββββββββͺββββββββββββͺβββββββββββββββββͺβββββββββββββββββ‘
β 1 β Natalie Spooner β TOR β 27 β Points β
β 2 β Sarah Nurse β TOR β 23 β Points β
β 3 β Marie-Philip Poulin β MTL β 23 β Points β
β 4 β Alex Carpenter β NY β 23 β Points β
β 5 β Ella Shelton β NY β 21 β Points β
β 1 β Natalie Spooner β TOR β 20 β Goals β
β 2 β Sarah Nurse β TOR β 11 β Goals β
β 3 β Grace Zumwinkle β MIN β 11 β Goals β
β 4 β Marie-Philip Poulin β MTL β 10 β Goals β
β 5 β Laura Stacey β MTL β 10 β Goals β
ββββββββ΄ββββββββββββββββββββββ΄ββββββββββββ΄βββββββββββββββββ΄βββββββββββββββββ
π₯ On-ice analyticsβ
Beyond the box score, three analytics helpers derive advanced metrics from the same shift + play-by-play feed:
| Function | Metric |
|---|---|
pwhl_game_corsi | Corsi / Fenwick shot-attempt share, with per-60 rates |
pwhl_player_toi | summed time-on-ice + shift counts per player |
pwhl_game_shifts | raw shift stints (who's on the ice, when) |
β οΈ Corsi note: the HockeyTech feed has no missed-shot event, so Corsi and Fenwick here are proxies counting shots + blocked shots + goals only (corsi_includes_missed = False).
toi = safe(f'PWHL TOI {gid}', lambda: pwhl.pwhl_player_toi(game_id=gid))
if toi is not None and toi.height:
out = (toi.select([c for c in ['first_name', 'last_name', 'toi_seconds', 'num_shifts']
if c in toi.columns])
.sort('toi_seconds', descending=True).head())
else:
out = 'time-on-ice feed unavailable right now'
out
β
PWHL TOI 84
shape: (5, 4)
ββββββββββββββ¬ββββββββββββ¬ββββββββββββββ¬βββββββββββββ
β first_name β last_name β toi_seconds β num_shifts β
β --- β --- β --- β --- β
β str β str β i64 β u32 β
ββββββββββββββͺββββββββββββͺββββββββββββββͺβββββββββββββ‘
β Nicole β Hensley β 3600 β 3 β
β Kristen β Campbell β 3600 β 3 β
β Jocelyne β Larocque β 1677 β 29 β
β Renata β Fast β 1674 β 28 β
β Sophie β Jaques β 1402 β 26 β
ββββββββββββββ΄ββββββββββββ΄ββββββββββββββ΄βββββββββββββ
if corsi is not None and corsi.height:
out = (corsi
.with_columns((pl.col('corsi_for') - pl.col('corsi_against')).alias('corsi_net'))
.select([c for c in ['player_id', 'corsi_for', 'corsi_against', 'corsi_net', 'corsi_for_per60']
if c in corsi.columns])
.sort('corsi_for_per60', descending=True)
.head())
else:
out = 'corsi feed unavailable right now'
out
shape: (5, 4)
βββββββββββββ¬ββββββββββββ¬ββββββββββββββββ¬ββββββββββββββββββ
β player_id β corsi_for β corsi_against β corsi_for_per60 β
β --- β --- β --- β --- β
β str β i64 β i64 β f64 β
βββββββββββββͺββββββββββββͺββββββββββββββββͺββββββββββββββββββ‘
β 115 β 18 β 5 β 84.155844 β
β 76 β 20 β 5 β 78.26087 β
β 89 β 17 β 7 β 72.943981 β
β 100 β 19 β 10 β 66.86217 β
β 20 β 20 β 17 β 64.228368 β
βββββββββββββ΄ββββββββββββ΄ββββββββββββββββ΄ββββββββββββββββββ
β¨ Bonus: tidy goal log + pandas interopβ
load_pwhl_scoring_summary is a clean per-goal log β scorer plus up to two assists, with situation flags like power play, short handed, and game-winning. And because every loader takes return_as_pandas=True, dropping into the pandas world is one keyword away.
scoring = pwhl.load_pwhl_scoring_summary(seasons=[2024])
scoring.select([
'game_id', 'period', 'time', 'team_abbr',
'scorer_first', 'scorer_last', 'is_power_play', 'is_game_winning',
]).head()
shape: (5, 8)
βββββββββββ¬βββββββββ¬ββββββββ¬ββββββββββββ¬βββββββββββββββ¬ββββββββββββββ¬ββββββββββββββββ¬βββββββββββββββ
β game_id β period β time β team_abbr β scorer_first β scorer_last β is_power_play β is_game_winn β
β --- β --- β --- β --- β --- β --- β --- β ing β
β i32 β str β str β str β str β str β i32 β --- β
β β β β β β β β i32 β
βββββββββββͺβββββββββͺββββββββͺββββββββββββͺβββββββββββββββͺββββββββββββββͺββββββββββββββββͺβββββββββββββββ‘
β 2 β 1st β 10:43 β NY β Ella β Shelton β 0 β 1 β
β 2 β 3rd β 2:53 β NY β Alex β Carpenter β 0 β 0 β
β 2 β 3rd β 4:57 β NY β Jill β Saulnier β 0 β 0 β
β 2 β 3rd β 7:42 β NY β Kayla β Vespa β 0 β 0 β
β 3 β 2nd β 16:24 β OTT β Hayley β Scamurra β 1 β 0 β
βββββββββββ΄βββββββββ΄ββββββββ΄ββββββββββββ΄βββββββββββββββ΄ββββββββββββββ΄ββββββββββββββββ΄ βββββββββββββββ
# Same skater box, but as a pandas DataFrame β group with the pandas API.
skater_pd = pwhl.load_pwhl_skater_box(seasons=[2024], return_as_pandas=True)
print('type:', type(skater_pd).__name__, '| shape:', skater_pd.shape)
(skater_pd
.groupby(['first_name', 'last_name'], as_index=False)['points'].sum()
.sort_values('points', ascending=False)
.head(10))
type: DataFrame | shape: (3205, 22)
first_name last_name points
101 Natalie Spooner 29
90 Marie-Philip Poulin 25
114 Sarah Nurse 24
4 Alex Carpenter 23
35 Ella Shelton 21
40 Emma Maltais 21
126 Taylor Heise 21
18 Brianne Jenner 20
42 Erin Ambrose 20
47 Grace Zumwinkle 20
π Where to nextβ
- π¦ Loaders are your offline-friendly workhorses β stack seasons with
seasons=[2024, 2025]and passreturn_as_pandas=Truefor pandas. - π°οΈ Live wrappers (
pwhl_*) pull fresh data and add analytics (Corsi, TOI, shifts) β no key required. - Full reference: the PWHL β Loaders and Additional functions pages in the sidebar.
- Junior & minor hockey? The same HockeyTech surface powers the AHL / OHL / WHL / QMJHL β see
11_junior_hockey_intro.ipynb. - The men's game and the modern NHL APIs live in
07_nhl_intro.ipynb. - R user? The same data lives in fastRhockey (NHL + PWHL).
Now go tell the story of the PWHL β the data's all here. ππ