Table of Contents generated with DocToc
- 0.0.51 (unreleased)
- New: MLB module (greenfield)
- New: NHL —
api-web.nhle.commigration + EDGE / Stats REST / Records - New: ESPN cross-league port
- New: NCAA bracketology
- New:
_common_espn_parsers.py(polars / pandas parser layer) - New:
return_parsed=Truedispatch shim - New:
nhl_edge_parsers.py - New: Site v2 summary dispatcher (20 sub-parsers)
- New: 100% ENDPOINT_PARSERS coverage (121/121)
- New: weekly cron live-test drift detector
- New: MLB Stats API parser layer
- Test infrastructure
- Documentation
- 0.0.50 Release: May 7, 2026
- Packaging modernization
- Conda installability
- Linting & pre-commit modernization
- Documentation toolchain
- Runnable docstring examples (~190 functions)
- Example notebooks
- Contributor docs and templates
- NFL — nflreadpy parity
- NFL — caching and configuration
- NFL — static datasets
- NFL — pickcenter / odds modern path
- NFL —
load_nfl_scheduleparquet port - WBB / WNBA — new ESPN scrape modules
- CFB —
cfb_play_participantsand__add_player_colscollapse - CFB — pandas → polars 1.x bug-fix reconciliation (
0.36-live→main) - Infrastructure and tooling
- Bug fixes
- Deprecations
- 0.0.40 Release: December 6, 2025
- 0.0.38-39 Release: August 28, 2023
- 0.0.36-37 Release: July 9, 2023
- 0.0.34-35 Release: May 7-9, 2023
- 0.0.18 Release: July 25, 2022
- 0.0.17 Release: July 9, 2022
- 0.0.15 Release: May 8, 2022
- 0.0.14 Release: March 16, 2022
- 0.0.12 Release: February 24, 2022
- 0.0.5 Release: October 20, 2021
0.0.51 (unreleased)​
A second big release on top of 0.0.50. The headline items:
- New
sportsdataverse.mlbmodule (greenfield) — 175 functions spanning three data surfaces:- 113 ESPN cross-league wrappers + 5 ESPN originals
- 40 official MLB Stats API wrappers (
statsapi.mlb.com) - 17 Baseball Savant / Statcast wrappers including auto-chunked
25,000-row truncation handling on
/statcast_search/csv
- NHL migrated to
api-web.nhle.com/v1/— the deprecatedstatsapi.web.nhl.comhost is gone; replaced with 26 modernnhl_web_*wrappers grounded in the OpenAPI spec atfastRhockey/data-raw/nhl_api_web_openapi.yaml. - Cross-league ESPN port from hoopR / wehoop / cfbfastR — 804 new
wrappers across 8 leagues (NBA, MBB, WNBA, WBB, CFB, NFL, MLB, NHL)
via a single ~80-function core (
_common_espn.py) parameterized on the(sport, league)slug. Each per-league extension module is a 5-line file callingmake_league_module()to mass-register the wrappers with proper__name__/__qualname__/__doc__for IDE discoverability. - 3 new NHL modules for the historical / Statcast surfaces:
nhl_edge— 35 wrappers for the NHL EDGE player-tracking system (api-web.nhle.com/v1/edge/*)nhl_stats_rest— 21 wrappers for the official stats REST API (api.nhle.com/stats/rest/) with verbatim Cayenne filter expression supportnhl_records— 50 wrappers for the records site (records.nhl.com/site/api/) covering awards, coaches, franchises, HOF, draft, all-star, GMs
- NCAA bracketology —
espn_mbb_bracketology()andespn_wbb_bracketology()for the non-leaguesports.core.api.espn.com/v2/tournament/{22,23}/seasons/{y}/bracketologyendpoint (live during the projection window, Jan-Mar). - 20 polars/pandas parsers in
_common_espn_parsers.pycovering the most-used ESPN payload shapes (scoreboard, teams, standings, groups, athlete overview/stats/gamelog/splits, leaders, coaches, draft, event-competitor surface, team schedule/roster, news, injuries, generic Core v2 paginated lists). - 4 NHL EDGE family parsers + 3 sub-frame parsers in
nhl_edge_parsers.py, schema-grounded against live captures from 2026-05-23. return_parsed=Truedispatch shim — every wrapper whose short name has a registered parser (57 keys currently inENDPOINT_PARSERS) gains an optionalreturn_parsed=Truekwarg that routes the raw response through the parser and returns a polars DataFrame (pandas viareturn_as_pandas=True). The raw-Dict path is unchanged — the shim is backwards-compatible and strictly additive.- 80 offline parser tests (NHL EDGE 32 + universal ESPN 16 + the
cross-league shim suite) + 32 live-gated integration tests under
SDV_PY_LIVE_TESTS=1so default test runs never hit live endpoints.
New: MLB module (greenfield)​
- New top-level
sportsdataverse.mlbpackage with 8 submodules. mlb_api.py(40 functions) wraps the official MLB Stats API. IDs to know:sportId=1is MLB,leagueId103=AL /104=NL,gameTypeslugsR/F/D/L/W/S/A/E/PO. Player IDs (personId/batter/pitcher) are the same MLBAM id space shared with Baseball Savant.mlb_statcast.py(17 functions) wraps Baseball Savant. The unofficial CSV search at/statcast_search/csvtruncates at exactly 25,000 rows with no pagination;statcast_searchraisesRuntimeErrorwhen the response hits that cap (default,raise_on_truncation=True). Usestatcast_search_chunkedfor multi-week ranges — it auto-chunks the date range and stitches client-side.mlb_espn_ext.pyregisters 113 cross-league ESPN wrappers viamake_league_module(..., include_mlb=True), which adds the MLB-onlyespn_mlb_athlete_hotzonesto the universal surface.
New: NHL — api-web.nhle.com migration + EDGE / Stats REST / Records​
-
The deprecated
statsapi.web.nhl.comis gone.nhl_api.pykeeps a small set of backward-compatible aliases that warn and delegate tonhl_api_web. -
nhl_api_web.py(26 functions) covers the modern game-feed API athttps://api-web.nhle.com/v1/. -
nhl_edge.py(35 functions) wraps the NHL EDGE player-tracking surface — skater / goalie / team detail, shot-location, shot-speed, skating distance, zone time, plus 12*_top_10leaderboards.Note: all 12
*_top_10URL paths return 404 as of 2026-05-23 — the OpenAPI spec lists them but they're not live. The wrappers andparse_edge_top10are kept for forward-compatibility. -
nhl_stats_rest.py(21 functions) wraps the official Stats REST API atapi.nhle.com/stats/rest/. Verbatim Cayenne filter expression support viacayenneExp/factCayenneExpkwargs. -
nhl_records.py(50 functions) wraps the records site atrecords.nhl.com/site/api/— awards, coaches, franchises, skaters, goalies, draft, all-star, HOF, GMs, attendance, fastest goals, team records.
New: ESPN cross-league port​
_common_espn.pyexposes ~80 core functions parameterized on(sport, league).make_league_module(sport, league, prefix, globals(), include_ncaa=, include_football=, include_mlb=)mass-registers wrappers in the caller's namespace. Each per-league extension file is a 5-line wrapper.- Wrappers use
functools.partialwith explicit__name__/__qualname__/__doc__so they behave like real functions forhelp(), IDE auto-complete, andinspect.signature(). - The
_NCAA_WRAPPERStable addsrankings,season_recruits,season_week_rankingsformbb,wbb,cfb. - The
_FOOTBALL_WRAPPERStable addsseason_qbr,season_qbr_weekfornfl,cfb. - The new
_MLB_WRAPPERStable addsathlete_hotzonesformlb.
New: NCAA bracketology​
espn_mbb_bracketology(season, iteration=None)/espn_wbb_bracketology(...)atsports.core.api.espn.com/v2/tournament/{22,23}/seasons/{y}/bracketology.- The endpoint is seasonal — live during the projection window
(roughly January through March each year) and 404s the rest of the
year. Integration tests handle this with
pytest.xfailso off-season CI runs don't fail.
New: _common_espn_parsers.py (polars / pandas parser layer)​
- 20 parsers covering the highest-traffic ESPN payload shapes. All parsers are league-agnostic — the same parser handles MLB, NFL, NBA, etc. because ESPN's payload shapes are identical across leagues.
- Every parser returns polars by default;
return_as_pandas=Trueyields pandas. Empty / malformed payloads return zero-row frames rather than raising. - Output columns snake-cased via
sportsdataverse.dl_utils.underscore. ENDPOINT_PARSERSregistry has 57 short-name keys mapped to 20 unique parsers; covers the universal table plus NCAA / football / MLB extras.parser_for(short_name)lookup helper.
New: return_parsed=True dispatch shim​
_bind()in_common_espn.pywas extended with an optionalparser=argument. When present, the bound wrapper is a closure that addsreturn_parsed=Falseandreturn_as_pandas=Falsekwargs; whenreturn_parsed=True, the closure dispatches the raw response through the parser and returns a DataFrame.make_league_module()looks up the parser viaparser_for(short)on each wrapper registration. The lookup is lazy-imported so a missing parsers module doesn't break the package.- API contract: every existing caller continues to get raw
Dict— the shim is opt-in via the new kwargs.
New: nhl_edge_parsers.py​
- 4 family parsers (
parse_edge_top10,parse_edge_detail,parse_edge_shot_location,parse_edge_zone_time) + generic fallback (parse_edge_payload). - 3 sub-frame parsers (
parse_edge_sog_details,parse_edge_sog_summary,parse_edge_hardest_shots) for unrolling the rich nested lists inside detail payloads thatparse_edge_detaildeliberately stringifies. EDGE_ENDPOINT_PARSERSregisters 33 of the 35 EDGE wrappers (the remaining 2 fall through to the generic parser viaparser_for_edge).EDGE_SUBFRAME_PARSERSmaps each detail wrapper to the tuple of sub-frame parsers that apply.
New: Site v2 summary dispatcher (20 sub-parsers)​
The Site v2 summary endpoint
(espn_{league}_summary(event_id=...)) ships ~19-22 top-level sections
per game (~700 KB to 1.8 MB per call). Rather than collapse that into
one parser, the summary surface now has 20 targeted sub-parsers plus a
dispatcher:
parse_summary_boxscore_player— one row per (team × athlete) with the parallelkeys/statsarrays zipped (e.g. NBA produces 27 rows withmin,fg,3pt,ft,reb,ast, columns).parse_summary_boxscore_team— one row per (team × stat) withstat_name,stat_label,stat_display_value.parse_summary_plays— one row per play (~450 rows per NBA game).parse_summary_winprobability— one row per win-prob tick (joinable to plays viaplay_id).parse_summary_leaders— one row per (team × category × leader) from the 3-levelleaders[]nesting.parse_summary_game_info,parse_summary_officials,parse_summary_header,parse_summary_season_series,parse_summary_against_the_spread,parse_summary_standings,parse_summary_broadcasts,parse_summary_format,parse_summary_pickcenter,parse_summary_odds,parse_summary_article,parse_summary_injuries,parse_summary_news— one row per (or one row total for) the corresponding summary section.parse_summary_drives,parse_summary_scoring_plays— NFL / CFB specific (NFL summary shipsdrives.previous[]+scoringPlaysinstead of top-levelplays). Return zero-row frames for non-football leagues.parse_summary(payload, section=None)— dispatcher. Withsection=Nonereturns a dict of all 20 sub-frames keyed by section name; withsection="<name>"returns just that frame. Empty payload returns a dict of 20 zero-row frames.SUMMARY_SECTION_PARSERS— public registry mapping section name to parser.
Cross-league parity tests verify the dispatcher works against captured fixtures for NBA / MLB / NFL / NHL / WNBA — same code path handles every league's summary endpoint.
New: 100% ENDPOINT_PARSERS coverage (121/121)​
Every wrapper short name across all 4 wrapper tables
(_UNIVERSAL_WRAPPERS, _NCAA_WRAPPERS, _FOOTBALL_WRAPPERS,
_MLB_WRAPPERS) is now registered in ENDPOINT_PARSERS. Every
factory-bound wrapper plus the hand-bound NCAA bracketology helpers
accepts return_parsed=True and return_as_pandas=True.
Two new generic fall-through parsers cover the long tail:
parse_single_entity— flattens any single-resource Core v2 payload (team, venue, franchise, coach, award, position, season_info, athlete_core, event_competitor, etc.) to a one-row frame.parse_itemswas already generic for{items: [...]}Core v2 lists and Core v2{entries: [...]}(athlete_statisticslog); this release expands its registration to ~30 more list-shape endpoints (calendar variants, event lists, season_powerindex, talentpicks, etc.).
register_ncaa_bracketology was upgraded to wrap the bracketology
helpers in the same return_parsed=True shim used by make_league_module
— previously they were hand-bound without the shim.
Three regression tests lock in the invariant:
test_every_wrapper_short_name_has_a_registered_parsertest_no_stale_entries_in_endpoint_parsers_registrytest_return_parsed_shim_active_on_every_wrapper_across_all_leagues(walks the__all__of every league extension module and verifies 819+ wrappers carry the shim).
New: weekly cron live-test drift detector​
.github/workflows/live-tests-cron.yml runs the full live test suite
(tests/test_espn_live.py and any other SDV_PY_LIVE_TESTS=1 gated
tests) every Monday 13:00 UTC and on workflow_dispatch. On failure,
the workflow uses actions/github-script to find or create a tracking
issue labeled live-tests:drift:
- First failure opens a new issue with the last 4 KB of pytest output plus a run URL.
- Subsequent failures comment on the existing open issue instead of duplicating.
- Closing the issue resets state.
Catches upstream API drift (ESPN schema changes, NHL EDGE 404s, MLB Stats API URL moves) on a regular cadence even when the repo is otherwise quiet between releases.
New: MLB Stats API parser layer​
sportsdataverse.mlb.mlb_api_parsers turns the 40 raw-Dict
mlb_api_* wrappers into tidy polars / pandas DataFrames. Mirrors
the design of _common_espn_parsers:
- Every parser returns polars by default; pandas via
return_as_pandas=True. - Empty / malformed payloads return zero-row frames.
- Output columns snake-cased via
sportsdataverse.dl_utils.underscore. - Most parsers use
pandas.json_normalizefor one-pass flattening.
Five dedicated parsers handle the high-traffic endpoints with their own unrolling logic:
parse_mlb_api_schedule— walksdates[].games[]and prefixes the schedule date onto each game row (one row per game withteams.home.*/teams.away.*/venue.*/status.*flattened).parse_mlb_api_teams— one row per team fromteams[].parse_mlb_api_team_roster— one row per player fromroster[]withperson,position,statussub-dicts flattened.parse_mlb_api_standings— walksrecords[].teamRecords[], prefixes division identifiers (namespacedstandings_*to avoid column collisions with team-record fields likelastUpdated), and produces one row per (division × team).parse_mlb_api_person_stats— walksstats[].splits[](also handlesmlb_api_team_statswith the same shape), prefixesstats_type/stats_groupfrom the parent block, and flattens the innerstatblock to wide stat columns.
A generic parse_mlb_api_list fallback handles every list-shape
endpoint that doesn't need extra unrolling (venues, sports, leagues,
divisions, seasons, awards, umpires, draft, draft_prospects,
attendance, team_leaders, team_alumni, team_affiliates, stats,
stats_leaders, stats_streaks, people, sport_players).
MLB_API_ENDPOINT_PARSERS registry has 26 entries (7 dedicated + 19
generic). parser_for_mlb_api(fn_name) returns the registered
parser; unknown names fall back to parse_mlb_api_list so the
caller always gets a DataFrame-returning callable.
Test fixtures captured 2026-05-24 from statsapi.mlb.com (8 captures
in tests/fixtures/mlb_api/). 17 offline tests in
tests/test_mlb_api_parsers.py exercise each dedicated parser plus
the generic fallback against the live fixtures.
Test infrastructure​
- New
tests/test_espn_universal_parsers.py(65 tests),tests/test_mlb_api_parsers.py(17 tests), andtests/test_nhl_edge_parsers.py(32 tests) run offline against captured fixtures. - New
tests/test_espn_live.py(32 live tests) gated bySDV_PY_LIVE_TESTS=1for live integration verification. - Captured fixtures live under
tests/fixtures/espn/(12 captures — the original 7 plus summary captures for NBA / MLB / NFL / NHL / WNBA),tests/fixtures/mlb_api/(8 captures: schedule, teams, roster, standings, person_stats, venues, sports, divisions), andtests/fixtures/nhl_edge/(7 captures), each with a README documenting provenance. - Parametrized cross-league parity tests in
test_espn_universal_parsers.pyexercise the summary dispatcher against all 5 captured leagues and assert the full 20-section dispatch contract for each (boxscore_player + boxscore_team + plays- winprobability + leaders + 13 metadata sections + 2 football-only).
Documentation​
- New documentation pages:
docs/architecture/espn-cross-league.md— the factory + shim architecture.docs/parsers/index.md— the parser layer +ENDPOINT_PARSERS.docs/mlb/index.md— MLB module overview (ESPN + Stats API + Statcast).docs/nhl/edge.md,edge-parsers.md,stats-rest.md,records.md— the new NHL surface.
0.0.50 Release: May 7, 2026​
This release is a big one. The headline items:
- A near-drop-in nflreadpy-parity surface inside
sportsdataverse.nfl: six new loaders, two unified per-type loaders, a caching layer, runtime config, three static datasets, 25load_*aliases, and current-season / current-week helpers. - 11 new ESPN scrape modules across
wbbandwnba(team rosters, season player & team stats, standings, draft, event officials), each with full@overloadtyping. - A new
cfb_play_participantsmodule and a corresponding ~340-line collapse insidecfb_pbp.__add_player_cols. - The long-running
0.36-live→mainpolars-1.x reconciliation across all seven*_pbp.pymodules (~165 API translation sites). - Packaging fully modernized to PEP 621
pyproject.toml(no moresetup.py), conda-installable via the newrecipe/meta.yaml. - Lint chain re-baselined on Ruff (replacing black + isort + pycln + flake8) plus a richer pre-commit set.
- Runnable
Example:sections on ~190 public callables and seven new intro / intermediate Jupyter notebooks underexamples/notebooks/. - Sphinx docs build is clean under
sphinx-build -W.
Round bump to 0.0.50 (rather than 0.0.41) to signal scope; we are still alpha.
Packaging modernization​
- Migrated all packaging metadata from
setup.pyto PEP 621[project]inpyproject.toml.setup.pyis removed;python -m buildis the only supported build path. - License switched from classifier (
License :: OSI Approved :: MIT License) to SPDX expression (license = "MIT"+license-files = ["LICENSE"]) for Metadata 2.4 compliance. - Python target widened to 3.9–3.14 (3.6/3.7/3.8 dropped). Dependency lower bounds modernized (
polars>=1.0,<2.0,pyarrow>=14.0,numpy>=1.23,pandas>=2.0, etc.). [tool.setuptools.packages.find]excludestests*,Sphinx-docs*,docs*,examples*,archive*,recipe*,dev*from the wheel.[tool.setuptools.package-data]retains thecfb/models/*+nfl/models/*shipping list.MANIFEST.intrimmed to current-relevance patterns..gitignoreextended to ignoredev/,dist_check/, and the Sphinx_build/+_static/artifacts; trackedSphinx-docs/_build/files were untracked.
Conda installability​
- New
recipe/meta.yaml:noarch: pythonconda-build recipe that mirrors[project.dependencies]and consumespyproject.tomldirectly. Two source modes documented — localpath: ..for dev, PyPIurl:+sha256:for conda-forge submission. - New
recipe/README.md: walks through the localconda build recipe/workflow and the conda-forgestaged-recipessubmission flow. - New
.github/workflows/conda-build.yml: verifies the recipe on every PR that touchesrecipe/orpyproject.toml, plus on every release. Usesconda-incubator/setup-miniconda@v3+ miniforge / mamba; builds, installs the resulting.conda, smoke-imports all seven sport subpackages, uploads the built package as a workflow artifact.
Linting & pre-commit modernization​
- Replaced the legacy black + isort + pycln + flake8 chain with Ruff (lint, import-sort, pyupgrade, format, unused-import removal).
pyproject.toml [tool.ruff]pinsline-length = 120,fix = true,show-fixes = true. The standaloneisorthook is retained ONLY to injectfrom __future__ import annotationsat the top of every Python file via its--add-importflag — Ruff handles all other import concerns. pyproject.toml [tool.ruff.lint]ignoresE712(intentionalpl.col(...) == True/Falsefor polars boolean masks),E501/E402(long-URL docstrings + module-level imports),F601/F841(legacy parser idioms). Per-file ignores cover star-imports + re-exports in__init__.pyfiles (F401/F403).- New pre-commit hooks alongside Ruff:
pre-commit-hooks(trailing-whitespace, check-merge-conflict, check-ast, check-toml/json/xml/yaml, check-symlinks, end-of-file-fixer, requirements-txt-fixer, check-added-large-files, debug-statements). Thecheck-yamlhook excludesrecipe/meta.yamlbecause its Jinja2 templating isn't valid pre-substitution YAML.pygrep-hooks:python-use-type-annotations,python-no-eval,python-no-log-warn,rst-backticks,rst-directive-colons,rst-inline-touching-normal,text-unicode-replacement-char,python-check-mock-methods,python-check-blanket-noqa,python-check-blanket-type-ignore.add-trailing-comma,sync-pre-commit-deps.check-jsonschema --check-github-workflowsvalidates.github/workflows/*.ymlagainst the GitHub Actions schema.actionlintfor workflow expressions / shell.yamlfmt(config in.yamlfmt:line_ending: lf,eof_newline: true).doctocregenerates Markdown TOCs.markdownlint-cli2against.markdownlint-cli2.yaml. The config disables a handful of rules that fight legacy README / CHANGELOG content (MD013 line-length, MD030 list-marker-space, MD045 alt-text, MD051 link-fragments, MD060 table-column-style) and allows<a>,<img>,<br>,<sub>,<sup>inMD033for the README's badge / logo HTML.
Documentation toolchain​
- Added
sphinx.ext.napoleontoSphinx-docs/conf.pywith explicit Google-style settings — the newwbb/wnba/nfl/cfbmodules use Google-style docstrings (Args:/Returns:/Raises:) and these were producing 22 docutils warnings on build before napoleon was wired up. - Added a no-op
visit_abbreviationshim to the markdown translator inSphinx-docs/conf.py. Sphinx 9 emitsabbreviationnodes for the keyword-only*separator in rendered function signatures, andsphinx-markdown-builder0.6.10 has no visitor for that node type. The shim emits the inner text and skips the node, so the build is now warning-free undersphinx-build -W. - Module docstrings in
cfb_play_participants.pyandnfl/utils_date.pyhad bullet lists immediately following aCaveats:/NFL season convention:paragraph header. Added the required blank line + asterisk markers so docutils parses them as proper RST bullet lists. Sphinx-docs/sportsdataverse.{cfb,mbb,nba,nfl,nhl,wbb,wnba}.rstregisterautomoduleentries for every new ESPN scrape module shipped this release.Sphinx-docs/setup.rstdeleted (was an auto-generated apidoc page for the now-removedsetup.py).Sphinx-docs/index.rstfixed a single-backtick\toctree`typo so therst-backticks` pre-commit hook passes.
Runnable docstring examples (~190 functions)​
- Every public callable across
cfb,nfl,nba,nhl,mbb,wbb,wnba,dl_utils,decorators,errors,nfl/cache,nfl/config,nfl/datasets,nfl/utils_date, and the top-level package now ships a multi-blockExample:section: a quick-start invocation, one or two useful parameter combinations, a one-line pipeline next-step, and aSee Also:block with cross-links to companion R packages (wehoop,hoopR,cfbfastR,baseballr,fastRhockey),nflverse,nflreadpy,nba_api, andnhl-api-pywhere applicable. - Examples use the napoleon literal-block format (heading +
::+ 4-space indented code) so they render as proper code blocks in the markdown docs without triggeringsphinx.ext.doctest. Users can copy-paste any block and run it as-is. - Existing one-line backtick-wrapped examples (the legacy
Example: <inline call>shape) were replaced (not appended) so each function has exactly oneExample:section.
Example notebooks​
- Seven new Jupyter notebooks under
examples/notebooks/:01_quickstart.ipynb,02_cfb_intro.ipynb,03_nfl_intro.ipynb,04_nba_intro.ipynb,05_wbb_wnba_intro.ipynb,06_mbb_intro.ipynb,07_nhl_intro.ipynb. Intro / intermediate level — schedule, pbp, team / player / season-stats endpoints, thenfl.update_config/clear_cache/get_current_*runtime surface, and a small pipeline example per sport. Outputs cleared so the user runs them locally; cross-references link to companion R packages and alternative Python libraries. .gitignorekeeps*.ipynbignored at the repo level (so scratch + checkpoint notebooks aren't accidentally tracked) but adds a negative pattern!examples/notebooks/*.ipynbso the curated tutorial notebooks are explicitly tracked.
Contributor docs and templates​
- New
CLAUDE.mdand.github/copilot-instructions.mdcapture the project conventions for AI-assisted development: branching, conventional commit messages, polars 1.x rules, HTTP layer, module patterns, NFL nflreadpy-parity surface, CFBcfb_play_participants, test conventions, packaging, Sphinx toolchain, the docstring conventions for new functions, common pitfalls. - New
CONTRIBUTING.md: canonical onboarding doc covering uv workflow, conda fallback, Python target 3.9–3.14, code standards (ruff, mypy), polars 1.x rules, test gating withskip_if_no_live, new-module spec. - New
.github/PULL_REQUEST_TEMPLATE.mdand.github/ISSUE_TEMPLATE/(config.yml,bug_report.yml,feature_request.yml,data_quality.yml). The PR template includes an "I have NOT included AI agents (Claude / Copilot / Cursor / GPT / Gemini) as commit co-authors" checkbox enforcing project policy.
NFL — nflreadpy parity​
- Six new loaders:
load_nfl_team_stats,load_nfl_ftn_charting,load_nfl_trades,load_nfl_ff_playerids,load_nfl_ff_rankings,load_nfl_ff_opportunity. - Two new utility helpers in
nfl/utils_date.py:get_current_nfl_season(),get_current_nfl_week(). - Unified
load_nfl_nextgen_stats(stat_type=...)consolidating the per-type variants. The per-type functions are kept as aliases that emitDeprecationWarningand forward to the unified entry point. - Unified
load_nfl_pfr_advstats(stat_type=, summary_level=)consolidating eight per-type / per-summary functions, with the same deprecation alias pattern. - 25 nflreadpy-parity aliases inside
sportsdataverse.nfl(load_pbp↔load_nfl_pbp, etc.). Identity-equivalent — no perf overhead, just a friendlier import surface for nflreadpy users. kind=parameter added toload_nfl_ff_rankingsas the preferred name;type=retained for nflreadpy parity.
NFL — caching and configuration​
- New caching layer in
sportsdataverse.nfl.cachewith both memory and filesystem backends and TTL support. clear_cache()for explicit invalidation.- New
NflConfigplusupdate_config()/get_config()/reset_config(), with env-var initialization:SDV_PY_NFL_CACHE,SDV_PY_NFL_CACHE_DIR,SDV_PY_NFL_CACHE_DURATION,SDV_PY_NFL_VERBOSE,SDV_PY_NFL_TIMEOUT,SDV_PY_NFL_USER_AGENT. - All 23 canonical loaders plus the 11 deprecated aliases are decorated with
@cached_loader. return_as_pandas=Trueround-trips correctly through the cache: a single polars frame is stored, and conversion happens on read.
NFL — static datasets​
team_abbr_mapping(143 entries, relocations folded into the modern abbreviation:OAK -> LV,SD -> LAC,STL -> LA).team_abbr_mapping_norelocate(143 entries, history preserved).player_name_mapping(136 entries, common-variant → canonical).- All three are eagerly loaded at import time and inline-bundled in the package — no separate JSON files to ship.
NFL — pickcenter / odds modern path​
__helper__espn_nfl_odds_information__now hits the modernsports.core.api.espn.com/v2/.../events/{gid}/competitions/{gid}/oddsendpoint when the legacysummary?event=pickcenterarray is empty (true for all 2024+ games).- Cascades to defaults
(2.5, 55.5, True, False)only if both modern and legacy paths fail. - For example, the 2024 CFP semifinal previously returned
(2.5, 55.5, True, False)and now correctly returns(-3.5, 67.5, True, True).
NFL — load_nfl_schedule parquet port​
- Switched from the stale
nflverse-pbp/master/schedules/sched_{season}.rds(which was 404'ing on every season) to the modernnflverse-data/releases/download/schedules/games.parquet. One combined file, 1999–2025, 7,276 rows × 46 cols.
WBB / WNBA — new ESPN scrape modules​
Eleven new modules across sportsdataverse.wbb and sportsdataverse.wnba, plus their __init__.py re-exports and live-gated smoke tests. The WNBA modules (other than wnba_draft) are thin shims onto a shared _espn_basketball_* helper that lives in the corresponding wbb_*.py file (league slug fixed to "wnba"), keeping the wbb/wnba pair DRY.
wbb_team_roster/wnba_team_roster: per(team_id, season)roster, flattened to one row per athlete. Snake-case columns; stable schema on empty rosters.wbb_player_stats/wnba_player_stats: per(athlete_id, season)stats. Multi-table dict with canonical keysAverages/Totals/Misc(always present, empty-frame fallback) plus anOtherbucket only added when ESPN ships a non-canonical category.wbb_team_stats/wnba_team_stats: per(team_id, season)stats. Same multi-table shape as player stats; ESPN ships these asGeneral/Offensive/Defensivecategories that map onto the canonical Averages / Totals / Misc keys. Endpoint corrected tosite.web.api.espn.com/.../teams/{id}/statistics?season=...(thecommon/v3path the original spec named 404s).wbb_standings/wnba_standings: one-row-per-team season standings. WBB defaults togroup=50(Division I women); WNBA has no group filter.wnba_draft: one-row-per-pick draft history. Modern endpoint atsite.web.api.espn.com/apis/site/v2/sports/basketball/wnba/draft(thesite/v3variant 404s).wbb_event_officials/wnba_event_officials: one-row-per-official game-level officials list.- All eleven ship with full
@overloadtyping (mypy-strict), polars 1.x APIs, andsnake_casecolumns viadl_utils.underscore.
CFB — cfb_play_participants and __add_player_cols collapse​
- New
cfb_play_participantsmodule hits the ESPNevents/{gid}/competitions/{gid}/playsparticipants endpoint, with$refresolution (default-on,resolve_missing=True) for athletes missing from the sidecar. cfb_pbp.__add_player_colsshrunk from 471 lines of regex extraction to ~130 lines that delegate to the participants module.- All 19 legacy
_player_namecolumns preserved via an alias mapping. - Hybrid scalar + list-column output:
{type}_player_nameplus{type}_player_names, so multi-entry types like split sacks aren't silently collapsed to a single name. - Targeted regex fallbacks retained as a tertiary safety net for
sack_player_name2,fg_block_player_name,punt_block_player_name, andinterception_player_name— ESPN's sidecar has documented gaps for those.
CFB — pandas → polars 1.x bug-fix reconciliation (0.36-live → main)​
- Foundation: new
cleaned_textcolumn normalizes ESPN play descriptions and is the single source of truth for downstream feature extraction. - Behavioral: kneel-down semantics flag plus
scrimmage_playexclusion. - Yardage: structural rewrite of
__add_yardage_cols(~150-linenp.selectchain →pl.when().then()chain), pass-yards regex tightened from(?<=for)to(?<=[\s,]for), full punt rewrite, fair-catch fix. - Helper-features: end-state edge cases, NCG 2025 GW play hardcode,
lead_halfend-of-half fix, OOB punts block, FG classification correction,end.TimeSecsRemshift direction flipped from lag to lead — which is what WPA inputs expected all along. - WPA:
__process_wpaend-of-game branch rewrite plus onside-kick rewrite, pluspenalty_assessed_on_kickoffplumbing across__setup_penalty_data+__process_epa+__process_wpa. - Player names: extraction migrated to
cleaned_texteverywhere.
Infrastructure and tooling​
- Polars 1.x migration across
cfb/cfb_pbp.py,nfl/nfl_pbp.py,mbb/mbb_pbp.py,nba/nba_pbp.py,nhl/nhl_pbp.py,wbb/wbb_pbp.py,wnba/wnba_pbp.py. Roughly 165 API translation sites:groupby→group_by,with_row_count→with_row_index,apply→map_elements(with explicitreturn_dtype), struct list-arg → varargs,shift_and_fill→shift,cumsum→cum_sum,str.strip→str.strip_chars,str.n_chars→str.len_chars, outer-join →full+coalesce,write_jsonkwargs. - Polars 1.x
is_insame-datatype deprecation: switched to.implode()for the global-containment idiom. pkg_resources.resource_filename→importlib.resources.files()incfb_pbp.pyandnfl_pbp.pyvia small_cfb_resource_filename/_nfl_resource_filenamehelpers. Setuptools 81+ removedpkg_resources, which made the legacy import emit aUserWarningat module load and (eventually) break entirely.download()retry rewrite: iterative loop instead of recursion, defensiveresponse = Noneinit, re-raises the last captured exception when the retry budget is exhausted.psutilmade optional indecorators.py(lazy import, previously an undeclared transitive dep that broke autodoc).pytest.inifilterwarnings for the transitivesphinxcontrib-jsmathlegacynspkg.pthUserWarningand thepkg_resourcesAPIDeprecationWarningsurfacing from setuptools 81+.- New tests under
tests/wbb/,tests/wnba/,tests/conftest.py(with the@skip_if_no_livedecorator gated bySDV_PY_LIVE_TESTS=1), andtests/README.mdcapturing the test conventions. NFL test files renamed to drop legacy-phase-jargon filenames in favor of descriptive names (test_nfl_loaders_parity_loaders.py,_unified.py,_aliases.py).
Bug fixes​
test_havoc_ratecorrected for bothcfbandnfl:def_intfield name fix, bounded<=assertion,def_box.sort()for deterministic group_by emit order, andturnover_boxnow produces a cli warning instead of silently padding an empty dict.yds_puntedduplicate definition removed.drive.idNCG 2025 GW play hardcode.is_in(col)→is_in(col.implode())for global containment, applied acrosscfb_pbpandnfl_pbp.- Pickcenter regression test added for both CFB and NFL: a 2024+ game must NOT silently fall back to the
(2.5, 55.5, True, False)defaults; a pre-2024 game with populated legacypickcentermust continue to use that legacy path. - NFL
__helper_nfl_pbp_featuresdefensive cast for the case where ESPN returnsoverUnderas a Python float (no.astype()); same shape fix as the cfb_pbp version.
Deprecations​
- Four NFL loader families now consolidate per-type variants into a single unified function:
load_nfl_nextgen_stats(stat_type=...)andload_nfl_pfr_advstats(stat_type=, summary_level=). The per-type names continue to work but emit aDeprecationWarningpointing at the unified function. No removal yet.
0.0.40 Release: December 6, 2025​
- Minor changes to mbb_calendar and wbb_calendar functions to include all games, even when top 25 teams are not competing
0.0.38-39 Release: August 28, 2023​
- Minor changes to cfb_pbp functions to improve WP calculation and player parsing.
0.0.36-37 Release: July 9, 2023​
- Switched most under the hood dataframe operations to use the python
polarslibrary and many functions now have a parameterreturn_as_pandaswhich defaults toFalsebut can be set toTrueto return a pandas dataframe instead of a polars dataframe. This is a breaking change. - Added
**kwargswhich pass arguments to thedl_utils.download()function, includingheaders,proxy,timeout(default 30s),num_retries(default = 15),logger(default = None) - Function
espn_cfb_game_rosters()added. - Function
espn_nba_game_rosters()added. - Function
espn_nfl_game_rosters()added. - Function
espn_nhl_game_rosters()added. - Function
espn_wbb_game_rosters()added. - Function
espn_wnba_game_rosters()added. - Function
load_cfb_betting_lines()added (only 2006 through 2019).
0.0.34-35 Release: May 7-9, 2023​
- Reconfigured some imports
- Improved compliance with pandas upgrades
- Updated loader locations to use sportsdataverse-data releases and nflverse releases
- Flattened the returned results somewhat for "sportsdataverse.cfb.espn_cfb_schedule()" functions, but also now including some nested data frame and list columns
0.0.18 Release: July 25, 2022​
- Added ondays parameter to ESPN calendar functions
- Renamed "sportsdataverse.cfb.cfb_teams()" to "sportsdataverse.cfb.espn_cfb_teams()" to avoid an edge case issue when running the function.
0.0.17 Release: July 9, 2022​
- Added MLBAM API functionality to the sportsdataverse-py package. For more information on how to use these new functions, refer to the docs.
- Fixed a bug where the "sportsdataverse.nfl.load_nfl_schedule()" function would cause a 404 error when run.
- For functions where multiple files are loaded in, progress bars have been added to indicate how far along the sportsdataverse-py package is in completing its task(s).
- Renamed "sportsdataverse.cfb.cfb_teams()" to "sportsdataverse.cfb.get_cfb_teams()" to avoid an edge case issue when running the function.
0.0.15 Release: May 8, 2022​
- Refactor schedule and teams functions for all existing leagues.
- Created more robust home/away mappings to simplify assignment.
0.0.14 Release: March 16, 2022​
- Refactor schedule and teams functions for all existing leagues.
- Created more robust home/away mappings to simplify assignment.
0.0.12 Release: February 24, 2022​
- Minor refactor to all the pbp functions, attempting to normalize behavior.
- Adding raw parameter to same functions to return object as it comes in without any transformation
- Adding some config file corrections.
0.0.5 Release: October 20, 2021​
- f'in round
- findin' out