GGalaxy0505

hf-hub-v1-migrator

Migrate huggingface_hub v0.x Python call sites to v1.x-safe APIs with deterministic LibCST rules and AI review for hard cases.

upgradebreaking-changehuggingfacepython
Public
0 executions

Run locally

npx codemod hf-hub-v1-migrator

Hugging Face Hub v1 Migrator

LibCST codemod for migrating huggingface_hub v0.x Python call sites toward v1.x-safe APIs.

The safety model is intentionally conservative:

  • High-confidence API changes are rewritten deterministically.
  • Risky semantic migrations are reported for bounded AI review instead of being rewritten blindly.
  • The output is syntax-checked in tests and the codemod is idempotent.

Why this exists

huggingface_hub v1.0 switches important internals from requests/git-oriented workflows to HTTPX-based APIs. The official migration guide documents the changes, but large AI projects still need repeatable source migration, review reports, and safe handling of ambiguous cases.

This project targets the Boring AI pattern: deterministic CST rules first, AI fallback only for low-confidence blocks.

Deterministic fixes

Legacy patternNew pattern
use_auth_token=token=
hf_hub_download(..., resume_download=...)argument removed
hf_hub_download(..., force_filename=...)argument removed
hf_hub_download(..., local_dir_use_symlinks=...)argument removed
InferenceApi(...)InferenceClient(...)
update_repo_visibility(...)update_repo_settings(...)
HfFolder.get_token()get_token()
except requests.HTTPError around HF callsexcept HfHubHttpError
login(write_permission=...)argument removed
login(new_session=True/False)skip_if_logged_in=False/True

Review-only findings

These are intentionally not auto-fixed:

  • Repository(...): requires semantic migration to snapshot_download, HfApi.upload_file, or HfApi.upload_folder.
  • **kwargs passed into HF calls: may hide removed keys such as use_auth_token.
  • list_models(library=..., task=..., tags=...): removing filters can broaden results.
  • configure_http_backend(...): v1 uses HTTPX client factories.
  • HfFolder.save_token() / HfFolder.delete_token(): requires login/logout intent detection.

Usage

bash

AI fallback

AI is used only for high-risk findings. It does not overwrite files. The suggestion is stored in the JSON report as ai_suggestion.

Currently AI fallback is triggered for cases such as:

  • Repository(...), because it may need snapshot_download, HfApi.upload_file, or HfApi.upload_folder depending on intent.
  • **kwargs passed into Hugging Face Hub calls, because deprecated keys can be hidden inside a dict.
  • list_models(...) filters that changed behavior in v1.x.
  • custom HTTP backend configuration and non-trivial HfFolder token flows.

Put your URL and API key in a local .env file. Do not commit it:

bash
dotenv

Then run:

bash

If your provider gives the full chat-completions URL, this also works:

dotenv

For local development:

bash

Example

Before:

python

After:

python

The remaining HfFolder import is left intact unless a dedicated unused-import cleanup pass is run. That is deliberate: this codemod avoids deleting imports unless it can prove the import is fully unused.

Report shape

json

Test strategy

  • Golden fixtures compare exact before.py to after.py.
  • Negative fixtures assert unrelated APIs and same-name kwargs are untouched.
  • Every golden output is compiled with compile(..., mode="exec").
  • Idempotency is enforced by running the transform twice.

Validation snapshot

Local fixture suite:

text

AI fallback smoke test:

text

Real project dry-run: datasets==2.14.0 source distribution from PyPI.

text

Representative diff:

diff

Codemod workflow

The root workflow.yaml wraps the Python engine so the same migration can run as a Codemod workflow package.

Ready to contribute?

Build your own codemod and share it with the community.