Hugging Face Hub v1 Migrator
LibCST codemod for migrating huggingface_hub v0.x Python call sites toward v1.x-safe APIs.
The safety model is intentionally conservative:
- High-confidence API changes are rewritten deterministically.
- Risky semantic migrations are reported for bounded AI review instead of being rewritten blindly.
- The output is syntax-checked in tests and the codemod is idempotent.
Why this exists
huggingface_hub v1.0 switches important internals from requests/git-oriented workflows to HTTPX-based APIs. The official migration guide documents the changes, but large AI projects still need repeatable source migration, review reports, and safe handling of ambiguous cases.
This project targets the Boring AI pattern: deterministic CST rules first, AI fallback only for low-confidence blocks.
Deterministic fixes
| Legacy pattern | New pattern |
|---|---|
use_auth_token= | token= |
hf_hub_download(..., resume_download=...) | argument removed |
hf_hub_download(..., force_filename=...) | argument removed |
hf_hub_download(..., local_dir_use_symlinks=...) | argument removed |
InferenceApi(...) | InferenceClient(...) |
update_repo_visibility(...) | update_repo_settings(...) |
HfFolder.get_token() | get_token() |
except requests.HTTPError around HF calls | except HfHubHttpError |
login(write_permission=...) | argument removed |
login(new_session=True/False) | skip_if_logged_in=False/True |
Review-only findings
These are intentionally not auto-fixed:
Repository(...): requires semantic migration tosnapshot_download,HfApi.upload_file, orHfApi.upload_folder.**kwargspassed into HF calls: may hide removed keys such asuse_auth_token.list_models(library=..., task=..., tags=...): removing filters can broaden results.configure_http_backend(...): v1 uses HTTPX client factories.HfFolder.save_token()/HfFolder.delete_token(): requires login/logout intent detection.
Usage
bash
AI fallback
AI is used only for high-risk findings. It does not overwrite files. The suggestion is stored in the JSON report as ai_suggestion.
Currently AI fallback is triggered for cases such as:
Repository(...), because it may needsnapshot_download,HfApi.upload_file, orHfApi.upload_folderdepending on intent.**kwargspassed into Hugging Face Hub calls, because deprecated keys can be hidden inside a dict.list_models(...)filters that changed behavior in v1.x.- custom HTTP backend configuration and non-trivial
HfFoldertoken flows.
Put your URL and API key in a local .env file. Do not commit it:
bash
dotenv
Then run:
bash
If your provider gives the full chat-completions URL, this also works:
dotenv
For local development:
bash
Example
Before:
python
After:
python
The remaining HfFolder import is left intact unless a dedicated unused-import cleanup pass is run. That is deliberate: this codemod avoids deleting imports unless it can prove the import is fully unused.
Report shape
json
Test strategy
- Golden fixtures compare exact
before.pytoafter.py. - Negative fixtures assert unrelated APIs and same-name kwargs are untouched.
- Every golden output is compiled with
compile(..., mode="exec"). - Idempotency is enforced by running the transform twice.
Validation snapshot
Local fixture suite:
text
AI fallback smoke test:
text
Real project dry-run: datasets==2.14.0 source distribution from PyPI.
text
Representative diff:
diff
Codemod workflow
The root workflow.yaml wraps the Python engine so the same migration can run as a Codemod workflow package.