csharp-service-call-mining
Read-only mining codemod that maps direct HTTP service-to-service calls in a
C# codebase. It emits one metric row per call site so a service-architecture
cleanup starts from code-level truth — caller class → callee, with call-site
count, transport details, URL provenance, and per-file auth presence — instead
of a high-level diagram somebody drew last quarter.
What it detects
HttpClient-shaped calls of the form <receiver>.<Method>Async(...):
GetAsync,PostAsync,PutAsync,DeleteAsync,PatchAsync,
SendAsyncGetStringAsync,GetByteArrayAsync,GetStreamAsyncGetFromJsonAsync<T>,PostAsJsonAsync,PutAsJsonAsync,PatchAsJsonAsync
The match is structural (kind-based on invocation_expression →
member_access_expression → method name), so the receiver does not have
to be named httpClient. Real callers like _httpClient, this._http, or
_apiClient are all captured; the receiver text is reported on the row.
What it ignores
- Generated files (
*.Designer.cs,*.g.cs,*.g.i.cs) bin/,obj/,Generated/directoriesSendAsync(HttpRequestMessage)is recorded but the URL gets
urlKind=request-objectsince the URL lives inside the request object, not
the call site
The codemod is strictly read-only: it returns null and never edits
source. It is safe to run repeatedly during a cleanup to track progress.
Output
For every match, the codemod emits a csharp-service-http-calls metric row
with these fields:
| field | meaning |
|---|---|
sourceService | Enclosing C# type name (class / record / struct / interface) |
sourceNamespace | Enclosing namespace (block-scoped or file-scoped) |
receiver | Text of the call's receiver expression (e.g. _httpClient, this._http) |
method | The exact method name (GetAsync, PostAsJsonAsync, …) |
httpMethod | Derived verb: GET / POST / PUT / PATCH / DELETE / ANY |
urlKind | literal-absolute, literal-relative, interpolated-absolute, interpolated-relative, config, concatenation, variable, request-object, or expression |
targetHost | Host pulled from absolute URLs (literal or interpolated leading text) |
targetService | First non-empty path segment, or the config key when the URL comes from configuration |
urlConfigKey | The key/property used when urlKind=config (e.g. "CustomerService:BaseUrl") |
argCount | Argument count (helps spot payload variation across call sites) |
fileAuth | configured if the file mentions any well-known auth surface (Authorization literal, DefaultRequestHeaders.Authorization, AuthenticationHeaderValue, SetBearerToken, …); absent otherwise |
file | Relative path of the C# file |
line | 1-based line number of the call site |
Aggregating those rows in Codemod Insight (or jq) gives you the dependency
map the Notion brief asks for:
text
Usage
Run against a checkout:
bash
Run only the codemod (no workflow):
bash
Test:
bash
Heuristics & caveats
targetHostrequires either a literal absolute URL or an interpolated
string whose leading text contains the host. Configured / variable URLs
surface asurlKind=config/variablewith the relevant key in
urlConfigKey.targetServiceis best-effort: for absolute URLs it is the first path
segment (/customers/{id}→customers); for configured URLs it falls back
to the config key (EmailService:BaseUrl); forSendAsyncit is empty.fileAuthis per file, not per call site. It does not validate scopes,
token lifetimes, or refresh logic. It only flags files where authorization
is wired somewhere, so cross-checking eachfileAuth=absentrow is part
of the manual triage.- The codemod assumes ast-grep / tree-sitter understands the C# in the file.
Generated, partial, or heavily preprocessed files may yield ERROR nodes
that are silently skipped.
Why this is the right first step
The Notion brief
(How we'd modernize a tangled C# service architecture)
spells it out: boundary problems beat language problems, and you can't fix
boundaries you can't see. Start with the call-graph, prove the edges with
code, then introduce contracts on the worst offenders. This codemod gives you
the call-graph in a deterministic, repeatable way — no guessing, no
hand-drawn diagrams.