Methodology

How RepoShield decides

We don't hide behind a "results may be incorrect" banner. Every check is documented below with what it actually does, the confidence level you should give it, and the known false-positive cases. If you spot a bad finding, the report false positive link on each finding emails us the snippet and we patch the rule.

Scope: what RepoShield does and doesn't do

RepoShield is a pre-clone GitHub repository audit. Given a repo URL, we fetch its file tree, source files, lockfiles, CI workflows, and git history, and run 49 live supply-chain checks against that source (3 more checks are on the roadmap below — they appear with a ROADMAP badge and are listed for transparency, not yet active).

We do not publish the exact regex, AST visitor, or threshold used by each check on this page. Doing so would tell attackers exactly which patterns to obfuscate around. The check purpose, severity, and confidence are shown; the impl stays server-side.

We do not currently scan npm registry tarballs directly. That means a package compromised by a direct npm publishwith no corresponding GitHub commit (the Axios March 2026 pattern) won't be caught by scanning the GitHub source — the malicious code lives in the .tgz on registry.npmjs.org, not in the repo. DP-007 (publisher provenance regression) does query the npm registry for publisher metadata diff, so we can flag the OIDC → manual-publish + email-changed signal — but we don't inspect the tarball contents.

Registry-tarball scanning (`shieldrepo audit-package express@4.18.2`) is the v1.1 ship. We're not pretending it's done.

Coverage on large codebases

Scans run inside a 60-second budget (Vercel function ceiling). On large repos (>150 source files) we sample a strategic 120-file subset (manifests, lockfiles, CI workflows, install hooks always, then risk-pattern matches, then size-fill). Every result reports files_scanned / files_total / coverage_pct on the results page. If coverage is <80% the verdict is labeled partial scan; below 30%it's flagged low confidence. For unbounded coverage on enterprise monorepos use the CLI in CI with extended budgets (post-launch v1.1).

HIGH CONFIDENCE

Deterministic match. If this fires on a non-malicious repo, it is almost certainly a real finding worth investigating.

MEDIUM CONFIDENCE

Pattern-based heuristic. Catches the malicious case reliably but occasionally fires on unusual but legitimate code (e.g. legitimate uses of eval, child_process). Review the snippet.

HEURISTIC

Indicator of suspicion. Not by itself a reason to block a build — use it together with other findings or as a flag for human review.

Scoring. Each finding contributes a severity-weighted penalty (critical = 28, medium = 10, info = 4) with diminishing returns (each subsequent finding of the same severity scaled by 0.75ⁿ). Mature repos (>2y old, >500 stars) get a 0.75× discount; very mature (>4y, >5k stars) get 0.55×. Five or more critical findings trigger a floor penalty of 65 to force the "danger" verdict regardless of reputation (raised from 3 after validation showed mature repos can legitimately accumulate 3-4 criticals from CI patterns + stale OSV advisories).

Verdict.Score < 40 = danger · 40–74 = caution · 75+ = trust.

Fixture suppression. Findings inside tests/, __tests__/, fixtures/, examples/, docs/, templates/, and e2e/ directories are suppressed for secret-pattern checks (SK-001/002/003/005) because real provider regex hits in these paths are almost always intentional fakes.

Install Script Risks

Pre/postinstall hooks, curl|bash patterns, binary drops.

EX-001HIGH CONFIDENCEseverity criticaltier free2 creds

Dangerous npm Lifecycle Scripts

Inspects preinstall/postinstall/prepare in package.json. Any shell command here is immediately suspicious.

EX-002HIGH CONFIDENCEseverity criticaltier free2 creds

curl | bash Patterns

Any variant of curl/wget piped to sh/bash/zsh — the single most abused malware delivery pattern.

EX-003HIGH CONFIDENCEseverity criticaltier free2 creds

Hidden node_modules Commit

node_modules committed to repo. Bypasses npm integrity checks and can contain modified package code.

EX-004MEDIUM CONFIDENCEseverity mediumtier free2 creds

Dockerfile Remote Execution

RUN curl|bash, ADD http://, or secret exfil via build args in Dockerfile.

EX-005HEURISTICseverity criticaltier pro2 creds

Binary Executables in Repo

Finds ELF/Mach-O/PE binaries. Almost never legitimate in a source repo. Often pre-compiled malware.

EX-006HIGH CONFIDENCEseverity criticaltier free2 creds

setup.py exec / install hooks

Arbitrary code inside setup.py / pyproject.toml that runs at pip install. The Python analogue of npm postinstall.

redactedMAXX · SECRETtier maxx

Advanced check — details withheld

One of the Maxx-tier sensitive detectors. The name, description, and detection signal are intentionally omitted from the public methodology so attackers can't build evasion. Maxx subscribers see the full details inside their dashboard.

Code Patterns

AST-level signals: eval, child_process, sensitive file reads.

SC-001MEDIUM CONFIDENCEseverity criticaltier free2 creds

eval() / Function() Usage

Detects runtime code evaluation. Almost never legitimate in modern code. Used to hide payloads from static scanners.

SC-002MEDIUM CONFIDENCEseverity criticaltier free2 creds

child_process Shell Execution

Flags exec/spawn/execSync with dynamic or user-controlled arguments. Command injection + RCE vector.

SC-003HIGH CONFIDENCEseverity mediumtier free2 creds

Dynamic require()

require() with variable or template literal arguments. Used to load obfuscated modules at runtime.

SC-004MEDIUM CONFIDENCEseverity criticaltier free2 creds

Base64 Payload Detection

Finds base64 strings >500 chars, especially when paired with Buffer.from() + eval. Classic payload hiding.

SC-005HIGH CONFIDENCEseverity mediumtier free2 creds

Hex/Unicode Obfuscation

Detects strings built entirely from \x## or \u#### sequences — the classic 'hidden string' obfuscation trick.

SC-006HIGH CONFIDENCEseverity criticaltier free2 creds

Sensitive File Access

Flags fs reads of ~/.ssh, ~/.aws, ~/.env, browser cookie stores, wallet files, keychain paths.

SC-007MEDIUM CONFIDENCEseverity criticaltier free3 creds

Crypto-Stealer Heuristics

Patterns for wallet file access, MetaMask/Phantom store reads, clipboard hijack + address-regex replace.

SC-008MEDIUM CONFIDENCEseverity mediumtier pro2 creds

Prototype Pollution Sinks

Flags Object.assign / lodash merge / JSON.parse reviver patterns that taint Object.prototype.

SC-009MEDIUM CONFIDENCEseverity criticaltier maxx3 creds

Cross-File Behavior Chain Analysis

Correlates signals across files: env read → encrypt → fetch — detects multi-step exfil that single-file AST misses.

SC-010MEDIUM CONFIDENCEseverity mediumtier pro2 creds

Anti-Analysis / Debugger Detection

Code that alters behavior when a debugger is attached, or checks env like CI, NODE_INSPECT. Classic evasion.

SC-011HIGH CONFIDENCEseverity criticaltier free2 creds

TLS / Cert Validation Bypass

Detects code that disables TLS chain or hostname verification (rejectUnauthorized:false, NODE_TLS_REJECT_UNAUTHORIZED=0, verify=False). Lets a self-signed C2 cert MITM the connection.

SC-012HIGH CONFIDENCEseverity criticaltier free2 creds

Hardcoded Backdoor / Magic-String Auth

Catches `if (token === 'supersecret')`-style auth bypasses and undocumented `x-debug` / `x-admin` header gates that almost never appear in legit code.

SC-013HIGH CONFIDENCEseverity criticaltier free2 creds

npm Token Theft / Self-Propagating Worm

Catches the defining 2025-2026 attack pattern: code that reads `_authToken` out of `~/.npmrc`, runs `npm whoami` / `npm access ls-packages` to enumerate publishable packages, then spawns `npm publish` from runtime. Almost no legitimate package does any of these from runtime code.

Dependencies

Known-bad packages, typosquats, fresh deps, transitive risk.

DP-001HIGH CONFIDENCEseverity criticaltier free3 creds

Known Malicious Package DB

Cross-references package names against OSV, GitHub Advisory DB, npm security feed. Instant block for known bad.

DP-002HIGH CONFIDENCEseverity criticaltier free3 creds

Typosquatting Detection

Levenshtein distance against top 5000 npm packages. Catches 'lodahs', 'expresss', 'reakt' style attacks.

DP-003MEDIUM CONFIDENCEseverity mediumtier free2 creds

Very New Dependency

Flags packages published <30 days ago with <1000 downloads. Fresh packages are statistically most likely to be malicious.

DP-004MEDIUM CONFIDENCEseverity mediumtier pro3 creds

Deep Transitive Dependency Scan

Recursively scans full dependency tree (not just direct). Most malware hides 3-4 levels deep.

DP-005MEDIUM CONFIDENCEseverity criticaltier pro3 creds

Maintainer Compromise Heuristics

Flags when a long-dormant package is published by a new npm account within 24h — classic account takeover signature.

DP-007HIGH CONFIDENCEseverity criticaltier free3 creds

Publisher Provenance Regression

Catches the Axios-2026 attack signature: a package that previously published with OIDC + SLSA provenance suddenly publishes without it, or with a different maintainer email. Either signal alone is a five-alarm fire.

DP-006HIGH CONFIDENCEseverity mediumtier pro2 creds

Package Lock Drift

package-lock / pnpm-lock references registries, hashes, or tarball URLs that don't match the declared package sources.

Network Indicators

Hardcoded URLs, raw IPs, env exfil, webhook C2 channels.

NW-001MEDIUM CONFIDENCEseverity mediumtier free2 creds

Hardcoded URL Extraction

Extracts all URLs from source. Compares against known-bad domain list + flags suspicious TLDs (.tk, .ml, free DDNS).

NW-002MEDIUM CONFIDENCEseverity mediumtier free2 creds

Raw IP Address Communication

Code talking to raw IPs instead of domains. Common evasion technique to avoid DNS-based blocking.

NW-003MEDIUM CONFIDENCEseverity criticaltier pro3 creds

Environment Variable Exfiltration

process.env access immediately followed by HTTP POST/fetch. Credential theft signature.

NW-005HIGH CONFIDENCEseverity criticaltier pro2 creds

Webhook & Discord/Telegram Bots

Flags hardcoded Discord/Telegram webhooks — popular C2 channels for low-effort malware.

NW-004HIGH CONFIDENCEseverity mediumtier pro2 creds

DNS Over HTTPS Abuse

Code routes traffic through DoH endpoints (cloudflare-dns, google, quad9) to evade DNS monitoring.

redactedMAXX · SECRETtier maxx

Advanced check — details withheld

NW-007HIGH CONFIDENCEseverity criticaltier free2 creds

Decentralized C2 / IPFS / ICP Canister Exfil

Detects calls to ICP canister endpoints (*.icp0.io, *.ic0.app, raw.icp.host), IPFS gateways (ipfs.io, pinata.cloud, w3s.link, dweb.link, etc.), and bare ICP canister IDs near network sinks. The defining exfil channel of 2025-2026 npm worms (CanisterWorm, Shai-Hulud variants).

Secrets & Credentials

Committed .env, API keys, private keys, git history leaks.

SK-001HIGH CONFIDENCEseverity criticaltier free2 creds

Committed .env / credentials files

Files named .env, credentials.json, service-account.json in the working tree. Instant leak signal.

SK-002HIGH CONFIDENCEseverity criticaltier free2 creds

Hardcoded API Keys

Provider-specific tokens (AWS AKIA, OpenAI sk-, Stripe sk_live, Google AIza…) in source.

SK-003HIGH CONFIDENCEseverity criticaltier free2 creds

Private Key Blocks

BEGIN RSA/OPENSSH/EC PRIVATE KEY blocks committed to source or release artifacts.

SK-004HIGH CONFIDENCEseverity criticaltier pro3 creds

Git History Secret Scan

Deep-scans every commit in history — secrets deleted in HEAD but still reachable via git log are still leaked.

SK-005HEURISTICseverity mediumtier pro2 creds

High-Entropy String Clusters

Catches unknown-format secrets by entropy + length + character-class mix, tuned to avoid base64 false positives.

redactedMAXX · SECRETtier maxx

Advanced check — details withheld

CI/CD & Actions

Unpinned actions, over-privileged workflows, pr target abuse.

CI-001HIGH CONFIDENCEseverity mediumtier free2 creds

Unpinned GitHub Actions

Actions referenced by tag (@v2) instead of SHA — a single upstream compromise can inject arbitrary code into CI.

CI-002HIGH CONFIDENCEseverity criticaltier free2 creds

pull_request_target Abuse

Workflows using pull_request_target with secrets + untrusted code checkout — a privileged RCE vector.

CI-003MEDIUM CONFIDENCEseverity mediumtier free2 creds

Over-Privileged Workflow Tokens

permissions: write-all or unscoped GITHUB_TOKEN — amplifies blast radius of any CI compromise.

CI-004MEDIUM CONFIDENCEseverity mediumtier pro2 creds

Release Publishing Chain

Who can push to npm/PyPI/crates? Are secrets scoped? This check maps the full publish trust chain.

CI-005MEDIUM CONFIDENCEseverity infotier maxx3 creds

Workflow Permission Audit Matrix

Per-workflow, per-trigger permission matrix with diffs across the last 30 commits. Maxx tier only.

Repo Surface & Trust

Age, contributor trust, takeover signals, release integrity.

RS-001HEURISTICseverity mediumtier free1 cred

Repo Age & Activity

Flags repos less than 14 days old with install scripts. New repo + executable payload = classic npm supply chain pattern.

RS-002HEURISTICseverity mediumtier free2 creds

Star/Fork Anomaly

Detects inflated metrics — sudden star bursts, fork-to-star ratio mismatch, stars from zero-activity accounts (bought stars).

RS-003HEURISTICseverity infotier free1 cred

Contributor Trust Score

Scores maintainers by account age, public contribution history, verified email, 2FA status. Solo new account = higher risk.

RS-006MEDIUM CONFIDENCEseverity criticaltier pro3 creds

Release vs Code Mismatch

Detects when the tagged release contains code not in any branch. xz-utils style attack where the published tarball differs from git source.

RS-004MEDIUM CONFIDENCEseverity criticaltier pro3 creds

Account Takeover Indicators

Detects force-push to default branch, email change in commit author, or new committer with admin rights in last 72h.

RS-005MEDIUM CONFIDENCEseverity mediumtier free1 cred

Recently Transferred Repo

Repo transferred to a new owner within 30 days. Common in typosquat / trust-transfer attacks.

RS-007MEDIUM CONFIDENCEseverity criticaltier maxx3 creds

Dependency Confusion Window

Public package name collides with an internal private org namespace — opens the door to dependency-confusion attacks.

← home