How RepoShield decides
We don't hide behind a "results may be incorrect" banner. Every check is documented below with what it actually does, the confidence level you should give it, and the known false-positive cases. If you spot a bad finding, the report false positive link on each finding emails us the snippet and we patch the rule.
Scope: what RepoShield does and doesn't do
RepoShield is a pre-clone GitHub repository audit. Given a repo URL, we fetch its file tree, source files, lockfiles, CI workflows, and git history, and run 49 live supply-chain checks against that source (3 more checks are on the roadmap below — they appear with a ROADMAP badge and are listed for transparency, not yet active).
We do not publish the exact regex, AST visitor, or threshold used by each check on this page. Doing so would tell attackers exactly which patterns to obfuscate around. The check purpose, severity, and confidence are shown; the impl stays server-side.
We do not currently scan npm registry tarballs directly. That means a package compromised by a direct npm publishwith no corresponding GitHub commit (the Axios March 2026 pattern) won't be caught by scanning the GitHub source — the malicious code lives in the .tgz on registry.npmjs.org, not in the repo. DP-007 (publisher provenance regression) does query the npm registry for publisher metadata diff, so we can flag the OIDC → manual-publish + email-changed signal — but we don't inspect the tarball contents.
Registry-tarball scanning (`shieldrepo audit-package express@4.18.2`) is the v1.1 ship. We're not pretending it's done.
Coverage on large codebases
Scans run inside a 60-second budget (Vercel function ceiling). On large repos (>150 source files) we sample a strategic 120-file subset (manifests, lockfiles, CI workflows, install hooks always, then risk-pattern matches, then size-fill). Every result reports files_scanned / files_total / coverage_pct on the results page. If coverage is <80% the verdict is labeled partial scan; below 30%it's flagged low confidence. For unbounded coverage on enterprise monorepos use the CLI in CI with extended budgets (post-launch v1.1).
Deterministic match. If this fires on a non-malicious repo, it is almost certainly a real finding worth investigating.
Pattern-based heuristic. Catches the malicious case reliably but occasionally fires on unusual but legitimate code (e.g. legitimate uses of eval, child_process). Review the snippet.
Indicator of suspicion. Not by itself a reason to block a build — use it together with other findings or as a flag for human review.
Scoring. Each finding contributes a severity-weighted penalty (critical = 28, medium = 10, info = 4) with diminishing returns (each subsequent finding of the same severity scaled by 0.75n). Mature repos (>2y old, >500 stars) get a 0.75× discount; very mature (>4y, >5k stars) get 0.55×. Five or more critical findings trigger a floor penalty of 65 to force the "danger" verdict regardless of reputation (raised from 3 after validation showed mature repos can legitimately accumulate 3-4 criticals from CI patterns + stale OSV advisories).
Verdict.Score < 40 = danger · 40–74 = caution · 75+ = trust.
Fixture suppression. Findings inside tests/, __tests__/, fixtures/, examples/, docs/, templates/, and e2e/ directories are suppressed for secret-pattern checks (SK-001/002/003/005) because real provider regex hits in these paths are almost always intentional fakes.
Install Script Risks
Pre/postinstall hooks, curl|bash patterns, binary drops.
Dangerous npm Lifecycle Scripts
Inspects preinstall/postinstall/prepare in package.json. Any shell command here is immediately suspicious.
curl | bash Patterns
Any variant of curl/wget piped to sh/bash/zsh — the single most abused malware delivery pattern.
Hidden node_modules Commit
node_modules committed to repo. Bypasses npm integrity checks and can contain modified package code.
Dockerfile Remote Execution
RUN curl|bash, ADD http://, or secret exfil via build args in Dockerfile.
Binary Executables in Repo
Finds ELF/Mach-O/PE binaries. Almost never legitimate in a source repo. Often pre-compiled malware.
setup.py exec / install hooks
Arbitrary code inside setup.py / pyproject.toml that runs at pip install. The Python analogue of npm postinstall.
Advanced check — details withheld
One of the Maxx-tier sensitive detectors. The name, description, and detection signal are intentionally omitted from the public methodology so attackers can't build evasion. Maxx subscribers see the full details inside their dashboard.
Code Patterns
AST-level signals: eval, child_process, sensitive file reads.
eval() / Function() Usage
Detects runtime code evaluation. Almost never legitimate in modern code. Used to hide payloads from static scanners.
child_process Shell Execution
Flags exec/spawn/execSync with dynamic or user-controlled arguments. Command injection + RCE vector.
Dynamic require()
require() with variable or template literal arguments. Used to load obfuscated modules at runtime.
Base64 Payload Detection
Finds base64 strings >500 chars, especially when paired with Buffer.from() + eval. Classic payload hiding.
Hex/Unicode Obfuscation
Detects strings built entirely from \x## or \u#### sequences — the classic 'hidden string' obfuscation trick.
Sensitive File Access
Flags fs reads of ~/.ssh, ~/.aws, ~/.env, browser cookie stores, wallet files, keychain paths.
Crypto-Stealer Heuristics
Patterns for wallet file access, MetaMask/Phantom store reads, clipboard hijack + address-regex replace.
Prototype Pollution Sinks
Flags Object.assign / lodash merge / JSON.parse reviver patterns that taint Object.prototype.
Cross-File Behavior Chain Analysis
Correlates signals across files: env read → encrypt → fetch — detects multi-step exfil that single-file AST misses.
Anti-Analysis / Debugger Detection
Code that alters behavior when a debugger is attached, or checks env like CI, NODE_INSPECT. Classic evasion.
TLS / Cert Validation Bypass
Detects code that disables TLS chain or hostname verification (rejectUnauthorized:false, NODE_TLS_REJECT_UNAUTHORIZED=0, verify=False). Lets a self-signed C2 cert MITM the connection.
Hardcoded Backdoor / Magic-String Auth
Catches `if (token === 'supersecret')`-style auth bypasses and undocumented `x-debug` / `x-admin` header gates that almost never appear in legit code.
npm Token Theft / Self-Propagating Worm
Catches the defining 2025-2026 attack pattern: code that reads `_authToken` out of `~/.npmrc`, runs `npm whoami` / `npm access ls-packages` to enumerate publishable packages, then spawns `npm publish` from runtime. Almost no legitimate package does any of these from runtime code.
Dependencies
Known-bad packages, typosquats, fresh deps, transitive risk.
Known Malicious Package DB
Cross-references package names against OSV, GitHub Advisory DB, npm security feed. Instant block for known bad.
Typosquatting Detection
Levenshtein distance against top 5000 npm packages. Catches 'lodahs', 'expresss', 'reakt' style attacks.
Very New Dependency
Flags packages published <30 days ago with <1000 downloads. Fresh packages are statistically most likely to be malicious.
Deep Transitive Dependency Scan
Recursively scans full dependency tree (not just direct). Most malware hides 3-4 levels deep.
Maintainer Compromise Heuristics
Flags when a long-dormant package is published by a new npm account within 24h — classic account takeover signature.
Publisher Provenance Regression
Catches the Axios-2026 attack signature: a package that previously published with OIDC + SLSA provenance suddenly publishes without it, or with a different maintainer email. Either signal alone is a five-alarm fire.
Package Lock Drift
package-lock / pnpm-lock references registries, hashes, or tarball URLs that don't match the declared package sources.
Network Indicators
Hardcoded URLs, raw IPs, env exfil, webhook C2 channels.
Hardcoded URL Extraction
Extracts all URLs from source. Compares against known-bad domain list + flags suspicious TLDs (.tk, .ml, free DDNS).
Raw IP Address Communication
Code talking to raw IPs instead of domains. Common evasion technique to avoid DNS-based blocking.
Environment Variable Exfiltration
process.env access immediately followed by HTTP POST/fetch. Credential theft signature.
Webhook & Discord/Telegram Bots
Flags hardcoded Discord/Telegram webhooks — popular C2 channels for low-effort malware.
DNS Over HTTPS Abuse
Code routes traffic through DoH endpoints (cloudflare-dns, google, quad9) to evade DNS monitoring.
Advanced check — details withheld
One of the Maxx-tier sensitive detectors. The name, description, and detection signal are intentionally omitted from the public methodology so attackers can't build evasion. Maxx subscribers see the full details inside their dashboard.
Decentralized C2 / IPFS / ICP Canister Exfil
Detects calls to ICP canister endpoints (*.icp0.io, *.ic0.app, raw.icp.host), IPFS gateways (ipfs.io, pinata.cloud, w3s.link, dweb.link, etc.), and bare ICP canister IDs near network sinks. The defining exfil channel of 2025-2026 npm worms (CanisterWorm, Shai-Hulud variants).
Secrets & Credentials
Committed .env, API keys, private keys, git history leaks.
Committed .env / credentials files
Files named .env, credentials.json, service-account.json in the working tree. Instant leak signal.
Hardcoded API Keys
Provider-specific tokens (AWS AKIA, OpenAI sk-, Stripe sk_live, Google AIza…) in source.
Private Key Blocks
BEGIN RSA/OPENSSH/EC PRIVATE KEY blocks committed to source or release artifacts.
Git History Secret Scan
Deep-scans every commit in history — secrets deleted in HEAD but still reachable via git log are still leaked.
High-Entropy String Clusters
Catches unknown-format secrets by entropy + length + character-class mix, tuned to avoid base64 false positives.
Advanced check — details withheld
One of the Maxx-tier sensitive detectors. The name, description, and detection signal are intentionally omitted from the public methodology so attackers can't build evasion. Maxx subscribers see the full details inside their dashboard.
CI/CD & Actions
Unpinned actions, over-privileged workflows, pr target abuse.
Unpinned GitHub Actions
Actions referenced by tag (@v2) instead of SHA — a single upstream compromise can inject arbitrary code into CI.
pull_request_target Abuse
Workflows using pull_request_target with secrets + untrusted code checkout — a privileged RCE vector.
Over-Privileged Workflow Tokens
permissions: write-all or unscoped GITHUB_TOKEN — amplifies blast radius of any CI compromise.
Release Publishing Chain
Who can push to npm/PyPI/crates? Are secrets scoped? This check maps the full publish trust chain.
Workflow Permission Audit Matrix
Per-workflow, per-trigger permission matrix with diffs across the last 30 commits. Maxx tier only.
Repo Surface & Trust
Age, contributor trust, takeover signals, release integrity.
Repo Age & Activity
Flags repos less than 14 days old with install scripts. New repo + executable payload = classic npm supply chain pattern.
Star/Fork Anomaly
Detects inflated metrics — sudden star bursts, fork-to-star ratio mismatch, stars from zero-activity accounts (bought stars).
Contributor Trust Score
Scores maintainers by account age, public contribution history, verified email, 2FA status. Solo new account = higher risk.
Release vs Code Mismatch
Detects when the tagged release contains code not in any branch. xz-utils style attack where the published tarball differs from git source.
Account Takeover Indicators
Detects force-push to default branch, email change in commit author, or new committer with admin rights in last 72h.
Recently Transferred Repo
Repo transferred to a new owner within 30 days. Common in typosquat / trust-transfer attacks.
Dependency Confusion Window
Public package name collides with an internal private org namespace — opens the door to dependency-confusion attacks.