What is SAST and why does it matter?

SAST (Static Application Security Testing) analyses source code, bytecode, or binaries without executing the application. It finds injection flaws, hardcoded secrets, insecure library use, and logic errors earlier in the development cycle than DAST.

Which programming languages does Pencheff SAST support?

Pencheff runs CodeQL, Semgrep, Bandit (Python), gosec (Go), Brakeman (Ruby on Rails), ESLint security rules (JavaScript/TypeScript), and a tree-sitter pack for additional languages including Rust, PHP, and Java.

How does Pencheff find hardcoded secrets in code?

Pencheff runs gitleaks over the full git history and working tree, detecting API keys, tokens, passwords, and private keys across all commits — not just the current HEAD. YARA rules additionally flag malware patterns and backdoors.

Does SAST replace DAST, or do they complement each other?

They complement each other. SAST finds flaws in code that may not be reachable at runtime, while DAST finds runtime vulnerabilities that may not be apparent from reading the source. Pencheff combines both into a unified findings stream with de-duplication.

Code security · Platform

SAST and secrets

Semgrep, Bandit, gosec, Brakeman, ESLint security, tree-sitter rules, and gitleaks.

Repository scanning gives source findings the same operational treatment as runtime findings: scanner provenance, line-level evidence, remediation guidance, SARIF, GitHub annotations, and fix state.

Start free Sign in

Scanner pipelineSARIF-ready

8coverage areas

5operator steps

4evidence fields

Coverage8

Execution5

Evidence4

Controls4

SemgrepgitleaksSARIFFix PRSBOM

Source findings carry scanner provenance, line evidence, and fix state through the same pipeline.

ScopeSecurity Surfaces

SectionPlatform

MethodDeterministic-first

OutputUnified evidence

ProfileCode security

Coverage

What does SAST and secrets test?

Semgrep, Bandit, gosec, Brakeman, ESLint security, tree-sitter rules, and gitleaks.
This page is part of Platform under Security Surfaces.
It links back into the broader a complete adversarial security platform experience.
Semgrep OSS packs, Bandit, gosec, Brakeman, ESLint security, tree-sitter rules, and niche-language scaffolds.
Secret detection with gitleaks and suspicious-code indicators with YARA-style patterns.
GitHub repository connection, webhook-triggered scans, hardlink staging, gitignore-aware filtering, and default-deny controls.
SARIF and GitHub check run output so developers see findings where they work.
Auto-fix preparation for Semgrep autofix, SCA version bumps, and reviewer-friendly patch synthesis.

Execution

How does Pencheff run this?

Connect or register a repository and choose a branch, scan profile, and scanner policy.
Stage the source safely, fan out language-specific scanners, and capture raw scanner output.
Normalize results into repo findings with file, line, rule, severity, scanner, and remediation metadata.
Merge code results with SCA, IaC, secrets, and runtime context to reduce duplicate triage.
Send annotations, SARIF, reports, fix PRs, or dashboard tasks depending on the workflow.

Evidence

What evidence does this produce?

File path, line number, rule id, scanner name, confidence, language, and vulnerable snippet context.
Suggested fix, fixed-version data when applicable, and status across suppressions or rechecks.
GitHub check output, SARIF upload, comments, and links back into the finding record.
Cross-finding signals when a code pattern aligns with runtime exploitation.

Controls

How is this kept safe to run?

Scanner choices are explicit and permissively licensed where used in the repo pipeline.
Secrets are handled as findings rather than echoed into broad UI surfaces.
CI gates can be tuned by severity, reachability, policy, and target branch.
Generated fixes remain reviewer-owned and trace back to original scanner evidence.

From the Pencheff docs

Scanners

Each repo scan fans out to several scanners. Every match is normalised into a shared RepoFinding row so the UI and the API don't care which engine produced it.

CodeQL was removed in v0.7 — the CodeQL CLI is not licensed for commercial use on third-party code, and Pencheff scans customer code. The SAST role is now filled by the five permissively-licensed tools listed below, all run as subprocesses (no static linking).

Semgrep OSS — multi-language SAST

Pinned to an explicit allowlist of OSS Semgrep Registry packs — never --config=auto, never any Semgrep Pro Engine / Pro rules. Default pack list:

p/owasp-top-ten p/security-audit p/cwe-top-25 p/secrets p/jwt
p/django p/flask p/express p/nodejs p/golang p/r2c-security-audit

Override per-deployment with the PENCHEFF_SEMGREP_PACKS env var (comma-separated). The runner script lives at bench/runners/semgrep.sh. License: LGPL-2.1 (subprocess-only).

Severity maps via the existing _canonical_severity helper — ERROR/WARNING/INFO collapse to our five-level scale.

Bandit — Python SAST

Apache-2.0; runs bandit -r <repo> skipping B101 (assert in tests). Captures CWE ids when Bandit emits them.

gosec — Go SAST

Apache-2.0; only fires when the staged tree contains .go files outside vendor/. Reports CWE id + confidence on every issue.

Brakeman — Ruby on Rails SAST

MIT; auto-skips when the tree isn't a Rails app (no app/ + config/ directories). Confidence levels collapse to severity: high→high, medium→medium, weak→low.

ESLint + eslint-plugin-security — JS / TS SAST

Both MIT. Invoked via npx --no-install eslint against a pinned flat config at bench/runners/eslint_security.config.cjs — ignores any .eslintrc in the target repo so the security ruleset is identical on every scan. Only security/* rule hits surface as findings.

Tree-sitter pack — niche-language SAST

Phase 2.3 — per-language sub-packs under plugins/pencheff/pencheff/modules/sast/treesitter_pack/ cover languages that Semgrep OSS / Bandit / gosec / Brakeman / ESLint don't reach cleanly. Solidity ships at v0.7 (4 hand-curated rules: tx.origin auth, weak-randomness, deprecated selfdestruct, unchecked low-level calls). Lua, Scala, Dart, Kotlin, Swift, COBOL, Erlang sub-packs scaffold-ready — drop a queries.scm + rules.json pair into a sibling directory. Each sub-pack is gracefully skipped when the language grammar isn't installed.

GHSA Advisory DB — SCA

Dependency-vulnerability scan against the GitHub Advisory Database, sourced via osv-scanner (which mirrors GHSA along with PyPA, RustSec, Go Vulndb, and several other ecosystem feeds).

Walks every manifest the engine recognises:

package-lock.json, yarn.lock, pnpm-lock.yaml
requirements.txt, Pipfile.lock, poetry.lock
Gemfile.lock, Cargo.lock, composer.lock
go.sum, pom.xml, build.gradle

Findings include package, installed_version, fixed_version, and the GHSA-prefixed alias as rule_id when present (otherwise the OSV ID). CVE aliases populate the cve field. Severity maps from the CVSS v3 score: 9+ critical, 7+ high, 4+ medium, else low.

For App-installed repos, Dependabot push webhooks deliver alerts straight into the same bucket — they merge with the on-disk scan.

gitleaks — secrets

Scans the working tree for credential patterns: AWS keys, GCP service accounts, Slack tokens, private SSH keys, generic high-entropy strings. Every match is high severity — the right call is almost always to revoke and rotate.

YARA — malware / backdoor patterns

Runs the YARA engine against every file using Pencheff's bundled rule pack at bench/rules/yara/. Targets that actually appear in real source trees:

Minimal PHP webshells (eval($_GET[…]) families)
Obfuscated JS loaders (eval(atob(…)), Function(decodeURIComponent(…)))
Crypto-miner pool configs (stratum+tcp://, xmrig)
Python pickle RCE gadgets
Classic reverse-shell oneliners

Drop your own *.yar files into bench/rules/yara/ to extend the pack without touching Pencheff code.

Trivy IaC — infrastructure misconfigurations

Runs trivy config over the staged repo. Picks up Terraform, CloudFormation, Helm charts, Kubernetes manifests, and Dockerfiles without configuration. Includes CIS benchmarks and AWS / Azure / GCP provider-specific rules.

Checkov — policy-as-code

1,000+ policy-as-code rules across the same IaC surface as Trivy plus ARM, Bicep, Serverless, OpenAPI. Useful complement when an organisation cares about specific compliance frameworks (Trivy is broader, Checkov is opinionated).

Filtering — what gets scanned

Before any scanner runs, the repo is staged into a clean directory using hardlinks (cheap, no byte copy on the same filesystem). Staging respects:

.gitignore (root and nested)
A default-deny list: .git, .env*, node_modules, .venv, build / dist directories, __pycache__, …

stats.filter on each RepoScan records included / excluded counts and the method (git ls-files if available, fallback walk).

From the Pencheff docs

GitHub Check Run + SARIF + Pencheff Suggest

When the Pencheff GitHub App is installed on a repo, every PR scan posts a Pencheff Check Run on the head commit with inline annotations on the diff, and uploads a SARIF v2.1.0 document to Security → Code scanning.

A separate bot — Pencheff Suggest — reads PR comments and acts on pencheff: suppress … directives so reviewers can mark findings noise without leaving GitHub.

(The bot name is provisional pending the Phase 0.6 trademark search; final name TBD.)

Check Run surface

Layer	What you see
Per-commit check	`Pencheff` check appears alongside `lint` / `test` / `build` on every PR. Conclusion is `success` when no critical/high; `failure` otherwise.
Inline annotations	One annotation per finding, anchored at `(file_path, line_start..line_end)` with severity → `failure` / `warning` / `notice`. GitHub caps at 50 per Check-Run POST; Pencheff pages remaining annotations via PATCH.
Summary	Count strip — N critical · N high · N medium · N low · N info — rendered in the check's `output.summary`.

SARIF upload

A separate path uploads the same findings as a SARIF v2.1.0 document to GitHub's Code Scanning ingest endpoint. The findings then show up under the repo's Security → Code scanning alerts tab and inherit the standard GitHub triage UI (dismiss, mark resolved, alert routing).

POST /repos/{owner}/{repo}/code-scanning/sarifs
Authorization: Bearer <installation-token>
Content-Type: application/json

{ "commit_sha": "...", "ref": "refs/heads/main",
  "sarif": "<base64-gzip>", "tool_name": "Pencheff",
  "checkout_uri": "https://github.com/owner/repo" }

The Pencheff GitHub App requires the security_events permission (write) for SARIF upload. Customers using the PAT path need a token scoped to security_events.

Pencheff Suggest — PR-comment suppression

Reviewers can suppress a finding directly from a PR comment:

Looks fine to me — running on staging only.

pencheff: suppress 47bf3c92 reason="accepted_risk" notes="staging-only test fixture"

The bot parses the directive, validates the reason against the allowlist, and calls POST /findings/{id}/suppress on your behalf. Valid reasons: accepted_risk, wont_fix, false_positive, duplicate, out_of_scope. Anything else is rejected silently — there's no way to inject a custom reason via the comment surface.

How to enable

Install the Pencheff GitHub App on the org or specific repos (see Connect a repo).
Grant the Checks permission (write) and security_events permission (write) when the app installer prompts.
The next push or PR triggers an automatic Check Run + SARIF upload alongside the existing scan.

For repos connected via PAT (no GitHub App), the Check Run / SARIF features require a PAT with security_events write — the standard fine-grained PAT path doesn't expose this scope, so most PAT-only deployments use the unified findings stream + DOCX report instead.

Source

apps/api/pencheff_api/services/github_check_runs.py

References

Authoritative sources

FAQ

Common questions

What is SAST and why does it matter?: SAST (Static Application Security Testing) analyses source code, bytecode, or binaries without executing the application. It finds injection flaws, hardcoded secrets, insecure library use, and logic errors earlier in the development cycle than DAST.
Which programming languages does Pencheff SAST support?: Pencheff runs CodeQL, Semgrep, Bandit (Python), gosec (Go), Brakeman (Ruby on Rails), ESLint security rules (JavaScript/TypeScript), and a tree-sitter pack for additional languages including Rust, PHP, and Java.
How does Pencheff find hardcoded secrets in code?: Pencheff runs gitleaks over the full git history and working tree, detecting API keys, tokens, passwords, and private keys across all commits — not just the current HEAD. YARA rules additionally flag malware patterns and backdoors.
Does SAST replace DAST, or do they complement each other?: They complement each other. SAST finds flaws in code that may not be reachable at runtime, while DAST finds runtime vulnerabilities that may not be apparent from reading the source. Pencheff combines both into a unified findings stream with de-duplication.