degraded mode policy
what it answers: what happens when verification cannot complete?
verification can fail for many reasons: service unavailable, config missing, parsing error, timeout.
the question is what the system does next.
primary domain default
software supply chain (container signing) is the primary domain informed by published research.
| domain |
default |
rationale |
| container |
configurable (dev: fail-open with audit; prod: fail-closed) |
balance velocity vs. release integrity |
mapped domains (conceptual)
the mappings below reflect common constraints in each domain and are intended as starting points, not universal defaults.
- fintech: usually fail-closed + fast override path (high C_fn, regulatory constraints).
- firmware: usually fail-closed (C_fn dominates: bricking, safety).
- AI/ML: often configurable (dev: permissive; prod: conservative).
- crypto: often tiered by value (high-value: fail-closed; low-value: configurable).
- document: often fail-open with audit (or configurable) when errors are reversible.
every override is logged with: who, when, why, evidence, expiry.
configuration example
degraded_mode:
defaults:
container: fail_open_with_audit # dev
container_prod: fail_closed # prod
# note: non-primary domains below are illustrative mappings, not validated defaults.
fintech: fail_closed_fast_override
firmware: fail_closed
ai_ml: configurable
crypto: tiered_by_value
document: fail_open_with_audit
override_log_fields: [who, when, why, evidence, expiry]
appeal and resolution
what it answers: what if legitimate operations get blocked?
false positives are inevitable in any security system.
the question is whether there's a path to resolve them without disabling the control entirely.
flow
- block occurs with reason code and evidence snapshot.
- operator requests review with counter-evidence.
- review against invariant definition. is this a rule bug or an actual violation?
- outcome: override granted, invariant updated, or block upheld.
- every step logged with identity, timestamp, rationale.
time bounds: deployment-defined targets
(recommended defaults: initial response 24h, resolution 72h for standard cases).
invariant lifecycle
what it answers: how do rules get created, updated, and retired?
static rules become false-positive generators.
what was valid yesterday may block legitimate operations tomorrow.
invariants need a lifecycle.
stages
- proposal: new invariant proposed with evidence of attack pattern.
- experimental: warn-only mode, gather fp/fn data. 7-30 days.
- active: enforcing mode, ongoing monitoring.
- deprecated: sunset period, warn-only. 30 days.
- retired: removed from enforcement, archived for audit.
review cadence: quarterly review of all active invariants.
error cost framework
what it answers: how do you choose thresholds?
optimal thresholds depend on domain-specific error costs.
blocking a firmware update has different consequences than blocking a low-value transaction.
inputs
- c_fp: cost of false positive. blocking legitimate operation.
- c_fn: cost of false negative. allowing attack.
- base_rate: expected attack frequency in this context.
output: threshold and bias direction.
primary domain example
| domain |
c_fp |
c_fn |
bias |
| container |
low (dev), medium (prod) |
extreme |
permissive (dev), conservative (prod) |
mapped domains (conceptual)
- fintech: C_fp high, C_fn extreme → conservative + fast appeal.
- firmware: C_fp medium, C_fn extreme → very conservative.
- AI/ML: C_fp low (dev) / medium (prod), C_fn high → permissive (dev), conservative (prod).
- crypto: value-tiered error costs → configurable by value tier.
- document: often reversible errors → balanced or permissive with audit.
recalibration triggers: threat landscape change, fp rate exceeds threshold, incident post-mortem.