Sparkient Docs
Core Concepts

Decision Pipeline

How rules, ML, and LLM escalation work together to make fast decisions.

The decision pipeline is the core of Sparkient. Every call to /decide flows through three stages, each progressively more powerful — and only as far as needed.

The Three Stages

Input → [1. Rules] → [2. Classifier] → [3. Escalation] → Response
           < 1ms        < 100ms           150ms+

Stage 1: Hard Rules (CEL)

Latency: < 1ms

CEL rules are evaluated first. If any rule matches, the decision is returned immediately. This is the fastest path — pure logic, no ML involved.

Use this for:

  • Compliance requirements ("always block amounts over $50,000")
  • Known patterns ("reject if the user is banned")
  • Rate limiting ("escalate if more than 10 requests in 1 minute")

Stage 2: ML Classifier (ONNX)

Latency: < 100ms (typically 5–30ms)

If no rules match and a trained model is deployed, the classifier runs inference. It uses the ONNX-exported LightGBM model with pre-computed features and optional text embeddings.

The classifier returns a decision along with:

  • Confidence score (0.0 to 1.0)
  • Class probabilities for all options
  • Reason codes from the training data

If the confidence is above the auto_decide threshold, the decision is returned. If it's below the escalation threshold, it moves to Stage 3.

Stage 3: LLM Escalation (Gemini)

Latency: 150ms–3s

The fallback for low-confidence decisions. Gemini receives the input and decision type context and produces a structured decision with explanation.

This stage includes:

  • Automatic retry with exponential backoff
  • Structured output parsing
  • Timeout protection

In practice, a well-trained model escalates less than 5% of decisions. The LLM is a safety net, not the primary path.

Response Format

Every decision — regardless of which stage produced it — returns the same structured format:

{
  "decision": "approve",
  "confidence": 0.94,
  "reason_codes": ["safe_content"],
  "latency_ms": 8.3,
  "stage": "classifier",
  "escalate": false,
  "fallback_used": false,
  "rules_triggered": [],
  "class_probabilities": {
    "approve": 0.94,
    "flag": 0.04,
    "reject": 0.02
  },
  "request_id": "req_abc123"
}

The stage field tells you which stage produced the decision:

  • "rules" — a hard rule matched
  • "classifier" — the ML model decided
  • "escalation" — the LLM fallback was used
  • "fallback" — the LLM escalation was triggered due to an error

Latency Breakdown

StageTypical LatencyWhen It Runs
Rules0.1–0.5msAlways (first check)
Feature extraction1–5msIf no rule matched
Text embedding2–8msIf input has text fields
ONNX inference0.5–2msIf model is deployed
Total (no escalation)5–30ms95%+ of requests
LLM escalation150–3000msLow-confidence or no model

On this page