Routing Policies

Routing policies store task-level preferences for deployment selection and fallback.

When a live inference request does not pin a deployment explicitly, Orlo can evaluate the routing policy against the task's evaluated deployment candidates and choose the best available option.

Endpoints

`POST /v1/routing-policies`

Create a routing policy.

Common fields:

task_id
weight_accuracy
weight_latency
weight_cost
weight_validation
min_accuracy
max_latency_ms
min_validation_rate
max_cost_per_1k
sla_latency_p95_ms
sla_availability
sla_max_error_rate
fallback_model_id

`GET /v1/routing-policies`

List policies, optionally filtered by task_id.

Runtime effect

When routing is active for a request, Orlo exposes the result in two places:

x-orlo-routing-mode on POST /v1/chat/completions
debug.routing on POST /v1/tasks/:task_id/run when explain is debug or audit

Notes

Routing policies work against evaluated deployment candidates, not arbitrary models with no deployment snapshot.
Routing policy records are part of Orlo's control-plane data model.