Routing Policies
Routing policies store task-level preferences for deployment selection and fallback.
When a live inference request does not pin a deployment explicitly, Orlo can evaluate the routing policy against the task's evaluated deployment candidates and choose the best available option.
Endpoints
POST /v1/routing-policies
Create a routing policy.
Common fields:
task_idweight_accuracyweight_latencyweight_costweight_validationmin_accuracymax_latency_msmin_validation_ratemax_cost_per_1ksla_latency_p95_mssla_availabilitysla_max_error_ratefallback_model_id
GET /v1/routing-policies
List policies, optionally filtered by task_id.
Runtime effect
When routing is active for a request, Orlo exposes the result in two places:
x-orlo-routing-modeonPOST /v1/chat/completionsdebug.routingonPOST /v1/tasks/:task_id/runwhenexplainisdebugoraudit
Notes
- Routing policies work against evaluated deployment candidates, not arbitrary models with no deployment snapshot.
- Routing policy records are part of Orlo's control-plane data model.