Rose 1 API documentation
Send a query and long context, receive a smaller prompt plus an auditable receipt. Rose sits before your model call and keeps the downstream provider flow unchanged.
Quickstart
Create a project key in the dashboard, then call Rose before your model provider. The output is plain text that can be passed directly to the next LLM request.
curl https://api.rose.dev/v1/compress \
-H "Authorization: Bearer rose_live_..." \
-H "Content-Type: application/json" \
-d '{
"model": "rose-1",
"query": "Which incidents mention database saturation?",
"input": "Long logs, tickets, transcripts, docs...",
"compression": {
"target_ratio": 0.3
},
"include_spans": false
}'Authentication
Production traffic uses project-scoped bearer keys. Keys can be revoked without affecting other projects in the same workspace.
Authorization: Bearer rose_live_...Compression request
Rose is query aware: the request should include the task your model will answer and the full context you want reduced.
modelstringUse rose-1 for the production compression route.
querystringThe task or user question Rose should preserve context for.
inputstringLong logs, docs, tickets, transcripts, retrieved chunks, or memory.
compression.target_rationumberOptional target output ratio. Default is 0.3.
include_spansbooleanReturn selected spans for debugging and audit workflows.
Response receipt
Every response includes token accounting, compression ratio, latency, and risk metadata so product and finance teams can audit the path.
{
"model": "rose-1",
"output": "Selected context...",
"receipt": {
"original_tokens": 4200,
"output_tokens": 980,
"tokens_saved": 3220,
"compression_ratio": 0.233,
"latency_ms": 4.8,
"risk": { "level": "low", "flags": [] }
}
}Batch compression
Use batch jobs for evaluation sets, large retrieval reprocessing, and asynchronous backfills. Responses use the same receipt shape as synchronous compression.
{
"model": "rose-1",
"input_file_id": "file_eval_123",
"endpoint": "/v1/compress",
"metadata": {
"eval": "support-copilot-regression"
}
}Errors
Error responses include a stable code, a human-readable message, and a request id for support and log correlation.
400Malformed JSON, missing input, or invalid compression options.
401Missing, revoked, or malformed bearer key.
402Workspace quota exceeded or billing disabled.
429Project rate limit exceeded.
500Compression worker unavailable. Retry with backoff.