Everything around the compression hop.

Docs, billing notes, implementation patterns, and dashboard links for putting Rose in front of production LLM traffic.

API reference

Request shape, auth headers, response receipts, batch jobs, and error codes.

How saved-token billing works and when to move from free to production.

Create scoped bearer keys, rotate credentials, and isolate production traffic.

Understand token counts, compression ratio, latency, risk flags, and audit metadata.

Put Rose before OpenAI, Anthropic, DeepSeek, local models, or your own router.

Docker services, migrations, readiness checks, Azure Container Apps, and Postgres.

The same sequence works for agents, RAG retrieval, support copilots, and model gateways.

Generate a bearer key for the service that owns the model request.

Send the query plus retrieved context to Rose before your model call.

Track saved tokens, output ratio, latency, and risk flags by request.