API Endpoints
Dieser Inhalt ist für v1.0.0. Geh zur neuesten Version, um die aktuellste Dokumentation zu bekommen.
Dieser Inhalt ist noch nicht in deiner Sprache verfügbar.
API Endpoints
Section titled “API Endpoints”AI Foundation Services provides an OpenAI-compatible REST API. All endpoints use the base URL:
https://llm-server.llmhub.t-systems.net/v2For the complete OpenAPI specification, see the interactive API docs (Redoc).
Endpoints
Section titled “Endpoints”| Method | Endpoint | Description |
|---|---|---|
GET | /models | List all available models |
GET | /models/{model_id} | Get model details and metadata |
POST | /chat/completions | Create a chat completion |
POST | /completions | Create a text completion |
POST | /embeddings | Create embeddings |
POST | /audio/transcriptions | Transcribe audio to text |
POST | /audio/translations | Translate audio to English |
GET | /audio/models | List available audio models |
POST | /images/generations | Generate images from text |
POST | /responses | Create a response (Responses API) |
POST | /fine_tuning/jobs | Create a fine-tuning job |
GET | /fine_tuning/jobs | List fine-tuning jobs |
POST | /fine_tuning/jobs/{id}/cancel | Cancel a fine-tuning job |
GET | /fine_tuning/jobs/{id}/events | List fine-tuning events |
POST | /files | Upload a file |
GET | /files | List uploaded files |
DELETE | /files/{id} | Delete a file |
Queue Endpoints (Asynchronous)
Section titled “Queue Endpoints (Asynchronous)”All major endpoints have /queue variants for asynchronous processing. See the Asynchronous Requests guide for details.
| Method | Endpoint | Description |
|---|---|---|
POST | /queue/chat/completions | Async chat completion |
POST | /queue/completions | Async text completion |
POST | /queue/embeddings | Async embeddings |
POST | /queue/audio/transcriptions | Async audio transcription |
POST | /queue/audio/translations | Async audio translation |
POST | /queue/images/generations | Async image generation |
POST | /queue/images/edits | Async image editing |
GET | /queue/models | List models (queue) |
API Versioning
Section titled “API Versioning”The API uses versioned paths:
| Path Prefix | Purpose | Examples |
|---|---|---|
/v2 | Default — LLM inference, embeddings, audio, images | /v2/chat/completions, /v2/embeddings |
/v1 | Visual RAG, vector stores, file management | /v1/vector_stores, /v1/files |
/queue | Asynchronous processing (mirrors /v2 endpoints) | /queue/chat/completions |
- Use
/v2for all standard LLM API calls (this is the default when using OpenAI SDKs with our base URL) - Use
/v1for Visual RAG and file management operations - Use
/queuefor long-running or batch workloads
Authentication
Section titled “Authentication”All requests require an API key in the Authorization header:
Authorization: Bearer YOUR_API_KEYSee Authentication for setup details.
Request Format
Section titled “Request Format”All POST requests use JSON:
curl -X POST "https://llm-server.llmhub.t-systems.net/v2/chat/completions" \ -H "Authorization: Bearer $OPENAI_API_KEY" \ -H "Content-Type: application/json" \ -d '{"model": "Llama-3.3-70B-Instruct", "messages": [{"role": "user", "content": "Hello"}]}'Response Format
Section titled “Response Format”Responses follow the OpenAI response format. For chat completions:
{ "id": "chatcmpl-abc123", "object": "chat.completion", "created": 1710000000, "model": "Llama-3.3-70B-Instruct", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "Hello! How can I help you today?" }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 10, "completion_tokens": 9, "total_tokens": 19 }}