This document provides a comprehensive guide to using the Sahara Model Hub API. It walks you through discovering available models and compute providers, querying model metadata, and making inference requests using both raw HTTP and OpenAI-compatible Python clients.
The API is especially useful for developers integrating multiple model providers into their workflow while maintaining a unified interface.
You will learn how to:
Query all available models and compute providers
Filter models or providers using specific criteria
Access model usage details
Send inference requests through Langchain, OpenAI SDK, or direct HTTP
Implement multi-agent logic with routing
Preparation
API Setup
To access the Sahara Model Hub API, you need a valid API key. This key is required to authenticate every API request.
How to Get Your API Key
1. Go to the Developer Portal
Open: https://portal.saharalabs.ai
2. Log In and Access API Keys
Click your profile icon (top-right) → select "API Key".
3. Create a New Key
Click "Create API Key", assign it a name like "dev-client", and generate it.
4. Copy and Store Securely
You can only view the key once. Save it securely in an environment variable, config file, or secret manager.
Note: Never expose your API key in public code or repositories. Treat it as a secret credential.
Once you have your API key, configure it in your script. This will be required in all requests sent to the Sahara Model Hub API.
If you prefer to use OpenAI's SDK, the Sahara endpoint fully supports OpenAI-compatible APIs.
Non-Streaming Response
from openai import OpenAI
client = OpenAI(
base_url=MODEL_BASE_URL,
api_key=SAHARA_DEVPORTAL_API_KEY,
organization="openai"
)
completion = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a helpful assistant. You are a helpful assistant. You are a helpful assistant. You are a helpful assistant."},
{"role": "user", "content": "Hello! Who are you man? Are you ok? Hey hey hey"}
]
)
print(completion.choices[0].message)
Sample Output
ChatCompletionMessage(content="Hello! I'm an AI assistant here to help you with any questions or information you need. How can I assist you today?", refusal=None, role='assistant', audio=None, function_call=None, tool_calls=None)
Streaming Response
async def generate(model_name, model_provider):
print(f"Testing Streaming Output of {model_name} on {model_provider}")
chat = ChatOpenAI(
model=model_name,
api_key=SAHARA_DEVPORTAL_API_KEY,
openai_api_base=MODEL_BASE_URL,
organization=model_provider,
streaming=True,
extra_body={
"compute_provider": "lepton"
}
)
messages = [
HumanMessage(content="Hello! How are you are you are you? Hey hey hey!")
]
try:
full_content = ""
async for chunk in chat.astream(messages):
if chunk.content:
full_content += chunk.content
print(full_content)
print(full_content)
return
except Exception as e:
print(f"Streaming error: {e}")
error_data = {"type": "error", "message": str(e)}
print(f"data: {json.dumps(error_data)}\n\n")
async def main():
for combination in model_provider_combinations[:1]:
await generate(combination["model_name"], combination["model_provider"])
if __name__ == '__main__':
asyncio.run(main())
Sample Response
Testing Streaming Output of gpt-4o on openai
Hello
Hello!
Hello! I'm
Hello! I'm here
Hello! I'm here and
Hello! I'm here and ready
Hello! I'm here and ready to
Hello! I'm here and ready to help
Hello! I'm here and ready to help.
Hello! I'm here and ready to help. What
Hello! I'm here and ready to help. What can
Hello! I'm here and ready to help. What can I
Hello! I'm here and ready to help. What can I do
Hello! I'm here and ready to help. What can I do for
Hello! I'm here and ready to help. What can I do for you
Hello! I'm here and ready to help. What can I do for you today
Hello! I'm here and ready to help. What can I do for you today?
Hello! I'm here and ready to help. What can I do for you today?
Model Inference Using Langchain
Prerequisites
Ensure the following tools and packages are installed before continuing
pip install langchain_openai
langchain_openai is a Python library that provides integration between LangChain and OpenAI’s API.
You can interact with sahara models using the `langchain` interface. This is useful for testing streaming outputs and experimenting with conversational flows.
Below is an example using three working models and one invalid one to demonstrate both success and failure:
from langchain_core.messages import HumanMessage
from langchain_openai import ChatOpenAI
import asyncio
import json
model_name = "gpt-4o"
model_provider = "openai"
chat = ChatOpenAI(
model=model_name,
api_key=SAHARA_DEVPORTAL_API_KEY,
openai_api_base=MODEL_BASE_URL,
organization=model_provider,
streaming=False,
)
messages = [
HumanMessage(content="Hello! How are you?")
]
def generate():
try:
res = chat.invoke(messages)
print(res)
except Exception as e:
print(f"Streaming error: {e}")
error_data = {"type": "error", "message": str(e)}
print(f"data: {json.dumps(error_data)}\n\n")
if __name__ == '__main__':
generate()
Sample Response
content="Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?" additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 30, 'prompt_tokens': 13, 'total_tokens': 43, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-2024-08-06', 'system_fingerprint': 'fp_eb9dce56a8', 'finish_reason': 'stop', 'logprobs': None} id='run-427fd56e-853e-4cb4-9c29-8f48cccab9d6-0' usage_metadata={'input_tokens': 13, 'output_tokens': 30, 'total_tokens': 43, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}}
Multi-Agent Integration (OpenAI Agents SDK)
The Sahara API supports OpenAI's 'agents-python' package. This example sets up three agents:
A Spanish-speaking agent
An English-speaking agent
A triage agent that routes input based on langauge
Prerequisites
Ensure the following tools and packages are installed before continuing