Skip to main content

Azure Anthropic (Claude via Azure Foundry)

LiteLLM supports Claude models deployed via Microsoft Azure Foundry, including Claude Sonnet 4.5, Claude Haiku 4.5, and Claude Opus 4.1.

Available Models​

Azure Foundry supports the following Claude models:

  • claude-sonnet-4-5 - Anthropic's most capable model for building real-world agents and handling complex, long-horizon tasks
  • claude-haiku-4-5 - Near-frontier performance with the right speed and cost for high-volume use cases
  • claude-opus-4-1 - Industry leader for coding, delivering sustained performance on long-running tasks
PropertyDetails
DescriptionClaude models deployed via Microsoft Azure Foundry. Uses the same API as Anthropic's Messages API but with Azure authentication.
Provider Route on LiteLLMazure_ai/ (add this prefix to Claude model names - e.g. azure_ai/claude-sonnet-4-5)
Provider DocAzure Foundry Claude Models ↗
API Endpointhttps://<resource-name>.services.ai.azure.com/anthropic/v1/messages
Supported Endpoints/chat/completions, /anthropic/v1/messages

Key Features​

  • Extended thinking: Enhanced reasoning capabilities for complex tasks
  • Image and text input: Strong vision capabilities for analyzing charts, graphs, technical diagrams, and reports
  • Code generation: Advanced thinking with code generation, analysis, and debugging (Claude Sonnet 4.5 and Claude Opus 4.1)
  • Same API as Anthropic: All request/response transformations are identical to the main Anthropic provider

Authentication​

Azure Anthropic supports two authentication methods:

  1. API Key: Use the api-key header
  2. Azure AD Token: Use Authorization: Bearer <token> header (Microsoft Entra ID)

API Keys and Configuration​

import os

# Option 1: API Key authentication
os.environ["AZURE_API_KEY"] = "your-azure-api-key"
os.environ["AZURE_API_BASE"] = "https://<resource-name>.services.ai.azure.com/anthropic"

# Option 2: Azure AD Token authentication
os.environ["AZURE_AD_TOKEN"] = "your-azure-ad-token"
os.environ["AZURE_API_BASE"] = "https://<resource-name>.services.ai.azure.com/anthropic"

# Optional: Azure AD Token Provider (for automatic token refresh)
os.environ["AZURE_TENANT_ID"] = "your-tenant-id"
os.environ["AZURE_CLIENT_ID"] = "your-client-id"
os.environ["AZURE_CLIENT_SECRET"] = "your-client-secret"
os.environ["AZURE_SCOPE"] = "https://cognitiveservices.azure.com/.default"

Usage - LiteLLM Python SDK​

Basic Completion​

from litellm import completion

# Set environment variables
os.environ["AZURE_API_KEY"] = "your-azure-api-key"
os.environ["AZURE_API_BASE"] = "https://<resource-name>.services.ai.azure.com/anthropic"

# Make a completion request
response = completion(
model="azure_ai/claude-sonnet-4-5",
messages=[
{"role": "user", "content": "What are 3 things to visit in Seattle?"}
],
max_tokens=1000,
temperature=0.7,
)

print(response)

Completion with API Key Parameter​

import litellm

response = litellm.completion(
model="azure_ai/claude-sonnet-4-5",
api_base="https://<resource-name>.services.ai.azure.com/anthropic",
api_key="your-azure-api-key",
messages=[
{"role": "user", "content": "Hello!"}
],
max_tokens=1000,
)

Completion with Azure AD Token​

import litellm

response = litellm.completion(
model="azure_ai/claude-sonnet-4-5",
api_base="https://<resource-name>.services.ai.azure.com/anthropic",
azure_ad_token="your-azure-ad-token",
messages=[
{"role": "user", "content": "Hello!"}
],
max_tokens=1000,
)

Streaming​

from litellm import completion

response = completion(
model="azure_ai/claude-sonnet-4-5",
messages=[
{"role": "user", "content": "Write a short story"}
],
stream=True,
max_tokens=1000,
)

for chunk in response:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)

Tool Calling​

from litellm import completion

response = completion(
model="azure_ai/claude-sonnet-4-5",
messages=[
{"role": "user", "content": "What's the weather in Seattle?"}
],
tools=[
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
}
},
"required": ["location"]
}
}
}
],
tool_choice="auto",
max_tokens=1000,
)

print(response)

Usage - LiteLLM Proxy Server​

1. Save key in your environment​

export AZURE_API_KEY="your-azure-api-key"
export AZURE_API_BASE="https://<resource-name>.services.ai.azure.com/anthropic"

2. Configure the proxy​

model_list:
- model_name: claude-sonnet-4-5
litellm_params:
model: azure_ai/claude-sonnet-4-5
api_base: https://<resource-name>.services.ai.azure.com/anthropic
api_key: os.environ/AZURE_API_KEY

3. Test it​

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
"model": "claude-sonnet-4-5",
"messages": [
{
"role": "user",
"content": "Hello!"
}
],
"max_tokens": 1000
}'

Messages API​

Azure Anthropic also supports the native Anthropic Messages API. The endpoint structure is the same as Anthropic's /v1/messages API.

Using Anthropic SDK​

from anthropic import Anthropic

client = Anthropic(
api_key="your-azure-api-key",
base_url="https://<resource-name>.services.ai.azure.com/anthropic"
)

response = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=1000,
messages=[
{"role": "user", "content": "Hello, world"}
]
)

print(response)

Using LiteLLM Proxy​

curl --request POST \
--url http://0.0.0.0:4000/anthropic/v1/messages \
--header 'accept: application/json' \
--header 'content-type: application/json' \
--header "Authorization: bearer sk-anything" \
--data '{
"model": "claude-sonnet-4-5",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "Hello, world"}
]
}'

Supported OpenAI Parameters​

Azure Anthropic supports the same parameters as the main Anthropic provider:

"stream",
"stop",
"temperature",
"top_p",
"max_tokens",
"max_completion_tokens",
"tools",
"tool_choice",
"extra_headers",
"parallel_tool_calls",
"response_format",
"user",
"thinking",
"reasoning_effort"
info

Azure Anthropic API requires max_tokens to be passed. LiteLLM automatically passes max_tokens=4096 when no max_tokens are provided.

Differences from Standard Anthropic Provider​

The only difference between Azure Anthropic and the standard Anthropic provider is authentication:

  • Standard Anthropic: Uses x-api-key header
  • Azure Anthropic: Uses api-key header or Authorization: Bearer <token> for Azure AD authentication

All other request/response transformations, tool calling, streaming, and feature support are identical.

API Base URL Format​

The API base URL should follow this format:

https://<resource-name>.services.ai.azure.com/anthropic

LiteLLM will automatically append /v1/messages if not already present in the URL.

Example: Full Configuration​

import os
from litellm import completion

# Configure Azure Anthropic
os.environ["AZURE_API_KEY"] = "your-azure-api-key"
os.environ["AZURE_API_BASE"] = "https://my-resource.services.ai.azure.com/anthropic"

# Make a request
response = completion(
model="azure_ai/claude-sonnet-4-5",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum computing in simple terms."}
],
max_tokens=1000,
temperature=0.7,
stream=False,
)

print(response.choices[0].message.content)

Troubleshooting​

Missing API Base Error​

If you see an error about missing API base, ensure you've set:

os.environ["AZURE_API_BASE"] = "https://<resource-name>.services.ai.azure.com/anthropic"

Or pass it directly:

response = completion(
model="azure_ai/claude-sonnet-4-5",
api_base="https://<resource-name>.services.ai.azure.com/anthropic",
# ...
)

Authentication Errors​

  • API Key: Ensure AZURE_API_KEY is set or passed as api_key parameter
  • Azure AD Token: Ensure AZURE_AD_TOKEN is set or passed as azure_ad_token parameter
  • Token Provider: For automatic token refresh, configure AZURE_TENANT_ID, AZURE_CLIENT_ID, and AZURE_CLIENT_SECRET