Azure AI Foundry with Anthropic Claude Part 4: Python Implementation with Azure SDK

Python’s rich ecosystem and async capabilities make it an excellent choice for building AI applications with Claude in Azure AI Foundry. This comprehensive guide demonstrates production-ready Python patterns including async/await implementations, Azure Identity integration, advanced tool use, vision capabilities, and cost optimization strategies unique to Python’s ecosystem.

While Part 3 covered Node.js implementation, Part 4 focuses on Python-specific patterns that leverage asyncio, type hints, context managers, and Python’s extensive data science libraries for building sophisticated AI applications.

Environment Setup

Prerequisites and Dependencies

Python 3.8 or higher is required. We recommend Python 3.11 or 3.12 for optimal performance with async operations.

# Create virtual environment
python -m venv venv

# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate

# Install Anthropic SDK with Foundry support
pip install anthropic

# Install Azure Identity for Entra ID authentication
pip install azure-identity

# Install additional utilities
pip install python-dotenv aiofiles pydantic

Project Structure and Configuration

Create a .env file for configuration:

# Azure Foundry Configuration
AZURE_FOUNDRY_RESOURCE=your-resource-name
AZURE_FOUNDRY_BASE_URL=https://your-resource-name.services.ai.azure.com/anthropic

# API Key Authentication
ANTHROPIC_FOUNDRY_API_KEY=your-api-key

# Model Configuration
DEFAULT_MODEL=claude-sonnet-4-5
MAX_TOKENS=4096

Basic Synchronous Implementation

Start with a basic synchronous implementation to understand core concepts:

from anthropic import AnthropicFoundry
from dotenv import load_dotenv
import os

# Load environment variables
load_dotenv()

# Initialize client with API key
client = AnthropicFoundry(
    api_key=os.getenv("ANTHROPIC_FOUNDRY_API_KEY"),
    base_url=f"{os.getenv('AZURE_FOUNDRY_BASE_URL')}"
)

def simple_chat(user_message: str) -> str:
    """Send a message to Claude and get response."""
    response = client.messages.create(
        model=os.getenv("DEFAULT_MODEL", "claude-sonnet-4-5"),
        max_tokens=int(os.getenv("MAX_TOKENS", "1024")),
        messages=[
            {
                "role": "user",
                "content": user_message
            }
        ]
    )
    
    # Extract text from response
    text_content = "".join([
        block.text for block in response.content 
        if hasattr(block, 'text')
    ])
    
    print(f"Input tokens: {response.usage.input_tokens}")
    print(f"Output tokens: {response.usage.output_tokens}")
    
    return text_content

# Test the function
if __name__ == "__main__":
    result = simple_chat("Explain Python async/await in 3 sentences.")
    print(result)

Async Implementation with Azure Identity

For production applications, use async patterns with Entra ID authentication:

import asyncio
from anthropic import AsyncAnthropicFoundry
from azure.identity.aio import DefaultAzureCredential
from azure.core.credentials import AccessToken
from typing import AsyncIterator
import os

class AzureTokenProvider:
    """Token provider for Azure Entra ID authentication."""
    
    def __init__(self):
        self.credential = DefaultAzureCredential()
        self.scope = "https://ai.azure.com/.default"
    
    async def __call__(self) -> str:
        """Get access token for Azure AI Foundry."""
        token = await self.credential.get_token(self.scope)
        return token.token
    
    async def close(self):
        """Clean up credential resources."""
        await self.credential.close()

async def create_client() -> AsyncAnthropicFoundry:
    """Create authenticated Claude client."""
    token_provider = AzureTokenProvider()
    
    client = AsyncAnthropicFoundry(
        azure_ad_token_provider=token_provider,
        base_url=os.getenv("AZURE_FOUNDRY_BASE_URL")
    )
    
    return client

async def async_chat(user_message: str) -> str:
    """Async chat with Claude using Entra ID auth."""
    client = await create_client()
    
    try:
        response = await client.messages.create(
            model=os.getenv("DEFAULT_MODEL", "claude-sonnet-4-5"),
            max_tokens=1024,
            messages=[{"role": "user", "content": user_message}]
        )
        
        return "".join([
            block.text for block in response.content 
            if hasattr(block, 'text')
        ])
    finally:
        await client.close()

# Run async function
if __name__ == "__main__":
    result = asyncio.run(async_chat("What are Python's key features?"))
    print(result)

Streaming Responses with Async Generators

Python’s async generators provide elegant streaming implementations:

from anthropic import AsyncAnthropicFoundry
from anthropic.types import MessageStreamEvent
from typing import AsyncIterator
import asyncio

async def stream_chat(
    client: AsyncAnthropicFoundry, 
    user_message: str
) -> AsyncIterator[str]:
    """Stream response tokens as they arrive."""
    
    async with client.messages.stream(
        model="claude-sonnet-4-5",
        max_tokens=2048,
        messages=[{"role": "user", "content": user_message}]
    ) as stream:
        async for text in stream.text_stream:
            yield text

async def demonstrate_streaming():
    """Demonstrate streaming chat."""
    client = await create_client()
    
    try:
        print("Claude: ", end="", flush=True)
        
        async for token in stream_chat(client, "Explain generators in Python."):
            print(token, end="", flush=True)
        
        print("\n")
    finally:
        await client.close()

# Run streaming example
asyncio.run(demonstrate_streaming())

Multi-Turn Conversation Manager

Build a conversation manager with proper type hints and context management:

from dataclasses import dataclass, field
from typing import List, Optional
from anthropic import AsyncAnthropicFoundry
from anthropic.types import MessageParam

@dataclass
class ConversationConfig:
    """Configuration for conversation manager."""
    model: str = "claude-sonnet-4-5"
    max_tokens: int = 2048
    system_prompt: Optional[str] = None

class ConversationManager:
    """Manage multi-turn conversations with Claude."""
    
    def __init__(
        self, 
        client: AsyncAnthropicFoundry,
        config: Optional[ConversationConfig] = None
    ):
        self.client = client
        self.config = config or ConversationConfig()
        self.messages: List[MessageParam] = []
    
    async def send_message(self, user_message: str) -> str:
        """Send message and get response."""
        # Add user message to history
        self.messages.append({
            "role": "user",
            "content": user_message
        })
        
        # Call Claude API
        response = await self.client.messages.create(
            model=self.config.model,
            max_tokens=self.config.max_tokens,
            system=self.config.system_prompt,
            messages=self.messages
        )
        
        # Extract assistant response
        assistant_message = "".join([
            block.text for block in response.content 
            if hasattr(block, 'text')
        ])
        
        # Add to history
        self.messages.append({
            "role": "assistant",
            "content": assistant_message
        })
        
        return assistant_message
    
    def get_history(self) -> List[MessageParam]:
        """Get conversation history."""
        return self.messages.copy()
    
    def clear_history(self) -> None:
        """Clear conversation history."""
        self.messages.clear()
    
    def estimate_tokens(self) -> int:
        """Rough token estimate (4 chars per token)."""
        total_chars = sum(
            len(msg.get("content", "")) 
            for msg in self.messages
        )
        return total_chars // 4

# Usage example
async def conversation_example():
    client = await create_client()
    
    config = ConversationConfig(
        system_prompt="You are a Python expert helping with async programming.",
        max_tokens=2048
    )
    
    conversation = ConversationManager(client, config)
    
    try:
        response1 = await conversation.send_message(
            "What's the difference between asyncio.create_task and await?"
        )
        print("Claude:", response1)
        
        response2 = await conversation.send_message(
            "Can you show me a practical example?"
        )
        print("Claude:", response2)
        
        print(f"\nEstimated tokens: {conversation.estimate_tokens()}")
    finally:
        await client.close()

Tool Use and Function Calling

Implement tool use for agentic applications:

from anthropic import AsyncAnthropicFoundry
from anthropic.types import ToolUseBlock
from typing import Dict, Any, List
import json

# Define tools
tools = [
    {
        "name": "get_weather",
        "description": "Get current weather for a location",
        "input_schema": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "City name, e.g. San Francisco"
                },
                "unit": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"],
                    "description": "Temperature unit"
                }
            },
            "required": ["location"]
        }
    }
]

# Tool implementations
async def get_weather(location: str, unit: str = "celsius") -> Dict[str, Any]:
    """Simulated weather API call."""
    # In production, call actual weather API
    return {
        "location": location,
        "temperature": 72 if unit == "fahrenheit" else 22,
        "unit": unit,
        "conditions": "Partly cloudy"
    }

async def execute_tool(tool_name: str, tool_input: Dict[str, Any]) -> str:
    """Execute tool and return result as JSON string."""
    if tool_name == "get_weather":
        result = await get_weather(**tool_input)
        return json.dumps(result)
    
    raise ValueError(f"Unknown tool: {tool_name}")

async def tool_use_example(client: AsyncAnthropicFoundry, query: str):
    """Demonstrate tool use with Claude."""
    messages = [{"role": "user", "content": query}]
    
    while True:
        response = await client.messages.create(
            model="claude-sonnet-4-5",
            max_tokens=1024,
            tools=tools,
            messages=messages
        )
        
        # Check if Claude wants to use a tool
        tool_uses = [
            block for block in response.content 
            if isinstance(block, ToolUseBlock)
        ]
        
        if not tool_uses:
            # No more tools to call, return final response
            return "".join([
                block.text for block in response.content 
                if hasattr(block, 'text')
            ])
        
        # Execute all requested tools
        messages.append({
            "role": "assistant",
            "content": response.content
        })
        
        tool_results = []
        for tool_use in tool_uses:
            result = await execute_tool(tool_use.name, tool_use.input)
            tool_results.append({
                "type": "tool_result",
                "tool_use_id": tool_use.id,
                "content": result
            })
        
        messages.append({
            "role": "user",
            "content": tool_results
        })

# Test tool use
async def test_tools():
    client = await create_client()
    try:
        result = await tool_use_example(
            client,
            "What's the weather like in Paris?"
        )
        print(result)
    finally:
        await client.close()

Vision Capabilities

Process images with Claude’s vision capabilities:

import base64
from pathlib import Path
from anthropic import AsyncAnthropicFoundry

async def analyze_image(
    client: AsyncAnthropicFoundry,
    image_path: str,
    prompt: str
) -> str:
    """Analyze an image with Claude."""
    
    # Read and encode image
    image_data = Path(image_path).read_bytes()
    base64_image = base64.standard_b64encode(image_data).decode("utf-8")
    
    # Detect media type
    extension = Path(image_path).suffix.lower()
    media_type_map = {
        ".jpg": "image/jpeg",
        ".jpeg": "image/jpeg",
        ".png": "image/png",
        ".gif": "image/gif",
        ".webp": "image/webp"
    }
    media_type = media_type_map.get(extension, "image/jpeg")
    
    response = await client.messages.create(
        model="claude-sonnet-4-5",
        max_tokens=2048,
        messages=[
            {
                "role": "user",
                "content": [
                    {
                        "type": "image",
                        "source": {
                            "type": "base64",
                            "media_type": media_type,
                            "data": base64_image
                        }
                    },
                    {
                        "type": "text",
                        "text": prompt
                    }
                ]
            }
        ]
    )
    
    return "".join([
        block.text for block in response.content 
        if hasattr(block, 'text')
    ])

# Usage
async def vision_example():
    client = await create_client()
    try:
        result = await analyze_image(
            client,
            "diagram.png",
            "Describe this architecture diagram in detail."
        )
        print(result)
    finally:
        await client.close()

Prompt Caching for Cost Optimization

from anthropic import AsyncAnthropicFoundry
from anthropic.types import TextBlock, CacheControlEphemeralParam

async def cached_documentation_qa(
    client: AsyncAnthropicFoundry,
    documentation: str,
    question: str
) -> str:
    """Answer questions about documentation with caching."""
    
    response = await client.messages.create(
        model="claude-sonnet-4-5",
        max_tokens=1024,
        system=[
            {
                "type": "text",
                "text": "You are a helpful documentation assistant."
            },
            {
                "type": "text",
                "text": f"Documentation:\n\n{documentation}",
                "cache_control": {"type": "ephemeral"}
            }
        ],
        messages=[
            {
                "role": "user",
                "content": question
            }
        ]
    )
    
    # Log cache usage
    usage = response.usage
    print(f"Cache creation tokens: {usage.cache_creation_input_tokens or 0}")
    print(f"Cache read tokens: {usage.cache_read_input_tokens or 0}")
    print(f"Input tokens: {usage.input_tokens}")
    print(f"Output tokens: {usage.output_tokens}")
    
    return "".join([
        block.text for block in response.content 
        if hasattr(block, 'text')
    ])

Production Application Example

import asyncio
from contextlib import asynccontextmanager
from typing import AsyncIterator, Optional
from anthropic import AsyncAnthropicFoundry
from azure.identity.aio import DefaultAzureCredential
import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

class ClaudeApp:
    """Production-ready Claude application."""
    
    def __init__(self, use_entra_id: bool = True):
        self.use_entra_id = use_entra_id
        self.client: Optional[AsyncAnthropicFoundry] = None
        self.credential: Optional[DefaultAzureCredential] = None
    
    async def __aenter__(self):
        """Async context manager entry."""
        await self.initialize()
        return self
    
    async def __aexit__(self, exc_type, exc_val, exc_tb):
        """Async context manager exit."""
        await self.cleanup()
    
    async def initialize(self):
        """Initialize client with authentication."""
        if self.use_entra_id:
            self.credential = DefaultAzureCredential()
            
            async def token_provider():
                token = await self.credential.get_token(
                    "https://ai.azure.com/.default"
                )
                return token.token
            
            self.client = AsyncAnthropicFoundry(
                azure_ad_token_provider=token_provider,
                base_url=os.getenv("AZURE_FOUNDRY_BASE_URL")
            )
        else:
            self.client = AsyncAnthropicFoundry(
                api_key=os.getenv("ANTHROPIC_FOUNDRY_API_KEY"),
                base_url=os.getenv("AZURE_FOUNDRY_BASE_URL")
            )
    
    async def cleanup(self):
        """Clean up resources."""
        if self.client:
            await self.client.close()
        if self.credential:
            await self.credential.close()
    
    async def chat(
        self,
        message: str,
        *,
        stream: bool = False,
        max_tokens: int = 1024
    ) -> str:
        """Send chat message and get response."""
        if not self.client:
            raise RuntimeError("Client not initialized")
        
        if stream:
            return await self._stream_chat(message, max_tokens)
        
        response = await self.client.messages.create(
            model=os.getenv("DEFAULT_MODEL", "claude-sonnet-4-5"),
            max_tokens=max_tokens,
            messages=[{"role": "user", "content": message}]
        )
        
        return "".join([
            block.text for block in response.content 
            if hasattr(block, 'text')
        ])
    
    async def _stream_chat(self, message: str, max_tokens: int) -> str:
        """Stream chat response."""
        full_response = ""
        
        async with self.client.messages.stream(
            model=os.getenv("DEFAULT_MODEL", "claude-sonnet-4-5"),
            max_tokens=max_tokens,
            messages=[{"role": "user", "content": message}]
        ) as stream:
            async for text in stream.text_stream:
                print(text, end="", flush=True)
                full_response += text
        
        print()  # New line after stream
        return full_response

# Application entry point
async def main():
    """Main application entry point."""
    async with ClaudeApp(use_entra_id=True) as app:
        # Example 1: Simple chat
        response = await app.chat("What is Python used for?")
        logger.info(f"Response: {response}")
        
        # Example 2: Streaming chat
        logger.info("Streaming response:")
        await app.chat(
            "Explain asyncio in detail.",
            stream=True,
            max_tokens=2048
        )

if __name__ == "__main__":
    asyncio.run(main())

Conclusion

This guide covered Python implementation with Claude in Azure AI Foundry, demonstrating async patterns, Azure Identity integration, streaming with async generators, conversation management, tool use, vision capabilities, and prompt caching. Python’s async ecosystem and type system provide powerful tools for building production AI applications.

In Part 5, we will explore C# integration with Microsoft Extensions.AI, showing how to build enterprise-grade applications using .NET’s robust framework and unified AI abstractions.

Azure AI Foundry with Anthropic Claude Part 4: Python Implementation with Azure SDK – Complete Async Guide

Environment Setup

Prerequisites and Dependencies

Project Structure and Configuration

Basic Synchronous Implementation

Async Implementation with Azure Identity

Streaming Responses with Async Generators

Multi-Turn Conversation Manager

Tool Use and Function Calling

Vision Capabilities

Prompt Caching for Cost Optimization

Production Application Example

Conclusion

References

Like this:

You may like

Written by:

Chandan 575 Posts

You May Have Missed

The Complete Picture: Balancing Professional and Personal Support Systems

For Parents, Partners, and Friends: A Guide to Supporting Your Loved One in Tech

The HR Conversation: When and How to Involve HR in Your Mental Health Journey

Finding Your Tech Tribe: The Power of Peer Support Groups

How to whitelist website on AdBlocker?

Environment Setup

Prerequisites and Dependencies

Project Structure and Configuration

Basic Synchronous Implementation

Async Implementation with Azure Identity

Streaming Responses with Async Generators

Multi-Turn Conversation Manager

Tool Use and Function Calling

Vision Capabilities

Prompt Caching for Cost Optimization

Production Application Example

Conclusion

References

Like this:

You may like

Written by:

Chandan 575 Posts

Related Posts

Enterprise AI Infrastructure: Gateways, MLOps, and Production Architecture

Breaking Out of Pilot Purgatory: The Production AI Challenge in 2026

Enterprise GEO Strategy: Organizational Frameworks, Case Studies, and Future-Proofing Your AI Search Dominance

You May Have Missed

The Complete Picture: Balancing Professional and Personal Support Systems

For Parents, Partners, and Friends: A Guide to Supporting Your Loved One in Tech

The HR Conversation: When and How to Involve HR in Your Mental Health Journey

Finding Your Tech Tribe: The Power of Peer Support Groups

How to whitelist website on AdBlocker?