Azure AI Foundry with Anthropic Claude Part 3: Building Your First Node.js Application – Complete Implementation Guide → Explore with me!

Building production-ready applications with Claude in Azure AI Foundry requires understanding the Node.js SDK, proper authentication patterns, error handling strategies, and performance optimization techniques. This comprehensive guide walks through creating a complete Node.js application that leverages Claude Sonnet 4.5 for intelligent conversations, code generation, and complex reasoning tasks.

Parts 1 and 2 of this series covered strategic overview and deployment fundamentals. Part 3 focuses entirely on practical Node.js implementation, providing production-ready code examples that you can adapt for your specific use cases.

Environment Setup

Proper environment configuration ensures smooth development and production deployments. We will set up a TypeScript-based Node.js project with all necessary dependencies and configuration files.

Prerequisites

Ensure you have Node.js 18 LTS or later installed. TypeScript 4.5 or higher is required for type safety. You should have completed the deployment steps from Part 2, with at least one Claude model deployed in Azure AI Foundry.

Project Initialization

Create a new Node.js project and install required dependencies:

# Create project directory
mkdir claude-azure-app
cd claude-azure-app

# Initialize Node.js project
npm init -y

# Install Anthropic Foundry SDK
npm install @anthropic-ai/foundry-sdk

# Install Azure Identity for Entra ID auth
npm install @azure/identity

# Install TypeScript and type definitions
npm install --save-dev typescript @types/node

# Install dotenv for environment variables
npm install dotenv

# Install development tools
npm install --save-dev tsx nodemon

TypeScript Configuration

Create a tsconfig.json file for TypeScript compilation settings:

{
  "compilerOptions": {
    "target": "ES2022",
    "module": "NodeNext",
    "moduleResolution": "NodeNext",
    "lib": ["ES2022"],
    "outDir": "./dist",
    "rootDir": "./src",
    "strict": true,
    "esModuleInterop": true,
    "skipLibCheck": true,
    "forceConsistentCasingInFileNames": true,
    "resolveJsonModule": true,
    "declaration": true,
    "declarationMap": true,
    "sourceMap": true
  },
  "include": ["src/**/*"],
  "exclude": ["node_modules", "dist"]
}

Environment Variables

Create a .env file to store configuration. Never commit this file to source control:

# Azure Foundry Resource Configuration
AZURE_FOUNDRY_RESOURCE=your-resource-name
AZURE_FOUNDRY_BASE_URL=https://your-resource-name.services.ai.azure.com

# API Key Authentication (Option 1)
ANTHROPIC_FOUNDRY_API_KEY=your-api-key-here

# Entra ID Authentication (Option 2)
# These are automatically detected by DefaultAzureCredential
# AZURE_CLIENT_ID=your-client-id
# AZURE_TENANT_ID=your-tenant-id
# AZURE_CLIENT_SECRET=your-client-secret

# Model Configuration
DEFAULT_MODEL=claude-sonnet-4-5
MAX_TOKENS=4096

Create a .env.example file with dummy values to commit to source control, showing required configuration without exposing secrets.

Package Scripts

Update package.json with helpful development scripts:

{
  "name": "claude-azure-app",
  "version": "1.0.0",
  "type": "module",
  "scripts": {
    "dev": "tsx watch src/index.ts",
    "build": "tsc",
    "start": "node dist/index.js",
    "type-check": "tsc --noEmit"
  }
}

Basic Chat Implementation

Let’s build a basic chat implementation that demonstrates core SDK usage. This example uses API key authentication for simplicity.

Simple Chat Example

Create src/index.ts with a basic chat implementation:

import { AnthropicFoundry } from '@anthropic-ai/foundry-sdk';
import dotenv from 'dotenv';

// Load environment variables
dotenv.config();

// Initialize client with API key
const client = new AnthropicFoundry({
  apiKey: process.env.ANTHROPIC_FOUNDRY_API_KEY,
  baseURL: `${process.env.AZURE_FOUNDRY_BASE_URL}/anthropic`,
});

async function simpleChat(userMessage: string) {
  try {
    const response = await client.messages.create({
      model: process.env.DEFAULT_MODEL || 'claude-sonnet-4-5',
      max_tokens: parseInt(process.env.MAX_TOKENS || '1024'),
      messages: [
        {
          role: 'user',
          content: userMessage,
        },
      ],
    });

    // Extract text content from response
    const textContent = response.content
      .filter((block) => block.type === 'text')
      .map((block) => block.text)
      .join('\n');

    console.log('Claude:', textContent);
    console.log('\nToken usage:', response.usage);
    
    return textContent;
  } catch (error) {
    console.error('Error:', error);
    throw error;
  }
}

// Test the function
simpleChat('Explain quantum computing in 3 sentences.');

Run the example with npm run dev. You should see Claude’s response and token usage statistics.

Understanding the Response Structure

Claude API responses include multiple components. The content array contains response blocks of different types (text, tool use, etc.). The usage object provides token consumption metrics including input tokens, output tokens, and cache statistics. The model field confirms which deployment processed the request. The stop reason indicates why generation ended (end_turn, max_tokens, stop_sequence).

Entra ID Authentication

For production deployments, Microsoft Entra ID provides superior security compared to API keys. Here’s how to implement it:

import { AnthropicFoundry } from '@anthropic-ai/foundry-sdk';
import { DefaultAzureCredential, getBearerTokenProvider } from '@azure/identity';
import dotenv from 'dotenv';

dotenv.config();

// Create Azure credential
const credential = new DefaultAzureCredential();

// Get token provider for AI Foundry scope
const scope = 'https://ai.azure.com/.default';
const azureADTokenProvider = getBearerTokenProvider(credential, scope);

// Initialize client with Entra ID
const client = new AnthropicFoundry({
  azureADTokenProvider,
  baseURL: `${process.env.AZURE_FOUNDRY_BASE_URL}/anthropic`,
});

async function authenticatedChat(userMessage: string) {
  const response = await client.messages.create({
    model: process.env.DEFAULT_MODEL || 'claude-sonnet-4-5',
    max_tokens: 1024,
    messages: [{ role: 'user', content: userMessage }],
  });

  return response.content
    .filter((block) => block.type === 'text')
    .map((block) => block.text)
    .join('\n');
}

export { authenticatedChat };

DefaultAzureCredential automatically discovers credentials from environment variables, managed identity, Azure CLI, Visual Studio, and other sources in a specific order. This makes the code portable across development and production environments without changes.

Multi-Turn Conversations

Real applications require multi-turn conversations that maintain context. Here’s a conversation manager implementation:

import { AnthropicFoundry } from '@anthropic-ai/foundry-sdk';
import type { MessageParam } from '@anthropic-ai/foundry-sdk/resources/messages';

interface ConversationOptions {
  systemPrompt?: string;
  maxTokens?: number;
}

class ConversationManager {
  private client: AnthropicFoundry;
  private messages: MessageParam[] = [];
  private systemPrompt?: string;
  private maxTokens: number;
  private model: string;

  constructor(
    client: AnthropicFoundry,
    options: ConversationOptions = {}
  ) {
    this.client = client;
    this.systemPrompt = options.systemPrompt;
    this.maxTokens = options.maxTokens || 1024;
    this.model = process.env.DEFAULT_MODEL || 'claude-sonnet-4-5';
  }

  async sendMessage(userMessage: string): Promise {
    // Add user message to conversation history
    this.messages.push({
      role: 'user',
      content: userMessage,
    });

    // Call Claude API
    const response = await this.client.messages.create({
      model: this.model,
      max_tokens: this.maxTokens,
      system: this.systemPrompt,
      messages: this.messages,
    });

    // Extract assistant response
    const assistantMessage = response.content
      .filter((block) => block.type === 'text')
      .map((block) => block.text)
      .join('\n');

    // Add assistant response to conversation history
    this.messages.push({
      role: 'assistant',
      content: assistantMessage,
    });

    return assistantMessage;
  }

  getHistory(): MessageParam[] {
    return [...this.messages];
  }

  clearHistory(): void {
    this.messages = [];
  }

  getTokenCount(): number {
    // Estimate token count (rough approximation)
    const allText = this.messages
      .map((m) => 
        typeof m.content === 'string' 
          ? m.content 
          : m.content.map((c) => 
              c.type === 'text' ? c.text : ''
            ).join('')
      )
      .join('');
    
    // Rough estimate: 4 characters per token
    return Math.ceil(allText.length / 4);
  }
}

// Usage example
async function conversationExample() {
  const client = new AnthropicFoundry({
    apiKey: process.env.ANTHROPIC_FOUNDRY_API_KEY,
    baseURL: `${process.env.AZURE_FOUNDRY_BASE_URL}/anthropic`,
  });

  const conversation = new ConversationManager(client, {
    systemPrompt: 'You are a helpful coding assistant specializing in TypeScript and Node.js.',
    maxTokens: 2048,
  });

  // Multi-turn conversation
  const response1 = await conversation.sendMessage(
    'How do I implement error handling in async functions?'
  );
  console.log('Claude:', response1);

  const response2 = await conversation.sendMessage(
    'Can you show me a practical example?'
  );
  console.log('Claude:', response2);

  // Check conversation length
  console.log('Estimated tokens:', conversation.getTokenCount());
}

export { ConversationManager };

Streaming Responses

For real-time user experiences, streaming provides incremental response delivery. This dramatically improves perceived latency for long responses:

import { AnthropicFoundry } from '@anthropic-ai/foundry-sdk';

async function streamingChat(client: AnthropicFoundry, userMessage: string) {
  const stream = await client.messages.create({
    model: 'claude-sonnet-4-5',
    max_tokens: 2048,
    messages: [{ role: 'user', content: userMessage }],
    stream: true,
  });

  console.log('Claude: ');
  
  let fullResponse = '';
  
  for await (const event of stream) {
    if (event.type === 'content_block_delta') {
      if (event.delta.type === 'text_delta') {
        process.stdout.write(event.delta.text);
        fullResponse += event.delta.text;
      }
    }
    
    if (event.type === 'message_stop') {
      console.log('\n\nStream completed.');
    }
  }
  
  return fullResponse;
}

// Advanced streaming with event handlers
async function advancedStreaming(client: AnthropicFoundry, userMessage: string) {
  const stream = client.messages
    .stream({
      model: 'claude-sonnet-4-5',
      max_tokens: 2048,
      messages: [{ role: 'user', content: userMessage }],
    })
    .on('text', (text) => {
      process.stdout.write(text);
    })
    .on('message', (message) => {
      console.log('\n\nFull message:', message);
    })
    .on('error', (error) => {
      console.error('Stream error:', error);
    });

  // Wait for completion
  const finalMessage = await stream.finalMessage();
  console.log('\n\nToken usage:', finalMessage.usage);
  
  return finalMessage;
}

The streaming API provides two approaches. The low-level approach iterates through events manually, giving fine-grained control. The high-level approach uses event handlers for cleaner code and automatic message accumulation.

Error Handling and Retry Logic

Production applications require robust error handling. Here’s a comprehensive error handling wrapper:

import { AnthropicFoundry } from '@anthropic-ai/foundry-sdk';
import type { MessageCreateParams } from '@anthropic-ai/foundry-sdk/resources/messages';

interface RetryOptions {
  maxRetries?: number;
  baseDelay?: number;
  maxDelay?: number;
}

class ClaudeClient {
  private client: AnthropicFoundry;
  private retryOptions: Required;

  constructor(client: AnthropicFoundry, retryOptions: RetryOptions = {}) {
    this.client = client;
    this.retryOptions = {
      maxRetries: retryOptions.maxRetries || 3,
      baseDelay: retryOptions.baseDelay || 1000,
      maxDelay: retryOptions.maxDelay || 10000,
    };
  }

  private async sleep(ms: number): Promise {
    return new Promise((resolve) => setTimeout(resolve, ms));
  }

  private calculateBackoff(attempt: number): number {
    const delay = this.retryOptions.baseDelay * Math.pow(2, attempt);
    return Math.min(delay, this.retryOptions.maxDelay);
  }

  private isRetryableError(error: any): boolean {
    // Retry on rate limits and temporary server errors
    if (error.status === 429) return true; // Rate limit
    if (error.status >= 500) return true; // Server errors
    if (error.code === 'ECONNRESET') return true; // Network errors
    if (error.code === 'ETIMEDOUT') return true; // Timeout errors
    return false;
  }

  async createMessage(
    params: MessageCreateParams,
    attempt: number = 0
  ): Promise {
    try {
      const response = await this.client.messages.create(params);
      return response;
    } catch (error: any) {
      // Log error details
      console.error(`API error (attempt ${attempt + 1}):`, {
        status: error.status,
        code: error.code,
        message: error.message,
      });

      // Check if we should retry
      if (
        this.isRetryableError(error) &&
        attempt < this.retryOptions.maxRetries
      ) {
        const delay = this.calculateBackoff(attempt);
        console.log(`Retrying in ${delay}ms...`);
        await this.sleep(delay);
        return this.createMessage(params, attempt + 1);
      }

      // Handle specific error types
      if (error.status === 401) {
        throw new Error(
          'Authentication failed. Check your API key or Entra ID credentials.'
        );
      }

      if (error.status === 429) {
        throw new Error(
          'Rate limit exceeded. Consider implementing request queuing or requesting quota increase.'
        );
      }

      if (error.status === 404) {
        throw new Error(
          'Model deployment not found. Verify deployment name and region.'
        );
      }

      // Re-throw original error
      throw error;
    }
  }
}

// Usage
async function robustChatExample() {
  const baseClient = new AnthropicFoundry({
    apiKey: process.env.ANTHROPIC_FOUNDRY_API_KEY,
    baseURL: `${process.env.AZURE_FOUNDRY_BASE_URL}/anthropic`,
  });

  const client = new ClaudeClient(baseClient, {
    maxRetries: 3,
    baseDelay: 1000,
    maxDelay: 10000,
  });

  try {
    const response = await client.createMessage({
      model: 'claude-sonnet-4-5',
      max_tokens: 1024,
      messages: [
        {
          role: 'user',
          content: 'Explain the benefits of TypeScript.',
        },
      ],
    });

    console.log('Success:', response.content[0].text);
  } catch (error) {
    console.error('Final error:', error);
  }
}

export { ClaudeClient };

Cost Optimization with Prompt Caching

Prompt caching can reduce costs by up to 90% for applications with repeated context. Here's how to implement it effectively:

import { AnthropicFoundry } from '@anthropic-ai/foundry-sdk';

async function cachedContextChat(
  client: AnthropicFoundry,
  largeContext: string,
  userQuery: string
) {
  const response = await client.messages.create({
    model: 'claude-sonnet-4-5',
    max_tokens: 1024,
    system: [
      {
        type: 'text',
        text: 'You are a helpful assistant analyzing the provided documentation.',
      },
      {
        type: 'text',
        text: largeContext,
        cache_control: { type: 'ephemeral' }, // Cache this block
      },
    ],
    messages: [
      {
        role: 'user',
        content: userQuery,
      },
    ],
  });

  // Check cache usage
  console.log('Cache statistics:', {
    cacheCreationTokens: response.usage.cache_creation_input_tokens || 0,
    cacheReadTokens: response.usage.cache_read_input_tokens || 0,
    inputTokens: response.usage.input_tokens,
    outputTokens: response.usage.output_tokens,
  });

  return response.content[0].text;
}

// Example: Documentation Q&A with caching
class DocumentationAssistant {
  private client: AnthropicFoundry;
  private documentation: string;

  constructor(client: AnthropicFoundry, documentation: string) {
    this.client = client;
    this.documentation = documentation;
  }

  async ask(question: string): Promise {
    const response = await this.client.messages.create({
      model: 'claude-sonnet-4-5',
      max_tokens: 2048,
      system: [
        {
          type: 'text',
          text: 'You are a documentation expert. Answer questions based on the provided documentation.',
        },
        {
          type: 'text',
          text: `Documentation:\n\n${this.documentation}`,
          cache_control: { type: 'ephemeral' },
        },
      ],
      messages: [{ role: 'user', content: question }],
    });

    return response.content[0].text;
  }
}

export { cachedContextChat, DocumentationAssistant };

Cache control markers indicate which content blocks should be cached. The first request with new cache content pays cache write costs (1.25x for 5-minute cache, 2x for 1-hour cache). Subsequent requests within the cache TTL pay only 0.1x for cache reads, providing 90% cost savings on repeated context.

Complete Production Application

Here's a complete example combining all best practices into a production-ready application:

import { AnthropicFoundry } from '@anthropic-ai/foundry-sdk';
import { DefaultAzureCredential, getBearerTokenProvider } from '@azure/identity';
import dotenv from 'dotenv';

dotenv.config();

interface AppConfig {
  useEntraID: boolean;
  enableCaching: boolean;
  enableRetry: boolean;
  maxRetries: number;
}

class ClaudeApp {
  private client: AnthropicFoundry;
  private config: AppConfig;

  constructor(config: Partial = {}) {
    this.config = {
      useEntraID: config.useEntraID ?? true,
      enableCaching: config.enableCaching ?? true,
      enableRetry: config.enableRetry ?? true,
      maxRetries: config.maxRetries ?? 3,
    };

    this.client = this.initializeClient();
  }

  private initializeClient(): AnthropicFoundry {
    const baseURL = `${process.env.AZURE_FOUNDRY_BASE_URL}/anthropic`;

    if (this.config.useEntraID) {
      const credential = new DefaultAzureCredential();
      const scope = 'https://ai.azure.com/.default';
      const azureADTokenProvider = getBearerTokenProvider(credential, scope);

      return new AnthropicFoundry({
        azureADTokenProvider,
        baseURL,
      });
    } else {
      return new AnthropicFoundry({
        apiKey: process.env.ANTHROPIC_FOUNDRY_API_KEY,
        baseURL,
      });
    }
  }

  async chat(
    userMessage: string,
    options: {
      systemPrompt?: string;
      maxTokens?: number;
      stream?: boolean;
    } = {}
  ): Promise {
    const params = {
      model: process.env.DEFAULT_MODEL || 'claude-sonnet-4-5',
      max_tokens: options.maxTokens || 1024,
      system: options.systemPrompt,
      messages: [{ role: 'user' as const, content: userMessage }],
    };

    if (options.stream) {
      return this.streamingChat(params);
    }

    const response = await this.client.messages.create(params);
    
    return response.content
      .filter((block) => block.type === 'text')
      .map((block) => block.text)
      .join('\n');
  }

  private async streamingChat(params: any): Promise {
    let fullResponse = '';

    const stream = await this.client.messages.create({
      ...params,
      stream: true,
    });

    for await (const event of stream) {
      if (event.type === 'content_block_delta') {
        if (event.delta.type === 'text_delta') {
          process.stdout.write(event.delta.text);
          fullResponse += event.delta.text;
        }
      }
    }

    console.log('\n');
    return fullResponse;
  }
}

// Application entry point
async function main() {
  const app = new ClaudeApp({
    useEntraID: true,
    enableCaching: true,
    enableRetry: true,
  });

  try {
    // Example 1: Simple chat
    const response1 = await app.chat(
      'Write a TypeScript function to calculate factorial.'
    );
    console.log('Response:', response1);

    // Example 2: Streaming chat
    console.log('\nStreaming response:');
    await app.chat('Explain async/await in JavaScript.', {
      stream: true,
      maxTokens: 2048,
    });

    // Example 3: Custom system prompt
    const response3 = await app.chat(
      'How do I optimize database queries?',
      {
        systemPrompt: 'You are a database optimization expert specializing in PostgreSQL.',
        maxTokens: 2048,
      }
    );
    console.log('Expert response:', response3);

  } catch (error) {
    console.error('Application error:', error);
    process.exit(1);
  }
}

// Run if executed directly
if (import.meta.url === `file://${process.argv[1]}`) {
  main();
}

export { ClaudeApp };

Testing and Debugging

Proper testing ensures reliability. Here's a testing setup using Jest:

# Install testing dependencies
npm install --save-dev jest @types/jest ts-jest

# Create Jest configuration
npx ts-jest config:init

Create a test file src/claude.test.ts:

import { ClaudeApp } from './index';

describe('ClaudeApp', () => {
  let app: ClaudeApp;

  beforeAll(() => {
    app = new ClaudeApp({
      useEntraID: false, // Use API key for testing
    });
  });

  test('should respond to simple question', async () => {
    const response = await app.chat('What is 2+2?');
    expect(response).toBeTruthy();
    expect(response.length).toBeGreaterThan(0);
  }, 30000); // 30 second timeout

  test('should handle system prompts', async () => {
    const response = await app.chat('Tell me about cats', {
      systemPrompt: 'You are a veterinarian.',
    });
    expect(response).toBeTruthy();
  }, 30000);
});

Performance Monitoring

Track API performance and costs with custom monitoring:

interface Metrics {
  totalRequests: number;
  totalInputTokens: number;
  totalOutputTokens: number;
  totalCost: number;
  averageLatency: number;
}

class MetricsTracker {
  private metrics: Metrics = {
    totalRequests: 0,
    totalInputTokens: 0,
    totalOutputTokens: 0,
    totalCost: 0,
    averageLatency: 0,
  };
  
  private latencies: number[] = [];

  trackRequest(
    inputTokens: number,
    outputTokens: number,
    latencyMs: number
  ): void {
    this.metrics.totalRequests++;
    this.metrics.totalInputTokens += inputTokens;
    this.metrics.totalOutputTokens += outputTokens;
    
    // Calculate cost (Sonnet 4.5 pricing)
    const inputCost = (inputTokens / 1_000_000) * 3;
    const outputCost = (outputTokens / 1_000_000) * 15;
    this.metrics.totalCost += inputCost + outputCost;
    
    // Track latency
    this.latencies.push(latencyMs);
    this.metrics.averageLatency =
      this.latencies.reduce((a, b) => a + b, 0) / this.latencies.length;
  }

  getMetrics(): Metrics {
    return { ...this.metrics };
  }

  reset(): void {
    this.metrics = {
      totalRequests: 0,
      totalInputTokens: 0,
      totalOutputTokens: 0,
      totalCost: 0,
      averageLatency: 0,
    };
    this.latencies = [];
  }
}

export { MetricsTracker };

Conclusion

This comprehensive guide covered building production-ready Node.js applications with Claude in Azure AI Foundry. We explored environment setup, basic and advanced chat implementations, Entra ID authentication, multi-turn conversations, streaming responses, error handling with retry logic, cost optimization through prompt caching, and complete application examples.

The patterns and code examples provided form a solid foundation for building sophisticated AI applications. In Part 4, we will explore Python implementation, demonstrating how to leverage Claude's capabilities using Python's rich ecosystem and async capabilities.

Azure AI Foundry with Anthropic Claude Part 3: Building Your First Node.js Application – Complete Implementation Guide

Environment Setup

Prerequisites

Project Initialization

TypeScript Configuration

Environment Variables

Package Scripts

Basic Chat Implementation

Simple Chat Example

Understanding the Response Structure

Entra ID Authentication

Multi-Turn Conversations

Streaming Responses

Error Handling and Retry Logic

Cost Optimization with Prompt Caching

Complete Production Application

Testing and Debugging

Performance Monitoring

Conclusion

References

Like this:

You may like

Written by:

Chandan 575 Posts

You May Have Missed

The Complete Picture: Balancing Professional and Personal Support Systems

For Parents, Partners, and Friends: A Guide to Supporting Your Loved One in Tech

The HR Conversation: When and How to Involve HR in Your Mental Health Journey

Finding Your Tech Tribe: The Power of Peer Support Groups

How to whitelist website on AdBlocker?

Environment Setup

Prerequisites

Project Initialization

TypeScript Configuration

Environment Variables

Package Scripts

Basic Chat Implementation

Simple Chat Example

Understanding the Response Structure

Entra ID Authentication

Multi-Turn Conversations

Streaming Responses

Error Handling and Retry Logic

Cost Optimization with Prompt Caching

Complete Production Application

Testing and Debugging

Performance Monitoring

Conclusion

References

Like this:

You may like

Written by:

Chandan 575 Posts

Related Posts

Enterprise AI Infrastructure: Gateways, MLOps, and Production Architecture

Breaking Out of Pilot Purgatory: The Production AI Challenge in 2026

Enterprise GEO Strategy: Organizational Frameworks, Case Studies, and Future-Proofing Your AI Search Dominance

You May Have Missed

The Complete Picture: Balancing Professional and Personal Support Systems

For Parents, Partners, and Friends: A Guide to Supporting Your Loved One in Tech

The HR Conversation: When and How to Involve HR in Your Mental Health Journey

Finding Your Tech Tribe: The Power of Peer Support Groups

How to whitelist website on AdBlocker?