Amit Arora ff5fdffd42
fix(02-use-cases): Add multi-region support for SRE-Agent (#246)
* Add multi-region support for SRE-Agent

- Add AWS region configuration parameter to agent_config.yaml
- Update gateway main.py to validate region matches endpoint URL
- Modify SRE agent to read region from config and pass through function chain
- Update memory client and LLM creation to use configurable region
- Fixes hardcoded us-east-1 region dependencies

Closes #245

* Move architecture file to docs/ and improve setup instructions

- Move sre_agent_architecture.md to docs/ folder for better organization
- Update graph export code to generate architecture file in docs/ folder
- Add automatic docs directory creation if it doesn't exist
- Improve README setup instructions:
  - Fix .env.example copy path to use sre_agent folder
  - Add note that Amazon Bedrock users don't need to modify .env
  - Add START_API_BACKEND variable to conditionally start backend servers
  - Useful for workshop environments where backends are already running

* Improve gateway configuration documentation and setup instructions

- Update config.yaml.example to use REGION placeholder instead of hardcoded us-east-1
- Add gateway configuration step to README setup instructions
- Document .cognito_config file in auth.md automated setup section
- Remove duplicate credential_provider_name from config.yaml.example
- Update configuration.md to include .cognito_config in files overview
- Add clear instructions to copy and edit gateway/config.yaml before creating gateway

* Improve IAM role guidance and region handling

- Add clear guidance about IAM role options in gateway/config.yaml.example
- Explain that testing can use current EC2/notebook role
- Recommend dedicated role for production deployments
- Add aws sts get-caller-identity command to help users find their role
- Update deployment scripts to use AWS_REGION env var as fallback
- Scripts now follow: CLI arg -> AWS_REGION env var -> us-east-1 default

* Remove unnecessary individual Cognito ID files

- Remove creation of .cognito_user_pool_id file
- Remove creation of .cognito_client_id file
- Keep only .cognito_config as the single source of truth
- Simplifies configuration management

* Implement region fallback logic for SRE Agent

- Added region fallback chain: agent_config.yaml -> AWS_REGION env -> us-east-1
- Modified agent_config.yaml to comment out region parameter to enable fallback
- Updated multi_agent_langgraph.py with comprehensive fallback implementation
- Added logging to show which region source is being used
- Ensures flexible region configuration without breaking existing deployments
- Maintains backward compatibility while adding multi-region support
2025-08-13 08:32:37 -04:00

9.1 KiB

Configuration

This document provides a comprehensive overview of all configuration files used in the SRE Agent system. Configuration files are organized across different directories based on their purpose and scope.

Configuration Files Overview

File Path Type Purpose Manual Edit Required? Auto-Generated?
sre_agent/.env ENV SRE agent-specific settings Yes Yes (GATEWAY_ACCESS_TOKEN by setup)
gateway/.env ENV Gateway authentication settings Yes No
gateway/config.yaml YAML AgentCore Gateway configuration Yes Partially (provider_arn by setup)
deployment/.env ENV Soft link to sre_agent/.env No (uses sre_agent/.env) N/A (symlink)
deployment/.cognito_config ENV Cognito configuration details No Yes (by setup_cognito.sh)
sre_agent/config/agent_config.yaml YAML Agent-to-tool mapping configuration No Yes (gateway URI by setup)
scripts/user_config.yaml YAML Script-specific user configuration No No
backend/openapi_specs/*.yaml YAML OpenAPI specifications for tools No Yes (from templates by setup)

Setup Instructions

For files with .example versions:

  1. Copy the .example file to create the actual configuration file
  2. Edit the copied file with your environment-specific values
  3. Never commit the actual configuration files to version control
# Example setup commands
cp sre_agent/.env.example sre_agent/.env
cp gateway/.env.example gateway/.env
cp gateway/config.yaml.example gateway/config.yaml

Files Automatically Updated During Setup

The following files are automatically modified by the setup scripts:

  1. sre_agent/.env - The GATEWAY_ACCESS_TOKEN is automatically appended
  2. sre_agent/config/agent_config.yaml - The gateway.uri field is updated with the created gateway URI
  3. gateway/config.yaml - The provider_arn field is updated when creating the credential provider
  4. backend/openapi_specs/*.yaml - Generated from templates with your backend domain
  5. deployment/.cognito_config - Created by setup_cognito.sh with USER_POOL_ID, CLIENT_ID, CLIENT_SECRET, and other Cognito settings

Environment Variables

The SRE Agent uses environment variables for sensitive configuration values. Create a .env file in the sre_agent/ directory with the following required variables:

# Required: API key for Claude model access
# For Anthropic direct access:
ANTHROPIC_API_KEY=sk-ant-api-key-here

# For Amazon Bedrock access:
AWS_DEFAULT_REGION=us-east-1
AWS_PROFILE=your-profile-name  # Or use AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY

# Required: AgentCore Gateway authentication
GATEWAY_ACCESS_TOKEN=your-gateway-token-here  # Generated by gateway setup

# Optional: Debugging and logging
LOG_LEVEL=INFO  # Options: DEBUG, INFO, WARNING, ERROR
DEBUG=false     # Enable debug mode for verbose output

Note: The SRE Agent looks for the .env file in the sre_agent/ directory, not the project root. This allows for modular configuration management.

Agent Configuration

The agent behavior is configured through sre_agent/config/agent_config.yaml. This file defines the mapping between agents and their available tools, as well as LLM parameters:

# Agent to tool mapping
agents:
  kubernetes_agent:
    name: "Kubernetes Infrastructure Agent"
    description: "Specializes in Kubernetes operations and troubleshooting"
    tools:
      - get_pod_status
      - get_deployment_status
      - get_cluster_events
      - get_resource_usage
      - get_node_status

  logs_agent:
    name: "Application Logs Agent"
    description: "Expert in log analysis and pattern detection"
    tools:
      - search_logs
      - get_error_logs
      - analyze_log_patterns
      - get_recent_logs
      - count_log_events

  metrics_agent:
    name: "Performance Metrics Agent"
    description: "Analyzes performance metrics and trends"
    tools:
      - get_performance_metrics
      - get_error_rates
      - get_resource_metrics
      - get_availability_metrics
      - analyze_trends

  runbooks_agent:
    name: "Operational Runbooks Agent"
    description: "Provides operational procedures and guides"
    tools:
      - search_runbooks
      - get_incident_playbook
      - get_troubleshooting_guide
      - get_escalation_procedures
      - get_common_resolutions

# Global tools available to all agents
global_tools:
  - x-amz-bedrock-agentcore-search  # AgentCore search tool
  
# Gateway configuration
gateway:
  uri: "https://your-gateway-url.com"  # Updated during setup

Gateway Environment Variables

The AgentCore Gateway requires additional environment variables for authentication. Create a .env file in the gateway/ directory with the following:

# Required: Backend API key for credential provider authentication
BACKEND_API_KEY=your-backend-api-key-here

# Optional: Override config.yaml values with environment variables
# ACCOUNT_ID=123456789012
# REGION=us-east-1
# ROLE_NAME=your-role-name
# GATEWAY_NAME=MyAgentCoreGateway
# CREDENTIAL_PROVIDER_NAME=sre-agent-api-key-credential-provider

Note: The BACKEND_API_KEY is used by the create_gateway.sh script to authenticate with the credential provider service.

Gateway Configuration

The AgentCore Gateway is configured through gateway/config.yaml. This configuration is managed by the setup scripts but can be customized:

# AgentCore Gateway Configuration Template
# Copy this file to config.yaml and update with your environment-specific settings

# AWS Configuration
account_id: "YOUR_ACCOUNT_ID"
region: "us-east-1"
role_name: "YOUR_ROLE_NAME"
endpoint_url: "https://bedrock-agentcore-control.us-east-1.amazonaws.com"
credential_provider_endpoint_url: "https://us-east-1.prod.agent-credential-provider.cognito.aws.dev"

# Cognito Configuration
user_pool_id: "YOUR_USER_POOL_ID"
client_id: "YOUR_CLIENT_ID"

# S3 Configuration
s3_bucket: "your-agentcore-schemas-bucket"
s3_path_prefix: "devops-multiagent-demo"  # Path prefix for OpenAPI schema files

# Provider Configuration
# This ARN is automatically generated by create_gateway.sh when it runs create_credentials_provider.py
provider_arn: "arn:aws:bedrock-agentcore:REGION:ACCOUNT_ID:token-vault/default/apikeycredentialprovider/YOUR_PROVIDER_NAME"

# Gateway Configuration
gateway_name: "MyAgentCoreGateway"
gateway_description: "AgentCore Gateway for API Integration"

# Target Configuration
target_description: "S3 target for OpenAPI schema"

Configuration File Details

SRE Agent .env File

  • Location: sre_agent/.env
  • Purpose: Agent-specific configuration separate from deployment settings
  • Setup: Copy from sre_agent/.env.example and customize
  • Auto-Updates: The setup script automatically adds GATEWAY_ACCESS_TOKEN to this file
  • Note: The agent looks for this file specifically in the sre_agent/ directory

Gateway .env File

  • Location: gateway/.env
  • Purpose: Gateway authentication and backend API configuration
  • Setup: Copy from gateway/.env.example and customize
  • Key Variables: Backend API key for credential provider authentication

Deployment .env File

  • Location: deployment/.env
  • Purpose: Symbolic link to sre_agent/.env
  • Setup: No manual setup required - this is a soft link
  • Note: This symlink ensures deployment scripts use the same environment variables as the agent

Gateway Configuration (config.yaml)

  • Location: gateway/config.yaml
  • Purpose: AgentCore Gateway settings including AWS, Cognito, and S3 configuration
  • Setup: Copy from config.yaml.example and customize
  • Auto-Updates: The create_gateway.sh script automatically updates certain fields like provider_arn

Agent Configuration (agent_config.yaml)

  • Location: sre_agent/config/agent_config.yaml
  • Purpose: Defines agent-to-tool mappings and agent capabilities
  • Setup: Edit directly (no example file)
  • Auto-Updates: The setup script automatically updates the gateway.uri field with the created gateway URI
  • Content: Agent definitions, tool assignments, and global tool configurations

User Configuration File

  • Location: scripts/user_config.yaml
  • Purpose: User personas and preferences for memory-enhanced personalization
  • Setup: Edit directly to add or modify user personas
  • Content: Predefined user preferences (Alice: technical, Carol: executive)

OpenAPI Specifications

  • Location: backend/openapi_specs/*.yaml
  • Purpose: Define the API contracts for various backend services
  • Files:
    • k8s_api.yaml - Kubernetes operations API
    • logs_api.yaml - Log analysis API
    • metrics_api.yaml - Metrics collection API
    • runbooks_api.yaml - Runbook management API
  • Auto-Generation: These files are generated from templates during setup when you run generate_specs.sh
  • Note: Do not edit these directly - modify the templates instead