Amit Arora ff5fdffd42
fix(02-use-cases): Add multi-region support for SRE-Agent (#246)
* Add multi-region support for SRE-Agent

- Add AWS region configuration parameter to agent_config.yaml
- Update gateway main.py to validate region matches endpoint URL
- Modify SRE agent to read region from config and pass through function chain
- Update memory client and LLM creation to use configurable region
- Fixes hardcoded us-east-1 region dependencies

Closes #245

* Move architecture file to docs/ and improve setup instructions

- Move sre_agent_architecture.md to docs/ folder for better organization
- Update graph export code to generate architecture file in docs/ folder
- Add automatic docs directory creation if it doesn't exist
- Improve README setup instructions:
  - Fix .env.example copy path to use sre_agent folder
  - Add note that Amazon Bedrock users don't need to modify .env
  - Add START_API_BACKEND variable to conditionally start backend servers
  - Useful for workshop environments where backends are already running

* Improve gateway configuration documentation and setup instructions

- Update config.yaml.example to use REGION placeholder instead of hardcoded us-east-1
- Add gateway configuration step to README setup instructions
- Document .cognito_config file in auth.md automated setup section
- Remove duplicate credential_provider_name from config.yaml.example
- Update configuration.md to include .cognito_config in files overview
- Add clear instructions to copy and edit gateway/config.yaml before creating gateway

* Improve IAM role guidance and region handling

- Add clear guidance about IAM role options in gateway/config.yaml.example
- Explain that testing can use current EC2/notebook role
- Recommend dedicated role for production deployments
- Add aws sts get-caller-identity command to help users find their role
- Update deployment scripts to use AWS_REGION env var as fallback
- Scripts now follow: CLI arg -> AWS_REGION env var -> us-east-1 default

* Remove unnecessary individual Cognito ID files

- Remove creation of .cognito_user_pool_id file
- Remove creation of .cognito_client_id file
- Keep only .cognito_config as the single source of truth
- Simplifies configuration management

* Implement region fallback logic for SRE Agent

- Added region fallback chain: agent_config.yaml -> AWS_REGION env -> us-east-1
- Modified agent_config.yaml to comment out region parameter to enable fallback
- Updated multi_agent_langgraph.py with comprehensive fallback implementation
- Added logging to show which region source is being used
- Ensures flexible region configuration without breaking existing deployments
- Maintains backward compatibility while adding multi-region support
2025-08-13 08:32:37 -04:00

223 lines
9.1 KiB
Markdown

# Configuration
This document provides a comprehensive overview of all configuration files used in the SRE Agent system. Configuration files are organized across different directories based on their purpose and scope.
## Configuration Files Overview
| File Path | Type | Purpose | Manual Edit Required? | Auto-Generated? |
|-----------|------|---------|----------------------|-----------------|
| `sre_agent/.env` | ENV | SRE agent-specific settings | Yes | Yes (GATEWAY_ACCESS_TOKEN by [setup](../README.md#use-case-setup)) |
| `gateway/.env` | ENV | Gateway authentication settings | Yes | No |
| `gateway/config.yaml` | YAML | AgentCore Gateway configuration | Yes | Partially (provider_arn by [setup](../README.md#use-case-setup)) |
| `deployment/.env` | ENV | Soft link to `sre_agent/.env` | No (uses sre_agent/.env) | N/A (symlink) |
| `deployment/.cognito_config` | ENV | Cognito configuration details | No | Yes (by [setup_cognito.sh](../deployment/setup_cognito.sh)) |
| `sre_agent/config/agent_config.yaml` | YAML | Agent-to-tool mapping configuration | No | Yes (gateway URI by [setup](../README.md#use-case-setup)) |
| `scripts/user_config.yaml` | YAML | Script-specific user configuration | No | No |
| `backend/openapi_specs/*.yaml` | YAML | OpenAPI specifications for tools | No | Yes (from templates by [setup](../README.md#use-case-setup)) |
### Setup Instructions
For files with `.example` versions:
1. Copy the `.example` file to create the actual configuration file
2. Edit the copied file with your environment-specific values
3. Never commit the actual configuration files to version control
```bash
# Example setup commands
cp sre_agent/.env.example sre_agent/.env
cp gateway/.env.example gateway/.env
cp gateway/config.yaml.example gateway/config.yaml
```
### Files Automatically Updated During Setup
The following files are automatically modified by the setup scripts:
1. **`sre_agent/.env`** - The `GATEWAY_ACCESS_TOKEN` is automatically appended
2. **`sre_agent/config/agent_config.yaml`** - The `gateway.uri` field is updated with the created gateway URI
3. **`gateway/config.yaml`** - The `provider_arn` field is updated when creating the credential provider
4. **`backend/openapi_specs/*.yaml`** - Generated from templates with your backend domain
5. **`deployment/.cognito_config`** - Created by `setup_cognito.sh` with USER_POOL_ID, CLIENT_ID, CLIENT_SECRET, and other Cognito settings
## Environment Variables
The SRE Agent uses environment variables for sensitive configuration values. Create a `.env` file in the `sre_agent/` directory with the following required variables:
```bash
# Required: API key for Claude model access
# For Anthropic direct access:
ANTHROPIC_API_KEY=sk-ant-api-key-here
# For Amazon Bedrock access:
AWS_DEFAULT_REGION=us-east-1
AWS_PROFILE=your-profile-name # Or use AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY
# Required: AgentCore Gateway authentication
GATEWAY_ACCESS_TOKEN=your-gateway-token-here # Generated by gateway setup
# Optional: Debugging and logging
LOG_LEVEL=INFO # Options: DEBUG, INFO, WARNING, ERROR
DEBUG=false # Enable debug mode for verbose output
```
**Note**: The SRE Agent looks for the `.env` file in the `sre_agent/` directory, not the project root. This allows for modular configuration management.
## Agent Configuration
The agent behavior is configured through `sre_agent/config/agent_config.yaml`. This file defines the mapping between agents and their available tools, as well as LLM parameters:
```yaml
# Agent to tool mapping
agents:
kubernetes_agent:
name: "Kubernetes Infrastructure Agent"
description: "Specializes in Kubernetes operations and troubleshooting"
tools:
- get_pod_status
- get_deployment_status
- get_cluster_events
- get_resource_usage
- get_node_status
logs_agent:
name: "Application Logs Agent"
description: "Expert in log analysis and pattern detection"
tools:
- search_logs
- get_error_logs
- analyze_log_patterns
- get_recent_logs
- count_log_events
metrics_agent:
name: "Performance Metrics Agent"
description: "Analyzes performance metrics and trends"
tools:
- get_performance_metrics
- get_error_rates
- get_resource_metrics
- get_availability_metrics
- analyze_trends
runbooks_agent:
name: "Operational Runbooks Agent"
description: "Provides operational procedures and guides"
tools:
- search_runbooks
- get_incident_playbook
- get_troubleshooting_guide
- get_escalation_procedures
- get_common_resolutions
# Global tools available to all agents
global_tools:
- x-amz-bedrock-agentcore-search # AgentCore search tool
# Gateway configuration
gateway:
uri: "https://your-gateway-url.com" # Updated during setup
```
## Gateway Environment Variables
The AgentCore Gateway requires additional environment variables for authentication. Create a `.env` file in the `gateway/` directory with the following:
```bash
# Required: Backend API key for credential provider authentication
BACKEND_API_KEY=your-backend-api-key-here
# Optional: Override config.yaml values with environment variables
# ACCOUNT_ID=123456789012
# REGION=us-east-1
# ROLE_NAME=your-role-name
# GATEWAY_NAME=MyAgentCoreGateway
# CREDENTIAL_PROVIDER_NAME=sre-agent-api-key-credential-provider
```
**Note**: The `BACKEND_API_KEY` is used by the `create_gateway.sh` script to authenticate with the credential provider service.
## Gateway Configuration
The AgentCore Gateway is configured through `gateway/config.yaml`. This configuration is managed by the setup scripts but can be customized:
```yaml
# AgentCore Gateway Configuration Template
# Copy this file to config.yaml and update with your environment-specific settings
# AWS Configuration
account_id: "YOUR_ACCOUNT_ID"
region: "us-east-1"
role_name: "YOUR_ROLE_NAME"
endpoint_url: "https://bedrock-agentcore-control.us-east-1.amazonaws.com"
credential_provider_endpoint_url: "https://us-east-1.prod.agent-credential-provider.cognito.aws.dev"
# Cognito Configuration
user_pool_id: "YOUR_USER_POOL_ID"
client_id: "YOUR_CLIENT_ID"
# S3 Configuration
s3_bucket: "your-agentcore-schemas-bucket"
s3_path_prefix: "devops-multiagent-demo" # Path prefix for OpenAPI schema files
# Provider Configuration
# This ARN is automatically generated by create_gateway.sh when it runs create_credentials_provider.py
provider_arn: "arn:aws:bedrock-agentcore:REGION:ACCOUNT_ID:token-vault/default/apikeycredentialprovider/YOUR_PROVIDER_NAME"
# Gateway Configuration
gateway_name: "MyAgentCoreGateway"
gateway_description: "AgentCore Gateway for API Integration"
# Target Configuration
target_description: "S3 target for OpenAPI schema"
```
## Configuration File Details
### SRE Agent `.env` File
- **Location**: `sre_agent/.env`
- **Purpose**: Agent-specific configuration separate from deployment settings
- **Setup**: Copy from `sre_agent/.env.example` and customize
- **Auto-Updates**: The setup script automatically adds `GATEWAY_ACCESS_TOKEN` to this file
- **Note**: The agent looks for this file specifically in the `sre_agent/` directory
### Gateway `.env` File
- **Location**: `gateway/.env`
- **Purpose**: Gateway authentication and backend API configuration
- **Setup**: Copy from `gateway/.env.example` and customize
- **Key Variables**: Backend API key for credential provider authentication
### Deployment `.env` File
- **Location**: `deployment/.env`
- **Purpose**: Symbolic link to `sre_agent/.env`
- **Setup**: No manual setup required - this is a soft link
- **Note**: This symlink ensures deployment scripts use the same environment variables as the agent
### Gateway Configuration (`config.yaml`)
- **Location**: `gateway/config.yaml`
- **Purpose**: AgentCore Gateway settings including AWS, Cognito, and S3 configuration
- **Setup**: Copy from `config.yaml.example` and customize
- **Auto-Updates**: The `create_gateway.sh` script automatically updates certain fields like `provider_arn`
### Agent Configuration (`agent_config.yaml`)
- **Location**: `sre_agent/config/agent_config.yaml`
- **Purpose**: Defines agent-to-tool mappings and agent capabilities
- **Setup**: Edit directly (no example file)
- **Auto-Updates**: The setup script automatically updates the `gateway.uri` field with the created gateway URI
- **Content**: Agent definitions, tool assignments, and global tool configurations
### User Configuration File
- **Location**: `scripts/user_config.yaml`
- **Purpose**: User personas and preferences for memory-enhanced personalization
- **Setup**: Edit directly to add or modify user personas
- **Content**: Predefined user preferences (Alice: technical, Carol: executive)
### OpenAPI Specifications
- **Location**: `backend/openapi_specs/*.yaml`
- **Purpose**: Define the API contracts for various backend services
- **Files**:
- `k8s_api.yaml` - Kubernetes operations API
- `logs_api.yaml` - Log analysis API
- `metrics_api.yaml` - Metrics collection API
- `runbooks_api.yaml` - Runbook management API
- **Auto-Generation**: These files are generated from templates during setup when you run `generate_specs.sh`
- **Note**: Do not edit these directly - modify the templates instead