mirror of
https://github.com/awslabs/amazon-bedrock-agentcore-samples.git
synced 2025-09-08 20:50:46 +00:00
* feat: Deploy SRE agent on Amazon Bedrock AgentCore Runtime - Add agent_runtime.py with FastAPI endpoints for AgentCore compatibility - Create Dockerfile for ARM64-based containerization - Add deployment scripts for automated ECR push and AgentCore deployment - Update backend API URLs from placeholders to actual endpoints - Update gateway configuration for production use - Add dependencies for AgentCore runtime support Implements #143 * chore: Add deployment artifacts to .gitignore - Add deployment/.sre_agent_uri, deployment/.env, and deployment/.agent_arn to .gitignore - Remove already tracked deployment artifacts from git * feat: Make ANTHROPIC_API_KEY optional in deployment - Update deploy_agent_runtime.py to conditionally include ANTHROPIC_API_KEY - Show info message when using Amazon Bedrock as provider - Update .env.example to clarify ANTHROPIC_API_KEY is optional - Only include ANTHROPIC_API_KEY in environment variables if it exists * fix: Use uv run python instead of python in build script - Update build_and_deploy.sh to use 'uv run python' for deployment - Change to parent directory to ensure uv environment is available - Fixes 'python: command not found' error during deployment * refactor: Improve deployment script structure and create .env symlink - Flatten nested if-else blocks in deploy_agent_runtime.py for better readability - Add 10-second sleep after deletion to ensure cleanup completes - Create symlink from deployment/.env to sre_agent/.env to avoid duplication - Move time import to top of file with other imports * feat: Add debug mode support and comprehensive deployment guide Add --debug command line flag and DEBUG environment variable support: - Created shared logging configuration module - Updated CLI and runtime to support --debug flag - Made debug traces conditional on DEBUG environment variable - Added debug mode for container and AgentCore deployments Enhanced build and deployment script: - Added command line argument for ECR repository name - Added help documentation and usage examples - Added support for local builds (x86_64) vs AgentCore builds (arm64) - Added environment variable pass-through for DEBUG, LLM_PROVIDER, ANTHROPIC_API_KEY Created comprehensive deployment guide: - Step-by-step instructions from local testing to production - Docker platform documentation (x86_64 vs arm64) - Environment variable configuration with .env file usage - Debug mode examples and troubleshooting guide - Provider configuration for Bedrock and Anthropic Updated README with AgentCore Runtime deployment section and documentation links. * docs: Update SRE Agent README with deployment flow diagram and fix directory reference - Fix reference from 04-SRE-agent to SRE-agent in README - Add comprehensive flowchart showing development to production deployment flow - Update overview to mention Amazon Bedrock AgentCore Runtime deployment - Remove emojis from documentation for professional appearance * docs: Replace mermaid diagram with ASCII step-by-step flow diagram - Change from block-style mermaid diagram to ASCII flow diagram - Show clear step-by-step progression from development to production - Improve readability with structured boxes and arrows - Minor text improvements for clarity * feat: Implement comprehensive prompt management system and enhance deployment guide - Create centralized prompt template system with external files in config/prompts/ - Add PromptLoader utility class with LRU caching and template variable substitution - Integrate PromptConfig into SREConstants for centralized configuration management - Update all agents (nodes, supervisor, output_formatter) to use prompt loader - Replace 150+ lines of hardcoded prompts with modular, maintainable template system - Enhance deployment guide with consistent naming (my_custom_sre_agent) throughout - Add quick-start copy-paste command sequence for streamlined deployment - Improve constants system with comprehensive model, AWS, timeout, and prompt configs - Add architectural assessment document to .gitignore for local analysis - Run black formatting across all updated Python files * docs: Consolidate deployment and security documentation - Rename deployment-and-security.md to security.md and remove redundant deployment content - Enhance security.md with comprehensive production security guidelines including: - Authentication and authorization best practices - Encryption and data protection requirements - Operational security monitoring and logging - Input validation and prompt security measures - Infrastructure security recommendations - Compliance and governance frameworks - Update README.md to reference new security.md file - Eliminate redundancy between deployment-guide.md and deployment-and-security.md - Improve documentation organization with clear separation of concerns * config: Replace hardcoded endpoints with placeholder domains - Update OpenAPI specifications to use placeholder domain 'your-backend-domain.com' - k8s_api.yaml: mcpgateway.ddns.net:8011 -> your-backend-domain.com:8011 - logs_api.yaml: mcpgateway.ddns.net:8012 -> your-backend-domain.com:8012 - metrics_api.yaml: mcpgateway.ddns.net:8013 -> your-backend-domain.com:8013 - runbooks_api.yaml: mcpgateway.ddns.net:8014 -> your-backend-domain.com:8014 - Update agent configuration to use placeholder AgentCore gateway endpoint - agent_config.yaml: Replace specific gateway ID with 'your-agentcore-gateway-endpoint' - Improve security by removing hardcoded production endpoints from repository - Enable template-based configuration that users can customize during setup - Align with existing documentation patterns for placeholder domain replacement
187 lines
7.0 KiB
Bash
Executable File
187 lines
7.0 KiB
Bash
Executable File
#!/bin/bash
|
|
|
|
# Exit on error
|
|
set -e
|
|
|
|
# Show usage if --help is passed
|
|
if [[ "$1" == "--help" ]] || [[ "$1" == "-h" ]]; then
|
|
echo "Usage: $0 [ECR_REPO_NAME]"
|
|
echo ""
|
|
echo "Arguments:"
|
|
echo " ECR_REPO_NAME Name for the ECR repository (default: sre_agent)"
|
|
echo ""
|
|
echo "Environment Variables:"
|
|
echo " LOCAL_BUILD Set to 'true' for local container build without ECR push"
|
|
echo " PLATFORM Set to 'x86_64' to build for local testing (default: arm64 for AgentCore)"
|
|
echo " DEBUG Set to 'true' to enable debug mode in deployed agent"
|
|
echo " LLM_PROVIDER Set to 'anthropic' or 'bedrock' (default: bedrock)"
|
|
echo " ANTHROPIC_API_KEY Required when using anthropic provider"
|
|
echo ""
|
|
echo "Examples:"
|
|
echo " # Deploy with default repo name"
|
|
echo " ./build_and_deploy.sh"
|
|
echo ""
|
|
echo " # Deploy with custom repo name"
|
|
echo " ./build_and_deploy.sh my_custom_sre_agent"
|
|
echo ""
|
|
echo " # Local build for testing"
|
|
echo " LOCAL_BUILD=true ./build_and_deploy.sh"
|
|
echo ""
|
|
echo " # Deploy with debug and anthropic provider"
|
|
echo " DEBUG=true LLM_PROVIDER=anthropic ./build_and_deploy.sh my_sre_agent"
|
|
exit 0
|
|
fi
|
|
|
|
# Get the directory where this script is located
|
|
SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
|
|
PARENT_DIR="$(dirname "$SCRIPT_DIR")"
|
|
|
|
# Configuration
|
|
AWS_REGION="${AWS_REGION:-us-east-1}"
|
|
ECR_REPO_NAME="${1:-sre_agent}"
|
|
RUNTIME_NAME="${RUNTIME_NAME:-$ECR_REPO_NAME}"
|
|
AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
|
|
ECR_REPO_URI="$AWS_ACCOUNT_ID.dkr.ecr.$AWS_REGION.amazonaws.com/$ECR_REPO_NAME"
|
|
|
|
# Platform configuration (default to ARM64 for AgentCore)
|
|
PLATFORM="${PLATFORM:-arm64}"
|
|
LOCAL_BUILD="${LOCAL_BUILD:-false}"
|
|
|
|
# Get current caller identity and construct role ARN
|
|
CALLER_IDENTITY=$(aws sts get-caller-identity --output json)
|
|
CURRENT_ARN=$(echo $CALLER_IDENTITY | jq -r '.Arn')
|
|
|
|
# Extract role name from ARN and construct role ARN
|
|
# This handles both assumed-role and user scenarios
|
|
if [[ $CURRENT_ARN == *":assumed-role/"* ]]; then
|
|
# Extract role name from assumed role ARN
|
|
ROLE_NAME=$(echo $CURRENT_ARN | sed 's/.*:assumed-role\/\([^\/]*\).*/\1/')
|
|
ROLE_ARN="arn:aws:iam::${AWS_ACCOUNT_ID}:role/${ROLE_NAME}"
|
|
else
|
|
# Default role if not running with an assumed role
|
|
ROLE_ARN="arn:aws:iam::${AWS_ACCOUNT_ID}:role/BedrockAgentCoreRole"
|
|
echo "⚠️ Not running with an assumed role. Will use default role: $ROLE_ARN"
|
|
fi
|
|
|
|
# Allow override via environment variable
|
|
ROLE_ARN="${AGENT_ROLE_ARN:-$ROLE_ARN}"
|
|
|
|
echo "🔐 Logging in to Amazon ECR..."
|
|
aws ecr get-login-password --region $AWS_REGION | docker login --username AWS --password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_REGION.amazonaws.com"
|
|
|
|
# Create repository if it doesn't exist
|
|
echo "📦 Creating ECR repository if it doesn't exist..."
|
|
aws ecr describe-repositories --repository-names "$ECR_REPO_NAME" --region "$AWS_REGION" || \
|
|
aws ecr create-repository --repository-name "$ECR_REPO_NAME" --region "$AWS_REGION"
|
|
|
|
# Determine which Dockerfile to use and set up build environment
|
|
if [ "$PLATFORM" = "x86_64" ] || [ "$LOCAL_BUILD" = "true" ]; then
|
|
echo "🏗️ Building Docker image for linux/amd64 (x86_64)..."
|
|
DOCKERFILE="$PARENT_DIR/Dockerfile.x86_64"
|
|
# Force platform to linux/amd64 for x86_64 builds
|
|
docker build --platform linux/amd64 -f "$DOCKERFILE" -t "$ECR_REPO_NAME" "$PARENT_DIR"
|
|
else
|
|
# Set up QEMU for ARM64 emulation
|
|
echo "🔧 Setting up QEMU for ARM64 emulation..."
|
|
docker run --rm --privileged multiarch/qemu-user-static --reset -p yes
|
|
|
|
# Build the Docker image for ARM64 (Dockerfile is at root level)
|
|
echo "🏗️ Building Docker image for linux/arm64 (this may take longer due to emulation)..."
|
|
DOCKERFILE="$PARENT_DIR/Dockerfile"
|
|
# Explicitly set platform for ARM64
|
|
DOCKER_BUILDKIT=0 docker build --platform linux/arm64 -f "$DOCKERFILE" -t "$ECR_REPO_NAME" "$PARENT_DIR"
|
|
fi
|
|
|
|
# For local builds, skip ECR push and deployment
|
|
if [ "$LOCAL_BUILD" = "true" ]; then
|
|
echo "✅ Successfully built local image: $ECR_REPO_NAME:latest"
|
|
echo ""
|
|
echo "📝 To run the container locally:"
|
|
echo "docker run -p 8080:8080 --env-file $PARENT_DIR/sre_agent/.env $ECR_REPO_NAME:latest"
|
|
echo ""
|
|
echo "Or with AWS credentials (bedrock provider - default):"
|
|
echo "docker run -p 8080:8080 -v ~/.aws:/root/.aws:ro -e AWS_PROFILE=default -e GATEWAY_ACCESS_TOKEN=\$GATEWAY_ACCESS_TOKEN $ECR_REPO_NAME:latest"
|
|
echo ""
|
|
echo "Or with Anthropic provider:"
|
|
echo "docker run -p 8080:8080 -e LLM_PROVIDER=anthropic -e ANTHROPIC_API_KEY=\$ANTHROPIC_API_KEY -e GATEWAY_ACCESS_TOKEN=\$GATEWAY_ACCESS_TOKEN $ECR_REPO_NAME:latest"
|
|
exit 0
|
|
fi
|
|
|
|
# Tag the image
|
|
echo "🏷️ Tagging image..."
|
|
docker tag "$ECR_REPO_NAME":latest "$ECR_REPO_URI":latest
|
|
|
|
# Push the image to ECR
|
|
echo "⬆️ Pushing image to ECR..."
|
|
docker push "$ECR_REPO_URI":latest
|
|
|
|
echo "✅ Successfully built and pushed image to:"
|
|
echo "$ECR_REPO_URI:latest"
|
|
|
|
# Save the container URI to a file in the script directory
|
|
echo "💾 Saving container URI to .sre_agent_uri file..."
|
|
echo "$ECR_REPO_URI:latest" > "$SCRIPT_DIR/.sre_agent_uri"
|
|
echo "Container URI saved to $SCRIPT_DIR/.sre_agent_uri"
|
|
|
|
# Deploy the agent runtime
|
|
echo ""
|
|
echo "🚀 Deploying agent runtime..."
|
|
echo "Using role ARN: $ROLE_ARN"
|
|
echo "Using runtime name: $RUNTIME_NAME"
|
|
echo "Using region: $AWS_REGION"
|
|
|
|
# Check if .env file exists
|
|
if [ ! -f "$SCRIPT_DIR/.env" ]; then
|
|
echo "❌ Error: .env file not found at $SCRIPT_DIR/.env"
|
|
echo "Please create a .env file with ANTHROPIC_API_KEY and GATEWAY_ACCESS_TOKEN"
|
|
echo "You can use .env.example as a template"
|
|
exit 1
|
|
fi
|
|
|
|
echo ".env file found at $SCRIPT_DIR/.env"
|
|
|
|
# Deploy using the Python script
|
|
cd "$SCRIPT_DIR"
|
|
|
|
# Create a temporary file to capture output
|
|
TEMP_OUTPUT=$(mktemp)
|
|
|
|
# Log environment variables being passed
|
|
echo "🔧 Environment variables for deployment:"
|
|
echo " DEBUG: ${DEBUG:-not set}"
|
|
echo " LLM_PROVIDER: ${LLM_PROVIDER:-bedrock (default)}"
|
|
if [ -n "$ANTHROPIC_API_KEY" ]; then
|
|
echo " ANTHROPIC_API_KEY: ***...${ANTHROPIC_API_KEY: -8}"
|
|
else
|
|
echo " ANTHROPIC_API_KEY: not set"
|
|
fi
|
|
|
|
# Change to parent directory to use uv
|
|
cd "$PARENT_DIR"
|
|
|
|
# Run the Python script and capture both return code and output
|
|
# Pass through DEBUG, LLM_PROVIDER, and ANTHROPIC_API_KEY environment variables
|
|
if DEBUG="$DEBUG" LLM_PROVIDER="${LLM_PROVIDER:-bedrock}" ANTHROPIC_API_KEY="$ANTHROPIC_API_KEY" uv run python deployment/deploy_agent_runtime.py \
|
|
--container-uri "$ECR_REPO_URI:latest" \
|
|
--role-arn "$ROLE_ARN" \
|
|
--runtime-name "$RUNTIME_NAME" \
|
|
--region "$AWS_REGION" \
|
|
--force-recreate > "$TEMP_OUTPUT" 2>&1; then
|
|
|
|
# Success - show output
|
|
DEPLOY_OUTPUT=$(cat "$TEMP_OUTPUT")
|
|
echo "$DEPLOY_OUTPUT"
|
|
else
|
|
# Failure - show error output and exit
|
|
echo "❌ Agent runtime deployment failed!"
|
|
echo "Error output:"
|
|
cat "$TEMP_OUTPUT"
|
|
rm -f "$TEMP_OUTPUT"
|
|
exit 1
|
|
fi
|
|
|
|
# Clean up temporary file
|
|
rm -f "$TEMP_OUTPUT"
|
|
|
|
echo ""
|
|
echo "🎉 Build and deployment complete!" |