rohillasandeep 01246a98b2
Configuration Management Fixes (#223)
* feat: Add AWS Operations Agent with AgentCore Runtime

- Complete rewrite of AWS Operations Agent using Amazon Bedrock AgentCore
- Added comprehensive deployment scripts for DIY and SDK runtime modes
- Implemented OAuth2/PKCE authentication with Okta integration
- Added MCP (Model Context Protocol) tool support for AWS service operations
- Sanitized all sensitive information (account IDs, domains, client IDs) with placeholders
- Added support for 17 AWS services: EC2, S3, Lambda, CloudFormation, IAM, RDS, CloudWatch, Cost Explorer, ECS, EKS, SNS, SQS, DynamoDB, Route53, API Gateway, SES, Bedrock, SageMaker
- Includes chatbot client, gateway management scripts, and comprehensive testing
- Ready for public GitHub with security-cleared configuration files

Security: All sensitive values replaced with <YOUR_AWS_ACCOUNT_ID>, <YOUR_OKTA_DOMAIN>, <YOUR_OKTA_CLIENT_ID> placeholders

* Update AWS Operations Agent architecture diagram

* feat: Enhance AWS Operations Agent with improved testing and deployment

- Update README with new local container testing approach using run-*-local-container.sh scripts
- Replace deprecated SAM-based MCP Lambda deployment with ZIP-based deployment
- Add no-cache flag to Docker builds to ensure clean builds
- Update deployment scripts to use consolidated configuration files
- Add comprehensive cleanup scripts for all deployment components
- Improve error handling and credential validation in deployment scripts
- Add new MCP tool deployment using ZIP packaging instead of Docker containers
- Update configuration management to use dynamic-config.yaml structure
- Add local testing capabilities with containerized agents
- Remove outdated test scripts and replace with interactive chat client approach

* fix: Update IAM policy configurations

- Update bac-permissions-policy.json with enhanced permissions
- Update bac-trust-policy.json for improved trust relationships

* fix: Update Docker configurations for agent runtimes

- Update Dockerfile.diy with improved container configuration
- Update Dockerfile.sdk with enhanced build settings

* fix: Update OAuth iframe flow configuration

- Update iframe-oauth-flow.html with improved OAuth handling

* feat: Update AWS Operations Agent configuration and cleanup

- Update IAM permissions policy with enhanced access controls
- Update IAM trust policy with improved security conditions
- Enhance OAuth iframe flow with better UX and error handling
- Improve chatbot client with enhanced local testing capabilities
- Remove cache files and duplicate code for cleaner repository

* docs: Add architecture diagrams and update README

- Add architecture-2.jpg and flow.jpg diagrams for better visualization
- Update README.md with enhanced documentation and diagrams

* Save current work before resolving merge conflicts

* Keep AWS-operations-agent changes (local version takes precedence)

* Fix: Remove merge conflict markers from AWS-operations-agent files - restore clean version

* Fix deployment and cleanup script issues

Major improvements and fixes:

Configuration Management:
- Fix role assignment in gateway creation (use bac-execution-role instead of Lambda role)
- Add missing role_arn cleanup in MCP tool deletion script
- Fix OAuth provider deletion script configuration clearing
- Improve memory deletion script to preserve quote consistency
- Add Lambda invoke permissions to bac-permissions-policy.json

Script Improvements:
- Reorganize deletion scripts: 11-delete-oauth-provider.sh, 12-delete-memory.sh, 13-cleanup-everything.sh
- Fix interactive prompt handling in cleanup scripts (echo -e format)
- Add yq support with sed fallbacks for better YAML manipulation
- Remove obsolete 04-deploy-mcp-tool-lambda-zip.sh script

Architecture Fixes:
- Correct gateway role assignment to use runtime.role_arn (bac-execution-role)
- Ensure proper role separation between gateway and Lambda execution
- Fix configuration cleanup to clear all dynamic config fields consistently

Documentation:
- Update README with clear configuration instructions
- Maintain security best practices with placeholder values
- Add comprehensive deployment and cleanup guidance

These changes address systematic issues with cleanup scripts, role assignments,
and configuration management while maintaining security best practices.

* Update README.md with comprehensive documentation

Enhanced documentation includes:
- Complete project structure with 75 files
- Step-by-step deployment guide with all 13 scripts
- Clear configuration instructions with security best practices
- Dual agent architecture documentation (DIY + SDK)
- Authentication flow and security implementation details
- Troubleshooting guide and operational procedures
- Local testing and container development guidance
- Tool integration and MCP protocol documentation

The README now provides complete guidance for deploying and operating
the AWS Support Agent with Amazon Bedrock AgentCore system.

---------

Co-authored-by: name <alias@amazon.com>
2025-08-09 13:51:24 -07:00

252 lines
9.4 KiB
Bash
Executable File
Raw Permalink Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

#!/bin/bash
# Delete all AgentCore Runtimes
echo "🗑️ Deleting all AgentCore Runtimes..."
# Configuration - Get project directory
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_DIR="$(dirname "$(dirname "$SCRIPT_DIR")")" # Go up two levels to reach AgentCore root
RUNTIME_DIR="$(dirname "$SCRIPT_DIR")" # agentcore-runtime directory
CONFIG_DIR="${PROJECT_DIR}/config"
# Load configuration from YAML (fallback if yq not available)
if command -v yq >/dev/null 2>&1; then
REGION=$(yq eval '.aws.region' "${CONFIG_DIR}/static-config.yaml")
ACCOUNT_ID=$(yq eval '.aws.account_id' "${CONFIG_DIR}/static-config.yaml")
else
echo "⚠️ yq not found, using default values from existing config"
# Fallback: extract from YAML using grep/sed
REGION=$(grep "region:" "${CONFIG_DIR}/static-config.yaml" | head -1 | sed 's/.*region: *["'\'']*\([^"'\'']*\)["'\'']*$/\1/')
ACCOUNT_ID=$(grep "account_id:" "${CONFIG_DIR}/static-config.yaml" | head -1 | sed 's/.*account_id: *["'\'']*\([^"'\'']*\)["'\'']*$/\1/')
fi
echo "📝 Configuration:"
echo " Region: $REGION"
echo " Account ID: $ACCOUNT_ID"
echo ""
# Get AWS credentials
echo "🔐 Getting AWS credentials..."
if [ -n "$AWS_PROFILE" ]; then
echo " Using AWS profile: $AWS_PROFILE"
aws configure list --profile "$AWS_PROFILE"
else
echo " Using default AWS credentials"
aws configure list
fi
# Check AWS credentials
echo "🔍 Verifying AWS credentials..."
if aws sts get-caller-identity --region "$REGION" >/dev/null 2>&1; then
CALLER_IDENTITY=$(aws sts get-caller-identity --region "$REGION" 2>/dev/null)
CURRENT_ACCOUNT=$(echo "$CALLER_IDENTITY" | grep -o '"Account": "[^"]*"' | cut -d'"' -f4)
echo "✅ AWS credentials configured"
echo " Current Account: $CURRENT_ACCOUNT"
echo " Target Account: $ACCOUNT_ID"
if [ "$CURRENT_ACCOUNT" != "$ACCOUNT_ID" ]; then
echo "⚠️ Warning: Current account ($CURRENT_ACCOUNT) differs from config account ($ACCOUNT_ID)"
echo " Proceeding with current account credentials..."
fi
else
echo "❌ AWS credentials not configured or invalid"
echo " Please run: aws configure or aws sso login --profile <your-profile>"
echo " Current AWS config:"
aws configure list
exit 1
fi
echo ""
# Function to delete runtime
delete_runtime() {
local runtime_arn="$1"
local runtime_name="$2"
if [ -z "$runtime_arn" ]; then
echo " ⚠️ No $runtime_name runtime ARN found - skipping"
return 0
fi
# Extract runtime ID from ARN (format: arn:aws:bedrock-agentcore:region:account:runtime/runtime-id)
local runtime_id=$(echo "$runtime_arn" | sed 's|.*runtime/||')
echo "🗑️ Deleting $runtime_name runtime..."
echo " ARN: $runtime_arn"
echo " ID: $runtime_id"
# Check if runtime exists first
if ! aws bedrock-agentcore-control get-agent-runtime \
--agent-runtime-id "$runtime_id" \
--region "$REGION" >/dev/null 2>&1; then
echo " $runtime_name runtime not found or already deleted"
return 0
fi
# Delete the runtime
if aws bedrock-agentcore-control delete-agent-runtime \
--agent-runtime-id "$runtime_id" \
--region "$REGION" 2>/dev/null; then
echo "$runtime_name runtime deletion initiated"
# Wait for deletion to complete
echo " ⏳ Waiting for $runtime_name runtime deletion to complete..."
local max_attempts=30
local attempt=1
while [ $attempt -le $max_attempts ]; do
if ! aws bedrock-agentcore-control get-agent-runtime \
--agent-runtime-id "$runtime_id" \
--region "$REGION" >/dev/null 2>&1; then
echo "$runtime_name runtime deleted successfully"
return 0
fi
echo " ⏳ Attempt $attempt/$max_attempts - $runtime_name runtime still exists..."
sleep 10
((attempt++))
done
echo " ⚠️ $runtime_name runtime deletion timeout - may still be in progress"
return 1
else
echo " ❌ Failed to delete $runtime_name runtime"
echo " 💡 This might be normal if the runtime was already deleted"
return 1
fi
}
# Get runtime ARNs from dynamic configuration
echo "📖 Reading runtime ARNs from dynamic configuration..."
DYNAMIC_CONFIG="${CONFIG_DIR}/dynamic-config.yaml"
if [ -f "$DYNAMIC_CONFIG" ]; then
if command -v yq >/dev/null 2>&1; then
DIY_RUNTIME_ARN=$(yq eval '.runtime.diy_agent.arn' "$DYNAMIC_CONFIG")
SDK_RUNTIME_ARN=$(yq eval '.runtime.sdk_agent.arn' "$DYNAMIC_CONFIG")
else
# Fallback parsing
DIY_RUNTIME_ARN=$(grep -A 10 "diy_agent:" "$DYNAMIC_CONFIG" | grep "arn:" | head -1 | sed 's/.*arn: *["'\'']*\([^"'\'']*\)["'\'']*$/\1/')
SDK_RUNTIME_ARN=$(grep -A 10 "sdk_agent:" "$DYNAMIC_CONFIG" | grep "arn:" | head -1 | sed 's/.*arn: *["'\'']*\([^"'\'']*\)["'\'']*$/\1/')
fi
# Clean up empty values
[ "$DIY_RUNTIME_ARN" = "null" ] && DIY_RUNTIME_ARN=""
[ "$SDK_RUNTIME_ARN" = "null" ] && SDK_RUNTIME_ARN=""
echo " DIY Runtime ARN: ${DIY_RUNTIME_ARN:-'Not configured'}"
echo " SDK Runtime ARN: ${SDK_RUNTIME_ARN:-'Not configured'}"
else
echo " ⚠️ Dynamic configuration file not found: $DYNAMIC_CONFIG"
DIY_RUNTIME_ARN=""
SDK_RUNTIME_ARN=""
fi
echo ""
# List all existing runtimes for reference
echo "📋 Listing all existing runtimes in account..."
if aws bedrock-agentcore-control list-agent-runtimes --region "$REGION" >/dev/null 2>&1; then
runtime_list=$(aws bedrock-agentcore-control list-agent-runtimes --region "$REGION" --query 'agentRuntimes[*].{Name:agentRuntimeName,ID:agentRuntimeId,Status:status}' --output table 2>/dev/null)
if [ -n "$runtime_list" ] && echo "$runtime_list" | grep -q "agentcoreDIYTest\|bac_runtime"; then
echo "$runtime_list"
else
echo " No runtimes found in account"
fi
else
echo " ⚠️ Unable to list runtimes (may be a permissions issue)"
fi
echo ""
# Delete runtimes
echo "🗑️ Starting runtime deletion process..."
echo ""
# Delete DIY runtime
if [ -n "$DIY_RUNTIME_ARN" ]; then
delete_runtime "$DIY_RUNTIME_ARN" "DIY"
else
echo "⚠️ No DIY runtime found to delete"
fi
echo ""
# Delete SDK runtime
if [ -n "$SDK_RUNTIME_ARN" ]; then
delete_runtime "$SDK_RUNTIME_ARN" "SDK"
else
echo "⚠️ No SDK runtime found to delete"
fi
echo ""
# Update dynamic configuration to clear runtime ARNs
echo "📝 Updating dynamic configuration to clear runtime ARNs..."
if command -v yq >/dev/null 2>&1; then
yq eval '.runtime.diy_agent.arn = ""' -i "$DYNAMIC_CONFIG"
yq eval '.runtime.diy_agent.ecr_uri = ""' -i "$DYNAMIC_CONFIG"
yq eval '.runtime.diy_agent.endpoint_arn = ""' -i "$DYNAMIC_CONFIG"
yq eval '.runtime.sdk_agent.arn = ""' -i "$DYNAMIC_CONFIG"
yq eval '.runtime.sdk_agent.ecr_uri = ""' -i "$DYNAMIC_CONFIG"
yq eval '.runtime.sdk_agent.endpoint_arn = ""' -i "$DYNAMIC_CONFIG"
echo " ✅ Dynamic configuration updated - runtime ARNs and ECR URIs cleared"
else
echo " ⚠️ yq not found - using sed fallback to clear runtime fields"
# Fallback: use sed to clear the fields
sed -i '' \
-e 's|arn: "arn:aws:bedrock-agentcore:.*"|arn: ""|g' \
-e 's|ecr_uri: ".*\.dkr\.ecr\..*"|ecr_uri: ""|g' \
-e 's|endpoint_arn: "arn:aws:bedrock-agentcore:.*"|endpoint_arn: ""|g' \
"$DYNAMIC_CONFIG"
echo " ✅ Dynamic configuration updated - runtime ARNs and ECR URIs cleared"
fi
echo ""
# Optional: Clean up ECR repositories
echo "🧹 Optional: Clean up ECR repositories..."
echo " This will delete Docker images but keep the repositories"
ECR_REPOS=("bac-runtime-repo-diy" "bac-runtime-repo-sdk")
for repo in "${ECR_REPOS[@]}"; do
echo " Checking ECR repository: $repo"
# List images in repository
if aws ecr describe-images --repository-name "$repo" --region "$REGION" >/dev/null 2>&1; then
echo " 📦 Found ECR repository: $repo"
# Get image digests
image_digests=$(aws ecr list-images --repository-name "$repo" --region "$REGION" --query 'imageIds[*].imageDigest' --output text 2>/dev/null)
if [ -n "$image_digests" ] && [ "$image_digests" != "None" ]; then
echo " 🗑️ Deleting images from $repo..."
for digest in $image_digests; do
aws ecr batch-delete-image \
--repository-name "$repo" \
--image-ids imageDigest="$digest" \
--region "$REGION" >/dev/null 2>&1
done
echo " ✅ Images deleted from $repo"
else
echo " No images found in $repo"
fi
else
echo " ECR repository $repo not found or not accessible"
fi
done
echo ""
echo "✅ Runtime cleanup completed!"
echo ""
echo "📋 Summary:"
echo " • Deleted all AgentCore runtimes"
echo " • Cleared runtime ARNs from dynamic configuration"
echo " • Cleaned up ECR repository images"
echo ""
echo "💡 Next steps:"
echo " • Run 09-delete-all-gateways-targets.sh to delete gateways and targets"
echo " • Run 10-delete-mcp-tool-deployment.sh to delete MCP Lambda"
echo " • Run 11-delete-oauth-provider.sh for complete cleanup"
echo " • Run 12-delete-memory.sh to delete memory"