Amit Arora faa5672f84 Update SRE Agent with AgentCore package updates and documentation restructure
- Updated pyproject.toml to use latest versions of boto3, botocore, awscli, and agentcore packages
- Merged main branch changes with conflict resolution
- Restructured README.md to follow template format with overview, prerequisites, setup, and execution sections
- Created detailed documentation structure in docs/ folder with specialized content files
- Updated package dependencies to use version constraints instead of local wheel files
- Removed production-specific language and focused on demo/sample implementation
- Added comprehensive documentation covering agents, configuration, deployment, and development
2025-07-16 13:23:39 +00:00

1.3 KiB

Example Use Cases

Investigating Pod Failures

sre-agent --prompt "Our database pods are crash looping in production"

The agents collaborate to check pod status, analyze events, examine memory usage trends, and provide remediation steps.

Diagnosing Performance Issues

sre-agent --prompt "API response times have degraded 3x in the last hour"

The system correlates metrics across multiple dimensions to identify latency sources and configuration issues.

Interactive Troubleshooting Session

sre-agent --interactive

👤 You: We're seeing intermittent 502 errors from the payment service
🤖 Multi-Agent System: Investigating intermittent 502 errors...

👤 You: What's causing the queue buildup?
🤖 Multi-Agent System: Analyzing payment queue patterns...

Interactive mode allows multi-turn conversations for complex investigations.

Proactive Monitoring

# Morning health check
sre-agent --prompt "Perform a comprehensive health check of all production services"

# Capacity planning
sre-agent --prompt "Analyze resource utilization trends and predict when we'll need to scale"

# Security audit
sre-agent --prompt "Check for any suspicious patterns in authentication logs"

Examples of proactive monitoring and health check queries.