The SRE Agent is a multi-agent system for Site Reliability Engineers that helps investigate infrastructure issues. Built on the Model Context Protocol (MCP) and powered by Amazon Nova and Anthropic Claude models (Claude can be accessed through Amazon Bedrock or directly through Anthropic), this system uses specialized AI agents that collaborate to investigate issues, analyze logs, monitor performance metrics, and execute operational procedures. The AgentCore Gateway provides access to data sources and systems available as MCP tools. This example also demonstrates how to deploy the agent using the Amazon Bedrock AgentCore Runtime for production environments.
> **⚠️ IMPORTANT:** Amazon Bedrock AgentCore Gateway **only works with HTTPS endpoints**. You must have valid SSL certificates for your backend servers.
* Python 3.12+
*`uv` package manager for Python package management
* EC2 Instance (recommended: `t3.xlarge` or larger)
* Valid SSL certificates for HTTPS endpoints
* Either Anthropic API key or AWS credentials configured for Amazon Bedrock
* Updated OpenAPI specifications with your actual domain name
## Deploying Your Agent on Amazon Bedrock AgentCore Runtime
For production deployments, you can deploy the SRE Agent directly to Amazon Bedrock AgentCore Runtime. This provides a scalable, managed environment for running your agent with enterprise-grade security and monitoring.
The AgentCore Runtime deployment supports:
- **Container-based deployment** with automatic scaling
- **Multiple LLM providers** (Amazon Bedrock or Anthropic Claude)
- **Debug mode** for troubleshooting and development
- **Environment-based configuration** for different deployment stages
- **Secure credential management** through AWS IAM and environment variables
For complete step-by-step instructions including local testing, container building, and production deployment, see the **[Deployment Guide](docs/deployment-guide.md)**.
### Important: Domain Configuration for Development vs Git Commits
The OpenAPI specification files in `backend/openapi_specs/` use a placeholder domain `your-backend-domain.com` by default. For development, you'll need to replace this with your actual domain, but **you must revert these changes before committing to git**.
#### For Development Setup
```bash
# Replace placeholder domain with your actual domain
sed -i 's/your-backend-domain.com/your-actual-domain.com/g' backend/openapi_specs/*.yaml
```
#### Before Committing Changes
```bash
# Revert back to placeholder domain before git commit
sed -i 's/your-actual-domain.com/your-backend-domain.com/g' backend/openapi_specs/*.yaml
# Then commit your changes
git add .
git commit -m "Your commit message"
```
#### Pre-commit Hook Protection
A git pre-commit hook is installed that automatically prevents commits of OpenAPI spec files that don't contain the placeholder domain `your-backend-domain.com`. This ensures that custom domain configurations don't accidentally get committed to the repository.
If the pre-commit hook blocks your commit:
1. Check which OpenAPI spec files contain custom domains
2. Use the sed command above to revert them to the placeholder
The examples provided in this repository are for experimental and educational purposes only. They demonstrate concepts and techniques but are not intended for direct use in production environments. Make sure to have Amazon Bedrock Guardrails in place to protect against [prompt injection](https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-injection.html).
**Important Note**: The data in [`backend/data`](backend/data) is synthetically generated, and the backend directory contains stub servers that showcase how a real SRE agent backend could work. In a production environment, these implementations would need to be replaced with real implementations that connect to actual systems, use vector databases, and integrate with other data sources.