Amit Arora faa5672f84 Update SRE Agent with AgentCore package updates and documentation restructure
- Updated pyproject.toml to use latest versions of boto3, botocore, awscli, and agentcore packages
- Merged main branch changes with conflict resolution
- Restructured README.md to follow template format with overview, prerequisites, setup, and execution sections
- Created detailed documentation structure in docs/ folder with specialized content files
- Updated package dependencies to use version constraints instead of local wheel files
- Removed production-specific language and focused on demo/sample implementation
- Added comprehensive documentation covering agents, configuration, deployment, and development
2025-07-16 13:23:39 +00:00

1.9 KiB

Verification of Results

The SRE Agent includes tools for verifying that investigation results are accurate and based on actual data rather than hallucinated information.

Ground Truth Verification

For result verification, we provide a data dump utility that creates a comprehensive ground truth dataset:

# Generate complete data dump for verification
cd backend/scripts
./dump_data_contents.sh

This script processes all files in the backend/data directory (including .json, .txt, and .log files) and creates a comprehensive dump at backend/data/all_data_dump.txt. This file serves as ground truth for verifying that agent responses are factual and not fabricated.

Report Verification

The reports folder contains investigation reports for several example queries. You can verify these reports against the ground truth data using the LLM-as-a-judge verification system:

# Verify a specific report against ground truth
python verify_report.py --report reports/example_report.md --ground-truth backend/data/all_data_dump.txt

Example Verification Workflow

# 1. Generate an investigation report
sre-agent --prompt "Why are the payment-service pods crash looping?"

# 2. Create ground truth data dump
cd backend/scripts && ./dump_data_contents.sh && cd ../..

# 3. Verify the report contains only factual information
python verify_report.py --report reports/your_report_.md --ground-truth backend/data/all_data_dump.txt

⚠️ Important Note: The system prompts and agent logic in sre_agent/agent_nodes.py require further refinement before production use. This implementation demonstrates the architectural approach and provides a foundation for building production-ready SRE agents, but the prompts, error handling, and agent coordination logic need additional tuning for real-world reliability.