amazon-bedrock-agentcore-sa.../01-tutorials/07-AgentCore-E2E/Optional-lab-agentcore-observability.ipynb
Akarsha Sehwag ce1e2d8367
Add Workshop E2E (#253)
* feat: e2e tutorial lab5

* docs: Add README.md for 05-AgentCore Observability lab

* feat: Add Lab 6 of E2E tutorial

* fix: Fix Agent ECR repository typo

* docs: Update Lab 6 Guidelines

* feat: cleanup guardrails

* docs: fix step name

* added lab4

* Add Lab 3 Identity Notebook and README

* added memory and updated lab 1

* pushing all of the helper files from original use case. Remove as needed

* feat: update lab1 helper file

* chore: restructure utils

* feat: update memory helper

* chore: restructure identity

* chore: append to agent definition from the helper

* Renamed agentcore identity to lab6

* Renamed Gateway notebook to Lab 3 and reviewed with fixes

* Fixed typo in delete_memory

* Lab 1: review and minor fixes

* Lab 1: cleanup

* Lab 2: refactored

* fix: change model to Claude 3.7

* added TODOs

* updated lab1 notebook

* update runtime intro

* refactor utils file

* minor_update to memory

* memory return client

* revert change.

* feat: update runtime lab

* feat: add helper for bedrock guardrails

* fix: fix typos

* docs: minor update

* update lab1 tools

* update memory

* update - runtime

* updated lab3 + lambda

* removed outputs

* changed sh

* removed zip

* added one missing piece

* chore: rm observability old lab

* Updates to Lab6 Identity

* Updates to Lab6 Identity

* updated arch. diagram

* update docs lab1

* rename-lab-5-6

* update arch doc

* lab 03

* fixed lab 3 docs

* Fix Lab 4

* Lab 7 frontend

* Fix lab7

* Fix prereq issues and update gitignore

* adding lab 3 tool removal

* removed checkpoints

* merged

* chore: Update Lab 4 documentation

* fix: Update AgentCore IAM Role to access memory

* Lab 7 fixed invoke to runtime

* minor changes

* removed guardrails + minor edits

* Deleting files and folders.

* Rename, Refactor and deletion

Added sagemaker_helper

* fixing Client

* Removing guardrails code

* remove unused arch

* remove unused files

* updating lab01

* remove policies

* updating lab02

* docs: Update lab 4 markdown

* chore: Update Lab 4

* update cleanup

* cleaning up DS_Store files

* frontend

* updates to lab1 notebook

* updating architectures

* Lab5: fixed response formatting in streamlit app

* updating lab3

* updated lab3

* Lab 5 and Lab 6 and Helper Scripts Updates

Lab 5: Added the architecture diagram
Lab 6: Updated the notebook
Utils: Added helper functions
Sagemaker_helper: Cosmetic Updates

* Updating lab 4

* removing clean up from lab 3

* added lab3 changes

* Streamlit Fixes, Cosmetic Updates, Notebook Updates

* add maira's changes

* update lab2+3

* minor updates

* sync labs

* fix runtime docs

* refactoring end-to-end tutorials

* remove guardrail ss

---------

Co-authored-by: Aleksei Iancheruk <aianch@amazon.fr>
Co-authored-by: EugeneSel <youdjin.sel15@gmail.com>
Co-authored-by: Aidan Ricci <riaidan@amazon.com>
Co-authored-by: Achintya <pinnintiachintya@gmail.com>
Co-authored-by: naresh rajaram <nareshrd@amazon.com>
Co-authored-by: Lorenzo Micheli <lorenzo.micheli@gmail.com>
Co-authored-by: Achintya <apinnint@amazon.com>
Co-authored-by: HT <hardikvt@amazon.com>
Co-authored-by: HT <hardik.thakkar00@gmail.com>
Co-authored-by: Maira Ladeira Tanke <mttanke@amazon.com>
2025-08-14 22:52:33 -04:00

437 lines
15 KiB
Plaintext
Raw Permalink Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

{
"cells": [
{
"cell_type": "markdown",
"id": "6bf252a7",
"metadata": {},
"source": [
"# Lab 5: Deeper look at GenAI Observability for Your Customer Support Agent\n",
"\n",
"## Overview\n",
"\n",
"In this lab, you will understand how AgentCore Observability works and how to set it up without using AgentCore Runtime.\n",
"\n",
"## What You'll Add\n",
"\n",
"🧠 **AgentCore Observability Features**:\n",
"- **Set up** Amazon OpenTelemetry Python Instrumentation \n",
"- **Visualize and analyze** agent traces in Amazon CloudWatch GenAI Observability\n",
"\n",
"## Tutorial Details\n",
"\n",
"| Information | Details |\n",
"|-------------|---------|\n",
"| **Tutorial type** | Incremental Enhancement |\n",
"| **Agent type** | Single Agent |\n",
"| **Agentic Framework** | Strands Agents |\n",
"| **LLM model** | Anthropic Claude 3.7 Sonnet |\n",
"| **Tutorial vertical** | Customer Support |\n",
"| **Complexity** | Easy to Moderate |\n",
"| **SDK used** | Strands SDK, AgentCore Observability, Cloudwatch, Bedrock, boto3 |\n",
"\n",
"## Prerequisites\n",
"\n",
"- ✅ **Must complete Lab 1 first** - This lab builds directly on your Lab 1 agent \n",
"- ✅ **Enable transaction search on Amazon CloudWatch** - First-time users must enable CloudWatch Transaction Search to view Bedrock AgentCore spans and traces. To enable transaction search, please refer to the our [documentation](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Enable-TransactionSearch.html).\n",
"\n",
"## Learning Objectives\n",
"\n",
"By the end of this lab, you will:\n",
"- Use the official Amazon CloudWatch GenAI Observability Dashboard\n",
"\n",
"\n",
"---\n",
"\n",
"## 🚀 Let's Add Observability to your agent\n"
]
},
{
"cell_type": "markdown",
"id": "b3bead94",
"metadata": {},
"source": [
"Initialize clients"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "90de7e84",
"metadata": {},
"outputs": [],
"source": [
"import boto3\n",
"from botocore.exceptions import ClientError\n",
"\n",
"session = boto3.Session()\n",
"region = session.region_name\n",
"\n",
"logs_client = boto3.client(\"logs\", region_name=region)\n",
"bedrock_client = boto3.client(\"bedrock\", region_name=region)\n",
"sts_client = boto3.client(\"sts\", region_name=region)\n",
"\n",
"account_id = sts_client.get_caller_identity()[\"Account\"]"
]
},
{
"cell_type": "markdown",
"id": "b31817c2",
"metadata": {},
"source": [
"Make sure to have `aws-opentelemetry-distro` installed"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "bb3009b0",
"metadata": {},
"outputs": [],
"source": [
"%pip install strands-agents boto3 aws-opentelemetry-distro -q"
]
},
{
"cell_type": "markdown",
"id": "7eaca492",
"metadata": {},
"source": [
"# Step 2: Configure Environment for Observability\n",
"\n",
"To enable observability for your Strands agent and send telemetry data to Amazon CloudWatch, you'll need to configure the following environment variables. We'll create a `.env` file to manage these settings securely, keeping sensitive AWS credentials separate from your code while making it easy to switch between different environments.\n",
"\n",
"Required Environment Variables:\n",
"\n",
"| Variable | Value | Purpose |\n",
"|----------|-------|---------|\n",
"| `OTEL_PYTHON_DISTRO` | `aws_distro` | Use AWS Distro for OpenTelemetry (ADOT) |\n",
"| `OTEL_PYTHON_CONFIGURATOR` | `aws_configurator` | Set AWS configurator for ADOT SDK |\n",
"| `OTEL_EXPORTER_OTLP_PROTOCOL` | `http/protobuf` | Configure export protocol |\n",
"| `OTEL_TRACES_EXPORTER` | `otlp` | Configure trace exporter |\n",
"| `OTEL_EXPORTER_OTLP_LOGS_HEADERS` | `x-aws-log-group=<YOUR-LOG-GROUP>,x-aws-log-stream=<YOUR-LOG-STREAM>,x-aws-metric-namespace=<YOUR-NAMESPACE>` | Direct logs to CloudWatch groups |\n",
"| `OTEL_RESOURCE_ATTRIBUTES` | `service.name=<YOUR-AGENT-NAME>` | Identify your agent in observability data |\n",
"| `AGENT_OBSERVABILITY_ENABLED` | `true` | Activate ADOT pipeline |\n",
"\n",
"Also, ensure you set `AWS_REGION`, `AWS_DEFAULT_REGION` and `AWS_ACCOUNT_ID` environment variables as these will be picked up by the opentelemetry instrument script."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e0c02eb1",
"metadata": {},
"outputs": [],
"source": [
"log_group_name = \"agents/customer-support-assistant-logs\" # Your log group name\n",
"log_stream_name = \"default\" # Your log stream name\n",
"\n",
"# Create log group\n",
"try:\n",
" logs_client.create_log_group(logGroupName=log_group_name)\n",
" print(f\"✅ Log group '{log_group_name}' created successfully\")\n",
"except ClientError as e:\n",
" if e.response[\"Error\"][\"Code\"] == \"ResourceAlreadyExistsException\":\n",
" print(f\" Log group '{log_group_name}' already exists\")\n",
" else:\n",
" print(f\"❌ Error creating log group: {e}\")\n",
"\n",
"# Create log stream\n",
"try:\n",
" logs_client.create_log_stream(\n",
" logGroupName=log_group_name, logStreamName=log_stream_name\n",
" )\n",
" print(f\"✅ Log stream '{log_stream_name}' created successfully\")\n",
"except ClientError as e:\n",
" if e.response[\"Error\"][\"Code\"] == \"ResourceAlreadyExistsException\":\n",
" print(f\" Log stream '{log_stream_name}' already exists\")\n",
" else:\n",
" print(f\"❌ Error creating log stream: {e}\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "bbe8e057",
"metadata": {},
"outputs": [],
"source": [
"# Create .env file\n",
"service_name = \"customer-support-assistant-strands\"\n",
"\n",
"with open(\".env\", \"w\") as f:\n",
" # AWS Configuration\n",
" f.write(f\"AWS_REGION={region}\\n\")\n",
" f.write(f\"AWS_DEFAULT_REGION={region}\\n\")\n",
" f.write(f\"AWS_ACCOUNT_ID={account_id}\\n\")\n",
"\n",
" # OpenTelemetry Configuration for AWS CloudWatch GenAI Observability\n",
" f.write(\"OTEL_PYTHON_DISTRO=aws_distro\\n\")\n",
" f.write(\"OTEL_PYTHON_CONFIGURATOR=aws_configurator\\n\")\n",
" f.write(\"OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf\\n\")\n",
" f.write(\"OTEL_TRACES_EXPORTER=otlp\\n\")\n",
" f.write(\n",
" f\"OTEL_EXPORTER_OTLP_LOGS_HEADERS=x-aws-log-group={log_group_name},x-aws-log-stream={log_stream_name},x-aws-metric-namespace=agents\\n\"\n",
" )\n",
" f.write(f\"OTEL_RESOURCE_ATTRIBUTES=service.name={service_name}\\n\")\n",
" f.write(\"AGENT_OBSERVABILITY_ENABLED=true\\n\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4e1a44ab",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"from dotenv import load_dotenv\n",
"\n",
"# Load environment variables from .env file\n",
"load_dotenv()\n",
"\n",
"# Display the OTEL-related environment variables\n",
"otel_vars = [\n",
" \"OTEL_PYTHON_DISTRO\",\n",
" \"OTEL_PYTHON_CONFIGURATOR\",\n",
" \"OTEL_EXPORTER_OTLP_PROTOCOL\",\n",
" \"OTEL_EXPORTER_OTLP_LOGS_HEADERS\",\n",
" \"OTEL_RESOURCE_ATTRIBUTES\",\n",
" \"AGENT_OBSERVABILITY_ENABLED\",\n",
" \"OTEL_TRACES_EXPORTER\",\n",
"]\n",
"\n",
"print(\"OpenTelemetry Configuration:\\n\")\n",
"for var in otel_vars:\n",
" value = os.getenv(var)\n",
" if value:\n",
" print(f\"{var}={value}\")"
]
},
{
"cell_type": "markdown",
"id": "70d209db",
"metadata": {},
"source": [
"# Step 3: Define Strands Agent"
]
},
{
"cell_type": "markdown",
"id": "fd3f18d6",
"metadata": {},
"source": [
"Now, let's redefine the same agent as before. \n",
"\n",
"To demonstrate that traces are created, we'll pass a simple greeting query to the agent.\n",
"\n",
"We'll also ensure that the session id is registered."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e1c0d2b3",
"metadata": {},
"outputs": [],
"source": [
"!cp lab_helpers/lab1_strands_agent.py customer_support_agent.py"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fc72a7f8",
"metadata": {},
"outputs": [],
"source": [
"%%writefile -a customer_support_agent.py\n",
"\n",
"import os\n",
"import argparse\n",
"from boto3.session import Session\n",
"from opentelemetry import baggage, context\n",
"from scripts.utils import get_ssm_parameter\n",
"\n",
"from strands import Agent\n",
"from strands.models import BedrockModel\n",
"\n",
"\n",
"def parse_arguments():\n",
" parser = argparse.ArgumentParser(description=\"Customer Support Agent\")\n",
" parser.add_argument(\n",
" \"--session-id\",\n",
" type=str,\n",
" required=True,\n",
" help=\"Session ID to associate with this agent run\",\n",
" )\n",
" return parser.parse_args()\n",
"\n",
"\n",
"def set_session_context(session_id):\n",
" \"\"\"Set the session ID in OpenTelemetry baggage for trace correlation\"\"\"\n",
" ctx = baggage.set_baggage(\"session.id\", session_id)\n",
" token = context.attach(ctx)\n",
" print(f\"Session ID '{session_id}' attached to telemetry context\")\n",
" return token\n",
"\n",
"\n",
"def main():\n",
" # Parse command line arguments\n",
" args = parse_arguments()\n",
"\n",
" # Set session context for telemetry\n",
" context_token = set_session_context(args.session_id)\n",
"\n",
" # Get region\n",
" boto_session = Session()\n",
" region = boto_session.region_name\n",
"\n",
" try:\n",
" # Create the same basic agent from Lab 1\n",
" MODEL = BedrockModel(\n",
" model_id=MODEL_ID,\n",
" temperature=0.3,\n",
" region_name=region,\n",
" )\n",
"\n",
" basic_agent = Agent(\n",
" model=MODEL,\n",
" tools=[\n",
" get_product_info,\n",
" get_return_policy,\n",
" ],\n",
" system_prompt=SYSTEM_PROMPT,\n",
" )\n",
"\n",
" # Execute the travel research task\n",
" query = \"\"\"Greet the user and provide a financial advice.\"\"\"\n",
"\n",
" result = basic_agent(query)\n",
" print(\"Result:\", result)\n",
"\n",
" print(\"✅ Agent executed successfully and trace was pushed to CloudWatch\")\n",
" finally:\n",
" # Detach context when done\n",
" context.detach(context_token)\n",
"\n",
"\n",
"if __name__ == \"__main__\":\n",
" main()"
]
},
{
"cell_type": "markdown",
"id": "af53c71b",
"metadata": {},
"source": [
"# Step 4: AWS OpenTelemetry Python Distro\n",
"\n",
"Now that your environment is configured and agent is created, let's understand how the observability happens. The [AWS OpenTelemetry Python Distro](https://pypi.org/project/aws-opentelemetry-distro/) automatically instruments your Strands agent to capture telemetry data without requiring code changes.\n",
"\n",
"This distribution provides:\n",
"- **Auto-instrumentation** for your Strands Agent hosted outside of AgentCore Runtime (i.e. EC2, Lambda etc..)\n",
"- **AWS-optimized configuration** for seamless CloudWatch integration \n",
"\n",
"### Running Your Instrumented Agent\n",
"\n",
"To capture traces from your Strands agent, use the `opentelemetry-instrument` command instead of running Python directly. This automatically applies instrumentation using the environment variables from your `.env` file:\n",
"\n",
"```bash\n",
"opentelemetry-instrument python customer_support_assistant_agent.py\n",
"```\n",
"\n",
"This command will:\n",
"\n",
"- Load your OTEL configuration from the .env file\n",
"- Automatically instrument Strands, Amazon Bedrock calls, agent tool and databases, and other requests made by agent\n",
"- Send traces to CloudWatch\n",
"- Enable you to visualize the agent's decision-making process in the GenAI Observability dashboard"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b5441912",
"metadata": {},
"outputs": [],
"source": [
"!opentelemetry-instrument python customer_support_agent.py --session-id \"session-1234\""
]
},
{
"cell_type": "markdown",
"id": "47c23fa6",
"metadata": {},
"source": [
"# Step 5: Viewing on Gen AI Observability \n",
"\n",
"Now that we have configured Observability, let's check the traces in AWS CloudWatch's GenAI Observability dashboard. Navigate to Cloudwatch - GenAI Observability - Bedrock AgentCore.\n",
"\n",
"#### Sessions View Page:\n",
"\n",
"![sessions](images/sessions_lab5_observability.png)\n",
"\n",
"#### Traces View Page:\n",
"![traces](images/traces_lab5_observability.png)\n"
]
},
{
"cell_type": "markdown",
"id": "b677cb17",
"metadata": {},
"source": [
"## Congratulations! 🎉\n",
"\n",
"You have successfully **implemented AgentCore Observability with a Strands agent** (without AgentCore Runtime)!\n",
"\n",
"### What You Accomplished:\n",
"\n",
"- ✅ **Observability**: Configured our Strands agent to send telemetry data to Amazon CloudWatch\n",
"- ✅ **Session management**: Ensured traces are stored by session for easier debugging\n",
"\n",
"## Next Steps\n",
"\n",
"Ready to add more AgentCore capabilities? Continue with:\n",
"\n",
"- **Lab 6**: Securely authenticate with external services using AgentCore Identity\n",
"\n",
"## Resources\n",
"\n",
"- [AgentCore Observability Documentation](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/observability.html)\n",
"- [**Official AgentCore Observability Samples**](https://github.com/awslabs/amazon-bedrock-agentcore-samples/tree/main/01-tutorials/06-AgentCore-observability) ⭐\n",
"\n",
"---\n",
"\n",
"**Excellent work! You can trace, debug, and monitor your customer support agent' performance in production environments! 🚀**\n"
]
},
{
"cell_type": "markdown",
"id": "36136b26",
"metadata": {},
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.2"
}
},
"nbformat": 4,
"nbformat_minor": 5
}