amazon-bedrock-agentcore-sa.../01-tutorials/07-AgentCore-E2E/Optional-lab-agentcore-observability.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "6bf252a7",
   "metadata": {},
   "source": [
    "# Lab 5: Deeper look at GenAI Observability for Your Customer Support Agent\n",
    "\n",
    "## Overview\n",
    "\n",
    "In this lab, you will understand how AgentCore Observability works and how to set it up without using AgentCore Runtime.\n",
    "\n",
    "## What You'll Add\n",
    "\n",
    "🧠 **AgentCore Observability Features**:\n",
    "- **Set up** Amazon OpenTelemetry Python Instrumentation  \n",
    "- **Visualize and analyze** agent traces in Amazon CloudWatch GenAI Observability\n",
    "\n",
    "## Tutorial Details\n",
    "\n",
    "| Information | Details |\n",
    "|-------------|---------|\n",
    "| **Tutorial type** | Incremental Enhancement |\n",
    "| **Agent type** | Single Agent |\n",
    "| **Agentic Framework** | Strands Agents |\n",
    "| **LLM model** | Anthropic Claude 3.7 Sonnet |\n",
    "| **Tutorial vertical** | Customer Support |\n",
    "| **Complexity** | Easy to Moderate |\n",
    "| **SDK used** | Strands SDK, AgentCore Observability, Cloudwatch, Bedrock, boto3 |\n",
    "\n",
    "## Prerequisites\n",
    "\n",
    "- ✅ **Must complete Lab 1 first** - This lab builds directly on your Lab 1 agent \n",
    "- ✅ **Enable transaction search on Amazon CloudWatch** - First-time users must enable CloudWatch Transaction Search to view Bedrock AgentCore spans and traces. To enable transaction search, please refer to the our [documentation](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Enable-TransactionSearch.html).\n",
    "\n",
    "## Learning Objectives\n",
    "\n",
    "By the end of this lab, you will:\n",
    "- Use the official Amazon CloudWatch GenAI Observability Dashboard\n",
    "\n",
    "\n",
    "---\n",
    "\n",
    "## 🚀 Let's Add Observability to your agent\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b3bead94",
   "metadata": {},
   "source": [
    "Initialize clients"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "90de7e84",
   "metadata": {},
   "outputs": [],
   "source": [
    "import boto3\n",
    "from botocore.exceptions import ClientError\n",
    "\n",
    "session = boto3.Session()\n",
    "region = session.region_name\n",
    "\n",
    "logs_client = boto3.client(\"logs\", region_name=region)\n",
    "bedrock_client = boto3.client(\"bedrock\", region_name=region)\n",
    "sts_client = boto3.client(\"sts\", region_name=region)\n",
    "\n",
    "account_id = sts_client.get_caller_identity()[\"Account\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b31817c2",
   "metadata": {},
   "source": [
    "Make sure to have `aws-opentelemetry-distro` installed"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bb3009b0",
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install strands-agents boto3 aws-opentelemetry-distro -q"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7eaca492",
   "metadata": {},
   "source": [
    "# Step 2: Configure Environment for Observability\n",
    "\n",
    "To enable observability for your Strands agent and send telemetry data to Amazon CloudWatch, you'll need to configure the following environment variables. We'll create a `.env` file to manage these settings securely, keeping sensitive AWS credentials separate from your code while making it easy to switch between different environments.\n",
    "\n",
    "Required Environment Variables:\n",
    "\n",
    "| Variable | Value | Purpose |\n",
    "|----------|-------|---------|\n",
    "| `OTEL_PYTHON_DISTRO` | `aws_distro` | Use AWS Distro for OpenTelemetry (ADOT) |\n",
    "| `OTEL_PYTHON_CONFIGURATOR` | `aws_configurator` | Set AWS configurator for ADOT SDK |\n",
    "| `OTEL_EXPORTER_OTLP_PROTOCOL` | `http/protobuf` | Configure export protocol |\n",
    "| `OTEL_TRACES_EXPORTER` | `otlp` | Configure trace exporter |\n",
    "| `OTEL_EXPORTER_OTLP_LOGS_HEADERS` | `x-aws-log-group=<YOUR-LOG-GROUP>,x-aws-log-stream=<YOUR-LOG-STREAM>,x-aws-metric-namespace=<YOUR-NAMESPACE>` | Direct logs to CloudWatch groups |\n",
    "| `OTEL_RESOURCE_ATTRIBUTES` | `service.name=<YOUR-AGENT-NAME>` | Identify your agent in observability data |\n",
    "| `AGENT_OBSERVABILITY_ENABLED` | `true` | Activate ADOT pipeline |\n",
    "\n",
    "Also, ensure you set `AWS_REGION`, `AWS_DEFAULT_REGION` and `AWS_ACCOUNT_ID` environment variables as these will be picked up by the opentelemetry instrument script."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e0c02eb1",
   "metadata": {},
   "outputs": [],
   "source": [
    "log_group_name = \"agents/customer-support-assistant-logs\"  # Your log group name\n",
    "log_stream_name = \"default\"  # Your log stream name\n",
    "\n",
    "# Create log group\n",
    "try:\n",
    "    logs_client.create_log_group(logGroupName=log_group_name)\n",
    "    print(f\"✅ Log group '{log_group_name}' created successfully\")\n",
    "except ClientError as e:\n",
    "    if e.response[\"Error\"][\"Code\"] == \"ResourceAlreadyExistsException\":\n",
    "        print(f\"ℹ️  Log group '{log_group_name}' already exists\")\n",
    "    else:\n",
    "        print(f\"❌ Error creating log group: {e}\")\n",
    "\n",
    "# Create log stream\n",
    "try:\n",
    "    logs_client.create_log_stream(\n",
    "        logGroupName=log_group_name, logStreamName=log_stream_name\n",
    "    )\n",
    "    print(f\"✅ Log stream '{log_stream_name}' created successfully\")\n",
    "except ClientError as e:\n",
    "    if e.response[\"Error\"][\"Code\"] == \"ResourceAlreadyExistsException\":\n",
    "        print(f\"ℹ️  Log stream '{log_stream_name}' already exists\")\n",
    "    else:\n",
    "        print(f\"❌ Error creating log stream: {e}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bbe8e057",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create .env file\n",
    "service_name = \"customer-support-assistant-strands\"\n",
    "\n",
    "with open(\".env\", \"w\") as f:\n",
    "    # AWS Configuration\n",
    "    f.write(f\"AWS_REGION={region}\\n\")\n",
    "    f.write(f\"AWS_DEFAULT_REGION={region}\\n\")\n",
    "    f.write(f\"AWS_ACCOUNT_ID={account_id}\\n\")\n",
    "\n",
    "    # OpenTelemetry Configuration for AWS CloudWatch GenAI Observability\n",
    "    f.write(\"OTEL_PYTHON_DISTRO=aws_distro\\n\")\n",
    "    f.write(\"OTEL_PYTHON_CONFIGURATOR=aws_configurator\\n\")\n",
    "    f.write(\"OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf\\n\")\n",
    "    f.write(\"OTEL_TRACES_EXPORTER=otlp\\n\")\n",
    "    f.write(\n",
    "        f\"OTEL_EXPORTER_OTLP_LOGS_HEADERS=x-aws-log-group={log_group_name},x-aws-log-stream={log_stream_name},x-aws-metric-namespace=agents\\n\"\n",
    "    )\n",
    "    f.write(f\"OTEL_RESOURCE_ATTRIBUTES=service.name={service_name}\\n\")\n",
    "    f.write(\"AGENT_OBSERVABILITY_ENABLED=true\\n\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4e1a44ab",
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "from dotenv import load_dotenv\n",
    "\n",
    "# Load environment variables from .env file\n",
    "load_dotenv()\n",
    "\n",
    "# Display the OTEL-related environment variables\n",
    "otel_vars = [\n",
    "    \"OTEL_PYTHON_DISTRO\",\n",
    "    \"OTEL_PYTHON_CONFIGURATOR\",\n",
    "    \"OTEL_EXPORTER_OTLP_PROTOCOL\",\n",
    "    \"OTEL_EXPORTER_OTLP_LOGS_HEADERS\",\n",
    "    \"OTEL_RESOURCE_ATTRIBUTES\",\n",
    "    \"AGENT_OBSERVABILITY_ENABLED\",\n",
    "    \"OTEL_TRACES_EXPORTER\",\n",
    "]\n",
    "\n",
    "print(\"OpenTelemetry Configuration:\\n\")\n",
    "for var in otel_vars:\n",
    "    value = os.getenv(var)\n",
    "    if value:\n",
    "        print(f\"{var}={value}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "70d209db",
   "metadata": {},
   "source": [
    "# Step 3: Define Strands Agent"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fd3f18d6",
   "metadata": {},
   "source": [
    "Now, let's redefine the same agent as before. \n",
    "\n",
    "To demonstrate that traces are created, we'll pass a simple greeting query to the agent.\n",
    "\n",
    "We'll also ensure that the session id is registered."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e1c0d2b3",
   "metadata": {},
   "outputs": [],
   "source": [
    "!cp lab_helpers/lab1_strands_agent.py customer_support_agent.py"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "fc72a7f8",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%writefile -a customer_support_agent.py\n",
    "\n",
    "import os\n",
    "import argparse\n",
    "from boto3.session import Session\n",
    "from opentelemetry import baggage, context\n",
    "from scripts.utils import get_ssm_parameter\n",
    "\n",
    "from strands import Agent\n",
    "from strands.models import BedrockModel\n",
    "\n",
    "\n",
    "def parse_arguments():\n",
    "    parser = argparse.ArgumentParser(description=\"Customer Support Agent\")\n",
    "    parser.add_argument(\n",
    "        \"--session-id\",\n",
    "        type=str,\n",
    "        required=True,\n",
    "        help=\"Session ID to associate with this agent run\",\n",
    "    )\n",
    "    return parser.parse_args()\n",
    "\n",
    "\n",
    "def set_session_context(session_id):\n",
    "    \"\"\"Set the session ID in OpenTelemetry baggage for trace correlation\"\"\"\n",
    "    ctx = baggage.set_baggage(\"session.id\", session_id)\n",
    "    token = context.attach(ctx)\n",
    "    print(f\"Session ID '{session_id}' attached to telemetry context\")\n",
    "    return token\n",
    "\n",
    "\n",
    "def main():\n",
    "    # Parse command line arguments\n",
    "    args = parse_arguments()\n",
    "\n",
    "    # Set session context for telemetry\n",
    "    context_token = set_session_context(args.session_id)\n",
    "\n",
    "    # Get region\n",
    "    boto_session = Session()\n",
    "    region = boto_session.region_name\n",
    "\n",
    "    try:\n",
    "        # Create the same basic agent from Lab 1\n",
    "        MODEL = BedrockModel(\n",
    "            model_id=MODEL_ID,\n",
    "            temperature=0.3,\n",
    "            region_name=region,\n",
    "        )\n",
    "\n",
    "        basic_agent = Agent(\n",
    "            model=MODEL,\n",
    "            tools=[\n",
    "                get_product_info,\n",
    "                get_return_policy,\n",
    "            ],\n",
    "            system_prompt=SYSTEM_PROMPT,\n",
    "        )\n",
    "\n",
    "        # Execute the travel research task\n",
    "        query = \"\"\"Greet the user and provide a financial advice.\"\"\"\n",
    "\n",
    "        result = basic_agent(query)\n",
    "        print(\"Result:\", result)\n",
    "\n",
    "        print(\"✅ Agent executed successfully and trace was pushed to CloudWatch\")\n",
    "    finally:\n",
    "        # Detach context when done\n",
    "        context.detach(context_token)\n",
    "\n",
    "\n",
    "if __name__ == \"__main__\":\n",
    "    main()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "af53c71b",
   "metadata": {},
   "source": [
    "# Step 4: AWS OpenTelemetry Python Distro\n",
    "\n",
    "Now that your environment is configured and agent is created, let's understand how the observability happens. The [AWS OpenTelemetry Python Distro](https://pypi.org/project/aws-opentelemetry-distro/) automatically instruments your Strands agent to capture telemetry data without requiring code changes.\n",
    "\n",
    "This distribution provides:\n",
    "- **Auto-instrumentation** for your Strands Agent hosted outside of AgentCore Runtime (i.e. EC2, Lambda etc..)\n",
    "- **AWS-optimized configuration** for seamless CloudWatch integration  \n",
    "\n",
    "### Running Your Instrumented Agent\n",
    "\n",
    "To capture traces from your Strands agent, use the `opentelemetry-instrument` command instead of running Python directly. This automatically applies instrumentation using the environment variables from your `.env` file:\n",
    "\n",
    "```bash\n",
    "opentelemetry-instrument python customer_support_assistant_agent.py\n",
    "```\n",
    "\n",
    "This command will:\n",
    "\n",
    "- Load your OTEL configuration from the .env file\n",
    "- Automatically instrument Strands, Amazon Bedrock calls, agent tool and databases, and other requests made by agent\n",
    "- Send traces to CloudWatch\n",
    "- Enable you to visualize the agent's decision-making process in the GenAI Observability dashboard"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b5441912",
   "metadata": {},
   "outputs": [],
   "source": [
    "!opentelemetry-instrument python customer_support_agent.py --session-id \"session-1234\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "47c23fa6",
   "metadata": {},
   "source": [
    "# Step 5: Viewing on Gen AI Observability \n",
    "\n",
    "Now that we have configured Observability, let's check the traces in AWS CloudWatch's GenAI Observability dashboard. Navigate to Cloudwatch - GenAI Observability - Bedrock AgentCore.\n",
    "\n",
    "#### Sessions View Page:\n",
    "\n",
    "![sessions](images/sessions_lab5_observability.png)\n",
    "\n",
    "#### Traces View Page:\n",
    "![traces](images/traces_lab5_observability.png)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b677cb17",
   "metadata": {},
   "source": [
    "## Congratulations! 🎉\n",
    "\n",
    "You have successfully **implemented AgentCore Observability with a Strands agent** (without AgentCore Runtime)!\n",
    "\n",
    "### What You Accomplished:\n",
    "\n",
    "- ✅ **Observability**: Configured our Strands agent to send telemetry data to Amazon CloudWatch\n",
    "- ✅ **Session management**: Ensured traces are stored by session for easier debugging\n",
    "\n",
    "## Next Steps\n",
    "\n",
    "Ready to add more AgentCore capabilities? Continue with:\n",
    "\n",
    "- **Lab 6**: Securely authenticate with external services using AgentCore Identity\n",
    "\n",
    "## Resources\n",
    "\n",
    "- [AgentCore Observability Documentation](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/observability.html)\n",
    "- [**Official AgentCore Observability Samples**](https://github.com/awslabs/amazon-bedrock-agentcore-samples/tree/main/01-tutorials/06-AgentCore-observability) ⭐\n",
    "\n",
    "---\n",
    "\n",
    "**Excellent work! You can trace, debug, and monitor your customer support agent' performance in production environments! 🚀**\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "36136b26",
   "metadata": {},
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.13.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}