Building your operations management with AI-Powered Operations at re:Invent 2025

Building your operations management with AI-Powered Operations at re:Invent 2025

ops mgmt banner

As organizations continue to scale and evolve their cloud environments, effective operations management has become more critical than ever. Operations management under the Cloud Operations track at AWS re:Invent 2025 offers a comprehensive lineup of sessions designed to help you build resilient, secure, and efficient operational practices across your AWS environment. Whether you’re managing complex multicloud environments, implementing AI-powered automation, or strengthening your disaster recovery strategies, this track has something for everyone.

This blog post will guide you through the key themes of operations management and highlight sessions that will help you transform your cloud operations strategy.

Plan Your Operations Management Track Experience

Operations management under the Cloud Operations track at re:Invent 2025 showcases AWS’s commitment to simplifying cloud operations through intelligent automation. Whether you’re managing a single-cloud environment or complex multicloud infrastructure, these sessions will provide practical strategies to enhance operational efficiency, security, and reliability.

With 12 sessions spanning five key themes, operations management offers something for everyone, from hands-on workshops to expert-level discussions. To make the most of your re:Invent experience, we recommend:

  1. Focus on your priorities: Select sessions that align with your organization’s immediate operational challenges
  2. Mix formats: Combine lecture-style sessions with interactive workshops and builders’ sessions
  3. Plan for skill development: Choose sessions that match your current skill level and those that stretch your capabilities
  4. Reserve early: Popular sessions fill up quickly, so reserve your spot as soon as registration opens

Key Themes at re:Invent for Operations Management

The operations management is organized around five core themes that address today’s most pressing operational challenges:

1. AI-Powered Operations

The integration of generative AI and machine learning into cloud operations represents one of the most transformative shifts in how organizations manage their infrastructure. Sessions in this theme showcase how Amazon Bedrock, Amazon Q, AWS Systems Manager, and other services can be leveraged to create intelligent operational workflows, from automated monitoring to predictive maintenance.

2. Resilience & Disaster Recovery

Building resilient systems that can withstand disruptions is essential for business continuity. The operations management track features sessions that demonstrate how to combine AWS Resilience Hub with generative AI to create sophisticated disaster recovery playbooks, conduct resilience testing, and implement automated recovery procedures.

3. Multicloud Management

As organizations adopt multiple cloud providers, the complexity of managing diverse environments increases exponentially. Learn how AWS provides tools and services that enable centralized visibility and control across your entire cloud estate, whether it’s AWS, on-premises, or other cloud providers.

4. Automation at Scale

Manual operations simply can’t keep pace with the scale and complexity of modern cloud environments. The operations management track offers practical guidance on implementing automation across your cloud operations, from patch management to security incident response.

5. Compliance & Security

Security and compliance remain top concerns for organizations of all sizes. Discover how to implement automated security controls, streamline compliance processes, and build governance frameworks that scale with your business.

Session Formats to Fit Your Learning Style

re:Invent offers a variety of session formats to accommodate different learning preferences. Here are some must-attend sessions by theme:

AI-Powered Operations

COP322|Building AI-Powered operational insights and automated remediation |Builders’ session
Location: Wednesday, Dec 3 1:30 PM – 2:30 PM PST |Mandalay Bay
This builders’ session demonstrates how to combine Amazon Q, Amazon OpenSearch Service, and AWS Systems Manager to create an intelligent operations platform that can detect anomalies and automatically remediate issues. You’ll learn to leverage Large Language Models through the Model Context Protocol (MCP) Server to analyze operational data and implement automated remediation workflows.

COP314|Scale & automate patching with AI-powered visualization| Workshop
Location: Thursday, Dec 4 12:30 PM – 2:30 PM PST |MGM
In this hands-on workshop, discover how AWS Systems Manager and Amazon Q can transform your patch management processes. You’ll learn to implement automated patching solutions, configure compliance reporting, and create dynamic visualizations using natural language queries with Amazon Q.

COP407| Building custom agents for intelligent AWS patch automation| Code Talk
Location: Wednesday, Dec 311:30 AM – 12:30 PM PST| Wynn
This expert-level code talk demonstrates how to build a custom Model Context Protocol (MCP) server that enforces organizational policies for patch management. You’ll see how to implement a policy engine that validates compliance requirements before authorizing patches, creating a scalable governance framework that reduces manual work.

Resilience & Disaster Recovery

COP303|Automate disaster recovery playbooks using generative AI |Builders’ Session 
Location: Thursday, Dec 4 11:00 AM – 12:00 PM PST |Wynn
This advanced builders’ session shows how to combine Amazon Bedrock, AWS Resilience Hub, and AWS Systems Manager to create automated disaster recovery plans. Learn to generate and validate recovery runbooks that align with compliance requirements while maintaining regulatory compliance.

COP420|AI-powered resilience testing and disaster recovery| Breakout Session 
Location: Tuesday, Dec 2 1:30 PM – 2:30 PM PST |Wynn
This expert-level breakout session demonstrates how to leverage Large Language Models with AWS Resilience Hub and AWS Systems Manager to modernize resilience testing. You’ll learn how to analyze infrastructure, generate targeted AWS Fault Injection Service experiments, and create comprehensive recovery runbooks.

Multicloud Management

COP313|Multicloud & hybrid node operation at scale is easier than you think |Chalk Talk 
Location: Monday, Dec 1 10:30 AM – 11:30 AM PST |Mandalay Bay
This chalk talk explores how AWS Systems Manager and Amazon CloudWatch can help you efficiently manage compute resources across multiple cloud environments and on-premises infrastructure. Learn how to implement consistent controls for patching, application deployment, and access management across your distributed compute landscape.

COP342|Centralize Multicloud Management using AWS | Breakout Session 
Location: Thursday, Dec 4 11:30 AM – 12:30 PM PST |MGM
This breakout session demonstrates how AWS Systems Manager, Amazon CloudWatch, and Amazon Managed Grafana can simplify operations in multicloud environments. Discover how to create unified dashboards that provide visibility into metrics and logs from any data source, whether your workloads run on AWS, on-premises, or multiple clouds.

Automation at Scale

COP340| Building reliable operations, feat. Fannie Mae |Breakout Session 
Location: Tuesday, Dec 25:30 PM – 6:30 PM PST | Caesars Forum
This breakout session features a real-world case study from Fannie Mae, showcasing how they built a cross-region observability platform on AWS to automate incident response and improve reliability. Learn practical strategies for implementing automated incident management and establishing effective on-call processes.

COP344 |Implementing Automated Security Controls for Zero-Day Defense |Chalk Talk 
Location: Wednesday, Dec 3 1:30 PM – 2:30 PM PST |MGM
This chalk talk shows how to combine Amazon Inspector, AWS Systems Manager, and AWS Security Hub to implement automated security controls that respond to zero-day vulnerabilities. Learn how to build automated compliance monitoring and continuous control validation through infrastructure and policy as code.

COP343|Streamline operations with automated health monitoring and response| Chalk Talk 
Location: Wednesday, Dec 3 12:00 PM – 1:00 PM PST| Mandalay Bay
In this chalk talk, discover how to implement comprehensive health monitoring and automated incident response using AWS Health, Amazon CloudWatch, and AWS CloudTrail. You’ll learn to create effective monitoring patterns, transform metrics into actions, and implement automated remediation workflows.

Compliance & Security

COP310| Automating compliance and auditing at scale|Workshop 
Location: Wednesday, Dec 3 9:00 AM – 11:00 AM PST |Mandalay Bay
This workshop demonstrates how to build automated compliance controls using AWS Config, Systems Manager, and Audit Manager. Learn to implement automated security assessments and remediation workflows while leveraging Amazon Q CLI and CloudTrail Lake for intelligent investigation.

COP341|Implement secure automated workflows with AWS Systems Manager |Chalk Talk 
Location: Monday, Dec 1 11:30 AM – 12:30 PM PST |MGM
This chalk talk shows how AWS Systems Manager enables you to build automated responses to security and operational incidents while maintaining detailed audit trails. Learn how to implement controlled emergency access procedures and develop automated runbooks that reduce remediation time while maintaining governance.

Looking Forward

The operations management under the Cloud Operations track at AWS re:Invent 2025 offers a comprehensive look at how organizations can transform their operational practices using the latest AWS services and best practices. From AI-powered automation to multicloud management, the sessions in this track will equip you with the knowledge and skills needed to build resilient, secure, and efficient cloud operations.

We look forward to seeing you at re:Invent 2025 and don’t forget to visit the Cloud Operations kiosk in the Venetian!

Stay Informed

Get the best articles every day for FREE. Cancel anytime.