Post-Mortem: {incident_title}

Incident ID: {incident_id} Date of Incident: {incident_date} Date of Post-Mortem: {postmortem_date} Severity: {P0 | P1} Duration: {duration} (from detection to resolution) Incident Commander: {name} Post-Mortem Author: {name} Status: Draft | Reviewed | Published Classification: Confidential — Internal Operations


§1 — Executive Summary

One-paragraph summary of what happened, who was affected, and the business impact. Write this for a non-technical audience (Founder, legal, users).

{executive_summary}


§2 — Incident Timeline

All times in UTC. Reconstruct the exact sequence of events from logs, alerts, and team communications.

Time (UTC) Event Source
HH:MM {event_description} {log_file, alert, person}
HH:MM
HH:MM
HH:MM
HH:MM DETECTION — {how was the incident first detected?}
HH:MM ACKNOWLEDGMENT — {who acknowledged? when?}
HH:MM CONTAINMENT START — {what initial containment actions?}
HH:MM CONTAINMENT COMPLETE — {all affected systems isolated?}
HH:MM ROOT CAUSE IDENTIFIED — {what was the root cause?}
HH:MM FIX DEPLOYED — {what was deployed?}
HH:MM VERIFIED — {how was fix confirmed?}
HH:MM RESOLVED — {incident officially closed}

§3 — Detection

How was the incident detected?

{Describe the detection method: automated alert, manual observation, external report, user report, etc.}

Detection Source

Attribute Detail
Alert system {AI Firewall / PII spike / outbound anomaly / GCP Monitoring / external report / manual}
Alert time {timestamp}
Alert recipient {Founder / AIRO / Hermes / AINA}
Alert latency {time from incident start to alert firing}

Could we have detected it sooner?

{Analysis of detection gaps. Are there monitoring improvements that would reduce time-to-detect?}


§4 — Root Cause Analysis

Technical Root Cause

{Detailed technical explanation of what caused the incident. Include code paths, configuration, infrastructure, or process failures.}

Contributing Factors

Factor Category Description
{factor_1} {code / config / process / dependency / human_error} {description}
{factor_2}
{factor_3}

Five Whys

  1. Why did {immediate_cause} happen? → {answer}
  2. Why {answer_1}? → {answer}
  3. Why {answer_2}? → {answer}
  4. Why {answer_3}? → {answer}
  5. Why {answer_4}?ROOT CAUSE: {root_cause}

§5 — Impact Assessment

User Impact

Metric Value
Users affected {count}
API requests failed {count or percentage}
Data exposed {PII types and count, or "None"}
Services degraded {list of affected services}
Duration of impact {duration}

Business Impact

Metric Value
Revenue lost {estimated amount in CAD}
Provider costs incurred {cost of failed/retry requests}
Support tickets generated {count}
Reputational impact {assessment}
Compliance impact {PIPEDA breach report required? OPC notified? Users notified?}

PIPEDA Breach Assessment


§6 — Response Actions

Containment Actions

Action Performed By Time Effective?
{action_1} {person/ai} {timestamp} {Yes/No/Partial}
{action_2}

Remediation Actions

Action Performed By Time Permanent?
{fix_1} {person/ai} {timestamp} {Yes/No — follow-up needed}
{fix_2}

Communication Actions

Communication Channel Audience Time Template Used
{comm_1} {Telegram/Email/Status page} {Founder/Users/Public} {timestamp} {template_name}
{comm_2}

§7 — What Went Well

Identify things that worked correctly during the incident response. Celebrate successes.

  1. {positive_1} — {explanation}
  2. {positive_2} — {explanation}
  3. {positive_3} — {explanation}

§8 — What Went Wrong

Identify areas for improvement. Be specific and blameless — focus on systems, not individuals.

  1. {issue_1} — {explanation and impact}
  2. {issue_2} — {explanation and impact}
  3. {issue_3} — {explanation and impact}

§9 — Lessons Learned

  1. {lesson_1} — {what we learned and why it matters}
  2. {lesson_2} — {what we learned and why it matters}
  3. {lesson_3} — {what we learned and why it matters}

§10 — Action Items

Tracked in the development plan. Each action item must have an owner and a deadline.

# Action Item Priority Owner Deadline Status
AI-1 {action_description} P0 {owner} {date} ☐ Open
AI-2 {action_description} P1 {owner} {date} ☐ Open
AI-3 {action_description} P2 {owner} {date} ☐ Open

§11 — Appendices

A. Relevant Logs

Attach or reference relevant log entries. Sanitize PII and sensitive data.

{paste relevant log excerpts or reference Cloud Logging query}

B. Communication Records

C. Code Changes

Commit Description PR Link
{commit_hash} {commit_message} {pr_link}

D. External References


Post-Mortem Sign-Off:


Template Version: 1.0.0 Related: docs/incident-response-plan.md §7 — Post-Mortem Process