Post-Mortem: {incident_title}

Incident ID: {incident_id} Date of Incident: {incident_date} Date of Post-Mortem: {postmortem_date} Severity: {P0 | P1} Duration: {duration} (from detection to resolution) Incident Commander: {name} Post-Mortem Author: {name} Status: Draft | Reviewed | Published Classification: Confidential — Internal Operations

§1 — Executive Summary

One-paragraph summary of what happened, who was affected, and the business impact. Write this for a non-technical audience (Founder, legal, users).

{executive_summary}

§2 — Incident Timeline

All times in UTC. Reconstruct the exact sequence of events from logs, alerts, and team communications.

Time (UTC)	Event	Source
HH:MM	{event_description}	{log_file, alert, person}
HH:MM
HH:MM
HH:MM
HH:MM	DETECTION — {how was the incident first detected?}
HH:MM	ACKNOWLEDGMENT — {who acknowledged? when?}
HH:MM	CONTAINMENT START — {what initial containment actions?}
HH:MM	CONTAINMENT COMPLETE — {all affected systems isolated?}
HH:MM	ROOT CAUSE IDENTIFIED — {what was the root cause?}
HH:MM	FIX DEPLOYED — {what was deployed?}
HH:MM	VERIFIED — {how was fix confirmed?}
HH:MM	RESOLVED — {incident officially closed}

§3 — Detection

How was the incident detected?

{Describe the detection method: automated alert, manual observation, external report, user report, etc.}

Detection Source

Attribute	Detail
Alert system	{AI Firewall / PII spike / outbound anomaly / GCP Monitoring / external report / manual}
Alert time	{timestamp}
Alert recipient	{Founder / AIRO / Hermes / AINA}
Alert latency	{time from incident start to alert firing}

Could we have detected it sooner?

{Analysis of detection gaps. Are there monitoring improvements that would reduce time-to-detect?}

§4 — Root Cause Analysis

Technical Root Cause

{Detailed technical explanation of what caused the incident. Include code paths, configuration, infrastructure, or process failures.}

Contributing Factors

Factor	Category	Description
{factor_1}	{code / config / process / dependency / human_error}	{description}
{factor_2}
{factor_3}

Five Whys

Why did {immediate_cause} happen? → {answer}
Why {answer_1}? → {answer}
Why {answer_2}? → {answer}
Why {answer_3}? → {answer}
Why {answer_4}? → ROOT CAUSE: {root_cause}

§5 — Impact Assessment

User Impact

Metric	Value
Users affected	{count}
API requests failed	{count or percentage}
Data exposed	{PII types and count, or "None"}
Services degraded	{list of affected services}
Duration of impact	{duration}

Business Impact

Metric	Value
Revenue lost	{estimated amount in CAD}
Provider costs incurred	{cost of failed/retry requests}
Support tickets generated	{count}
Reputational impact	{assessment}
Compliance impact	{PIPEDA breach report required? OPC notified? Users notified?}

PIPEDA Breach Assessment

Did the incident involve personal information? → {Yes/No}
Does it pose a real risk of significant harm (RROSH)? → {Yes/No}
OPC notified? → {Yes (date) / No / Not applicable}
Affected users notified? → {Yes (date, count) / No / Not applicable}
Breach record stored? → {breach_record_id in breach_records table}

§6 — Response Actions

Containment Actions

Action	Performed By	Time	Effective?
{action_1}	{person/ai}	{timestamp}	{Yes/No/Partial}
{action_2}

Remediation Actions

Action	Performed By	Time	Permanent?
{fix_1}	{person/ai}	{timestamp}	{Yes/No — follow-up needed}
{fix_2}

Communication Actions

Communication	Channel	Audience	Time	Template Used
{comm_1}	{Telegram/Email/Status page}	{Founder/Users/Public}	{timestamp}	{template_name}
{comm_2}

§7 — What Went Well

Identify things that worked correctly during the incident response. Celebrate successes.

{positive_1} — {explanation}
{positive_2} — {explanation}
{positive_3} — {explanation}

§8 — What Went Wrong

Identify areas for improvement. Be specific and blameless — focus on systems, not individuals.

{issue_1} — {explanation and impact}
{issue_2} — {explanation and impact}
{issue_3} — {explanation and impact}

§9 — Lessons Learned

{lesson_1} — {what we learned and why it matters}
{lesson_2} — {what we learned and why it matters}
{lesson_3} — {what we learned and why it matters}

§10 — Action Items

Tracked in the development plan. Each action item must have an owner and a deadline.

#	Action Item	Priority	Owner	Deadline	Status
AI-1	{action_description}	P0	{owner}	{date}	☐ Open
AI-2	{action_description}	P1	{owner}	{date}	☐ Open
AI-3	{action_description}	P2	{owner}	{date}	☐ Open

§11 — Appendices

A. Relevant Logs

Attach or reference relevant log entries. Sanitize PII and sensitive data.

{paste relevant log excerpts or reference Cloud Logging query}

B. Communication Records

Founder Telegram alert: {reference or screenshot}
User notification email: {reference or copy}
Status page update: {link or copy}
OPC breach report: {reference or copy}

C. Code Changes

Commit	Description	PR Link
{commit_hash}	{commit_message}	{pr_link}

D. External References

{reference_1}
{reference_2}

Post-Mortem Sign-Off:

Incident Commander review: {name} — {date}

Technical review (if AIRO involved): {name} — {date}

All action items assigned with deadlines

Filed in ops/postmortems/YYYY-MM-DD-{incident_id}-{slug}.md

Template Version: 1.0.0 Related: docs/incident-response-plan.md §7 — Post-Mortem Process