Status: {draft|final}
Owners: {who drove the incident resolution}
Description: {brief description of symptoms and root cause}
Component: {affected area}
Date/time: {YYYY-MM-DD HH:MM}
Duration: {time from initial breakage to final resolution}
User impact: {who was affected by the incident}
14:44 - something happened
14:45 - next thing happened <START OF OUTAGE>
09:12 - another thing happened <END OF OUTAGE>
{summarize the problems that the outage caused}
{without blame, describe the root cause of the outage}
{list things where things worked as expected in a positive manner}
{list things that mitigated this incident but not because of our foresight}
{list things that failed, with github issues from the action items section}
{each item here should have an owner}
{link to github issues for things that would have prevented this failure from happening in the first place, such as input validation, pinning dependencies, etc}
{link to github issues for things that would have detected this failure before it became An Incident, such as better testing, monitoring, etc}
{link to github issues for things that would have made this failure less serious, such as graceful degradation, better exception handling, etc}
{link to github issues for things that would have helped us resolve this failure faster, such as documented processes and protocols, etc}
{link to github issues or PRs/commits for the actual fixes that were necessary to resolve this incident}
{any other useful information, such as relevant chat logs}