A backup retention policy answers one deceptively simple question: how long do we keep each backup, and why? Most teams never write the answer down. They keep whatever their tooling defaulted to (often 7 days), discover the gap during an incident or an audit, and then overcorrect by keeping everything forever, which creates its own compliance and cost problems. This guide shows you how to design a backup retention policy that actually protects you: one that covers realistic recovery scenarios, satisfies the compliance frameworks and contracts you are actually subject to, costs a predictable amount, and resists deletion by attackers and by your own automation.
What a backup retention policy is (and is not)
A retention policy is a written rule set that specifies, for each data source: how often backups run, how long each tier of backups is kept, where they are stored, when they are destroyed, and who can change any of that. It is the policy layer; the enforcement layer is your rotation engine, the promote and prune machinery we cover in how backup rotation works. Policy without enforcement is a wish. Enforcement without policy is whatever your scripts happen to do.
A retention policy is not a recovery plan. It tells you which restore points will exist; it says nothing about whether they restore cleanly or how fast. Pair it with verification, which we treat separately in backup verification.
The two failure modes a policy must defend against
Retention failures come in exactly two flavors, and a good policy is the balance point between them.
Keeping too little. The classic story: a billing bug has been writing wrong amounts for five weeks, your retention is 30 days, and the last clean backup aged out four days before anyone noticed. Slow burning data corruption, insider fraud, gradually spreading ransomware, and disputes about historical data all share one property: the time between the damage and its discovery routinely exceeds short retention windows. Industry incident reports consistently find that breaches and corruption often go undetected for weeks or months. If your deepest backup is 7 or 30 days old, you are betting your company on fast detection.
Keeping too much. The opposite failure is quieter. Backups full of personal data retained for seven years "just in case" are a GDPR data minimization problem, a discovery liability in litigation (everything you keep is subpoenaable), a larger attack surface, and a growing bill. Deleted user data that lives on in a decade of monthly backups undermines your privacy commitments even if no regulator ever asks.
The policy design goal: retain long enough to cover slow discovery scenarios and your real obligations, and not one byte longer than you can justify in writing.
How many backups to keep: a tiered baseline
Flat retention ("keep 30 days of dailies") wastes money on near term density you do not need and starves you of long term reach. Tiered retention solves both. A defensible baseline for a production database:
- Hourly backups, kept 1 to 3 days. Covers the fat head of the risk curve: bad deploys, fat fingered deletes, botched migrations. Teams shipping AI generated code several times a day should treat hourly backups as checkpoints; see database checkpoints for AI generated code for why this tier has become more important, not less.
- Daily backups, kept 14 to 30 days. Covers bugs that surface through support tickets, reconciliation jobs, and monthly reporting.
- Weekly backups, kept 8 to 12 weeks. Covers quarterly cycles: audits, investigations, end of quarter disputes.
- Monthly backups, kept 12 months. Covers annual cycles and the long tail of slow discovery. Extend this tier only when a specific obligation requires it.
That is roughly 80 retained backups at steady state, with recovery resolution that degrades gracefully: any hour this week, any day this month, any week this quarter, any month this year. The numbers flex with your business: an agency holding client WordPress databases has different exposure than a fintech, but the tier structure itself is close to universal.
Compliance drivers: what actually requires what
Most retention anxiety comes from compliance, and most of it is misplaced, because the major frameworks are less prescriptive than people assume. Here is what each one actually demands. (This is engineering guidance, not legal advice; confirm specifics with counsel for your jurisdiction and industry.)
SOC 2
SOC 2 does not mandate any specific retention period. What auditors examine under the availability and processing integrity criteria is whether you have a documented backup and retention policy, whether your systems demonstrably enforce it, and whether you test recovery. A written policy, automated rotation that matches it, alerting on failures, and restore test records are the evidence package. An undocumented "we keep some backups in S3" is what fails the audit. Ottomatik is built with SOC 2 principles, certification in progress, and its enforced retention tiers plus alert history give you exactly the artifacts an auditor asks for.
GDPR and privacy regimes
GDPR pushes in the opposite direction: storage limitation says personal data should be kept no longer than necessary, and erasure requests apply in principle to backups too. The widely accepted operational compromise is to keep backup retention bounded and documented, ensure deleted data ages out of backups on a known schedule, and ensure that if an old backup is ever restored, re deletion of erased subjects is part of the restore runbook. A 12 month maximum with automatic pruning is far easier to defend than indefinite retention. CCPA and similar laws follow the same logic.
Industry and contractual obligations
Real long retention requirements usually come from specific regulation or from your customers' contracts, not from generic frameworks. Financial records often carry multi year obligations (SOX related records, for example, are commonly kept seven years); HIPAA requires retaining certain documentation six years, and some states extend medical record rules; PCI DSS constrains how long you may keep cardholder data at all. Meanwhile, enterprise customers increasingly write backup terms directly into MSAs and DPAs: "daily backups retained 90 days" or "backups stored in EU regions only." Inventory these before designing your tiers, because a contract you signed two years ago may already define your floor. Where long retention applies, consider whether the obligation covers the records (exportable reports, ledgers) rather than full database backups; retaining a purpose built archive is often cleaner than holding entire database dumps for seven years.
Cost math: what retention actually costs
Run the numbers before debating them. The formula is simple: steady state backup count x compressed backup size x price per GB x number of destinations.
Take a 10 GB compressed dump on the baseline policy above (48 hourlies, 14 dailies, 8 weeklies, 12 monthlies, about 80 files): roughly 800 GB at steady state. Stored at two destinations for 3-2-1 compliance:
- Backblaze B2 or Wasabi (around $6 to $7 per TB per month): about $5 to $6 per month each.
- Cloudflare R2 (about $15 per TB per month, zero egress fees): about $12 per month, with free egress mattering on restore day.
- S3 Standard (about $23 per TB per month): about $18 per month, less if you tier monthlies into Glacier class storage.
Even generous retention on a mid sized database costs less per month than a single engineer hour. The expensive scenario is not deep retention; it is unbounded retention without rotation, which grows linearly forever, or shallow retention that fails to cover an incident, which costs you the incident. When extending a tier, price it: doubling monthlies from 12 to 24 on the example above adds about 120 GB, under a dollar a month at B2 rates. Retention depth is one of the cheapest insurance products in your stack. Ottomatik writes to 15+ destinations including S3, Wasabi, R2, DigitalOcean Spaces, Backblaze B2, GCS, and Azure Blob, so you can put the math to work at whichever provider prices best, with flat $79/month pricing for the automation itself.
Immutability: retention that survives an attacker
A retention policy enforced by credentials that can also delete backups is a policy that lasts exactly until those credentials leak. Modern ransomware operators deliberately locate and destroy backups before encrypting production, because victims with intact backups do not pay. Three layers of defense, in increasing strength:
- Separate credentials. Backup write credentials should be distinct from production credentials and scoped to a single bucket or prefix. Production being compromised should not imply backups being deletable.
- Versioning. S3 compatible versioning means an attacker's delete creates a delete marker rather than destroying data. Cheap, easy, and reversible, though an attacker with full permissions can still purge versions.
- Object lock (WORM). Compliance mode object lock on S3, B2, and other providers makes objects undeletable by anyone, including root, until the lock expires. Set the lock duration to match the tier's retention: 35 days on dailies, 13 months on monthlies. This is the strongest practical form of the "1" in 3-2-1-1-0, and our deep dive on the 3-2-1 backup rule for production databases shows where it fits in the larger architecture.
One operational note: object lock also binds you. You cannot shorten retention on locked objects, so apply locks tier by tier rather than blanket locking everything for a year, or your GDPR bounded retention promise becomes unkeepable.
Writing the policy document
The document itself can be one page. For each data source (production Postgres, the MySQL reporting replica, the uploads bucket, the Mongo cluster), record:
- Scope: what is backed up, including whether file assets ride along with database dumps.
- Schedule and tiers: frequency per tier and retention count per tier.
- Destinations: where each copy lives, which one is off site and off account, and which tiers are immutable.
- Justification: one sentence per tier linking it to a risk or obligation ("monthlies kept 12 months per customer MSA section 4.2"). This sentence is what saves you in audits and in internal debates.
- Destruction: confirmation that pruning is automated, plus the rule for legal holds, which suspend deletion for affected data when litigation is reasonably anticipated.
- Ownership and review: who owns the policy and the annual review date. Policies rot; databases get added, contracts change.
Then make the enforcement match the document exactly. The fastest way to fail an audit is a policy that says 12 monthlies while the bucket contains 3, because the cron job changed and nobody updated either side.
Enforcement and the silent failure problem
The most dangerous retention gap is not a misdesigned tier; it is the policy silently not running. Backups stop (expired password, full disk, changed hostname), nothing alerts, and rotation keeps pruning old backups on schedule. Sixty days later your 14 day daily tier contains zero backups and your policy document describes a fantasy. Enforcement therefore needs three properties: backups run on schedule, failures alert immediately, and someone is told when a tier is underpopulated. Ottomatik handles all three: hourly through monthly schedules with tiered retention rotation, and alerts via email, Slack, or SMS that fire on state change only, so you hear about the first failure and the recovery rather than a daily wall of noise that trains everyone to ignore it. Heartbeat monitoring covers any remaining homegrown jobs. And because the self hosted Docker agent runs inside your network, database credentials never leave it, which keeps the security review short.
Verification closes the loop: a retained backup that does not restore is a line item, not protection. Schedule restore tests against your oldest tiers too, not just last night's file; the procedure is in our hands on guide to restore testing.
Worked example: a SaaS team's full policy
A growing SaaS company, Postgres on RDS plus an S3 uploads bucket, EU customers, two enterprise contracts requiring 90 day backup retention:
- Postgres: Ottomatik serverless dumps hourly. Tiers: 48 hourlies, 30 dailies (covers the 90 day contract together with weeklies), 13 weeklies, 12 monthlies. Primary destination: S3 in a separate AWS account, versioned, object lock on weeklies and monthlies. Secondary: Backblaze B2.
- Uploads bucket: daily file backup to B2, 30 dailies, 12 monthlies.
- Justifications: hourly tier for deploy risk; 30 dailies plus 13 weeklies for the contractual 90 days; 12 monthlies as GDPR bounded annual reach back; nothing beyond 13 months, documented in the privacy policy.
- Verification: automated alerts on any missed run, monthly scripted restore of the latest daily, quarterly drill restoring a 3 month old weekly.
Steady state storage runs a few hundred gigabytes per destination, a few dollars per month, plus $79/month for Ottomatik (less than 2 hours of engineering time). Total setup time with the visual Backup Builder: the first backup ran in about 3 minutes; the full policy took an afternoon, most of it spent reading the two contracts.
Frequently asked questions
How long should I keep database backups?
For most production teams: hourly backups for 1 to 3 days, dailies for 2 to 4 weeks, weeklies for 2 to 3 months, monthlies for 12 months. Extend specific tiers only when a named regulation or contract requires it, and document the reason. Twelve months of monthly reach back covers the overwhelming majority of slow discovery incidents.
Does SOC 2 require a specific backup retention period?
No. SOC 2 requires that you have a documented retention policy, that your systems enforce it, and that you can demonstrate both, including recovery testing. Auditors fail undocumented or unenforced practices, not particular durations. Pick durations that match your risks and obligations, write them down, and automate them.
Do GDPR erasure requests apply to backups?
In principle, yes, personal data in backups is still personal data. The accepted operational approach is bounded retention (so erased data ages out on a documented schedule) plus a restore runbook step that re applies erasures if an old backup is ever restored. Unbounded backup retention makes honest GDPR compliance effectively impossible.
Should every backup tier be immutable?
Lock the tiers that matter most for disaster scenarios, typically weeklies and monthlies, with object lock durations matching their retention. Hourly tiers churn too fast for locking to add much, and blanket long locks conflict with bounded retention promises. Versioning plus separate credentials covers the short tiers well.
Turn the policy into running infrastructure
A retention policy protects you only when something enforces it every hour of every day and screams when it cannot. Ottomatik turns the policy in this article into running infrastructure in one sitting: MySQL, PostgreSQL, MongoDB, and file backups on hourly through monthly schedules, tiered retention rotation enforced automatically, 15+ storage destinations for clean off site copies, serverless support for RDS, Supabase, Neon, and PlanetScale, and state change alerts so silent failure is off the table, all for $79/month. Sign up and have your retention policy enforced by tonight.

