Background
UBSi-Commvault is our current backup service, based on Commvault software. UBI was our previous backup service, based on IBM software (variously named Storage Protect, Spectrum Protect, and Tivoli Storage Manager / TSM).
Users who have experienced both services may wish to understand the differences between the retention methods used by each service and the technical reasons behind those differences.
Summary of retention rules
In UBI, we offered two main retention rules (also described here):
- Our default retention, named “STANDARD”, retained unlimited versions of a file for 30 days after the version becomes inactive.
- Our “EXTENDED” retention retained up to 5 versions of a file for 180 days, with the last version being retained for 365 days.
The “EXTENDED” retention in UBI had the limitation that a file which changed frequently would tend to be limited by the maximum version count, and thus might not benefit from the full retention duration.
In UBSi-Commvault, we offer two main retention rules (also described here):
- Our recommended retention plan, named “31 Days (Default)”, retains unlimited backup jobs for 31 days after the backup occurs.
- Our “Extended” retention plan retains unlimited backup jobs for 31 days after the backup occurs, and retains one “full” backup job from each quarter for 1 year after the backup occurred.
The “Extended” retention in UBSi-Commvault has the limitation that a file which is created and then deleted within the same quarter might not be caught by the quarterly full, and thus might not be retained for the full retention duration.
In both services, we may occasionally create custom retention rules for specific business cases. When we do so, we need to be careful to balance the business needs with our resource constraints and the potential impact on the rest of the environment and the backups of other users.
More details
Three main factors, described below, contribute to the differences between the two services.
1. Retention granularity
UBI applied retention rules to individual files. During a backup job, each new file version was added as an independent record with its own backup timestamp. Previous versions of that file would be marked as inactive automatically, but records for other files would not be affected.
In UBSi-Commvault, retention applies to entire backup jobs. When a backup runs, all changed files are backed up into a combined record for that backup job. All files within that job share the same backup time and retention behavior.
Due to the file-level granularity, UBI allowed us to limit the maximum number of versions stored for each file. In UBSi-Commvault, we are unable to limit the maximum number of versions of a file.
2. Retention timing
Both services allow us to limit the maximum duration to retain the copy of a file, but the implementation differs somewhat.
In UBI, the most recent (active) version of a file was retained for an unlimited duration. When a newer version of the file was backed up, or a backup job found that the file had been deleted, then the previous version would be marked inactive. The timer for retention duration of the previous version would begin at that time. If backups stopped occurring, all active files would continue to be retained indefinitely.
In UBSi-Commvault, the retention duration applies to the entire backup job. The timer for retention duration begins at the point of backup. If backups stop occurring, all backup jobs will eventually expire and be removed.
For example, if…
- I am using a 31 day retention for my files.
- I created a file and then backed it up yesterday.
- I modified the file and then backed it up today.
Then…
In UBI,I would have:
- One active version of the file with a backup date of today and no expiration date
- One inactive version of the file with a backup date of yesterday and an expiration 31 days from today.
In UBSi-Commvault, I would have:
- One backup job from today, with an expiration 31 days from today.
- One backup job from yesterday, with an expiration 31 days from yesterday.
3. Backup scheme
UBI used an “incremental forever” backup scheme. All backups were “incremental” in the sense that they backed up only new and modified files and stored individual file versions. There was not a separate “full” backup behavior; a “full” backup is just an “incremental” backup of all files on the device. Expired file versions could be deleted independently of any other files or retention.
UBSi-Commvault uses a more traditional “full” and “incremental” backup scheme. This is required due to the job-based retention. The first backup for a client must be a “full” backup job, which stores a copy of all of the client’s data. Subsequent backups can be an “incremental” backup job, which stores only the new and modified files since the previous job (whether full or incremental). These incremental copies depend on the presence of the previous full. Periodically, the Commvault software will automatically create additional full copies to aid in data expiration. Typically these will be a “synthetic full”, created by merging the previous backup jobs into a new, updated full without needing to re-read all the data from the client. When deleting expired backup data, a full can only be deleted if all of the incrementals which depend on it are also ready to be deleted.
As a result, UBSi-Commvault will typically store multiple full backups of a client in addition to the daily incremental copies, and may occasionally retain jobs for longer than configured. This would result in significant storage usage by redundant data, but our storage pools are configured for deduplication, which mostly compensates for this effect.