Volume Backup and Replication – The advanced features

Oracle Cloud (OCI) provides administrators with the ability to backup the underlying virtual discs (called Volumes) using BV backups. Whether it’s a BOOT Volume (the main drive in any compute instance) or an attached BLOCK volume (additional storage), the process is almost identical, with just a small syntax difference. It should be noted, you cannot backup the volumes which make up PaaS services like an Oracle Database VM. I’ll cover doing clever things with these in a later blog.

The OCI console allows you to define a backup policy, which includes frequency of backup, and retention, and either a FULL or INCREMENTAL backup. A backup policy can include multiple schedules, for example a daily backup, retained for 2 weeks, a weekly retained for 6 weeks, and a monthly retained for 5 years, and whether to replicate this to another region, but each volume can only have a single backup policy.

Firstly, it’s worth noting the difference between FULL and INCREMENTAL backups. The first backup of any volume is always a FULL. A full backup is great as it’s very quick to restore from, but it also is more costly as it is the same size as the original volume it’s based on. An INCREMENTAL is the difference (in blocks) between the last backup, and the current position. As a result, incremental backups can take longer to restore from. Even after restoring Oracle performs “rehydration” activities in the background which uses both the INCREMENTAL and it’s preceding FULL backups.. Until all this is done, there are some things you can’t do with a restored volume and performance may be slightly reduced. Interestingly, if you have a FULL backup, and a subsequent INCREMENTAL, and then you delete the full, the INCREMENTAL becomes a FULL(invisibly) and the deletion of the original FULL is not complete until all the blocks have been copied into the INCREMENTAL.

Anyway, volume backups are great, because they provide a consistent point-in-time backup for any of your block volumes, and by using volume groups you can make the backups consistent across multiple volumes. This all means, if the worst happens while you’re applying a windows patch and your OS becomes unusable, you can flash your C: back to last night’s backup (and as long as you observed good drive policy, and your data wasn’t on C:, but on an attached D: or E:) then you can get your system back to the last backup (same for Linux).

Oracle also provides the ability to perform a cross-region replica, so any backup you take is automatically (although asynchronously) replicated to a second region of your choice.

This all sounds exactly what any Administrator wants, but what are the drawbacks:

  1. The GUI only allows you to take daily backups (at most)
  2. Cross region replicas backup EVERY backup. You can only apply a single backup policy to a volume, and the option allows for replication to be turned on or off. This can double your storage costs for backups (because all cross-region backups create a FULL irrelevant of what type of backup existing at the origin), as well as potentially introducing cross-region transfer costs.

So what are the options ?

There is an API for everything in OCI. In my first blog I talked about how to use them, so check that one out if you’re not sure, but the API’s allow you to do a lot more than the GUI. One example is that you could take an incremental backup every hour, if you wanted (although I wouldn’t recommend any more frequent than that, as you cannot perform 2 backups or a single volume at the same time.

The syntax for taking a manual backup of a block volume using the OCICLI is as follows:

oci bv backup create –volume-id <ID> –display-name “ad-hoc backup” –type INCREMENTAL

Or for a boot volume

oci bv boot-volume-backup create –volume-id <ID> –display-name “ad-hoc backup” –type INCREMENTAL

(I’m going to stick with block volumes for now, because as you can see the syntax is almost identical, but you can get the full syntax here: https://docs.oracle.com/en-us/iaas/tools/oci-cli/3.30.2/oci_cli_docs/oci.html, and just look for the syntax you want)

What I wanted to achieve was to take a weekly backup of certain volumes, and replicate these to a second region (which we could then use as DR restores, cloning sources etc). My criteria were :

  • Not every BOOT of BLOCK volume was needed
  • I didn’t want to set an expiry on the replicated volumes (OCI by default sets the same expiry on a replicated volume as the source)
  • I did want to maintain a specific number of copies (e.g. only the most recent 3), to self-manage the storage costs

In order to achieve this, I firstly created a tag to mark blocks I wanted to replicate. Tags are super useful in OCI for a whole bunch of different reasons, like searching etc, but in this case it allowed me to add or remove a specific volume from the schedule without actually changing the script, just by adding or removing a tag.

I then wrote a script which used this flow:

  • Find all volumes, in all compartments which had the specific tag, and build an array of these
  • Loop round this array of volumes
    • Perform an ad-hoc backup for the volume, grabbing it’s backup id in the process (specifying no expiry)
    • Wait until the backup had completed
    • Once completed, replicate this volume to the second site
    • Wait until the replication had completed
    • Delete the ad-hoc backup you just took
    • Prune the number of backups at the target based on an agreed retention

I’m not going to cover all the syntax I used for all of these processes (I will eventually in subsequent blogs), however there were a few interesting ways of accessing / processing the data. The first was the structured-search, and I’ll cover this in the next blog, as it’s exceptionally useful for getting access to data in a whole raft of different ways (the syntax can also be used both in the console and in the CLI, which makes it very versatile).

The snippet of code I wanted to include here is how to initiate a backup, then wait for it to complete.

When you initiate a backup like this, you have an option to delay returning to the command prompt:

oci bv backup create –volume-id <OCID> –display-name “ad-hoc for replication” –type INCREMENTAL

There are 2 useful flags “wait-interval-seconds” and “wait-for-state” which can be used to delay this statement completing. However, these don’t give you a lot of control. You can wait until the backup is completed, but how long do you wait? If your time expires then you never know

Instead, this syntax also returns the ID of the running backup, and so what I did was capture this, and then repeatedly loop round checking the status of that backup. It also meant I could accurately react to that status, and get it without having to waste too much time waiting for small backups to complete as you need to code your wait time for the worst case.

So my script now looks like this:

export OCI_CLI_PROFILE=LONDON
BACKUP=`oci bv backup create –volume-id $ID –display-name “ad-hoc for replication” –type INCREMENTAL \
                    | grep “ocid1.volumebackup” | sed -e “s?\”id\”:??g” | sed -e “s? ??g” |  sed -e “s?,??g” |  sed -e “s?\”??g”`
echo “Initiating backup, waiting to complete”
sleep $BKUP_SLEEP
STATUS=`oci bv backup get –volume-backup-id $BACKUP | grep “lifecycle-state” | awk -F: ‘{print $2}’ \
                    | sed -e “s/\”//g” | sed -e “s/ //g” | sed -e “s/,//g”`
while [ “$STATUS”  == “CREATING” ]
do
                    echo “Backup still CREATING.. Sleeping for $REPL_SLEEP seconds `date +’%Y-%m-%d %H:%M:%S’`”
                    sleep $BKUP_SLEEP
                    STATUS=`oci bv backup get –volume-backup-id $BACKUP | grep “lifecycle-state” | awk -F: ‘{print $2}’ \
                                        | sed -e “s/\”//g” | sed -e “s/ //g” | sed -e “s/,//g”` done
echo Backup completed
  1. Set environment (see my first blog)
  2. Initiate backup and record backup ID
  3. Go into a loop checking the status of the backup (with a variable so you can adjust how long between loops)
  4. Exit loop with a status when the backup is no longer “CREATING”

Tags:

Leave a comment