Understanding the difference between a backup and snapshots

Often times our support team gets asked “Can I just use a snapshot as my backup?” Even though snapshots seems like an easy way to backup a virtual machine, you will learn in this post that it isn’t the most effective solution and can lead to later problems.

What exactly is a backup?

A backup is the process of creating a consistent copy of data and moving it to a different location. Traditionally backups were done on a removable storage device such as a tape or removable drive and then taken to another location. Today, backups to other locations require only a secure network connection and a back up software tool, along with available storage on the receiving end.

What is a snapshot?

What is the difference between a snapshot and a backup? A snapshot is a state, or image of the storage device or virtual machine, at a specific point in time. Although a snapshot does carry some traits of a backup, it is NOT a full-blown backup. Snapshots are stored at the same location as the original data; so, in the event of a location loss, a snapshot is lost just like the original VM and data.

Backup vs Snapshots

Snapshots are helpful in restoring data locally in the event of data loss or corruption. But by rolling back to the most recent snapshot, but just like your VM, that storage that holds the snapshot(s) runs the risk of corruption.

When snapshots are performed they are added into a snapshot tree, which shows all of the snapshots and the relationship between them. If a parent snapshot, or VDI (virtual disk image) in that tree has been corrupted, all child images are then corrupted as well. Not to mention a corrupt VDI may prevent you from taking additional snapshots in the future.

A true backup solution must also not be reliant on the original data disk format. Just about every snapshot (VMware, KVM, Xenserver, Hyper-V) relies on the original source information in order to restore to a certain point in time. This means if a virtual machine has disappeared or has been removed/deleted, then the snapshots are completely worthless. A new copy of the data is not created; the original file is only preserved.

Virtual machines are degraded as more snapshots are taken. Performance degradation is based on how long the snapshot or snapshot tree is in place, the depth of the tree, and how much the virtual machine and its guest operating system have changed from the time you originally took the snapshot. NOTE: This performance issue can be resolved depending on your type of hypervisor such as Xenserver coalescing and KVM merging snapshots using lvconvert; which will reduce the length of the snapshot tree.

So, what does this all mean to you as a service providers? A snapshot is an easy way to rollback a virtual machine or storage device to a specific point in time, but they rely 100% on the original data. A snapshot is not a solution for backing up your cloud. Instead consider utilizing a true backup solutions by keeping some sort of separate backup in a separate cloud zone for complete redundancy and to ensure recovery during time of disaster.