Ceph Backup: Don’t Lose Your HCI Data
This is more a public service announcement rather than a technical blog, but we will take a high-level look at a warning that I want to make sure that those who are using Ceph or CephFS on top of Ceph for HCI storage either for virtual machines or your files, you want to make sure that you don’t just trust your hypervisor-based virtual machine backups. Why? Well, let’s take a look.
Table of contents
What is Ceph? Brief overview
Ceph is software-defined storage that takes local disks on server hosts and aggregates these with servers in a cluster so that the storage pool looks like a single shared storage volume. You can also enable CephFS on top of Ceph to have an HCI solution where you can store your files on top of Ceph.
Why backups matter?
Backups are extremely important even with a software-defined solution like Ceph. Why? Well, even though your data is protected from a hardware failure, you still need backups to protect you from things like accidental data deletion or ransomware. Let’s say you are running CephFS on top of Ceph and a user has files stored in CephFS. Their workstation then gets infected with ransomware that encrypts all their files stored in the CephFS share.
The HCI data protection doesn’t prevent the files from being encrypted, nor does it automatically recover those files. To Ceph this is just normal file changes. So, you need a backup solution for that.
Traditional VM backups don’t protect CephFS
One of the things I discovered recently, and glad I did, and I should have thought about this beforehand. The CephFS mount isn’t backed up like you think it would be in a regular virtual machine backup. In fact, when you try a “File level restore” on a modern backup solution, you won’t see the files listed in your mount directory. Why?
Well, because the files don’t actually reside in the mount directory. CephFS presents the files to the mount so the users can see and interact with the files. However, they aren’t actually in that location.
Case in point. Let me show you two screenshots of the same server, looking at the file-level view and what a modern backup solution actually sees if you try to perform a file-level restore.
Below is the file-level view of the server. As you can see there are definitely files mounted in this directory.
Below is a backup I took of one of the VMs that is participating in the CephFS-enabled cluster. As you can see, when I mount the backup of the virtual machine, no files are present. Keep this in mind if you are using Ceph or another software-defined storage solution like GlusterFS, as an example. They both will show the same thing. Don’t be caught unawares if you have a disaster and need to recover. Also, this has nothing to do with the backup solution itself. The screenshot is from NAKIVO Backup & Replication. But the same test in Veeam resulted in the same.
Solution for backing up Ceph and CephFS
So, what is the solution for backing up something like Ceph and CephFS. Well, there are a couple of scenarios:
- Something like Virtual Machines running on Ceph
- Files residing in CephFS
With Ceph storage that is backing something like virtual machine hypervisors, traditional backups that you would take of VMs are fine. It will capture the data since the hypervisor is aware of the storage and “sees” it, so the backup solution will be able to as well.
However, CephFS is a little more dangerous here as it is a file system running on top of Ceph. So, there is the additional abstraction layer to think about. How did I work around this?
I used physical machine backup agents running in side the virtual machine. Using a backup solution like Veeam that has agents (most other solutions do as well), you can install the agent inside the virtual machine (even though most of these are called physical machine backup agents) and the agent will then see the files mounted as a user would see them and will be able to perform a backup of your data there.
Guest files restore
Below, I will walk you through what you then see with a guest files restore in something like Veeam. Here I have created an agent-based backup of the Ubuntu Server virtual machine. Now I am choosing to restore > Guest files restore.
Choose your operating system type.
Choose the virtual machine backup you want to mount.
Choose the restore point.
Choose the helper host (the linux machine that will be used to mount the backup.
Enter a reason (optional).
Finally, click Browse.
You will then get a file browser that will enable you to browse your backed-up data and restore it.
Wrapping up
Even though software-defined storage is awesome and allows us to do so many things in the enterprise data center, be sure to understand the implications of backing up and recovering your data. Don’t assume that since you have a backup of all the virtual machine hosts in the cluster that you will be able to recover your data. The abstraction layer that is used with these types of storage like CephFS prevents traditional VM-level backups from capturing the data in the mounted folders. You will need to have a local backup agent to capture the files mounted from your HCI storage in something like CephFS.