Secure Backup Concepts in a Private Cloud
Creating a backup procedure is a very important aspect for a company, because a loss of data can result in great damage to the company, possibly even a company going bankrupt. Proper backup mechanisms are usually not in place or are poorly designed, which becomes evident once it’s already too late. That is, once a disk failure or data loss has happened and restoration is needed. At that point of time, it’s frequently the case that the company figures out that the backup which was supposedly in place doesn’t even exist or wasn’t working at all. Because care is not taken to address those problems, data loss is prominent and is just waiting to happen.
Imagine a scenario where we buy a new NAS system with brand new hard drives and we throw a couple of extra bucks for an SSD disk, which is used for caching to speed up disk operation. Such a system is designed to work only as long as the hard drives are working and not malfunctioning. The life span of the backup media is an important factor we must take into consideration when planning a backup system; there are different backup media available and the most common ones are presented below.
- SSDs (Solid State Drive): solid state drives have no mechanical part to fail, which makes them resistant against mechanical failures, which is common in HDDs. But the flash-based SSD can only be erased and written a limited number of times, which is the main culprit of SSD failure .
- HDDs (Hard Disk Drive): mechanical parts of hard drives are subject to mechanical failures, but consist of a magnetic platter, which doesn’t degrade from read and write operations . Therefore, the life expectancy of HDDs is considerably longer than that of SSDs.
- Tape Drives: magnetic tapes are used to store data offline in an archive, which makes them a good choice to store data over longer periods of time. Contrary to the HDDs, which have a random access storage, tape drives use sequential access storage, where a disk drive can move to any position on the disk in a few milliseconds, but a tape drive must be rewound to the chosen position. Therefore, the tape drives have slow seek time, but once the tape is positioned, the read and writes are very fast.
Backup media can be connected to the computer via the following interfaces:
- Fibre Channel
- ATA over Ethernet (AoE)
There’s also another thing we must take into consideration when deciding upon backup solution. So far, we have decided that we want to store the data in an internal network on some kind of server- therefore, we want to backup the data to an internal server. Imagine what would happen if the drive: ssd, hdd, or tape were to suddenly fail. In such cases, the backup is rendered worthless, since we essentially don’t have a backup anymore.
If that happened, we would first have to detect that the drive is malfunctioning and replace it with another – there are a lot of monitoring tools that we can use to monitor when the drive has failed, but a lot of companies still don’t use them and therefore they’re not even aware of the disk failure. But even if we have a proper monitoring in place and we know the disk has failed the minute it happens, we still have to replace it with a brand new disk. But since the disk is blank, we need to run all backup software again to make proper backups, and that’s easier said than done.
Rather than relying on one disk to store backups, we must build RAID Mirroring where at least two drives are used to hold the same information. If one disk fails, all we need to do is to buy a new disk, insert it into the system and the new copy of the data will be built automatically – this takes a couple of hours, but at the end of the process we should be left with the same information on both drives without loss of information. That’s an improved version of the backup system, which must be in place for a proper backup solution.
There’s another kind of disaster we need to prepare ourselves for – imagine there’s a storm and lightning has struck into our system, causing power outage as well as failure of multiple hardware components. All the hardware can be easily replaced by buying new parts in a hardware store, but the data stored on the hardware is harder to restore. In the worst-case scenario, RAID mirroring has also failed and we don’t have a backup anymore. To protect ourselves from such events, we must keep a backup copy on at least two separate geological locations, because the chances of lightning striking into both of our systems at the same time is very slim. For the data of utmost importance, we should rather disperse it at three separate locations, but normally two locations are enough.
We must also differentiate storing data to a public cloud as opposed to internally (the private cloud). In the previous article, I discussed how there aren’t many companies or solutions out there to store backup data in the cloud security. There are many companies that synchronize your data to the cloud, but are they doing it in a secure manner – is all the data being encrypted before being transmitted to the cloud servers, and is the cloud data encrypted on the cloud itself? The real question we should be asking ourselves is whether the cloud provider has access to our data or are we the only ones who can decrypt it by first downloading the encrypted version and decrypting it locally. If that is the case, then we can use a public cloud to store our sensitive information, but there aren’t many secure public backup solutions available. Rather than relying on a cloud service provider to securely synchronize our data to the cloud data center, we can look at our own solutions and how we can use them. By storing the data locally in a private cloud, we can ensure the data is encrypted at all times and can be accessed only by trustworthy people.
There are a number of backup solutions available on the market, some of which are commercial and others are open-source. This article will be based solely on the open-source backup solutions, since they can easily be compared to commercial solutions based on reliability, extensibility, support and other varying factors. A list of a few open-source backup programs can be seen below. If you want to see a larger list of open-source as well as commercial backup products, you can view them on Wikipedia at .
- ARCserve Backup
The picture below presents the comparison against two open-source tools Amanda and Bacula and two proprietary tools ARCSserve Backup and Backup Exec. The comparison is only based on which platforms the chosen solution works and the language in which it was written. Nevertheless, we also have to look at the features of each backup solution prior to choosing the solution that would work best in our case.
If we would like a detailed comparison matrix, we can take a look at , where different backup solutions are compared. I’ve presented the relevant features below, where I’ve disabled the database support features, since Mysql/Postgres/Oracle were all supported on chosen backup solutions. This was merely done for brevity to only include the relevant options as well as the chosen backup solution without the clutter. The features presented below are mostly supported by all chosen backup solutions. The feature that stands out and is supported only by CA ARCserve is virus scanning.
Usually, the infrastructure is built by two principles:
- Virtual Machines: virtual servers are commonly used today as virtualization has flourished in recent years. We’ll specifically take a look at Vmware ESX virtualization architecture. Virtual servers can run web servers, application servers, and database servers, which are independent one of another. Usually every virtual machine is using network storage to store the data on a NAS/SAN connection. Virtualization architecture must also support HA (High Availability) where two ESX servers are used – if one ESX server fails, the virtual machines are migrated to the other ESX server to ensure every virtual machine is up and running at all times.
- Physical Servers: some of the servers in our environment are still physical servers, which are used to support virtualized servers. A physical server is usually used for backup data and data archiving.
Before using any backup solution, we should decide what data we would like to backup. There are three high level concepts that we have to take into account when implementing a backup solution in our own private cloud:
- Physical System: an imaging program can be used to create complete PC backups, which includes backing up the OS and entire data. When creating an image, we run an imaging program directly from the started operating system and select a drive or partition that we would like to backup; usually there are also options that enable us to exclude certain files or folders from the image, to disable the backup of large files that we don’t need. When we would like to restore to the previous version of the operating system, we can simply choose the image we would like to load, and everything will be taken care of for us automatically.
- Virtual Machine: if we’re using virtual machines, we can backup whole virtual machines, which can easily be deployed later on another ESX server. By using this approach, all the important data inside the virtual machine is copied to the backup system, but also the operating system data is included in the image. This results in a quick virtual machine restoration, but takes up more space than if we would only restore the data itself. We can do that by using the VDDK (Vmware Virtual Disk Development Kit) in ESX. Some backup agents are able to restore only files from the whole backup image, which is needed sometimes if we only need to obtain certain files from the whole image: this saves time, because otherwise we would have to restore the whole virtual machine, power it up and restore the needed files.
- Data Only: instead of creating an image of the whole system, we can instead backup just the data of individual applications that are important, like Mysql databases, Wiki files, pictures, configuration files, etc. When backing up just the data itself, the restoration process is more difficult, because we first have to install the operating system and the application which data we’ve backed-up and then restore the data by importing it back into the applications. A mysqldump is a great example of a utility that can be used to backup Mysql databases. It’s possible that the application doesn’t have the backup/restore feature, so we have to program our own script that does the backup on a predefined schedule, but is also able to restore the data. Usually a software that we’re using provides an agent, which listens for incoming connections from the backup manager that distributes the backup tasks to each agent. It’s a good idea for the backup data to be sent over a separate VLAN, which can easily be assigned to each and every virtual machine to secure the file transmission.
The table below presents the basic concepts we need to take into consideration when deciding on backup solution to be used in internal network.
- Backup VM: allows backing up a virtual machine snapshot that can be used to restore the VM at any time.
- Backup Data: allows installing an agent into the operating system, which can be used to backup certain files and directories of special importance. It doesn’t matter whether a physical server or a virtual machine is used, the backup agent can be installed in the operating system and used by the backup manager.
- Allow File Level Restore: a feature that allows only certain files from a snapshot to be restored, while not needing a restoration of a whole snapshot. This can be useful to save time when only certain files from a snapshot are needed.
- Iterative Backup Schedule: an iterative backup should be done every x hours/days/weeks at a specific time, normally at night. It’s important to realize that the time between backups depends upon the data we’ve backing up. If we’re a small company, we might get by when losing emails from the last 12 hours, while that’s unacceptable in large organizations. Therefore, there is no single answer when the backup needs to run, but the process needs to be dependent upon the client and data.
- Full Backup Schedule: a full backup schedule is needed, so a copy of everything is kept in one place and is recommended once a month, normally over the weekend.
- Encryption: There are times when we would like to additionally encrypt the data upon the backup process. At times like that, we have to remember the password, since a restore won’t be possible without providing the correct password.
- Backup Configuration Files: sometimes the configuration files need to be backed up in order to be able to restore the server. This is particularly important with ESX server itself, where if restoration is needed, it’s simpler to reinstall the ESX and restore only the configuration files.
- Status Messages: the backup solution should inform the customer about the status of backup jobs. Usually messages are sent via emails by using an internal SMTP server and an email address firstname.lastname@example.org or something similar.
- Hypervisor Backup: when using a virtualized environment, we’re usually running a hypervisor like ESX. Since the configuration of the ESX server doesn’t change frequently, a manual backup is viable, where the specific command/program is run after major upgrades. To restore the system, we can reinstall it and apply the backup configuration files.
- Restoration/Recovery: a procedure to restore/recover the data is just as important as its backup counterpart and is usually done over a separate VLAN to separate the backup/restore traffic from the rest of the traffic. The backup solution should support restoration of certain information, its version and destination of restore. If the data is encrypted, a correct password should be supplied during the data restoration, otherwise the restoration will not be possible. Data restoration differs depending on the backup method used– the files can be restored by restoring individual files directly or first restoring the whole virtual machine and copying the files over. When CSP is used as a backup provider, it must have a DRP (Disaster Recovery Plan), which is used when something goes wrong. The DRP plan is necessary to ensure the customer can obtain the backup data; the process should be quick in order to minimize downtime.
- Trust: we should trust our backup solution completely in order to feel confident that data can be brought back when something goes haywire. This can be achieved only by regularly testing the restoration procedure, which should be tested at least once or twice a year. Every so often we should actually restore a system to a previous state by restoring to a testing system; this verifies that data restoration works reliably and gives us the confidence in our own backup solution.
- Data Retention Period: the amount of time the data will still be available in the backup even if it’s deleted from the original place. The retention period depends upon the data, the rate of data changes, the importance of data, etc; and could be days, weeks, months or even years.
In this article we’ve looked at different aspects of backup solutions that we must take into consideration when implementing our own backup solution. Remember that having a proper backup solution in place decides whether data restoration is possible in case of hard drive failure and any other kind of disaster. Losing an important part of the data that’s vital to our organization can lead to our business going bankrupt or at least suffering an immense financial loss.
Data backups are something that we must think about before we actually need them, since otherwise it will probably be too late to restore the needed data. We have to have a proper backup solution in which we trust in place in order to concentrate on the business at hand. Remember that keeping a data backup is important, and if you’re shrugging your shoulders, just ask yourselves what consequences would your business suffer if all your data would be lost at this moment.
 Bacula vs Other Backup Solutions,