Our clinical trials section (CTS) needs to have off site backup. Seems the traditional way FDA Part 11 compliant studies do this is do encrypted backups to tape and ship them off with Iron Mountain. I cringed at that idea. It’s cumbersome and much less useful than a sending stream of ZFS snapshots off to another site.
After burning way to many hours trying to get OpenIndiana running on Amazon EC2, I started looking for alternatives. It turns out the ZFS on Linux project is almost in lock step with the Illumos kernel in OpenIndiana.
I started with a t1.micro instance of Amazon Linux. This is for backup purposes so I’m not worried about performance here. Should there arise a need to do a large restoration from this server or some actual work from it I could easily shutdown and started back up as a larger instance.
Even though Amazon claims redundancy of their EBS storage, they don’t provide any specifics but only an annual failure rate of between 0.1% and 0.5%. I decided to take advantage of ZFS to increase that reliability by building RAIDZ1 VDEVs with 8 EBS volumes each. I then created my ZPOOL out of 5 of these VDEVs. Since I don’t want to pay for a bunch of unused storage my EBS volumes all started at 1GB each. This stripes across 40 EBS volumes at 1GB each. I’ll cover later how I grow this storage and keep my ZPOOL balanced.
In the process of setting this up I’ve created a bitbucket repository of the scripts I’ve used to set up and grow this zfs pool in the cloud: https://bitbucket.org/ctsadmin/aws_zfs_tools
The pool creation and maintenance is all handled externally to the ec2 instance. The scripts reside on the primary ZFS server.
To perform a backup the primary ZFS server kicks off scripts that starts the instance, supplies the encryption key to the scripts to mount the ZFS pool, pushes a zfs send, and shuts down the instance. Since the encryption key is only in memory on the ec2 instance once shut down the data on the EBS storage is encrypted and safe.
Since this is for an FDA Part 11 project, I am taking the security of this data even further. The primary ZFS server will have two pools of data. The production pool and a file level encrypted copy. It will maintain the encrypted copy with additional scripts. It is the encrypted copy that will be backed up to the cloud. Two encryption keys will be needed for retrieval of data from the backup in the cloud.
After some kernel panics I’ve switch to using and m1.small instance instead of the t1.micro. I’ve also added some swap space for the few times when growing the pool I’ve seen memory use cause kernel panics on the small instance. Since the instance will be on for less than an hour a day to receive backups, the 8 cent instance hour charge per day is reasonable.
I may later play with deploying it as a spot instance or further tuning ZFS for the small memory environment of a micro instance. But it comes down to how much work do I want to put into pinching those last few pennies.