Controlled Server Demolition

/images/demo.gif

Today I tasked with one of those jobs which don't come up very often - shutting down a service for good. This particular retirement posed an interesting challenge though. As per usual, I wanted to make sure that all traces of customer data was wiped from the systems, and, indeed, any of our own keys/users/passwords and so on. The challenge was, though, that I only had ssh access to the systems. No console, no kvm or anthing like that. These were some pretty basic VPS machines which were located very far away. I looked around to see what approaches others have taken for securely, remote wiping machines under these conditions. The simple, brute-force approach seems to be to just dd the disks. This isn't too bad, but, what if the system panics before the dd completes and you're left with a system which is both unbootable and still has customer data on it? You don't have a strong enough sense that the data is actually gone for my liking. Another common theme is to boot the machine into a live image somehow and dd the disks. This approach is a bit better. You can at least be sure that dd completes. However, if you make a mistake setting the system up to boot into a live image, you could, again, end up with an unbootable system and the customer data remains intact. Yet another suggestion is to create a ramdisk, pivot_root into it and wipe from there. This way the running OS has a copy of the filesystem in memory and won't freak out when the filesystem disappears. This sounded like an interesting strategy to me. Here is the approach I took.

The OS was an Ubuntu LTS release. There were a dozen or so hosts. Each box had more or less the same disk setup - a root volume plus a separate /data volume. For some hosts, the sensitive information (e.g., DB content) lived in the /data partition, but, not in all cases. Being a bit paranoid, I wanted to make sure that the root volume was thoroughly wiped as well.

someone@somehost:/data$ df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda2        18G  3.9G   14G  23% /
udev            2.0G  4.0K  2.0G   1% /dev
tmpfs           792M  216K  791M   1% /run
none            5.0M     0  5.0M   0% /run/lock
none            2.0G     0  2.0G   0% /run/shm
/dev/vdb         79G  1.7G   74G   3% /data

So, first things first, the swap partition could contain some sensitive information. We need to wipe it out. I disabled it and used the shred tool to write over the volume. By default, shred will make three passes over the device or file you give it. I also gave it the -f flag (force permission change), the -v flag (verbose output), and the -z flag (do a pass writing all zeroes at the end).

someone@somewhere:/$ sudo fdisk -l | grep swap
/dev/vda5        38037504    41940991     1951744   82  Linux swap / Solaris
someone@somewhere:/$ sudo swapoff /dev/vda5
someone@somewhere:/$ free -m
             total       used       free     shared    buffers     cached
Mem:          3955       1498       2456          0        297       1033
-/+ buffers/cache:        167       3787
Swap:            0          0          0
someone@somewhere:/$ sudo shred -fvz /dev/vda5
shred: /dev/vda5: pass 1/4 (random)...
shred: /dev/vda5: pass 1/4 (random)...827MiB/1.9GiB 43%
shred: /dev/vda5: pass 1/4 (random)...1.5GiB/1.9GiB 81%
shred: /dev/vda5: pass 1/4 (random)...1.9GiB/1.9GiB 100%
shred: /dev/vda5: pass 2/4 (random)...
shred: /dev/vda5: pass 3/4 (random)...
shred: /dev/vda5: pass 4/4 (000000)...

Next, lets deal with the /data volume - this can be unmounted and wiped directly with little risk. First, I edited the /etc/fstab and commented out the /data mount on the off chance that the system freaks out and reboots. We don't want the OS to get stuck trying to mount a volume that had been half-wiped. Next we need to unmount the volume. But, before we can do that, we need to check to see what's using it.

someone@somehost:/$ sudo find /data -exec fuser {} \;
/data/mysql:         29890c
/data/mysql/ibdata1: 29890
/data/mysql/relay-log.info: 29890
/data/mysql/mysqld-relay-bin.003148: 29890
... snip ...
/data/mysql/ib_logfile1: 29890
/data/mysql/ib_logfile0: 29890
/data/mysql/mysqld-relay-bin.index: 29890
/data/squid3:        31128c
/data/squid3/swap.state: 31128

OK, in this particular case, mysql and squid are accessing this volume so they need to be shut down before we can proceed.

sudo service mysql stop
sudo service squid3 stop

Let's get to shredding!

someone@somehost:/$ time sudo shred -vfz /dev/vdb
shred: /dev/vdb: pass 1/4 (random)...
shred: /dev/vdb: pass 1/4 (random)...491MiB/40GiB 1%
shred: /dev/vdb: pass 1/4 (random)...507MiB/40GiB 1%
shred: /dev/vdb: pass 1/4 (random)...527MiB/40GiB 1%
shred: /dev/vdb: pass 1/4 (random)...564MiB/40GiB 1%
shred: /dev/vdb: pass 1/4 (random)...601MiB/40GiB 1%
... snip ...
shred: /dev/vdb: pass 4/4 (000000)...37GiB/40GiB 92%
shred: /dev/vdb: pass 4/4 (000000)...38GiB/40GiB 96%
shred: /dev/vdb: pass 4/4 (000000)...40GiB/40GiB 100%

real    12m13.907s
user    1m42.318s
sys     4m34.889s

I prefixed the shred commands with 'time', so I could see how long the process took. Interestingly, the speed of these shreds varied quite a lot. In the above case, there were two machines in that class and their shred times were 12 minutes and 19 minutes respectively. However, there was another pair of hosts, with slightly larger /data volumes, and the difference between their times was radically different - 25 minutes vs. 331 minutes. Like I say, these were some simple VPS servers, so I guess the storage backend that that second volume was sitting on was pretty bottlenecked.

So, now we've got the easier, non-root volume wiped to a pretty satisfactory degree. How do we go about wiping the root volume? First of all, we need to disable any remaining non-critical services - even things like syslog, cron, atd and so on. We want to quiesce the system as much as possible. Then you want to systematically go through your filesystem and delete anything that you think is sensitive. For example, things like:

  • All of your logs from /var/log
  • All users data from /home, even empty, but leave in place, the dir for the user you're using to wipe the system. Watch out for dotfiles.
  • Crontabs /etc/crontab, /var/cron/ or /var/spool/cron
  • Roots home directory contents - /root
  • Config files in /etc might have passwords or keys.
  • Everything in /tmp

I Basically tried to remove everything I could without killing the system yet. So, I've deteletd everything I think might be sensitive from the root volume, but there is still a small risk they could be recovered by a determined individual. I wanted to scrub the blocks where these files used to live. So, I used dd to create one big file which filled up all of the free space on the volume. Since it's consuming all free space, I can be sure that any blocks that used to belong to sensitive files, now belong to this new file. Then I used shred on this file to write over the file, thereby thouroughly overwriting the old sensitive data.

dd if=/dev/zero of=/junk

This filled up the disk in a couple of minutes. I now have a big file sitting on the blocks that used to be the home of my files. This alone may be enough for you as you've written over the blocks once. Just to be on the safe side, I ran shred on this file to make sure I had taken a few passes over the blocks.

sudo shred -vfz /junk

So, at this point, I should be in pretty good shape. The swap space has been scrubbed. The non-root partitions have been scrubbed. All of the sensitive data from the root partition should be gone. Buuuut, I'm still a little paranoid. So, I want to try one last thing. I want to really burn the hosts to the ground - just to be on the safe side. I had to try this process out on a vagrant box a few times to get all of the steps right. I created a ramdisk, and set up a basic environment in it using debootstrap. You'll need more than 1G of RAM to do this as the image is a little over that. Then, I chroot'd into the ramdisk and shred the disk from there. The idea is that we have a temporary filesystem in RAM which should allow the system to keep going while the filesystem on disk disappears.

mkdir /root/tmpfs/
mount -t tmpfs tmpfs /root/tmpfs/
debootstrap --variant=buildd --arch amd64 trusty /root/tmpfs/
mount --bind /dev /root/tmpfs/dev
chroot /root/tmpfs/
mount -t proc proc /proc
shred -vz /dev/vda

Now you're left with a system with a (hopefully) totally wiped disk. None of the usual commands to halt the system were working at this point (halt/shutdown/poweroff), so I had to do a 'kill -SEGV 1' to force the system to core and finally completely retreat from the machine.

So, did this work? Mostly. There was only one case where the ssh connection dropped early. It was about 50% through the first shred pass. I feel reasonably happy that I didn't leave anything important behind.

NOTE : When using shred to wipe files, make sure to pay attention to this warning from the manpage.

CAUTION:  Note  that  shred  relies on a very important assumption: that the file system overwrites data in place.  This is the traditional way to do things, but many modern file system designs do not
satisfy this assumption.  The following are examples of file systems on which shred is not effective, or is not guaranteed to be effective in all file system modes:

* log-structured or journaled file systems, such as those supplied with AIX and Solaris (and JFS, ReiserFS, XFS, Ext3, etc.)

* file systems that write redundant data and carry on even if some writes fail, such as RAID-based file systems

* file systems that make snapshots, such as Network Appliance's NFS server

* file systems that cache in temporary locations, such as NFS version 3 clients

* compressed file systems

In the case of ext3 file systems, the above disclaimer applies (and shred is thus of limited effectiveness) only in data=journal mode, which journals file data in addition to just metadata.   In  both
the  data=ordered  (default) and data=writeback modes, shred works as usual.  Ext3 journaling modes can be changed by adding the data=something option to the mount options for a particular file system
in the /etc/fstab file, as documented in the mount man page (man mount).

3rd Party Services And SLAs

First, a disclaimer. I'm a bit of a control freak. Given the choice, I'd usually lean toward running a service myself rather than offloading it to a 3rd party provider. This way, I have full visibility into how it works. I can be sure of how well (or poorly) it's set up. I know exactly what's going on with it at all times and, when things inevitably go wrong, I can dive in and debug it myself. Sitting, helpless, at your desk when a service you depend on is down is an odd and frustrating feeling. Doing it yourself, of course, isn't always the best choice. There are only so many pieces of infrastructure that one person can and should maintain.

GMail is an excellent example of this. Running a bunch of mail servers isn't a very interesting proposition. Frankly, I think running mail servers is really, really boring. If you're into that, more power to you, but it's very low on my list of fun activities. If you can offload that onto a 3rd party provider, then, why not?

Well, the decision of whether to run something yourself or have a 3rd party run it for you, ultimately, usually boils down to cost and complexity. What's the cost to build it yourself (hardware && colo || cloud servers, licences, sysadmin time, etc.) and what is the supplier charging? The fact that a 3rd party can usually do it for less money - due to the fact that they're doing it at a larger scale - can be pretty attractive. However, there's another selection criteria that occurred to me recently that I'm sure many people overlook. It relates to the SLAs that your suppliers provide to you and the SLA that you provide to your customers.

Say you have a web application that sends emails (think mailgun), uses a workqueue to process jobs asynchronously (think elastic beanstalk) and your static assets are served from a CDN (think cloudfront). These are all very common components that you could find in many, many applications. You depend on each of these services to do business and each one comes with it's own SLA. If one of these services go down, then it's going to impact your users too. Now, suppose that each of these services promises three nines of availability, i.e., 99.9%. The problem is, these services are very unlikely to all fail at the same time. So, in order to figure out the SLA that you can promise to your customers you need to sum the potential downtimes of all of the components upon which you depend.

0.1% (mailgun) + 0.1% (beanstalk) + 0.1% (cloudfront) = 0.3%

In this case, the best uptime you can honestly promise to your customers is 99.7%, and, that's not even factoring in the uptime that you can realistically provide from the parts of the application that you're running yourself.

What's more, as you add more 3rd party services, the situation gets worse and worse, because, for each supplier, you need to subtract their potential downtime from your promised uptime. With the same SLAs, if you use 4 external services you can only promise 99.6% uptime, with 5 external services 99.5% and so on and so on.

So, ask yourself, do you know the SLAs of all of the suppliers upon which you depend? Are you promising an uptime that you may not be able to meet? Maybe you can't offload service X to a 3rd party because to do so would drop your achievable uptime to a level lower than what you've already contractually promised your customers. Also, all of this only considers unscheduled downtime. Do any of your suppliers allow for scheduled downtime in their SLAs? What sort of promises around scheduled downtime have you made to your customers?

Ultimately, this comes down to what you might call; the supplier as a single point of failure. If uptime is important to you, then you need to have redundancy at the supplier level. Your application should be able to send mails to any service, mailgun, SES, postfix, whatever. Not only that, but your application should know when one of these services is down and route around it. Suddenly, offloading that bit of infrastructure doesn't seem as cheap and simple.