Controlled Server Demolition

Today I tasked with one of those jobs which don’t come up very often - shutting down a service for good. This particular retirement posed an interesting challenge though. As per usual, I wanted to make sure that all traces of customer data was wiped from the systems, and, indeed, any of our own keys/users/passwords and so on. The challenge was, though, that I only had ssh access to the systems. No console, no kvm or anthing like that. These were some pretty basic VPS machines which were located very far away. I looked around to see what approaches others have taken for securely, remote wiping machines under these conditions. The simple, brute-force approach seems to be to just dd the disks. This isn’t too bad, but, what if the system panics before the dd completes and you’re left with a system which is both unbootable and still has customer data on it? You don’t have a strong enough sense that the data is actually gone for my liking. Another common theme is to boot the machine into a live image somehow and dd the disks. This approach is a bit better. You can at least be sure that dd completes. However, if you make a mistake setting the system up to boot into a live image, you could, again, end up with an unbootable system and the customer data remains intact. Yet another suggestion is to create a ramdisk, pivot_root into it and wipe from there. This way the running OS has a copy of the filesystem in memory and won’t freak out when the filesystem disappears. This sounded like an interesting strategy to me. Here is the approach I took.

The OS was an Ubuntu LTS release. There were a dozen or so hosts. Each box had more or less the same disk setup - a root volume plus a separate /data volume. For some hosts, the sensitive information (e.g., DB content) lived in the /data partition, but, not in all cases. Being a bit paranoid, I wanted to make sure that the root volume was thoroughly wiped as well.

  someone@somehost:/data$ df -h
  Filesystem      Size  Used Avail Use% Mounted on
  /dev/vda2        18G  3.9G   14G  23% /
  udev            2.0G  4.0K  2.0G   1% /dev
  tmpfs           792M  216K  791M   1% /run
  none            5.0M     0  5.0M   0% /run/lock
  none            2.0G     0  2.0G   0% /run/shm
  /dev/vdb         79G  1.7G   74G   3% /data

So, first things first, the swap partition could contain some sensitive information. We need to wipe it out. I disabled it and used the shred tool to write over the volume. By default, shred will make three passes over the device or file you give it. I also gave it the -f flag (force permission change), the -v flag (verbose output), and the -z flag (do a pass writing all zeroes at the end).

  someone@somewhere:/$ sudo fdisk -l | grep swap
  /dev/vda5        38037504    41940991     1951744   82  Linux swap / Solaris
  someone@somewhere:/$ sudo swapoff /dev/vda5
  someone@somewhere:/$ free -m
               total       used       free     shared    buffers     cached
  Mem:          3955       1498       2456          0        297       1033
  -/+ buffers/cache:        167       3787
  Swap:            0          0          0
  someone@somewhere:/$ sudo shred -fvz /dev/vda5
  shred: /dev/vda5: pass 1/4 (random)...
  shred: /dev/vda5: pass 1/4 (random)...827MiB/1.9GiB 43%
  shred: /dev/vda5: pass 1/4 (random)...1.5GiB/1.9GiB 81%
  shred: /dev/vda5: pass 1/4 (random)...1.9GiB/1.9GiB 100%
  shred: /dev/vda5: pass 2/4 (random)...
  shred: /dev/vda5: pass 3/4 (random)...
  shred: /dev/vda5: pass 4/4 (000000)...

Next, lets deal with the /data volume - this can be unmounted and wiped directly with little risk. First, I edited the /etc/fstab and commented out the /data mount on the off chance that the system freaks out and reboots. We don’t want the OS to get stuck trying to mount a volume that had been half-wiped. Next we need to unmount the volume. But, before we can do that, we need to check to see what’s using it.

  someone@somehost:/$ sudo find /data -exec fuser {} \;
  /data/mysql:         29890c
  /data/mysql/ibdata1: 29890
  /data/mysql/relay-log.info: 29890
  /data/mysql/mysqld-relay-bin.003148: 29890
  ... snip ...
  /data/mysql/ib_logfile1: 29890
  /data/mysql/ib_logfile0: 29890
  /data/mysql/mysqld-relay-bin.index: 29890
  /data/squid3:        31128c
  /data/squid3/swap.state: 31128

OK, in this particular case, mysql and squid are accessing this volume so they need to be shut down before we can proceed.

  sudo service mysql stop
  sudo service squid3 stop

Let’s get to shredding!

  someone@somehost:/$ time sudo shred -vfz /dev/vdb
  shred: /dev/vdb: pass 1/4 (random)...
  shred: /dev/vdb: pass 1/4 (random)...491MiB/40GiB 1%
  shred: /dev/vdb: pass 1/4 (random)...507MiB/40GiB 1%
  shred: /dev/vdb: pass 1/4 (random)...527MiB/40GiB 1%
  shred: /dev/vdb: pass 1/4 (random)...564MiB/40GiB 1%
  shred: /dev/vdb: pass 1/4 (random)...601MiB/40GiB 1%
  ... snip ...
  shred: /dev/vdb: pass 4/4 (000000)...37GiB/40GiB 92%
  shred: /dev/vdb: pass 4/4 (000000)...38GiB/40GiB 96%
  shred: /dev/vdb: pass 4/4 (000000)...40GiB/40GiB 100%

  real    12m13.907s
  user    1m42.318s
  sys     4m34.889s

I prefixed the shred commands with ‘time’, so I could see how long the process took. Interestingly, the speed of these shreds varied quite a lot. In the above case, there were two machines in that class and their shred times were 12 minutes and 19 minutes respectively. However, there was another pair of hosts, with slightly larger /data volumes, and the difference between their times was radically different - 25 minutes vs. 331 minutes. Like I say, these were some simple VPS servers, so I guess the storage backend that that second volume was sitting on was pretty bottlenecked.

So, now we’ve got the easier, non-root volume wiped to a pretty satisfactory degree. How do we go about wiping the root volume? First of all, we need to disable any remaining non-critical services - even things like syslog, cron, atd and so on. We want to quiesce the system as much as possible. Then you want to systematically go through your filesystem and delete anything that you think is sensitive. For example, things like:

I Basically tried to remove everything I could without killing the system yet. So, I’ve deteletd everything I think might be sensitive from the root volume, but there is still a small risk they could be recovered by a determined individual. I wanted to scrub the blocks where these files used to live. So, I used dd to create one big file which filled up all of the free space on the volume. Since it’s consuming all free space, I can be sure that any blocks that used to belong to sensitive files, now belong to this new file. Then I used shred on this file to write over the file, thereby thouroughly overwriting the old sensitive data.

  dd if=/dev/zero of=/junk

This filled up the disk in a couple of minutes. I now have a big file sitting on the blocks that used to be the home of my files. This alone may be enough for you as you’ve written over the blocks once. Just to be on the safe side, I ran shred on this file to make sure I had taken a few passes over the blocks.

  sudo shred -vfz /junk

So, at this point, I should be in pretty good shape. The swap space has been scrubbed. The non-root partitions have been scrubbed. All of the sensitive data from the root partition should be gone. Buuuut, I’m still a little paranoid. So, I want to try one last thing. I want to really burn the hosts to the ground - just to be on the safe side. I had to try this process out on a vagrant box a few times to get all of the steps right. I created a ramdisk, and set up a basic environment in it using debootstrap. You’ll need more than 1G of RAM to do this as the image is a little over that. Then, I chroot’d into the ramdisk and shred the disk from there. The idea is that we have a temporary filesystem in RAM which should allow the system to keep going while the filesystem on disk disappears.

  mkdir /root/tmpfs/
  mount -t tmpfs tmpfs /root/tmpfs/
  debootstrap --variant=buildd --arch amd64 trusty /root/tmpfs/
  mount --bind /dev /root/tmpfs/dev
  chroot /root/tmpfs/
  mount -t proc proc /proc
  shred -vz /dev/vda

Now you’re left with a system with a (hopefully) totally wiped disk. None of the usual commands to halt the system were working at this point (halt/shutdown/poweroff), so I had to do a ‘kill -SEGV 1’ to force the system to core and finally completely retreat from the machine.

So, did this work? Mostly. There was only one case where the ssh connection dropped early. It was about 50% through the first shred pass. I feel reasonably happy that I didn’t leave anything important behind.

NOTE : When using shred to wipe files, make sure to pay attention to this warning from the manpage.

  CAUTION:  Note  that  shred  relies on a very important assumption: that the
  file system overwrites data in place. This is the traditional way to do
  things, but many modern file system designs do not satisfy this assumption.
  The following are examples of file systems on which shred is not effective,
  or is not guaranteed to be effective in all file system modes:

  * log-structured or journaled file systems, such as those supplied with AIX
    and Solaris (and JFS, ReiserFS, XFS, Ext3, etc.)

  * file systems that write redundant data and carry on even if some writes
    fail, such as RAID-based file systems

  * file systems that make snapshots, such as Network Appliance's NFS server

  * file systems that cache in temporary locations, such as NFS version 3
    clients

  * compressed file systems

  In the case of ext3 file systems, the above disclaimer applies (and shred is
  thus of limited effectiveness) only in data=journal mode, which journals file
  data in addition to just metadata. In both the data=ordered (default) and
  data=writeback modes, shred works as usual. Ext3 journaling modes can be
  changed by adding the data=something option to the mount options for a
  particular file system in the /etc/fstab file, as documented in the mount man
  page (man mount).