Controller Server Demolition #

2016-04-28

Today I tasked with one of those jobs which don’t come up very often - shutting down a service for good. This particular retirement posed an interesting challenge though. As per usual, I wanted to make sure that all traces of customer data was wiped from the systems, and, indeed, any of our own keys/users/passwords and so on. The challenge was, though, that I only had ssh access to the systems. No console, no kvm or anything like that. These were some pretty basic VPS machines which were located very far away. I looked around to see what approaches others have taken for securely, remote wiping machines under these conditions. The simple, brute-force approach seems to be to just dd the disks. This isn’t too bad, but, what if the system panics before the dd completes and you’re left with a system which is both unbootable and still has customer data on it? You don’t have a strong enough sense that the data is actually gone for my liking. Another common theme is to boot the machine into a live image somehow and dd the disks. This approach is a bit better. You can at least be sure that dd completes. However, if you make a mistake setting the system up to boot into a live image, you could, again, end up with an unbootable system and the customer data remains intact. Yet another suggestion is to create a ramdisk, pivot_root into it and wipe from there. This way the running OS has a copy of the filesystem in memory and won’t freak out when the filesystem disappears. This sounded like an interesting strategy to me. Here is the approach I took.

The OS was an Ubuntu LTS release. There were a dozen or so hosts. Each box had more or less the same disk setup - a root volume plus a separate /data volume. For some hosts, the sensitive information (e.g., DB content) lived in the /data partition, but, not in all cases. Being a bit paranoid, I wanted to make sure that the root volume was thoroughly wiped as well.

someone@somehost:/data$ df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda2        18G  3.9G   14G  23% /
udev            2.0G  4.0K  2.0G   1% /dev
tmpfs           792M  216K  791M   1% /run
none            5.0M     0  5.0M   0% /run/lock
none            2.0G     0  2.0G   0% /run/shm
/dev/vdb         79G  1.7G   74G   3% /data

So, first things first, the swap partition could contain some sensitive information. We need to wipe it out. I disabled it and used the shred tool to write over the volume. By default, shred will make three passes over the device or file you give it. I also gave it the -f flag (force permission change), the -v flag (verbose output), and the -z flag (do a pass writing all zeroes at the end).

someone@somewhere:/$ sudo fdisk -l | grep swap
/dev/vda5        38037504    41940991     1951744   82  Linux swap / Solaris
someone@somewhere:/$ sudo swapoff /dev/vda5
someone@somewhere:/$ free -m
             total       used       free     shared    buffers     cached
Mem:          3955       1498       2456          0        297       1033
-/+ buffers/cache:        167       3787
Swap:            0          0          0
someone@somewhere:/$ sudo shred -fvz /dev/vda5
shred: /dev/vda5: pass 1/4 (random)...
shred: /dev/vda5: pass 1/4 (random)...827MiB/1.9GiB 43%
shred: /dev/vda5: pass 1/4 (random)...1.5GiB/1.9GiB 81%
shred: /dev/vda5: pass 1/4 (random)...1.9GiB/1.9GiB 100%
shred: /dev/vda5: pass 2/4 (random)...
shred: /dev/vda5: pass 3/4 (random)...
shred: /dev/vda5: pass 4/4 (000000)...

Next, lets deal with the /data volume - this can be unmounted and wiped directly with little risk. First, I edited the /etc/fstab and commented out the /data mount on the off chance that the system freaks out and reboots. We don’t want the OS to get stuck trying to mount a volume that had been half-wiped. Next we need to unmount the volume. But, before we can do that, we need to check to see what’s using it.

someone@somehost:/$ sudo find /data -exec fuser {} \;
/data/mysql:         29890c
/data/mysql/ibdata1: 29890
/data/mysql/relay-log.info: 29890
/data/mysql/mysqld-relay-bin.003148: 29890
... snip ...
/data/mysql/ib_logfile1: 29890
/data/mysql/ib_logfile0: 29890
/data/mysql/mysqld-relay-bin.index: 29890
/data/squid3:        31128c
/data/squid3/swap.state: 31128

OK, in this particular case, mysql and squid are accessing this volume so they need to be shut down before we can proceed.

sudo service mysql stop
sudo service squid3 stop

Let’s get to shredding!

someone@somehost:/$ time sudo shred -vfz /dev/vdb
shred: /dev/vdb: pass 1/4 (random)...
shred: /dev/vdb: pass 1/4 (random)...491MiB/40GiB 1%
shred: /dev/vdb: pass 1/4 (random)...507MiB/40GiB 1%
shred: /dev/vdb: pass 1/4 (random)...527MiB/40GiB 1%
shred: /dev/vdb: pass 1/4 (random)...564MiB/40GiB 1%
shred: /dev/vdb: pass 1/4 (random)...601MiB/40GiB 1%
... snip ...
shred: /dev/vdb: pass 4/4 (000000)...37GiB/40GiB 92%
shred: /dev/vdb: pass 4/4 (000000)...38GiB/40GiB 96%
shred: /dev/vdb: pass 4/4 (000000)...40GiB/40GiB 100%

real    12m13.907s
user    1m42.318s
sys     4m34.889s

I prefixed the shred commands with ’time’, so I could see how long the process took. Interestingly, the speed of these shreds varied quite a lot. In the above case, there were two machines in that class and their shred times were 12 minutes and 19 minutes respectively. However, there was another pair of hosts, with slightly larger /data volumes, and the difference between their times was radically different - 25 minutes vs. 331 minutes. Like I say, these were some simple VPS servers, so I guess the storage backend that that second volume was sitting on was pretty bottle-necked.

So, now we’ve got the easier, non-root volume wiped to a pretty satisfactory degree. How do we go about wiping the root volume? First of all, we need to disable any remaining non-critical services - even things like syslog, cron, atd and so on. We want to quiesce the system as much as possible. Then you want to systematically go through your filesystem and delete anything that you think is sensitive. For example, things like:

All of your logs from /var/log
All users data from /home, even empty, but leave in place, the dir for the user you’re using to wipe the system. Watch out for dotfiles.
Crontabs /etc/crontab, /var/cron/ or /var/spool/cron
Roots home directory contents - /root
Config files in /etc might have passwords or keys.
Everything in /tmp

I Basically tried to remove everything I could without killing the system yet. So, I’ve deleted everything I think might be sensitive from the root volume, but there is still a small risk they could be recovered by a determined individual. I wanted to scrub the blocks where these files used to live. So, I used dd to create one big file which filled up all of the free space on the volume. Since it’s consuming all free space, I can be sure that any blocks that used to belong to sensitive files, now belong to this new file. Then I used shred on this file to write over the file, thereby thoroughly overwriting the old sensitive data.

dd if=/dev/zero of=/junk

This filled up the disk in a couple of minutes. I now have a big file sitting on the blocks that used to be the home of my files. This alone may be enough for you as you’ve written over the blocks once. Just to be on the safe side, I ran shred on this file to make sure I had taken a few passes over the blocks.

sudo shred -vfz /junk

So, at this point, I should be in pretty good shape. The swap space has been scrubbed. The non-root partitions have been scrubbed. All of the sensitive data from the root partition should be gone. Buuuut, I’m still a little paranoid. So, I want to try one last thing. I want to really burn the hosts to the ground - just to be on the safe side. I had to try this process out on a vagrant box a few times to get all of the steps right. I created a ramdisk, and set up a basic environment in it using debootstrap. You’ll need more than 1G of RAM to do this as the image is a little over that. Then, I chroot’d into the ramdisk and shred the disk from there. The idea is that we have a temporary filesystem in RAM which should allow the system to keep going while the filesystem on disk disappears.

mkdir /root/tmpfs/
mount -t tmpfs tmpfs /root/tmpfs/
debootstrap --variant=buildd --arch amd64 trusty /root/tmpfs/
mount --bind /dev /root/tmpfs/dev
chroot /root/tmpfs/
mount -t proc proc /proc
shred -vz /dev/vda

Now you’re left with a system with a (hopefully) totally wiped disk. None of the usual commands to halt the system were working at this point (halt/shutdown/poweroff), so I had to do a ‘kill -SEGV 1’ to force the system to core and finally completely retreat from the machine.

So, did this work? Mostly. There was only one case where the ssh connection dropped early. It was about 50% through the first shred pass. I feel reasonably happy that I didn’t leave anything important behind.

NOTE : When using shred to wipe files, make sure to pay attention to this warning from the manpage.

CAUTION:  Note  that  shred  relies on a very important assumption: that the
file system overwrites data in place. This is the traditional way to do
things, but many modern file system designs do not satisfy this assumption.
The following are examples of file systems on which shred is not effective,
or is not guaranteed to be effective in all file system modes:

* log-structured or journaled file systems, such as those supplied with AIX
  and Solaris (and JFS, ReiserFS, XFS, Ext3, etc.)

* file systems that write redundant data and carry on even if some writes
  fail, such as RAID-based file systems

* file systems that make snapshots, such as Network Appliance's NFS server

* file systems that cache in temporary locations, such as NFS version 3
  clients

* compressed file systems

In the case of ext3 file systems, the above disclaimer applies (and shred is
thus of limited effectiveness) only in data=journal mode, which journals file
data in addition to just metadata. In both the data=ordered (default) and
data=writeback modes, shred works as usual. Ext3 journaling modes can be
changed by adding the data=something option to the mount options for a
particular file system in the /etc/fstab file, as documented in the mount man
page (man mount).