Thursday, October 26, 2006

Lock Up Your Valuables

If you're going to keep backups of your important information, it only makes sense to protect those backups. This is doubly true if you're storing your backups off site. If you have your backups on the internet, this is a no-brainer. The best way to do this is to put the data in some kind of container that is locked away digitally. No one can see through the container, and nobody can open it without the key. In the digital world, this is accomplished by encryption.

There are several types of stored data encryption software, from FOSS to Top Secret; from mobile phone software to hardened enterprise appliances; from file-by-file to whole disk. Each of these types has its place in the world of Information Security. I will attempt to treat the most relevant ones here. Hopefully by the end of this post you'll know what encryption is, why it's important to encrypt your valuable data, and what the best method is for you.

Encryption and cryptography are much too broad to cover in depth here, but if you'd like to learn more about its history, it's details, and its uses, I recommend you start with the Wikipedia page and with Bruce Schneier's best known books, Applied Cryptography and Practical Cryptography. I haven't read either of these, but I have a decent idea of the principle ideas behind cryptography and encryption. I have neither the aptitude nor the desire to learn more about these fields. Here is a very brief explanation and history of cryptography and encryption, which may or may not be technically accurate (but it's close enough).

Cryptography is the use of codes or ciphers to transmit information between two parties in clear view in order to make the meaning of the message incomprehensible. Both parties must have a key to decrypt the code. This can be done by memorizing a substitution pattern, by using a physical device, by using a computer to keep track of the encryption and decryption code, by making use of a one-time pad, etc. Each of these has its advantages and disadvantages. As a general rule, usability comes at the cost of security. All cryptographic techniques can be broken by modern computers given enough time, but some are easier than others due to flawed implementation.

The earliest cyphers were simple letter or word substitute cyphers, such as replacing each character with a number or letter. Julius Caesar used a cipher named after him which relied on both parties having a cylinder of equal size -- a physical decryption key of sorts. Not a whole lot happened until the advent of basic computers -- in the mid 1800s by Charles Babbage! But during World War II, the use of cryptography (and cryptanalysis) really took off. The most famous bits of cryptography during this era were the Enigma machine and the Polish mathematicians' breaking of this (by hand, no less), the American decoding of the Japanese diplomatic and, after Pearl Harbor, tactical encryption, and the American Marines' use of Navajo "Code Talkers" to relay messages to and from the front lines. Modern powerful multipurpose computing machines have ushered in the age of Modern Cryptography and its various methods and techniques for encryption.

Now that the obligatory background information, we can start on the meat of the post. I find that it is best to think of encryption software by its functionality. What does the software do and how can that be useful? In this sense, there are three categories of stored data encryption: file level encryption, file vault encryption, and whole disk encryption. Note that I will not be discussing cryptographic protocols, such as SSL/TLS, for securing data as it crosses a network.

File level encryption or filesystem level encryption is a method of encrypting individual files on a disk. Usually this requires the user to manually select to encrypt a file. Some software allows the user to specify that a directory in its entirety is encrypted, including new documents created or put into this directory. Windows uses the Encrypting File System (EFS), and OS X uses their FileVault. Each of these automate decryption when the user logs into the computer. However, this means that anyone who has access to this login has access to the sensitive files. It also makes transporting the files encrypted a challenge: they are decrypted in transit, but are difficult to copy when encrypted (or rather, they are difficult to decrypt after they have been moved when encrypted). Other programs can be used which can overcome the latter difficulty, but which do not solve the first one and may not provide the same ease of use as the integrated products.

What I call "file vault encryption" others call "disk encryption". I think this is easily confused with "full disk encryption" so I will continue to use my terminology, despite the possible confusion with Apple's FileVault. Whatever you want to call it, file vault encryption creates a single file in which all data is stored encrypted. Typically the software will mount this file as an additional hard drive in your computer, making access to the data easy. This type of encryption is very easy to transfer to another computer or to medium -- you just copy the single file. However, it typically requires entering a secondary password after logging into the computer.

Full disk encryption or whole disk encryption usually refers to encrypting the entire boot device. This ensures that all of the data on the disk will be encrypted, including temporary files, working files like the ones Microsoft Word creates, and the scratch disk or virtual memory. Encrypting all of this data is most appropriate for mobile computers which are likely to be lost or stolen. However, this security costs performance. Also, once the user logs into the computer, all files are copied and transmitted unencrypted. In addition to the fact that transporting the data requires additional encryption, if the hard drive is damaged or if the boot sector is overwritten, the data is essentially irretrievable.

Of these three types, each has its proper use. The least useful type of stored data encryption of the three is the file level encryption. It offers the fewest benefits with the highest risks. In fact, I would argue that it is completely useless in comparison with file vault encryption, which performs many of the same functions with the added bonus of transportability. In addition, the fact that the vault is mounted to a drive letter clearly delineates which data is encrypted and which data is not encrypted. Full disk encryption should be used anywhere the risk of computer theft or loss is moderate, in addition to some high security environments. And some form of encryption should be used on all backed up data.

Of the many dozens of attacks where personal information has been lost, it is unclear how many were preventable by encrypting the data. However, it is a good bet that every lost or stolen laptop or backup tape would have yielded no data if proper encryption methods had been used. And many of the hacking incidents may have been preventable if the sensitive information had been encrypted properly. While it may seem costly for a company to implement, the encryption software and practices cost hardly anything compared to an incident like the Department of Veteran's Affairs suffered.

The take away lesson here is to keep your important stuff protected. It's not enough to just keep it in a safe place, you should keep it in a secure place. Whether that is a safety deposit box at your bank, a safe in your home, or a vault at Ft. Knox, you can't afford to let your valuables just sit around unprotected. How cheap it would seem in retrospect to buy a safe than to try replacing a family heirloom after it is stolen.

Thursday, October 19, 2006

Backups

This tip will either be a waste of time or it will save you more grief than you can imagine. Backing up your important information can make the difference between taking 10 minutes to restore your data versus weeks and hundreds of dollars to get none to all of it back. I lost my data once and didn't have the money to spend restoring, so I spent over a year and a half trying out different software and techniques before I was finally able to rebuild the data I lost -- a lot of irreplacable pictures.

So now that you know you should be backing up your data, how do you do that? The first step is to identify what you want to back up. This isn't as easy as it might sound at first. Things tend to get scattered all across your hard drive, floppies, CDs, etc. The only thing worse than not backing up anything is backing up everything but a key document -- by the time you realize you've lost it, it may be be too late to recover. Once you've got it all collected, find a spot on your hard drive where you can store everything.

Now that the first step is completed, it's time to look at your backup options. Which backup method you choose is largely a matter of personal preference. The four general ways to backup data are online, nearline, offline, and offsite. There are benefits to each, as well as drawbacks. Here are some brief descriptions.

Online storage backups are not really backups, they are redundancies in the way the data is stored, meaning that a single dead hard drive does not lead to data loss. However, for the purposes of our discussion, it can be considered a method of backup. Typical online storage would be something like an internal RAID with fault tolerance, NAS/SAN, or some other method of keeping data instantly accessible and current in the event of a failure. Also, you don't have to think about performing backups, data is automatically backed up whenever you change or update it. However, in the event of a complete system failure, all information will be lost. This could be due to theft, lightning and other natural disasters, structure failure, fire, etc.

Nearline storage allows you to keep data close at hand, but not fully current or instantly accessible. This would be a true replication of data, so that it exists both on the computer and on another device. Typical nearline storage devices are USB flash drives, external hard drives, secondary internal hard drives, or any other type of storage usually connected to the computer or across a network. The backed up data is quick and easy to access in the event of a primary storage failure. This type of backup is probably most common in home environments.

Offline storage is that which is backed up, usually on removable media such as blank CDs or DVDs (optical media), floppy disks, zip disks, storage tapes, etc. These media are easily stored elsewhere, since they are typically much cheaper and more portable than the other solutions. Offline storage requires that you locate the media and put it in a reader attached to your computer. One of the biggest problems with this type of storage is that sometimes the media goes bad. This is especially true for optical media.

Offsite storage is typically an offline storage system where some or all of the media is kept in another physical location. For example, if you backup your home computer's data to DVD and store the DVD in your desk drawer at work, you have an offsite backup. This may accomplish your goals just fine, or you may want to look at a more secure solution, such as a safety deposit box or a professional service which will pick up and store your media.

Another form of offsite storage is internet-based storage. There are plenty of sites out there that will give you free storage, from free web hosts, to file sharing sites, to dedicated backup sites, to jumbo sized email hosts. Some of these are better than others for keeping backups of sensitive information. For example, the backup sites linked all claim to encrypt your data so that only you can retrieve it. In general, I don't trust proprietary encryption and I don't trust somebody else to encrypt the data for me. So you'll probably want to encrypt it before uploading (that's a topic for another day...).

While any data backup is better than none at all, I recommend keeping a few different backups using different methods. My important data resides in several locations. First, it is on my local hard drive. Once a week or so I copy this to a file server running a RAID. Every once in a while I'll copy the backup to an internet-based offline storage system. This ensures that I can survive several failures without loss of data.

Don't forget how critical your backups are! Don't store the backups where they may get stolen, lost, damaged, or otherwise be useless. Also don't forget to keep this data secured and/or encrypted. And it might be handy to test your backups regularly to make sure you can restore the information. Many businesses learn these lessons the hard way by losing their only copy of data, by having the information leak out because they treated their backups as if they were blank, or by not being able to get their data back when they really needed it. I warned you.

Businesses pracitce "Risk Management," determining an acceptable amount of risk to allow as a tradeoff for cost. But they're only protecting their money; you have to protect much more valuable property. Whether you're backing up your Great Grandmother's cookie recipé, your college thesis paper, or your pictures of your kids' first Christmas, these things are irreplacable. With the free tools outlined here, the only cost to you is your time.

The final lesson in data backup is trust. Backups are an insurance policy and the most important part of insuring against loss is trust. So don't listen to the lizard or the duck when they tell you that cheap insurance is better. The truth is that if you ever have to cash in one of these things, they'd better pay off. If you don't have 110% confidence that you can recover quickly and easily after a disaster, then it's time to start looking for somebody that you can trust to make that happen.

Monday, October 09, 2006

Free Software Advantage

I was pretty busy last week and didn't get the post up on Wednesday like I'd planned. I decided that rather than rush something out that isn't quite done, I'd hold off. So I'll post it sometime this week.

One problem that I'm having with them is that they tend to get drawn out and give way too much information. Also, I know that they tend to jump around and be less understandable than I mean them to be. These are both problems that result from a lack of good planning. I haven't really got a plan for the blog, and I don't really have a plan for each topic. I just start writing and whatever related stuff I think of, I put in there. So I'm thinking I might revisit some of these first ones sometime down the road and do them right. Maybe I'll start doing a week on a theme and posting something short every day for some of the bigger themes.

Something that I wanted to point out is that most of the links and suggested programs are free. I like free, because it lets me test and compare them before deciding on something. Many commercial programs have a trial period, but I find that I usually don't focus my testing and comparisons to 14 or 30 days. Free software usually works out fairly well, though the commercial ones are much more polished and have better features and support.

I really like to use and recommend FOSS whenever I can. This type of software has very few restrictions, is quickly patched and fixed, and can be just as good as commercial stuff sometimes. A mature software product is the same, whether a large company created it or whether it was made by a coordinated group of unpaid volunteers. Most of the time the people who contribute to the software packages are professional programmers, anyway.

This isn't to say that you should stay away from commercial products. Some of them are very good with no real FOSS alternative. Sometimes the commercial tools will have features that make it well worth the cost. Especially if they have better stability, a better interface, save time, or have some feature that you really need. In businesses, support and accountability are also crucial in a software product. This is why companies tend to shy away from using free software -- someone on the payroll would end up doing this, and that can get expensive. Sometimes it ends up costing less to buy software than to use a free or lower priced alternative.

Often times with non-software, the same is true. Is it worth saving $25 on a $250 purchase? What does that 10% price difference really buy? Sometimes these things aren't quantifible, sometimes they are. If the $25 buys a much better experience, then you're more likely to use the product. The more expensive product may end up costing you less per use than the cheaper one. Many times I don't consider price at all, and just buy whatever will be the best product for me. I have never found myself wishing I'd saved the money, but I often find myself wishing that I'd spent the little bit extra for a better product.