Linux Literacy Session 7: Backups and Security

Standard

Session 7: Backups and Security

Backups and security are huge topics, and both require you to look at your situation, values and needs.

Backups

Ideally, you would like your backups to be:

  • Automated so you don’t have to remember to do them
  • Frequent so if your system does go down then you will not lose much data you care about
  • Stored offsite so that if your computer gets damaged your backups will not be damaged at the same time
  • Robust so that a small error in your backup does not mean you lose all your data
  • Secure so that data you would like to keep confidential will not be “leaked” via your backups
  • Historically thorough so that you can restore files created long ago
  • Recoverable even if you lose your computer system. Ideally you should be able to restore files on a completely different system.

In general, more ideal backups cost you more money. Some of these ideals conflict.

Here are some questions to consider before setting up backups:

  • What do you want to protect against? A dead hard drive? An electrical surge? Computer theft? Your house burning down? Other events?
  • What data would you miss the most if it disappeared? Generally people value their documents and creations over program files or downloaded files, but you should figure out what data is most valuable to you.
  • How much data do you have to back up?
  • Do you want a historical record (so you can restore a file you accidentally deleted 6 months ago) or is a mirror good enough?
  • How much money are you willing to spend?
  • How confidential is the data you want to back up?

Backup Media

Part of choosing a backup strategy involves deciding where to back your data up. There are many different options, all of which have their advantages and disadvantages.

CDs/DVDs

  • Low upfront cost
  • Not as reliable as you would hope: CDs are rated to last about three years
  • Awkward to automate
  • Fairly low capacity

USB Keys/Drives

  • Portable
  • Somewhat awkward to automate unless you keep the key in your drive at all times
  • USB drives have lots of capacity; USB keys less so
  • The quality of USB keys varies a lot
  • Can be encrypted if you want (but this makes automation harder)

Multiple Hard Drives

  • Can be local or on your local network
  • Not portable
  • Fairly reliable (programs like smartmontools can give you some indication when a drive is about to die)
  • Makes it easy to automate backups
  • Can have as much space as you are willing to pay for
  • Can be encrypted
  • Vulnerable to theft or catastrophic hardware failure (e.g. fire, lightning strikes)

Online Services/”The Cloud”

  • Can be relatively cheap for small amounts of data (you can have 5GiB free with Ubuntu One)
  • Offsite (as long as you have your credentials)
  • Easily automatable
  • Puts your data into another organization’s control (encrypt your own data!)
  • Doing full backups gets expensive

Backup Types

In addition to a multitude of backup media, there are different types of backup to consider.

You may find yourself using a combination of backup types and media to give different types of data different levels of protection.

Mirroring

Idea: keep a full copy of your data in another location. RAID is a technique where multiple hard drives can keep an “instant copy” for you automatically, but you can also mirror your data periodically.

  • Keeps a full copy of your data, usually in a usable format.
  • Takes up a lot of space (as much as your data).
  • Does not keep much historical context.
  • Only as robust as the window since the last synchronization (which is why RAID is better for redundancy than backup).

The rsync program can be used to set up mirrored backups.

Full Backups

Idea: copy all of the data you want to keep to an archive. If you need to restore your data, restore the one archive file.

  • Keeps a full copy of your data.
  • Takes up a fair amount of space (although compression is possible). If you keep many full backups then you can easily use more space. than you use for your data.
  • Keeping a historical record is possible, but takes up lots of space.
  • There is only one file archive to worry about.
  • If your full backup gets corrupted you might lose your entire backup.

Many programs can be used to make full backups. One of the most common command-line tools is tar.

Incremental Backups

Idea: Make an archive containing only the changes since the last backup.

  • Incremental backup archives are smaller than full backups.
  • Good for keeping historical archives of data.
  • Restoring to the latest data may mean restoring many archives.

The Deja Dup program provides an easy frontend for incremental backups. This program automatically encrypts your backups.

Differential Backups

Idea: Periodically do a full backup. In between, back up all the files that have changed since the last full backup.

  • Takes more space than an incremental backup, but less than a full backup.
  • Restoring to the latest data requires touching two archives.
  • Can be used to keep a fair amount of historical data.

Some programs like Backup My Files are a hybrid between mirroring and differential backups. Differential backups are more common in the Windows world (which is a shame).

Version Control

Idea: Check in files into a version control system, which tracks changes to each file you want to back up. Then back up the version control files into an archive.

  • Good verification of data.
  • Awkward to remember to check in files.
  • Often binary files (movies, music, pictures) are difficult to deal with, because version control is usually concerned with text files.
  • Keeps a complete historical archive.
  • Thanks to compression can be relatively small (three times the size of your original data).
  • Version Control systems are intended to be robust (but a bug could jeopardize all your backed up data).

Three version control systems that are currently trendy are git, mercurial, and bazaar. They are all commandline programs, but frontends are available. Of these, git may have the most momentum for backups (git-annex, git-bundle and bup).

Security

What Resources are at Stake?

  • Personal information/identity
  • Computing power
  • Administrative access (the “root user”) on your computer
  • Internet bandwidth
  • Data on your hard drive

Primary Threats

  • Intruders breaking into your online accounts (e.g. Gmail accounts)
  • Intruders breaking into your computer online (e.g. SSH worms)
  • Attacks on end-user software such as web browsers and Flash players
  • Attacks on your internet router or wireless
  • Physical theft (especially of laptops)
  • Hardware failure

Lesser Threats

  • Viruses (they exist, but there are not many of them)
  • Software installation containing malware (most software from the official repositories are safe. Other third party software may not be.)

Safer Computing Practices

  • Keep your packages up to date with security updates. Don’t run a version of Ubuntu that no longer has security updates.

  • Avoid running non-administrative tasks as the administrative user.

  • Firefox addons such as HTTPS Everywhere and NoScript can reduce the ways your online accounts can be compromised.

  • Passwords should be long and complicated. (Long matters more than complicated.) Passwords for different applications should not be identical.

    Some people use password managers such as KeepPassX to manage passwords. This has the advantage of generating completely random passwords.

  • Expose as little of your computer to the Internet as possible. If you do expose your computer to the Internet, learn about the security implications of doing so.

    If you run your computer on untrusted networks, you might want to use a firewall on your computer such as GUFW

  • Consider encrypting data that you feel is particularly sensitive. One popular tool for this is TrueCrypt http://www.truecrypt.org . (Unfortunately for licencing reasons TrueCrypt is not available in the Ubuntu repositories.)

Homework

  1. Find two (additional) ways in which the list of backup ideals contradict each other.

  2. Run at least one full backup of your data. What did you back up? Why? What backup method did you use? Why?

  3. Describe an effective and practical automated backup solution for your data. Answer the questions in the “Backup” section to evaluate your choice.

  4. Perform a password audit of your accounts. How many passwords do you use? How strong are they? How do you manage them? What passwords protect the resources most important to you?

Creative Commons Licence
This work by KW Freeskool is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.

Advertisements

Comments are closed.