Bash Course: Lesson 1

What is a Shell Script?
The shell (what you run when you use a terminal) is a command interpreter. Used by itself, on the command line, it acts as a way to interact with a computer, but it also has the features of a programming language, one that gives you access to all the unix tools and utilities you might normally use by themselves on the command line.

It's possible to program directly on the command line using semicolons to separate lines e.g. cd /home/paul;ls but normally it is done in a shell program, normally called a shell script. These take the form of a text file. Shell scripts are commonly used in systems administration. A knowledge of shell scripting is essential for this. Many aspects of a Linux/UNIX system involve shell scripts. For instance, the boot procedure is controlled by the shell scripts in /etc/rc.d. Sysadmins often use shell scripts to automate tasks or put together simple tools. While bash scripting does not have all the power of other more fully featured programming languages, that drawback is often outweighed by the advantages of being easy to use, and providing access to powerful UNIX tools you might already be familiar with.

Starting with a Bang
At its most basic, a shell script is just a list of commands that are executed in order, one by one. All shell scripts start the same way: with a #! , followed by the path to a command. This is known as a sh-bang. This tells the system that the file is a list of commands to be passed to an interpreter. It's a form of [|file magic] (see man 5 magic for the gory details).

The commonest form of shell script is the Bash Script. Not only is Bash (Bourne Again SHell) the default shell on most modern GNU/Linux systems, it is also particularly suitable for shell scripting. It is what we will be using on this course.

The first line of a Bash script always starts like this:

#!/bin/bash

Other examples are a Perl script:

#!/usr/bin/perl or this, which is a self-destructing script:

#!/bin/rm

Hello World - the First Script
By long tradition the first thing you do when you learn to program is to write a simple example that outputs "Hello World!" This is how you do it with a Bash script. (You can download this to your computer by clicking on the file name or the icon next to it).


 * 1) !/bin/bash

echo "Hello World!

N.B. Note the file has the extension .sh. This is a convention to say it is a shell script. It's not absolutely necessary but it's useful to remind you of what it is. If I'm writing a quick script that I might only use once or twice, I always use this extension so I can quickly figure out what it is later. If I write a script that I will be using a lot, just like a regular program, then I won't use it, but I make sure that these are stored in a special place on my file system that's used for storing programs (see Running the Script below).

Before You Begin
Before we can run the script there is one more thing we need to do --- set the right file permissions so it can run. Specifically, to set the "executable" bit. You do this with the [|chmod] (short for CHange MODe) command. There are various ways of setting the permissions (see the man page) but at the very least you need to make sure that you, the owner, have "executable" permissions:

chmod u+x hello_world.sh

If you want everyone on your system to be able to run in you can use chmod a+x hello_world.sh or just chmod +x hello_world.sh as a shorthand.

Running the Script
To run the script you have have to make sure you specify the path to the script. If you are in the same directory as the script, you can type

./hello_world.sh in a terminal.

otherwise give the full path e.g.

/home/paul/hello_world.sh It should say:

Hello World! If it does, congratulations. Otherwise go back and check the file permissions (look for an x on the left if you do ls -l) and make sure it is all typed correctly. you can also use the shortcut

~/hello_word.sh ~/ is the shorthand for /home/[USER] --- /home/paul in my case.

You don't need to specify the path if the script is in a directory that's in your $PATH. You can type echo $PATH to find out what it is. When you type something on the command line the interpreter (e.g. Bash) scans this list and looks for executable files in these directories that match what you have just typed. It runs the **first** one it come across (so you need to be careful what you name your scripts). Here's my path: /usr/local/bin:/usr/bin:/bin:/usr/bin/X11:/usr/games so first my system looks in /usr/local/bin then /usr/bin etc.

If you want everyone on your system to be able to use your script you can put it in /usr/local/bin, as long as you have the right permissions. Then you can run it by just typing its name on the command line.

Another thing to do is create your own bin directory (mkdir ~/bin) and add it to $PATH. If you look in your .bash_profile file (cat ~/.bash_profile) you might see something that looks like this:

With the above added, you will now have ~/bin in the front of your path if that directory exists. This doesn't work for non-login shells however, such as terminal windows you open up in a GUI. Add the above code to ~/.bashrc and it will work in this case as well.

Comments
//"Good comments don't repeat the code or explain it. They clarify its intent. Comments should explain, at a higher level of abstraction than the code, what you're trying to do."// Steve McConnell, Code Complete

Comments are there to help you understand what is going on in a script or program. They are lines that will be ignored by a computer but can be read by humans. They are especially important if anyone else is going to read your code. Often times if you are dealing with somebody else's code and they have not put any comments in, or just bad ones, it can be quicker to start from scratch than to try to figure out what they were trying to do. In practice you should assume that this will always be true. Even if you are the only person that ever looks at your code, I can guarantee there will come a point that you will come back to a code, look at it and ask yourself, "What the hell was I trying to do here?" From experience I can tell you that this can happen in less than a week. The only ways to counter this are good programming styles (clear program structure, good use of white space, etc) and the most important subset of these: comment and comment well.

The three common beginner's mistakes when commenting are: - Not enough, or no comments. - Too many, i.e. superfluous comments - comments that just echo the code they are commenting on

To clarify the last point: comments shouldn't just tell you what the code does. The code itself should do this. If you can't look at a snippet of code and see what it does, you should try rewriting it so it is clear. In scripting this can come about from sticking messy command line hacks or 'spaghetti code' -- hard to understand shortcuts or long regular expressions compounded together. Some people like to do the latter to show how clever or obscure they can be, but the point of code is not to boost your ego. The point is to work -- and if other people can't understand it, it's not doing its job. Similarly, Bash scripting isn't fast and there is no point in sacrificing clarity for speed.

Comments should be there to help you understand the structure and intent of the code --- what it should be doing, and how. \\ You can also use temporary comments in your code to remind yourself to do something later on.

Comment lines in bash always start with a # \\ You can put them on a line of their of their own (above the code) or append them to a line of code if they are short.

Here is the Hello World Script with comments:


 * 1) !/bin/bash

echo "Hello World!
 * 1) says hello to the world by outputting a string to 

A Note on Output
We used echo above to produce some output. By default it goes to  (short for STandarD OUT, it's the screen you're on when you run a command on the command line), as does the output from any other command. You can also use Bash to output to a file or another program just as you can on the command line.

A Pipe ( | ) redirects output from one command to the input of another.

cat helloworld.sh | grep echo > (angle bracket, or "arrow" for short) overwrites a file and %%>>%% (double-arrow) appends output to the end. Here is a script that logs uptime with the date. 
 * 1) !/bin/bash

date +%Y/%m/%d>>log.txt  # uses ISO date format, not US, for easy sorting uptime>>log.txt
 * 1) logs uptime prepended by today's date

Extended Output with Echo
echo -n will avoid putting a newline on the end so you can string things together. \\ echo -e will print 'escaped' characters You can also use ASCII hexadecimal codes eg echo -e "\xA9" outputs the © sign see [].

What is a Variable?
You can think of a variable as a container like a box. it can hold things (it can also be empty). While a box in real life can contain objects, a variable contains data. The variable name is like a label on the box. It never changes, while the contents of the box or variable might. So for example, there might be a box on my desk labeled 'Inbox' that has a gas bill in it. We could have an equivalent variable called $INBOX whose value equals "gas bill" (value equals is another way of saying what data it contains).

Just like the contents of a box, the contents of a variable can change. This means we can keep track of something that changes over time and refer to it by the same name. We can also use a variable to define the contents once and then refer to it multiple times using this name. This helps prevent mistakes, and makes it **much** easier to change things at a later date.

For instance, in the uptime.sh script, it would be a good idea to use a variable to hold the name of the log file. That way if we want to change the name at a later date, we would only have to change the name of the file in one place, not in all the places we had used it. For this reason it is also useful to put variables like this right at the beginning of a script.

The contents of a variable in Bash can be an integer (a number), a string (a string of characters such as letters, numbers and punctuation e.g. a name or a sentence) or even another variable.

You don't have to anything special before you use a variable in Bash. Some programming languages do there are called 'typed' languages because the type of variable matters. Bash, by contrast, is 'untyped' because it doesn't really matter. This makes it easier to use, but it does allow you to get yourself in trouble on occasion. Also, Bash doesn't really care whether a variable contains a string or a number, it just makes a guess depending on what you are doing with it.

N.B. Unlike my Inbox, a variable can only hold one thing at a time. There are other data types that can hold more than one thing that we will learn about in another lesson.

Variable Names
Variable names in Bash always begin with a dollar sign. A variable name can be pretty much anything you want it to be, but there are good variable names and bad ones. There also conventions on how to name variables.

N.B. **Variable names can not include spaces**. If you want to join words together in a variable the convention is to use underscores.

$A_LONG_VARIABLE_NAME Using ALL_CAPS for variable names is another Bash convention. It helps them stand out in a script. It is not always necessary or desirable to do this. When to use them is partly a matter of personal preference but there are a number of factors that can help you decide.

There are a special class of variables that always use ALL_CAPS, these are external variables that you do not set in the script but inherit when you use bash. These include "environment" variables such as $PATH (use env on the command line to get a list) and bash built-ins such as $HOSTNAME (see this list ).

When To Use All Caps
- For clarity, such as when a variable is going to be used with a command and needs to stand out - When a variable is going to be preserved, i.e. kept and referred to over and over again - When a variable will be unchanged, e.g. if it refers to the name of something - When it refers to something outside the script - When you might export it (pass it to another script or program)

With the uptime.sh script, it is clear that the log file variable meets most, or all of these criteria so it would be a good candidate to keep capitalized. We might choose to call it $LOG_FILE for example.

Good and Bad Variable Names
Beginners and bad programmers typically make two common mistakes when it comes to naming a variable: names that are too short, and names that are nonsensical (eg $L and $FRED). A good programmer, on the other hand, does not hesitate to use a long variable name. She aims to make a variable name as specific and as clear as possible, so that the next programmer can tell exactly what it does merely by looking at it. She might choose $UPTIME_LOG_FILE or $DAILY_UPTIME_LOG_FILE so it is clear exactly what it is for, and so the next programmer can easily add a reference to another log file if he chooses to. A good variable name is a useful form of comment in its own right.

Assigning Variables
Assigning a variable means giving a variable its value or contents. We do it like this:

UPTIME_LOG_FILE="uptime.log" Note the lack of spaces around the equals signs.

N.B. when you assign a variable you do not use a dollar sign. This is the only time you don't use a dollar sign when working with variables in BASH.

Referring to Variables
"Referring to variables" is another way of saying "use variables." It is more technically accurate to say that "I referred to variable $X" than "I used variable $X." Sometimes it is also referred to as "calling a variable." It is done by using them with the dollar sign.

# outputs the name of the log file echo $UPTIME_LOG_FILE ==== Putting Them Together ====

 UPTIME_LOG_FILE="uptime.log"
 * 1) !/bin/bash

date +%Y/%m/%d >> $UPTIME_LOG_FILE # uses ISO date format not US for easy sorting uptime >> $UPTIME_LOG_FILE

Variable Substitution
Variable Substitution means assigning the value of one variable to the contents of another

FOO=$BAR

$FOO now has the same contents of $BAR (foo, bar and foobar the the names typically used for example variable names, they aren't good names themselves).

Getting Input
the read command is a handy way of getting user input from the command line. Used by itself in the form read FOO it waits for you to type a string, followed by ENTER, then sets $FOO to that string.


 * 1) !/bin/bash

echo "What's your name?" read NAME echo "Hello $NAME"
 * 1) asks you your name then says hello

There are a number of handy parameters to read you might want to use. -p sets a prompt. -n//number// reads only that many characters without waiting for you to press enter. It is normally used in the form -n1 to read only 1 character for instant feed back. It is often useful to use -s with this; this stops the shell immediately echoing back what you just typed. You can also use -t//seconds// to set a time out.

 read -s -n 1 key_stroke echo $key_stroke
 * 1) captures a single key stroke (without return)

You can also look into the zenity command to get fancy GUI input and output using the GTK toolkit (Its not too hard either).

===== Quotes, Brackets and Other Special Things =====

Double Quotes
Whenever we have used a string so far it has been surrounded by double quotes like this "this is a string." It is not always necessary but it is generally what you want, so it's a good habit to get into.

Quoting with double quotes has two effects

1. It allows variable names to be 'interpreted' (or expanded) i.e. the variable name will be replaced by its contents.

2. It escapes most special characters (except / & $) --- it stops them being interpreted, so they are treated literally.

It also stops echo 'eating' newlines --- useful if you want output to appear just as if you typed it on the command line rather than as one continuous line.

Single Quotes
Quoting with single quotes has a different effect. It is very literal, it stops everything being interpreted.

Escaping Characters
A backslash can be used to 'escape' characters. With single quotes this will stop $ (and \) being interpreted so you can use this to stop variable names being expanded.

It can also be used to output certain special characters. With echo -e and double quotes

produces

Hello World

How are you Paul

You can also use $'\X' with out the -e

Here is a list of the most useful ones:

^ Character ^ Produces ^
 * \n | newline |
 * \r | return |
 * \t | tab |
 * \v | vertical tab |
 * \b | backspace |
 * \a | alert (beep or flash) |

\0xx outputs an ASCII character (in octal) This page has a [|full list] of ASCII codes.


 * debugging is working out where mistakes are in programs, then getting rid of them. Insects were attracted to the warmth given off by electrical components in the first computers. This used to cause short circuits and then errors in calculations so it used to be meant literally.

==== Curly Brackets ==== Curly Brackets {} are used to 'protect' variable names. This allows us to combine variable names with other strings in useful ways.

There are other neat tricks you can do with curly brackets:

Parentheses
With variables, parentheses are used to capture the output of a command in a variable like this:

This is also known as command substitution. You can also use it directly on the command line, such as with echo:

An alternative way of doing this is with back ticks a.k.a. back quotes (UPTIME=`uptime`; echo `date`). While you might see this in other people's code, it is not really a good idea. Back ticks are much easier to miss and it is less clear what is going on. It is mostly there as a historical artifact.

Special Variables
You have already come across some special variables: environment variables such as $PATH (use env on the command line to get a list) and bash built-ins such as $HOSTNAME (see this list ). These are typically things it is useful to have access to in your script without having to go through lots of extra effort to find them out.

Command Line Arguments
You will already be familiar with command line arguments: ls is the command, and -l and /var/log are the command arguments.

In bash these are provided for your use within a script through the special variables $0 $1 $2

$0 gives you the name (and full path) of the command (such as your script) as it was called. 
 * 1) outputs the name and path of itself when it's run.

echo $0    # outputs ./whatami.sh or /home/paul/bin/whatami.sh etc depending on how you ran it

$1 $2 $3%%...%% give you the first and second command line arguments etc.  FIRST_NAME=$1 LAST_NAME=$2 echo "Hello $FIRST_NAME $LAST_NAME"
 * 1) !/bin/bash

./hello.sh Paul M \\ would output \\  "Hello Paul M"

there is also: $# which gives you the number of command line arguments $* which gives you all the command line arguments as a single string. $@ this works like $* but each word is treated as a separate string (the difference will become clear later on when we look at loops) The last two **always** need to be quoted with double quotes ("$*" "$@").


 * 1) !/bin/bash

echo "Hello $* you have $# words in your name" # echo "Hello $@ you have $# words in your name" would work in the same way here

Other Special Parameters
$? is really useful in shell scripting. It contains the exit value of the last command (or function) run. If a command is successful it will be equal to 0, otherwise a positive integer (number) --- normally 2 if an error occurred. You can use this to see if something worked properly and maybe what went wrong if it didn't.

$! is the Process ID (PID) of the last job run in the background. You could use this to monitor or otherwise control a job you put in the background with & (this leaves a process/command running, but allows you to use other commands in the mean time without waiting for it to finish). You might want to use this for logging purposes.

$$ is the Process ID (PID) of the script itself. This can be used to construct unique file names for temporary files so they don't overwrite each other, or to make them easily identifiable, for example.

Putting It All Together
An updated version of the uptime logger. You need to give the name of the log file on the command line and make sure that you have a  directory called log in your home directory. With just what we have learned so far we can already produce a useful system tool. If you were to add it to your crontab you could use this to continuously monitor system uptime and load average on your computer.

Crontab entry:

# m h dom mon dow    command */10 * * * *      /home/paul/bin/uptime_logger.sh uptime.log # logs uptime and load average every ten minutes

The script:


 * 1) !/bin/bash

UPTIME_LOG_FILE=$1 UPTIME_LOG_PATH="${HOME}/log"
 * 1) set the log file name and the directory to store it in

ISO_DATE=$(date +%Y/%m/%d)

echo -e "${HOSTNAME}: \t $ISO_DATE $(uptime)" >> ${UPTIME_LOG_PATH}/${UPTIME_LOG_FILE}
 * 1) write hostname, date and uptime to log file in a pretty format

This will result in logfiles that look something like this: prairie:        2010/02/24  02:18:35 up 5 days, 15:44,  5 users,  load average: 0.71, 0.99, 0.88 As you can see my computer is danger of getting overloaded!