Command Line 2

From FreekiWiki
Revision as of 12:55, 11 January 2006 by Jeff (talk | contribs)
Jump to navigation Jump to search

Introduction

You've taken the Command Line class. You've QC'd a couple of boxen, and now you've built one. In the meantime, you've forgotten everything you learned in Command Line, except (I hope), tab completion. It's time to learn how it all works, fits together, and how to make sense of it all. Then you can go on to all the other topics in From $ to #.

Lexing a command line

guest@freekbox:~$ ls -la --sort=size -r -w 100 /home/lab* /home/class*

Scared? It's simpler when you know how to lex. Lex, the verb, means to break a flow of information into meaningful symbols. It's a term specific to linguistics and computer science, so you may not know it, but you do it all the time. When reading this sentence you recognize spaces and turn the letters around the spaces into separate words. You could lex on commas, and,then,the,sentence,would,look,like,this. We lex in aural conversation as well, though the spaces aren't visible, they're something we have to be taught. This is why the last step of fluency in a foreign language is the most difficult: knowing when they're saying "heat it", as opposed to "He did."

Despite all the other strange characters in a regular command line (the /, the -, the *, etc), we lex command line arguments with spaces. So let's look at the different pieces of the above command

ls
-la
--sort=size
-r
-w
100
/home/lab/*
/home/class/*

The most important of these is always the first one. Here's why.

Programs are a field of machines

Imagine, for a moment, that you are outside the Matrix, looking out at the field of machines that runs Zion. It is overwhelming. They seem to all interact and interrelate in bizarre ways. People are going through bizarre motions as they tend to machines because this is the way they have been taught.

You discover a simpler way. It turns out that, though they seem to interrelate, each machine is its own entity, and when you are looking at the machine, it has its directions for how to use it printed on the side. Though now you realize you need to remember where all the machines are (in unix: what they're called), you understand that when you get to the machine you need, the instructions are in front of you.

There's a field of machines in Unix. They're called programs, executables, or binaries (all the same thing). Some coder put together a project in which he, she, it (he from now on) figured out what he wanted the program to do and wrote that. Then he compiled it into a binary and installed it somewhere. This binary can have its own system of rules and way of thinking, but that will be documented in the man page. Soon you'll know enough to start reading them.

Parsing a command, part 1

Imagine momentarily that we put the verb first in English grammar. Talk like Yoda, we would. Know, however, immediately, we would, meant to convey what action the sentence did.

Command line talks like Yoda. The first word is the command, the action. The binary, or the machine, in the above. So

guest@freekbox:~$ ls -la --sort=size -r -w 100 /home/lab* /home/class*

The only piece we care about immediately is ls. ls is a command to list files in a directory (we'll get there in a moment). To know anything else about the command line above we have to read man ls to find out how ls expects arguments. When we read ls, we find out that -l means one thing, -a means another, --sort=size means sort files by size, -r means recursive, -w gets passed the 100 and means format the output to 100 columns, and the last two pieces are the directories/files we want listed. The *s I'll explain in a bit, in Parsing Part 2.

Paths/Filesystem introduction

What the heck are those directories? Honestly, how does that all work?

Why is a tree with lots of branches better for climbing than a tree with few branches? It's not because it's easier to climb, it's because when you bring along a whole class full of students, you still have room for all the students to sit on their own branches. This is the way the Linux filesystem is too. At the bottom there is the /, appropriately called the root. It's hard to realize that / is two things in the context of a file location or directory name, it's the root, the bottom of the tree, and it's the lexical separator (like the space is for written English). So /var/lib has two different characters other than the letters: the / that is the root and the / that separates var from lib. They're different. One is the root at the bottom of the tree, the other is the place where the lib directory branches off from the var directory.

So / is a place. So is /var (or /var/ if you want). It's a directory called var located in / (the root). /home is a directory which holds directories for users, like /home/joe holds all the files for joe. And there's one other critically important location: your current location. It starts off, if you're joe, as /home/joe (there's a special way of referring to this, which is ~). But if you don't know where you are, the command pwd will tell you where you are. It's probably also in your prompt, which is that strange thing to the left of where you type commands.

Life is more complex than that, but that's a subject for relative paths. Which is another lesson.

Man, oh man am I tired of remembering

You'll remember commands after using them. For now, you should learn to read man pages. Let's read man ls.

man ls

First, the SYNOPSIS. This tells us what, in a very abstract sense, the command will look like.

ls [OPTION]... [FILE]...

This says that we should expect ls, followed by, optionally (that's what the [] means), a set of options (the ... means again, again, again), and then an optional set of files.

Now the description. Please read aloud. We'll analyze it together. *pause for analysis*

Parsing, Part 2

Knowing the way most commands work, the way they expect their input, will give you a much better sense of how to parse a command, i.e. what it means.

Generally, you're going to see a couple different types of things that a command can take. Here they are:

Arguments

Pretty simply, these are just words - each singly lexed entity after the command that does not start with a hyphen. Commands are usually expected to do something, the arguments are what it is doing that something to, as in the object of a sentence.

 eat apple pineapple pomme_de_terre

Arguments can be yummy.

A location in the filesystem

/usr/lib/libdvd-read2
/lib/modules/2.6.10/kernel/driver/usb/pci/usb-hotplug/
file1

Many arguments will look like these - filesystem locations. If it's got a / in it, especially if it starts with a /, that's probably what it is. But note the third choice: file1. This is also a location in the filesystem. It's a location relative to your current location in the filesystem. See more about the filesystem in the Where is Everything class.

Options

These buggers start with some hyphens, and have some number of letters: "-a", "--sort", "--reverse". If arguments are the objects of our command sentences, options are more like adverbs and prepositions.

 dance --frantically --without=inhibitions

Option Arguments

Usually options are just switches --- either you ignore it and it's off, or you add it in to turn it on. Some options, however, take an argument all for themselves.

 ls --sort=size -w 100

Both of those options took an argument and kept it. We can accomplish this by using an equals sign: "--sort=size", or it may be that you just have to know that the option in question is the kind that takes posession of the argument immediately after it. "-w" is a good example for this. "-w" means how wide to format the output.

 ls -w

The above makes no sense, because -w requires a number of columns to tell ls how to format the output. So it would look like this:

 ls -w 100

Long and Short

Most options come in two forms, on that makes sense as a word, and one that is easy to type which can be quickly aggregated with other options.

"-p" is the short version, and generally one dash (-) means the short version, and two dashes (--) means the long version. This is a generalization from the patterns that people have written into programs, and is not always true. It is often the case, as well, that short options can be combined together with only one preceeding hyphen, and they will be treated each as their own option.

 ls -lap

would be treated as:

 ls -l -a -p

"--file-type" is the long version. That makes alot (Note that the middle hyphen is not preceeded by a space, so is not lexed as a separate word.)

The pipe

It's time to introduce you to a second separator. In English, we have the space to separate symbols and allow us to lex, and then punctuation, to help us parse grammar. Without the period, we wouldn't really know where one idea ended and another began. Bash has something similar: the |. The |, called the Pipe, tells us that we have passed one idea to another. Technically what |ing does is send the output of the first command to be the input of the second command. So imagine you wrote out a command that would give you a lot of output, say, cat /dev/urandom and what you really wanted to do was look at only the first four lines of it. There's a command to look at the first four lines: head -n 4. Now we can put them together:

cat /dev/urandom | head -n 4

One idea blends into another.


A handful of commands

It's time to see if you've gotten this. This is a test. I'm going to give you a few commands, and then ask you to do something in the most abstract way possible. It's your job to translate what I've asked you to do into commands.

  • cat is a command which outputs a file
  • ls shows the files in the current directory, or a directory you ask of it.
  • cd changes your working directory.
  • mv moves a file.
  • head is a command which reads the top of a file or its input
    • head has an optioned flag to limit how many lines to read. It is -n
  • tail is a command which reads the bottom, the same way head reads the top
  • true reports true.
  • false reports false.
  • cut is a command with many many uses
  • ifconfig shows you the network status
  • grep will show lines that have the characters it's given as an argument.



How do you do this?

Please determine whether the 4th line of the interfaces file in the network directory in etc is commented out. (Commented out means begins with a #). Try to do it without regular expressions. If you simply print out the # or print blank, that is what I'd prefer.

cat /etc/network/interfaces | head -n 4 | tail -n 1 | cut -c 1 | grep #


Bash is a shell

I'm not sure if this section is necessary. Evaluating...

Bash is a program, like the others. It's a machine that's intended to sit, wait for input from you, lex and then parse that input, and then run programs using that input. Which means that the stuff I've just taught you is Bash's expected input. We use bash exclusively, and the rest of the world uses it mostly, but there are other shells. I'm saying this not so that you expect to learn them, or me to teach them, but to warn you that sometimes it won't work quite like I've just said because you're not in Bash.

I'm also saying it because bash, as a program, isn't quite so straightforward as the above. For example:

echo=blue

The above does not go looking for the echo command. It sets a variable called echo (properly, $echo) to the value blue. This is why we're going to have a more detailed introduction to bash later, but for now let me say this: bash is complex. Read man bash if you want to know more.

There is one thing I want to be teach you about how bash parses things. the *, aka the glob, is a very special symbol. And part of the problem of understanding command line is understanding that * is special to bash, but it's also special to other things. When I type * into a bash command, bash replaces it with all the files in the current directory. When I type /home/* bash replaces it with all the files and directories in /home/.

If another command wants *, for example find or egrep, I will have to pass it to them escaped, so that they can decide what to do with it. That looks like this:

\*

Another thing that might be important is that Bash considers putting quotes around something as a valid alternative lexical separator. In other words

ls "/var/Jane's special directory"

will be lexed as two symbols: [ls], and [/var/Jane's special directory] (the brackets just show where it stops).