This section of the Workbook is a primer for those just starting out with Unix. Actually, there's probably a lot more here than you want to know if you're just starting out... so go through it, and try to assimilate twenty or so of the most useful commands. If you've used Unix for a while, but don't consider yourself an expert, you may find some useful information below, beyond the basic commands. If you are completely new to Unix, the Prompt information in the Workbook Introduction should first be consulted.

Man pages and help pages

One nice feature of unix is that nearly all commands are documented in the so-called "man pages." ("man" is short for "manual.") Whenever you want to know more about a given command, just enter:

> man <command>

at the command line. This should bring up the man page for that command. To exit the man page and return to the command line, just type "q". If the system responds:

No manual entry for <command>

then don't give up hope until you have also tried:

<command> --help

For example, "BbkDatasetTcl" is more of a BaBar command than a unix command, so it does not have a man entry. However, it does have a "--help" page.

Working with Multiple Flavors of Unix

BaBar code is supported on several different Unix operating systems. In principle, the code will run on all supported platforms using the existing tools. (In fact, that's what supported platform means.)

The BaBar environment is organized so that usually you are taken care of no matter which flavor or Unix you are using. If you want to check your flavor, issue the following command:

   > uname -srv
   Linux 2.4.21-27.0.2.ELSDRsmp #1 SMP Wed Mar 16 11:27:43 CST 2005 (yakut06)
   ...
   SunOS 5.8 Generic_105128-15                            (shire04)
   ...
   SunOS 5.8 Generic_108528-12                            (tersk10)
   ...
   Linux 2.2.19-6.2.12smp #1 SMP Fri Oct 26 13:31:09 EDT 2001  (noric02)
   ...
   Linux 2.4.18-5smp #1 SMP Mon Jun 10 15:19:40 EDT 2002   (noric07)

Note the different versions of Linux in use on noric02 and noric07! As a general rule, you should log into "yakut", "shire", etc, rather than specifying a machine number. That way you make sure you get the architecture you expect, and you allow the system to find the least loaded machine to help you and other active users work fastest.

Role of Regional Centers

All collaborators may use the SLAC systems. However, for reasons of convenience and optimization of the computer power and disk space distributed among the collaborating institutions, Regional Centers have been established. Currently these are:

SLAC
IN2P3 in France
Rutherford Lab (RAL) in the UK
CASPUR in Italy.

SLAC is the main collaboration center, and serves as the regional center for the USA and Canada. It is the primary repository for all BaBar data, although certain skims, and Monte Carlo samples are stored at other sites. IN2P3 is the primary European center, and has some BaBar data, as do Rutherford and CASPUR. Rutherford also has a mirror of the BaBar Web and code releases.

File Systems and Directories

NFS and AFS: the File Systems

When you are logged onto a SLAC machine, you will notice that complete filenames always begin with "/nfs/" of "/afs/". NFS (Network File System), and AFS (Andrew File System) are file systems that organize and provide access to all files. AFS is a more advanced system, and most BaBar files are now in AFS.

The reason a file system is needed is that nowadays, computing is distributed over hundreds of processors and servers, and users and resources may be very widespread geographically. So it is difficult to maintain a system that provides consistent access to all files and resources.

There is a single AFS tree for the whole world and the root of the AFS directory is named /afs. All the immediate subdirectories (there are currently about 150) of the root are called cells. All the AFS files at SLAC are in subdirectories of the cell named /afs/slac.stanford.edu/.

(CAUTION: Do NOT type "ls /afs", because this will cause the system to search not only /afs, but all of the cells of afs, and that can take a VERY long time. ls does not normally do this, but sometimes people alias "ls" to mean "ls -l." More about aliases later.)

Some useful features of /afs/ include direct access to the files of other institutions running AFS, such as CERN and DESY, and automatic daily backups of your home directory. (So if you ever accidentally erase a file and regret it the next day, you can recover it from the backup system.) AFS also makes it harder for one user in a group to use up all of the disk space.

AFS tokens

When you established your AFS account, you supplied a new AFS password, which you probably chose to be the same as that of your Unix account. Now, whenever you log in to Unix, AFS checks the password you supply to make sure that you are who you say you are. If you are, AFS grants you a token and thereby authenticates you as a valid AFS user and logs you into Unix.

A SLAC token expires after 25 hours of being continuously logged in to Unix and may be renewed with the klog command.

> klog

The system will then prompt you for your password. Once you enter it, you will get a new token. The tokens command shows you if you have tokens and when they expire (see below).

**Note**: If you start getting messages that you have read-only access to files which you know you should be able to write to, it is likely that you have no token or that it has expired. In this case, issuing the klog command should fix things. (Another possibility is that your disk quota has been exceeded. See below.)

You will have only one token on each machine at SLAC to which you log on, but may have additional tokens for foreign (that is, non-SLAC) machines that run AFS. When you connect to another machine, you may or may not have a token. The most reliable way of finding out is to issue the tokens command.

Note that while you get a token when you log in to SLAC, this is not the case at Rutherford Lab (RAL) where you need to log in and then create a token for yourself.

Authentication commands

To see the tokens you are currently holding, use the tokens command. For example:

   > tokens
   Tokens held by the Cache Manager:

   User's (AFS ID 1616) tokens for afs@slac.stanford.edu [Expires Jun 13 09:12]
   --End of list--

If you do not have a token, you will see:

   Tokens held by the Cache Manager:
   --End of list--

   User's tokens for afs@slac.stanford.edu [*EXPIRED*]

To obtain or renew a token, type:

   > klog 
   Password:   (your AFS password)
   >

To destroy your token, type:

   > unlog

Your Disk Space in AFS

All AFS data are stored in volumes, which are sections of AFS server disks. Each home directory is in a separate AFS volume. You need to be aware that each volume has a size limit called a quota. Therefore, each user has his or her personal quota of disk space. If you try to exceed the quota, which is given as a number of 1K blocks, AFS will respond with an error message. The command: fs listquota (or: fs lq) tells you the size of your quota and how much of it you have used for your files and directories:

   > fs listquota
   Volume Name            Quota    Used    % Used   Partition 
   u.kiwi                200000  148173       74%         58%  
   > cd bb1
   bb1> fs lq
   Volume Name            Quota    Used     %Used   Partition
   u.kiwi.bb1           200000   76325       38%         62%

There are good reasons to keep individual volumes under 100-200MB. AFS volumes are moved around from time to time, to "level the load." It's easier to move smaller volumes around, and the amount of time they are not accessible to the user is shorter. If your home directory is larger than this, you may want to split it into multiple volumes. Other than finding a convenient subset into which to split your subdirectories, this will have no effect on the use of your home directory. (See also the disk space policy page.)

Should you need to increase your afs quota after filling your initial afs space, use the afs disk space request form to request the increase.

Standard Unix Disk Space

A very useful command to track down space-hogging directories is:

du | sort -n

This will give a listing of the directories in ascending order of size. Further, du | sort -n | tail -10 will display the top 10 disk usage directories. To see sizes of directories, disks and files in GB and MB rather than in the more obscure unit of "blocks", use

du -h  // size of directories
df -h  // disk size
ls -alh  // file size

Within a given directory, you can also list files in ascending size order with the command:

ls -al | sort -n

To obtain information about your local disk, enter

cd /
df -h

The first command takes you to the root of the file system on your computer, from which the df command shows you the hard disk mount directories and their available space.

Finally, to view your disk quota on a standard unix system, enter

quota -v

Commands for Manipulating Directories

The file system in Unix is a hierarchy of directories and files. Every file and directory in the file system can be identified by a complete list of the names of the directories that are on the path from the root directory to that file or directory. The root directory, represented by a "/" (forward slash) is the directory at the top of the file system. Each directory on the route is separated by a "/". For example:

   /usr/local/bin/emacs

gives the full pathname for the file emacs, the program for the emacs editor at SLAC. You can picture the path as looking like this:

                      /(root)
                         |
                         | 
             ---------------------
                  |              |
                 tmp/           usr/
                                 |
                          ----------------------------
                                   |              |
                                local/         spool/
                                   |
                            ---------------------
                                |               |
                              bin/            lib/
                                |
                                ---------
                                    |
                                   emacs

In fact, a directory is just another kind of file, which contains links to its parent directory and to the files and directories which it "contains."

Your home directory is the directory in which you are placed when you logon, your initial working directory. If you change to another directory, this becomes your current directory.

Files may be specified by a relative pathname, which is the pathname with respect to the current working directory. This is a pathname given without the initial "/".

There are several pathnames which have special meaning in most systems:

.	(dot)	the current directory
..	(dot dot)	the directory "above" the current directory (the parent)
~	(tilde)	the home directory of the logged-on user
~userid		the home directory of the user userid

Here are some commands to manipulate directories. In each command, the directory may be given either as a full or relative pathname.

To display the current working directory use the command:
```
   > pwd
```
To change the working directory, use the command:
```
   > cd [directory_name]
```
where directory_name specifies the directory that you want to move to, absolute or relative. If absent, you are moved to your own home directory.
(Actually, it's a bit more complicated... if a relative path [but not "." or ".."] is specified, cd tries to find it in one of the directories in the cdpath shell variable. Try: echo $cdpath to examine this variable. You can learn about shell variables below.)
One more useful case:
```
   > cd - 
```
takes you to the back to the previous directory.
To make a directory, use the command:
```
   > mkdir directory_name
```
If the directory_name is relative, it is created starting from the current directory. The other directory commands behave similarly.
To remove a directory,use the command:
```
   > rmdir directory_name
```
The directory must be empty before you can remove it. You will need to remove any files and subdirectories that it contains. To remove a directory and all of its files and subdirectories, use the command:
```
   > rm -r directory_name
```
**Caution**: If you remove a directory that still holds files, there is no way to retrieve it, or them, unless a backup has been made. To avoid frustration, always use the -r option together with the -i option:
```
   > rm -ir directory_name
```
This will warn you and allow you to review the files to be removed.
To move or rename a directory, use the command:
```
   > mv [option] directory1 directory2
```
This moves directory1 and all its contents to directory2. If directory2 does not exist, it is created. Otherwise directory1 is created as a subdirectory within it.
To copy a directory, use the command:
```
   > cp -r directory1 directory2
```
This command is like mv above, except that the original directory stays around. It copies directory1 and all its contents to directory2. If directory2 does not exist, it is created. Otherwise directory1 is created as a subdirectory within it.
To list the contents of a directory, use the command:
```
   > ls [option] [pathname]
```
If pathname is absent, you get a listing of the files in the current directory.
If pathname is a directory name, you a get a listing of the files in that directory.
Pathname may be a filename with wildcard characters. Two useful wildcard characters are "*", which matches any set of characters (including the null set), and "?", which matches any single character. So:
```
   > ls /foo/*.txt
```
would list all the files with extension .txt in the directory foo.
Two useful options are -l, which gives a long listing with details about each file, and -a, which includes hidden files (those whose names begin with a "dot") in the listing.
To create a symbolic link to a directory, use the command:
```
   > ln -s my_directory link_name 
```
The path link_name can now be used anywhere that you would use my_directory. In a directory listing, symbolic links are followed by the symbol "@". There's more to know about links, but this is enough to get us through the next section.

Unix File Organization

Now that you know how to move around a directory structure, you may find it instructive to practice your skills on the file structure that you created when you ran the Quicktour.

As you navigate around directories, remember that if you get lost, you can always get back to your own home directory with:

   > cd

Start from your home directory, or your dedicated AFS directory, in which you will find the subdirectory generated by the example job:

   > cd  (assuming you put your release in your home directory)
   > pwd
   /afs/slac.stanford.edu/u/br/penguin/ 
   > ls
   ...
   ana31/
   ...
   
   > cd ana31
   > pwd
   /afs/slac.stanford.edu/u/br/penguin/ana31
   >
   > ls
   BetaMiniUser  all.log  database  include  lib  results  shtmp  tmp
   GNUmakefile   bin      doc       java     man  shlib    test   workdir

Move down to the BetaMiniUser directory:

   > cd BetaMiniUser
   > pwd  
   /afs/slac.stanford.edu/u/br/penguin/ana31/BetaMiniUser

Now try the workdir directory:

   > cd ../workdir
   > pwd
   /afs/slac.stanford.edu/u/br/penguin/ana31/workdir
   > ls
   CVS          README      RooLogon.C        kumac             shlib
   GNUmakefile  RELEASE     SP-1237-Run4.tcl  myHistogram.root  snippet.tcl
   PARENT       RooAlias.C  bin               pawlogon.kumac

Now follow the symbolic link PARENT:

   > cd PARENT
   > pwd
   /afs/slac.stanford.edu/g/babar/dist/releases/18.6.4
   > cd ..
   > pwd
   /afs/slac.stanford.edu/g/babar/dist/releases

The symbolic link takes us to a different branch of the directory tree, so the command to go back up one level doesn't return us to where we started. This link points to the source of the code you have run, which is useful because now you don't have to remember it. But it can also be a source of confusion!

To get back, we start over:

   > cd ~/ana31/workdir
   > pwd
   /afs/slac.stanford.edu/u/br/penguin/ana31/workdir
   >

The Shell Program

When you start a Unix session, you are placed in a shell, which is a user interface program that accepts terminal input and acts upon that input. A variety of shells exist, but as a new user, you will probably find that you are using tcsh, which is an extension of one of the original shells, csh, the C shell, so named because its syntax is based on the C language. Other shells you might encounter are the Bourne shell, and its extensions, the Korn shell (ksh), and the Bourne Again shell (bash).

To find out what shell you are using, type:

   > echo $shell
   /bin/tsch

(The command echo prints its argument to the terminal. The argument $shell is the string contained in the variable shell, in this case the name of the shell being used.)

Much of the Unix documentation in this workbook applies to all shells, but certain features (command completion, for example) are specific to individual shells. The syntax and features of tsch will be assumed in the remainder of this discussion. It is recommended that new users accept the default shell and learn to use it, as it is much easier for people to share shell scripts and generally help each other out if they are working off the same setup.

You may have already encountered the .cshrc and .login files when you set up your unix account. The file .cshrc is intended, among other things, to set shell and environment variables and aliases, and is executed when when a new shell is opened, say as a result of an xterm or ssh command.

Shell and environment variables store values that the shell uses in executing your commands. Shell variables help determine how the shell program interacts with you. You can create your own shell variables and assign them values with the set command, and you can insert the variable value in a command line by prefixing the variable name with $, as demonstrated above.

   > set mywork=~/ana31/workdir
   > echo $mywork
   /u/br/penguin/ana31/workdir
   > cd $mywork
   > pwd
   /u/br/penguin/ana31/workdir
   > unset mywork
   > cd $mywork
   mywork: Undefined variable.

A few shell variables have special meaning to the shell program. By assigning values to these variables, you customize the manner in which the shell executes your commands. path, as well as others such as prompt, history (see below), and term, are set in your default .cshrc file, and, of course, you can change them if you like. Some shell variables that you might want to set are:

   set rmstar      # to ask for verification when you type rm *
   set noclobber   # to disallow inadvertently overwriting a file with 
                   #    output redirection
   set ignoreeof   # to disallow CTRL-d from exiting shell
   set nostat =  (/afs /nfs) 
                   # to disable command completion for /nfs and /afs

These settings help to keep you safe from small mistakes that can cause big problems.

The path Variable

One of the most important shell variables is the path. The value for the path is a list of directories, called your search path. Unix searches these directories when it looks for a program or command specified on the command line. If a program is in your search path, you can just type the name of the program; you don't have to type its absolute or relative pathname. The BaBar .cshrc file defines a path with a couple of dozen directories in it. To see which they are type:

   > echo $PATH
/usr/local/bin:/usr/afsws/bin:/usr/afsws/etc:/bin:/usr/bin:/sbin:/usr/sbin:
/usr/etc:/usr/bin/X11:/cern/pro/bin:.:/afs/slac.stanford.edu/g/babar/bin:
/afs/slac.stanford.edu/g/babar/package/objy8.0.9/babar/linux86gcc3/bin:.
/bin/Linux24SL3_i386_gcc323:/afs/slac.stanford.edu/g/babar/dist/releases/
18.6.4/bin/Linux24SL3_i386_gcc323

Suppose you execute programs in another users directory. You may add this directory to your path with a set command:

   > set path=($path ~penguin/bin)
        > echo $PATH
/usr/local/bin:/usr/afsws/bin:/usr/afsws/etc:/bin:/usr/bin:/sbin:/usr/sbin:
/usr/etc:/usr/bin/X11:/cern/pro/bin:.:/afs/slac.stanford.edu/g/babar/bin:
/afs/slac.stanford.edu/g/babar/package/objy8.0.9/babar/linux86gcc3/bin:.
/bin/Linux24SL3_i386_gcc323:/afs/slac.stanford.edu/g/babar/dist/releases/
18.6.4/bin/Linux24SL3_i386_gcc323 /u/br/penguin/bin

Note the parentheses around the value to the right of the equals sign, necessary because of the blank in the string, and the use of $path to include all the directories in the current path. Also note the absence of the $-sign on the left side of the equals sign!

To add new directories to your PATH, the recommended way is to put an "addpath2" command in your .cshrc file somewhere after the basic initialization is done, and certainly after the hepix script is called. The correct syntax is:

   addpath2 PATH /whatever/dir/you/want/to/add

"addpath2" is an alias set up somewhere in the hepix setup. This adds the directory to both the environment variable PATH and the shell variable "path".

You can also add new directories to your path the "old-fashioned" way:

  setenv PATH /what/ever/directory/you/want/:$PATH

This prepends the directory to the environtment variable PATH without discarding the rest of the PATH list.

To see all of the shell variables currently set, type:

   > set
   ...
   (a long list of variables)
   >

To see the path of a given program, like emacs for example, use the "which" command:

   > which emacs
/usr/bin/emacs

So emacs is actually located in /usr/bin. But the system is set up so that whenever you type "emacs", the system responds as if you had typed the full path name "/usr/bin/emacs".

Environment Variables

Environment variables are like shell variables, except that they are accessible to programs that you run as well as to the shell. By convention, they are all upper case. Some variables are both shell and environment. There is a PATH variable, for example, which contains the same directories as does the path variable. Some important environment variables are PRINTER and DISPLAY. Some useful variables set for you by the BaBar scripts are BFROOT and BFDIST, which point to the root of all BaBar files, and to the root of BaBar code, respectively.

   > echo $BFROOT
   /afs/slac.stanford.edu/g/babar
   > echo $BFDIST
   /afs/slac.stanford.edu/g/babar/dist

You can set and unset environment variables:

   > setenv PRINTER puffin
   > echo $PRINTER
   puffin
   > unsetenv PRINTER
   > echo $PRINTER
   PRINTER: Undefined variable.
   > setenv PATH ${PATH}:~kiwi/bin

and as before, setenv without an argument lists all environment variables currently set. Note that this command is typed without the equals sign.

If you modify the commands in .cshrc and you would like them to take effect, you might think that you could just type:

   > .cshrc

But that doesn't work. When executing a program (and a shell script is essentially an program) the shell starts up a new version of itself, which disappears after the command is executed. Thus all of the commands in .cshrc occur in another shell. To make the file run in the current shell, type:

   > source .cshrc

(The command "source script.job" executes each line in the file script.job as if they were entered at the command line.)

Command Completion

The tcsh shell has the ability to complete the typing of a name on the command line, given a unique abbreviation. This feature works with command names, filenames, references to shell and environment variables, and the ~username convention. It is triggered by typing part of the name and pressing the TAB key. If the part of the name you type does not uniquely identify a complete name within the appropriate class of names, any additional unambiguous characters are added to what you typed and the terminal bell is rung.

What happens next depends on the settings of certain shell variables. If autolist is set, you will see a list of matching names. If not, you can type CTRL-d to get the list. You can then type a few more characters and try pressing TAB again. This feature also works in the middle of a command line.

Give it a try:

   > cd
   > cd a[TAB]na30/[TAB]
BetaMiniUser/ doc/          lib@          shlib@        tmp/
bin@          include/      man/          shtmp/        workdir/
database/     java/         results/      test/
   > cd ana31/w[TAB]orkdir
   > pwd
   /afs/slac.stanford.edu/u/br/penguin/ana31/workdir

Other modifications of this behavior are possible; for details, see the SLAC man page or enter "man tcsh" at the Unix prompt.

Command Recall and the Arrow Keys

Unix has an elaborate system of recalling previously typed commands which is based on the command history. The number of commands saved in the history is determined by the shell variable history.

You can look at the command history by typing:

   > set history=10  (the number of commands kept in the history)
   > echo $history
   10                 (it's usually set to 100)
   > history
   236  17:11  echo $BDIST
   237  17:14  echo $PRINTER
   238  17:15  unsetenv PRINTER
   239  17:17  echo $PRINTER
   240  17:21  cd
   241  17:22  cd ana31/workdir/
   242  17:22  pwd
   243  17:22  set history=10
   244  17:22  echo $history
   245  17:22  history
   >

There are a variety of commands which you can use to review and repeat your saved commands, but the easiest way to use this feature is to scroll back and forward using the up and down arrow keys, possibly edit the command you want, using the left and right arrow keys, and re-execute the command by typing a carriage return (enter).

Another way to use the history is to ask for a specific command to be repeated:

   > !242         (repeats command number 242)
   pwd
   /afs/slac.stanford.edu/u/br/penguin/ana31/workdir
   > !!           (repeats the last command)
   pwd
   /afs/slac.stanford.edu/u/br/penguin/ana31/workdir
   > !echo
   echo $BFDIST   (repeats the most recent command starting with "echo")
   /afs/slac.stanford.edu/g/babar/dist

The command is executed immediately, but a complicated set of history substitutions allow a lot of flexibility, if you like that sort of thing. Go to the SLAC man page, or enter "man history" at the Unix prompt, for more details.

Aliases

Another feature of the shell allows you to define shorthand names for frequently used or lengthy commands. When the alias appears in the command line that the shell reads, its text is replaced by the definition of the alias.

   ...    
   alias cp cp -i         (protects copy command)
   alias ll ls -l \!*     (generates long listing... the "!*" recalls the
                             arguments from the previous command: ll args)
   ...

Of course, alias commands can also be entered at the command line.

   > unalias ll

As usual, issuing the alias without arguments gives you a list of all current aliases.

Standard Input, Output and Pipes

Normally Unix commands take their input from the keyboard and display output on the terminal screen. The keyboard is the standard input and the screen is the standard output. A third i/o stream called standard error also exists, and is ordinarily sent to the screen. However, the shell provides a means of redirecting the input and output to come from and go to a file, respectively.

   > ls * > dir.list         (sends the output of the ls command to the file 
                               dir.list but not if the file already exists and 
                               the shell variable noclobber is set)
   >
   > ls *.txt >! dir.list    (overwrites dir.list even if the shell variable 
                               noclobber is set)
   >
   > ls *.dat >> dir.list    (appends output to the file dir.list)
   > rm dir.list
   > ls *.txt >>! dir.list   (appends output even if the file doesn't exist)
   >
   > ls foobar >& error.msg  (redirects output and errors to the file error.msg)
   >
   > myprog < input.dat > output.dat     
                             (takes input from the file input.dat and sends it
                               to output.dat)

Another handy feature of the shell is the pipe. This allows output from one command to be redirected to another without creating intermediate files. A sequence of commands connected by pipes is called a pipeline. Pipes are indicated by a vertical bar.

   > who | grep "Jan 21" | sort | lpr

The command who produces as output a list of all the users on the system, which becomes the input to the grep command, which outputs all lines that contain the string "Jan 21". This list becomes the input to the command sort whose output is a list sorted alphabetically on the first character (which happens, in this case, to be the userid), which becomes the input to the lpr, which causes the list to be printed. The net result of all this is to print an alphabetical list of all users on the system who logged on on January 21. Combining simple commands like those above with command-line editors like sed and awk allows for extremely powerful, if complicated, command strings. This use of pipelines is characteristic of Unix, and gives it great versatility. An explanation of the use of these features is beyond the scope of the workbook, but remember, you heard it here!

Clearing the screen

If the screen you are working in gets annoyingly cluttered, and you start wishing for a way to just clear it off and start over, your wish has already been answered. Just type:

   > clear

Archiving and quick file transfer

A method of tidying up files and directories that are not going to be used for some time but which aren't ready to be deleted is to make an archive file. This is also a very useful way to save space, and to quickly transfer several files and directories.

tar

"tar" originally meant tape archive. It is a useful utility to pack multiple files and directories into a single file.

For example, if we have a directory structure such as:

 /
      cpp_files/
           sort.cpp
	   header.cpp
	   print.cpp
	   address.cpp
      old_cpp_files/
           helloworld.cpp
	   name.cpp
	   test_files/
	       hellow_1.cpp
	       me.cpp

and want to pack up everything from old_cpp_files under, an archive file can be created from the directory '/' with the command:

tar cvf myoldfiles.tar old_cpp_files

This means create an archive, give verbose output (i.e. list all the files that are being packed up), and store it to a file called myoldfiles.tar. This command will store everything from the directory 'old_cpp_files' downwards (hence will include the directory 'test_files' and it's contents also). The directories and files that have been compressed will still exist as separate files, but can be removed with a command such as:

rm -rf old_cpp_files

(Note: the command rm -rf is a very powerful command which will delete all files and directories below the directory old_cpp_files, so should be used very carefully!) You can compress this one file using the Unix zipping package gzip:

gzip myoldfiles.tar

which will create a file called myoldfiles.tar.gz.

Similarly, if you obtain a tarred and gzipped file, you can unpack it with the commands:

gunzip myoldfiles.tar.gz    //only need this if the file is actually
zipped
tar xvf myoldfiles.tar

(and you will probably want to save space by deleting the archive file after this with rm myoldfiles.tar). Here the 'x' in the options for tar means extract. The original directory structure will be preserved, so the output will look like:

directory_used_tar_xvf_in/
      old_cpp_files/
           helloworld.cpp
	   name.cpp
	   test_files/
	       hellow_1.cpp
	       me.cpp

Commands for File Manipulation

Before we discuss individual commands affecting files, there is something you need to be aware of if you're coming to Unix from VMS. In that operating system, each file carries with it a version number, like foo.txt;16. Copying or renaming a file to one which already exists doesn't destroy that file; it just creates a file with a higher version number: foo.txt;17. Unix has no such versions, so renaming or copying in this situation will replace the existing file. If you mean to save a version of a file, you must generally do it by hand. This issue also comes up when editing.

One more thing for you VMS refugees: although many of the filenames you will encounter bear a superficial resemblance to VMS names, in that they have the form name.extension, and that these extensions may have meaning for some programs, it is important to remember that the "dot" means nothing to Unix: foo._#_.abcd.text.. is a perfectly legal (albeit silly) Unix file name.

To copy a file to another file or directory, use the command:

   > cp source_file target
   >

Examples:

   > cp myfile myfile.old       (makes a copy of myfile called myfile.old)
   > cp myfile myarchive/       (makes a copy of myfile in the directory 
                                    myarchive)
   > cp myfile.junk junk/myfile (makes a copy of myfile.junk called myfile 
                                    in the directory junk)
   > cp myarchive/myfile .      (makes a copy in the current directory of 
                                    the file myfile in the directory 
                                    myarchive, keeping the same name)

**Caution** Copying a file to an existing file will replace the existing file. To avoid this, issue the copy command with to option -i:

   > ls foo*
   foo1 foo2
   > cp -i foo1 foo2
   cp: overwrite foo2 (yes/no)?   ("y" overwrites, anything else prevents
                                   overwriting)

In fact, you may wish to define an alias for the cp command in your .cshrc file:

   > alias cp cp -i

so that any copy will automatically ask for verification before destroying an existing file.

To delete a file, use the command:
```
   > rm myfile
```
Once again, the -i option is available, if you would like to verify the files.

To move or rename a file, use the command:

   > mv  source_file target

The usual caution applies to this command, and again the option -i is available.
Examples:

   > mv myfile myfile.old
   > mv myfile archive/
   > mv myfile.junk junk/myfile
   > mv archive/myfile .

To find a file, use the command:
```
   > find start_directory -name filename -print
```
This will initiate a search through the directory start_directory and all its subdirectories, for the file filename. Wildcards may be used in the filename, but in this case, the filename must be quoted.
Examples:
```
   > find . -name myfile.txt -print
   > find .. -name "*.txt" -print
```
You may wish to put an alias in your .cshrc file to handle the simplest use of this command:
```
   > alias findfile "find . -name \!^ -print"
```
There are lots of other options associated with this command. You can select files based on date, user, size, etc., and can also perform actions on the found files. As usual, see the SLAC man page, or enter "man find" at the Unix prompt, for more details.

Editing Files

If you aren't fluent with any Unix editor, you might try nedit just to get started. It is a straightforward mouse-driven editor which will remind you of your favorite PC or Mac editor.

emacs:
You should probably start learning emacs. It is almost guaranteed to be found on any Unix system, and recent versions of the program have most of the features that you would expect in a modern editor (and in fact, many more, but you may never discover them!).

First make sure your xwindow client is turned on so that the editor window can be created. Then start the editor:

   > emacs &

   > emacs filename & (to edit or create a particular file)

and when the screen comes up find the Tutorial under Help in the menu, or type: "

CTRL-h
t

Two other important Unix editors are sed and awk, which are used in conjunction with other shell commands to generate powerful functions which would generally be coded in a high-level language in other operating systems. Their use is outside the scope of this primer, but very characteristic of Unix.

Related documents:

File Protection and Permissions

Access permissions control who may read, write or execute a file, and who may perform operations on a directory. All files and directories in Unix have permission bits which can be set and displayed. These bits determine permissions for files in NFS, but a new mechanism supplants them for AFS files.

You may examine the permission bits for a file or directory by issuing the command:

   > ls -l foo
   -rw-r--r--   1 kiwi     users            20 Jan 28 13:41 foo

The first character in the string "-rw-r--r--" tells you the type of the file: "-" for an ordinary file; "d" for a directory; and "l" for a link, among other possibilities. The next three sets of three characters give the read, write and execute permissions for the owner of the file (u), the members of the group owning the file (g), and all others (o), respectively. In this case, the owner may read or write to the file, and the group and all others may read, but not write to, the file. Nobody is allowed to execute the file.

If this file is a shell script that you just created, you will need to allow execution. To allow everyone to execute this file, issue the command:

   > chmod +x foo
   > ls -l foo
   -rwxr-xr-x   1 kiwi      users            20 Jan 28 13:41 foo

For web files you'll want to make the files and directories containing the files readable for everyone (chmod a+w mywebfile.html), but not writable. Similarly, if you have a .plan file in your root directory, you need it to set read/execute permissions for everyone so that it can be read. To remove execute permission for group and others:

   > chmod go-x foo
   > ls -l foo
   -rwxr--r--   1 lsrea    users            20 Jan 28 13:41 foo

The default permissions for a newly created file is controlled by the file-creation mode mask. This is usually set to some reasonable value, typically -rwxr-xr-x. For more information see the SLAC man page,or enter "man umask" at the Unix prompt.

The permission bits have slightly different meanings when they apply to directories, but again, the default is generally reasonable.

For AFS files, permissions are controlled by access control lists (ACL) and protection groups. A file uses the ACL of the directory in which it resides; it does not have an ACL of its own. If the file is moved to a new directory, it assumes the ACL of that directory. A newly created subdirectory assumes the ACL of the parent directory, but its ACL may be changed afterwards.

Unix permission bits are ignored in AFS, except for the user (owner) bits for files. The user r bit must be on in order for anyone, even the owner, to read the file, even if that user has ACL read and lookup rights in the directory; similarly for the user w bit and ACL write and lookup rights. Although there is no ACL right directly corresponding to the user x bit, this bit must be on, and the user have ACL read and lookup rights, for that user to execute a file.

In most cases, the files you create will have appropriate ACL's, so you should read the following section to get a idea of how things work, but don't feel obligated to master the details at this time. Instead review the documentation when you find you need to use this system.

Here's what an ACL looks like:

$ fs listacl ~kiwi
   Access list for /u/br/kiwi is
   Normal rights:
     system:slac rl
     system:administrators rlidwka
     system:authuser rl
     kiwi rlidwka

Access permissions for various groups are listed. The permissions are:

Directory access rights, which apply to the directory itself:
- l (lookup): allows the user to list the names of files and subdirectories in the directory, to examine the ACL for that directory, to access the directory's subdirectories, and to look at the contents of the directory (with ls -l).
- i (insert): allows the user to add new files and subdirectories to the directory.
- d (delete): allows the user to remove or move files and subdirectories.
- a (administrator): allows the user to change the ACL for the directory.
File access rights apply only to files in the directory:
- r (read): allows the user to read and copy the contents of files.
- w (write): allows the use to modify the contents of files in the directory and to change their permissions with the chmod command.
- k (lock): allows the user to run programs which need to lock (flock) the file.

The following shorthand forms exist for setting access rights:

write: includes rlidwk
read: includes rl
all: includes rlkdwka
none: remove the entry

A protection group is similar to a Unix group, but you can establish and maintain the protection group yourself. There are four system-wide protection groups at SLAC:

system:administrators - AFS system administrators
system:authuser - any user authenticated in the SLAC cell (this is, who has a SLAC token)
system:anyuser - all users anywhere in the world, authenticated or not
system:slac - any user of a host connected to the SLAC local network

The standard BaBar groups and their access rights may be examined by clicking here.

You may add individual users to your ACL, or create your own protection groups, and change the rights of any user or group in your ACL. Since you probably won't be needing to do this for some time, I'll skip the discussion here, and refer you to the SLAC AFS Users' Guide. Also,

   > fs help               (gives list of fs commands)
   > fs [command] -help    (gives syntax for fs command [command])
   > man fs_[command]      (gets man page for the fs command [command])

and similarly for pts commands, which manipulate protection groups.

Working with Files

Restoring Files from Backup

At SLAC, the NFS and AFS file systems are backed up in different ways, with different schedules and capabilities. In either case, backup should be viewed primarily as a disaster recovery mechanism, not as an archival system. Because of the huge volume of disk files in the Unix system, it is not feasible to keep old copies of files for decades. In fact files are kept for no more than one year.

NFS backup is not very interesting to most BaBar users, since the user directories in the NFS file system are mostly for scratch or temporary files.

The AFS backup is provided by the native AFS backup system. The unit of AFS file storage and backup is the volume. Typically, each user home directory is a single volume. For the first level of backup, AFS creates a copy of each volume at midnight each night. You can find this backup copy in the .backup subdirectory in your home directory. If you have just deleted or damaged a file that existed at midnight, check the .backup directory for a copy of it from the previous day.

If you have deleted or damaged a file on a group directory, things are a bit more complicated, but don't give up hope! Click here for more details.

The AFS backup is a series of full and incremental backups, designed to provide complete coverage of recent changes, and sparser and sparser coverage going back in time. A level 0 backup is a full backup of the AFS file system. A level 1 backup is an incremental backup of all changes since the previous level 0 backup. A level 2 backup is an incremental backup of all changes since the previous level 1 backup. The schedule of AFS backups is as follows:

A full level 0 backup is performed on the first of each month. This backup is retained for six months. After six months, only the quarterly (January, April, July, October) backups are kept. The quarterly backups are kept for one year.
A level 1 incremental backup is performed starting at midnight of each Sunday morning. This backup is kept for two months.
A level 2 incremental backup is performed starting at midnight Monday through Saturday. This backup is kept for two weeks.

The result of that schedule is that a volume can be retrieved from the daily backups for the first two weeks, then from the weeklies for the first two months, then from the monthlies for the first six months, and then from the quarterlies for one year.

AFS backups are not yet retrievable by users. However, you can use a web-based form to request damaged/deleted files to be restored from earlier versions. The web form for general afs restore requests is here, and such requests can often be dealt with within the hour. If all else fails, send an email to unix-admin@slac.stanford.edu to request the retrieval of a file from backup.

Printing Files

To print a simple text file (or PostScript file to a PostScript printer) to issue the command:

   > lpr myfile.txt [another_file ...]
   > lpr myfile.ps [...]

The file(s) will be sent to your default printer, which can be found in the environment variable PRINTER:

   > echo $PRINTER
   puffin

If it's not defined, you may want to set it in your .cshrc file or at the command prompt with

   >setenv PRINTER my_printer

To print a file on another printer, just type:

   > lpr -Pprinter_name myfile

To find the name a printer near you, look at the file /etc/printcap (or /etc/qconfig for AIX machines). If this doesn't help you, ask someone who's been through this already.

The enscript command turns a text file into a PostScript file and gives you a wide range of options with respect to format. For example, the command:

   > enscript -2rG myfile.txt

produces a 2-up printout in landscape (rotated) mode with a fancy (Gaudy) header at the top.

To remove a job from the print queue, you first need to get the job number for that item:

   > lpq [-Pprinter_name]

will give you this information for the jobs in the queue. Then just issue the command:

   > lprm [-Pprinter_name] job#    (remove job job#)
   > lprm [-Pprinter_name] user    (remove all jobs queued by user)

You can't stop a job which is already printing with the lprm command. You may be able to stop it at the printer by taking the printer offline or pressing the reset button.

Examining Text in Files: `cat, more, head, tail, grep, fgrep`

The most frequently used Unix commands for displaying the contents of a file are cat and more. The major difference between the two is that more pauses between each screen of text.

cat has several useful options

   > cat myfile       (displays the file)
   > cat -n myfile    (displays file with line numbers)
   > cat -tv myfile   (displays non-printing characters, including tabs)

more is mainly used to scroll through a file:

   > more myfile

The listing stops at the end of each screenful. The usual action at this point is to hit the SPACE-bar, which causes the next screenful to appear, but other responses are possible. For example, type "b" to go back 1 screen; "5b" to go back 5 screens; "3f" to go forward 3 screens; "q" to return to the shell prompt. If you come to the end of the file, and you aren't automatically returned to the shell prompt, type "q".

head and tail allow you to look at the first and last few lines of a file.

   > head myfile     (look at the first 10 lines of myfile)
   > head -20 myfile (the first 20 lines of myfile)
   > tail myfile     (the last ten lines of myfile)
   > tail -20 myfile (the last 20 lines of myfile)

grep and fgrep are used to search in a file for lines containing certain strings. fgrep searches for literal strings, while grep interprets strings as regular expressions.

   > grep hello myfile            (find lines containing "hello" in myfile)
   > grep "hi there" myfile       (quoted because of the space)
   > grep "Hi there" myfile       (grep is case sensitive, except if used
                                         with "-i" option)
   > grep '[0-9]' myfile          (lines containing numbers)
   > grep '^I' myfile             (lines beginning with "I")
   > fgrep '^I' myfile            (lines containing a literal "^I")
   > cat -tv myfile | fgrep '^I'  (find lines with tab characters in myfile)

Regular expressions are another characteristic feature of Unix.

Comparing files: `diff, sdiff`

diff and sdiff allow you to compare two files. diff produces a non-user-friendly output which is tailored to be be the input file to sed. sdiff generates human-readable output. However, diff only returns those lines in the code which have been changed. sdiff quotes all the lines of code, adding a mark next to the lines where the code has been changed:

   > cat ss1
   >
   Bert
   Ernie
   Kermit
   Grover
   Cookie Monster
   Snuffleupagus
   Elmo
   >
   > cat ss2
   Bert
   Grover
   Ernie
   Cookie Monster
   Snuffy
   Elmo
   >
   >
   > diff ss1 ss2  (you won't like this, but here goes anyway)
   2,3d1
   < Ernie
   < Kermit
   4a3
   > Ernie
   6c5
   < Snuffleupagus
   ---
   > Snuffy
   >
   >
   >sdiff -w40 ss1 ss2   (set the output width to 40 columns)
   Bert                  Bert
   Ernie              <  
   Kermit             <  
   Grover                Grover
                      >  Ernie
   Cookie Monster        Cookie Monster
   Snuffleupagus      |  Snuffy
   Elmo                  Elmo
   >

Foreground & Background Jobs: `fg, bg, "&", CTRL-z, jobs, stop, kill`

Unix supports background processing. A background process is a separate task, called a job, that may either be running or stopped. You may have one foreground job and several background jobs running simultaneously, constituting a multitasking environment. Only the foreground job can receive input from the terminal, but jobs can be moved from background to foreground and vice-versa.

To execute a command in background, end the command with an ampersand (&). You can also stop a command which is running in foreground by type CTRL-z. In both cases, you receive a new copy of the shell for foreground processing. Jobs stopped with a CTRL-z remain inactive until you restart them in the foreground, or start them in background.

   > long_command > cmd.out &
   [1] 26162
   >... more commands
   [1]    Done         long_command > cmd.out
   >

This command, which will presumably take a while, is issued in background so that you can do other things while waiting for it to finish. The output is directed to a file, so that it doesn't interfere with output from the foreground. The system returns a job number (in square brackets) and a process number. Eventually it finishes, and you get a message to that effect.

Now suppose you issue the command without the ampersand and realize after a while that you would rather have it running in background. One possibility is to type a CTRL-c, and start over. Or you can move it into the background by first stopping it, and starting it in background.

   > long_command > cmd.out
        CTRL-z
   [1]     8531 Running                       emacs public_html/unix/unix.html
   [3]     8708 Running                       xbiff
   [4]  -  2424 Suspended                     netscape-4.08
   [5]  +  7886 Suspended                     long_command > cmd.out
   > ... 
   >> bg %5   ( or " %5 & " ... both change the status of the background job
                from suspended (stopped) to running)
   [5]  long_command > cmd.out
   > ...

After typing the CTRL-z, you get a list of all background jobs and their status. (You can also get this list at any time by typing jobs at the shell prompt.)

   > jobs
   [1]     8531 Running                       emacs public_html/unix/unix.html
   [3]     8708 Running                       xbiff
   [4]  -  2424 Suspended                     netscape-4.08
   [5]  +  7886 Suspended                     long_command > cmd.out
   >

To bring a background job to the foreground, type:

   > fg %5  (or "  %5 " )
   (job continues in foreground)

The commands bg and fg issued without argument affect the current job, indicated by "+". When you change the status of a background job, that job becomes the current job (+).

If you change directories after submitting a background job, and then bring it to the foreground, the shell returns to the directory from which you issued the command to submit the background job.

To suspend a background job, type:

   > stop %5

and to remove it entirely, type:

   > kill %5

You're done for now!

By completing this tutorial, you have read about and tried out many of the commands that you will need in the course of your daily use of the Unix system. In addition, you should now be vaguely aware of some of the features of Unix, to the extent that you will not be totally surprised at what you encounter.

If you're a new user, it would probably be worth your time to revisit this tutorial in a few months. Some of the items which may have been opaque or confusing this time around may be much easier to understand, and will be much more relevant.

In any event, the list of documents below should be helpful if you need more information on any of the subjects covered here.

General related documents:

Unix at SLAC, Getting Started
Unix at SLAC (from SLAC Home Page)
The Indiana University Knowledge Base Unix Basics
UNIXhelp for users

SLAC Unix man pages. You get them by going to a URL like:

http://www.slac.stanford.edu/cgi-wrap/slac-man/page?<cmd>

How to Report Problems
BaBar Login Scripts

[Workbook Author List] [Old Workbook] [BaBar Physics Book]

Page maintained by Adam Edwards

Last modified: March 2009

Unix for BaBar

Contents: