Unix for BaBar
Contents:
This section of the Workbook is a primer for those just starting out
with Unix. Actually, there's probably a lot more here than you want
to know if you're just starting out... so go through it, and try to
assimilate twenty or so of the most useful commands. If you've used
Unix for a while, but don't consider yourself an expert, you may find
some useful information below, beyond the basic commands. If you are
completely new to Unix, the Prompt
information in the Workbook Introduction should first be consulted.
One nice feature of unix is that nearly all commands are documented in
the so-called "man pages." ("man" is short for "manual.") Whenever you
want to know more about a given command, just enter:
> man <command>
at the command line. This should bring up the man page for that
command. To exit the man page and return to the command line, just type
"q".
If the system responds:
No manual entry for <command>
then don't give up hope until you have also tried:
<command> --help
For example, "BbkDatasetTcl" is more of a BaBar command than a unix command,
so it does not have a man entry. However, it does have a "--help" page.
BaBar code is supported on several different Unix operating systems.
In principle, the code will run on all supported platforms using the
existing tools. (In fact, that's what supported platform means.)
The BaBar environment is organized so that usually you are taken care
of no matter which flavor or Unix you are using. If you want to
check your flavor, issue the following command:
> uname -srv
Linux 2.4.21-27.0.2.ELSDRsmp #1 SMP Wed Mar 16 11:27:43 CST 2005 (yakut06)
...
SunOS 5.8 Generic_105128-15 (shire04)
...
SunOS 5.8 Generic_108528-12 (tersk10)
...
Linux 2.2.19-6.2.12smp #1 SMP Fri Oct 26 13:31:09 EDT 2001 (noric02)
...
Linux 2.4.18-5smp #1 SMP Mon Jun 10 15:19:40 EDT 2002 (noric07)
Note the different versions of Linux in use on noric02 and noric07! As
a general rule, you should log into "yakut", "shire",
etc, rather than specifying a machine number. That way you make sure
you get the architecture you expect, and you allow the system to find
the least loaded machine to help you and other active users work
fastest.
All collaborators may use the SLAC systems. However, for reasons of
convenience and optimization of the computer power and disk space
distributed among the collaborating institutions, Regional
Centers have been established. Currently these are:
- SLAC
- IN2P3 in France
- Rutherford Lab (RAL) in the UK
- CASPUR in Italy.
SLAC is the main collaboration center, and serves as the regional
center for the USA and Canada. It is the primary repository for all
BaBar data, although certain skims, and Monte Carlo samples are
stored at other sites. IN2P3 is the primary European center, and has
some BaBar data, as do Rutherford and CASPUR. Rutherford also
has a mirror of the BaBar Web and code releases.
When you are logged onto a SLAC machine, you will notice that
complete filenames always begin with "/nfs/" of "/afs/". NFS
(Network File System), and AFS (Andrew File System) are file systems
that organize and provide access to all files. AFS is a more advanced
system, and most BaBar files are now in AFS.
The reason a file system is needed is that nowadays, computing is
distributed over hundreds of processors and servers, and users and
resources may be very widespread geographically. So it is difficult
to maintain a system that provides consistent access to all files and
resources.
There is a single AFS tree for the whole world and the root of the AFS
directory is named /afs. All the immediate
subdirectories (there are currently about 150) of the root are called
cells. All the AFS files at SLAC are in subdirectories of the
cell named /afs/slac.stanford.edu/.
(CAUTION: Do NOT type "ls /afs", because this will cause the system
to search not only /afs, but all of the cells of afs, and that can
take a VERY long time. ls does not normally do this, but sometimes
people alias "ls" to mean "ls -l." More about aliases later.)
Some useful features of /afs/ include direct access to the files of
other institutions running AFS, such as CERN and DESY, and
automatic daily backups of your home directory. (So if you ever
accidentally erase a file and regret it the next day, you can
recover it from the backup system.) AFS also makes it harder for
one user in a group to use up all of the disk space.
AFS tokens
When you established your AFS account, you supplied a new AFS
password, which you probably chose to be the same as that of your Unix
account. Now, whenever you log in to Unix, AFS checks the password you
supply to make sure that you are who you say you are. If you are, AFS
grants you a token and thereby authenticates you as
a valid AFS user and logs you into Unix.
A SLAC token expires after 25 hours of being
continuously logged in to Unix and may be renewed with the
klog command.
> klog
The system will then prompt you for your password. Once you
enter it, you will get a new token. The tokens
command shows you if you have tokens and when they expire (see below).
**Note**: If you start getting messages that you have read-only access
to files which you know you should be able to write to, it is likely
that you have no token or that it has expired. In this case, issuing
the klog command should fix things. (Another possibility
is that your disk quota has been exceeded. See below.)
You will have only one token on each machine at SLAC to which you log
on, but may have additional tokens for foreign (that is, non-SLAC)
machines that run AFS. When you connect to another machine, you may
or may not have a token. The most reliable way of finding out is to
issue the tokens command.
Note that while you get a token when you log in to SLAC, this is not
the case at Rutherford Lab (RAL) where you need to log in and then create a
token for yourself.
Authentication commands
To see the tokens you are currently holding, use the
tokens command. For example:
> tokens
Tokens held by the Cache Manager:
User's (AFS ID 1616) tokens for afs@slac.stanford.edu [Expires Jun 13 09:12]
--End of list--
If you do not have a token, you will see:
Tokens held by the Cache Manager:
--End of list--
or
User's tokens for afs@slac.stanford.edu [*EXPIRED*]
To obtain or renew a token, type:
> klog
Password: (your AFS password)
>
To destroy your token, type:
> unlog
All AFS data are stored in volumes, which are sections of AFS server
disks. Each home directory is in a separate AFS volume. You need to be
aware that each volume has a size limit called a quota. Therefore,
each user has his or her personal quota of disk space. If you try to
exceed the quota, which is given as a number of 1K blocks, AFS will
respond with an error message. The command: fs listquota
(or: fs lq) tells you the size of your quota and how much
of it you have used for your files and directories:
> fs listquota
Volume Name Quota Used % Used Partition
u.kiwi 200000 148173 74% 58%
> cd bb1
bb1> fs lq
Volume Name Quota Used %Used Partition
u.kiwi.bb1 200000 76325 38% 62%
There are good reasons to keep individual volumes under 100-200MB.
AFS volumes are moved around from time to time, to "level the load."
It's easier to move smaller volumes around, and the amount of time
they are not accessible to the user is shorter. If your home
directory is larger than this, you may want to split it into multiple
volumes. Other than finding a convenient subset into which to split
your subdirectories, this will have no effect on the use of your home
directory. (See also the disk
space policy page.)
Should you need to increase your afs quota after filling your initial afs space,
use the
afs disk space
request form to request the increase.
Related Documents:
A very useful command to track down space-hogging directories is:
du | sort -n
This will give a listing of the directories in ascending order of
size. Further, du | sort -n | tail -10 will display the top
10 disk usage directories. To see sizes of directories, disks and
files in GB and MB rather than in the more obscure unit of
"blocks", use
du -h // size of directories
df -h // disk size
ls -alh // file size
Within a given directory, you can also list files in ascending size
order with the command:
ls -al | sort -n
To obtain information about your local disk, enter
cd /
df -h
The first command takes you to the root of the file system on your
computer, from which the df command shows you the hard disk
mount directories and their available space.
Finally, to view your disk quota on a standard unix system, enter
quota -v
The file system in Unix is a hierarchy of directories and files.
Every file and directory in the file system can be identified by a
complete list of the names of the directories that are on the path
from the root directory to that file or directory. The root
directory, represented by a "/" (forward slash) is the
directory at the top of the file system. Each directory on the route
is separated by a "/". For example:
/usr/local/bin/emacs
gives the full pathname for the file emacs, the program for the
emacs editor at SLAC. You can picture the path as looking like this:
/(root)
|
|
---------------------
| |
tmp/ usr/
|
----------------------------
| |
local/ spool/
|
---------------------
| |
bin/ lib/
|
---------
|
emacs
In fact, a directory is just another kind of file, which contains
links to its parent directory and to the files and directories which
it "contains."
Your home directory is the directory in which you are placed
when you logon, your initial working directory. If you change
to another directory, this becomes your current directory.
Files may be specified by a relative pathname, which is
the pathname with respect to the current working directory. This is a
pathname given without the initial "/".
There are several pathnames which have special meaning in most
systems:
|
. |
(dot) |
the current directory |
|
.. |
(dot dot) |
the directory "above" the current directory (the parent) |
|
~ |
(tilde) |
the home directory of the logged-on user |
|
~userid |
|
the home directory of the user userid |
Here are some commands to manipulate directories. In each command,
the directory may be given either as a full or relative pathname.
- To display the current working directory use the command:
> pwd
- To change the working directory, use the command:
> cd [directory_name]
where directory_name specifies the directory that you
want to move to, absolute or relative.
If absent, you are moved to your own home directory.
(Actually, it's a bit more complicated... if a relative path
[but not "." or ".."] is
specified, cd tries to find it in one of the directories
in the cdpath shell variable. Try:
echo $cdpath to examine this variable.
You can learn about shell variables below.)
One more useful case:
> cd -
takes you to the back to the previous directory.
- To make a directory, use the command:
> mkdir directory_name
If the directory_name is relative, it is created starting
from the current directory. The other directory commands behave
similarly.
- To remove a directory,use the command:
> rmdir directory_name
The directory must be empty before you can remove it. You
will need to remove any files and subdirectories that it
contains. To remove a directory and all of its
files and subdirectories, use the command:
> rm -r directory_name
**Caution**: If you remove a directory that still holds
files, there is no way to retrieve it, or them,
unless a backup has been made. To avoid
frustration, always use the -r option together with the
-i option:
> rm -ir directory_name
This will warn you and allow you to review the files to be removed.
- To move or rename a directory, use the command:
> mv [option] directory1 directory2
This moves directory1 and all its contents to directory2.
If directory2 does not exist, it is created. Otherwise
directory1 is created as a subdirectory within it.
- To copy a directory, use the command:
> cp -r directory1 directory2
This command is like mv above, except that the
original directory stays around. It copies
directory1 and all its contents to
directory2. If directory2 does not
exist, it is created. Otherwise directory1 is
created as a subdirectory within it.
- To list the contents of a directory, use the command:
> ls [option] [pathname]
If pathname is absent, you get a listing of the
files in the current directory.
If pathname is a directory name, you a get a
listing of the files in that directory.
Pathname may be a filename with
wildcard characters. Two useful wildcard characters
are "*", which matches any set of characters
(including the null set), and "?", which matches
any single character. So:
> ls /foo/*.txt
would list all the files with extension .txt in
the directory foo.
Two useful options are -l, which gives a long listing
with details about each file, and -a, which includes
hidden files (those whose names begin with a "dot") in the listing.
- To create a symbolic link to a directory, use the command:
> ln -s my_directory link_name
The path link_name can now be used anywhere that you
would use my_directory. In a directory listing, symbolic
links are followed by the symbol "@". There's
more to know about links, but this is enough to get us through the
next section.
Now that you know how to move around a directory structure, you may
find it instructive to practice your skills on the file structure that
you created when you ran the Quicktour.
As you navigate around directories, remember that if you get lost, you
can always get back to your own home directory with:
> cd
Start from your home directory, or your dedicated AFS directory, in
which you will find the subdirectory generated by the example job:
> cd (assuming you put your release in your home directory)
> pwd
/afs/slac.stanford.edu/u/br/penguin/
> ls
...
ana31/
...
> cd ana31
> pwd
/afs/slac.stanford.edu/u/br/penguin/ana31
>
> ls
BetaMiniUser all.log database include lib results shtmp tmp
GNUmakefile bin doc java man shlib test workdir
Move down to the BetaMiniUser directory:
> cd BetaMiniUser
> pwd
/afs/slac.stanford.edu/u/br/penguin/ana31/BetaMiniUser
Now try the workdir directory:
> cd ../workdir
> pwd
/afs/slac.stanford.edu/u/br/penguin/ana31/workdir
> ls
CVS README RooLogon.C kumac shlib
GNUmakefile RELEASE SP-1237-Run4.tcl myHistogram.root snippet.tcl
PARENT RooAlias.C bin pawlogon.kumac
Now follow the symbolic link PARENT:
> cd PARENT
> pwd
/afs/slac.stanford.edu/g/babar/dist/releases/18.6.4
> cd ..
> pwd
/afs/slac.stanford.edu/g/babar/dist/releases
The symbolic link takes us to a different branch of the directory
tree, so the command to go back up one level doesn't return us to
where we started. This link points to the source of the code you have
run, which is useful because now you don't have to remember it. But it
can also be a source of confusion!
To get back, we start over:
> cd ~/ana31/workdir
> pwd
/afs/slac.stanford.edu/u/br/penguin/ana31/workdir
>
When you start a Unix session, you are placed in a shell,
which is a user interface program that accepts terminal input and acts
upon that input. A variety of shells exist, but as a new user, you
will probably find that you are using tcsh, which is an
extension of one of the original shells, csh, the C shell, so
named because its syntax is based on the C language. Other shells you
might encounter are the Bourne shell, and its extensions, the Korn
shell (ksh), and the Bourne Again shell (bash).
To find out what shell you are using, type:
> echo $shell
/bin/tsch
(The command echo prints its argument to the
terminal. The argument $shell is the string contained in
the variable shell, in this case the name of the shell
being used.)
Much of the Unix documentation in this workbook applies to all shells,
but certain features (command completion, for example) are specific to
individual shells. The syntax and features of tsch will be
assumed in the remainder of this discussion. It is recommended that
new users accept the default shell and learn to use it, as it is much
easier for people to share shell scripts and generally help each other
out if they are working off the same setup.
You may have already encountered the .cshrc and
.login files when you set up your unix account. The file
.cshrc is intended, among other things, to set
shell and environment variables and
aliases, and is executed when when a new shell is opened, say
as a result of an xterm or ssh command.
Shell and environment variables store values that the shell uses in
executing your commands. Shell variables help determine how the shell
program interacts with you. You can create your own shell variables
and assign them values with the set command, and you can insert the
variable value in a command line by prefixing the variable name with
$, as demonstrated above.
> set mywork=~/ana31/workdir
> echo $mywork
/u/br/penguin/ana31/workdir
> cd $mywork
> pwd
/u/br/penguin/ana31/workdir
> unset mywork
> cd $mywork
mywork: Undefined variable.
A few shell variables have special meaning to the shell program. By
assigning values to these variables, you customize the manner in which
the shell executes your commands. path, as well as others
such as prompt, history (see below), and
term, are set in your default .cshrc file,
and, of course, you can change them if you like.
Some shell variables that you might want to set are:
set rmstar # to ask for verification when you type rm *
set noclobber # to disallow inadvertently overwriting a file with
# output redirection
set ignoreeof # to disallow CTRL-d from exiting shell
set nostat = (/afs /nfs)
# to disable command completion for /nfs and /afs
These settings help to keep you safe from small mistakes that can cause big
problems.
One of the most important shell variables is the path. The value for
the path is a list of directories, called your search path. Unix
searches these directories when it looks for a program or command
specified on the command line. If a program is in your search path,
you can just type the name of the program; you don't have to type its
absolute or relative pathname. The BaBar .cshrc file
defines a path with a couple of dozen directories in it. To see which
they are type:
> echo $PATH
/usr/local/bin:/usr/afsws/bin:/usr/afsws/etc:/bin:/usr/bin:/sbin:/usr/sbin:
/usr/etc:/usr/bin/X11:/cern/pro/bin:.:/afs/slac.stanford.edu/g/babar/bin:
/afs/slac.stanford.edu/g/babar/package/objy8.0.9/babar/linux86gcc3/bin:.
/bin/Linux24SL3_i386_gcc323:/afs/slac.stanford.edu/g/babar/dist/releases/
18.6.4/bin/Linux24SL3_i386_gcc323
Suppose you execute programs in another users directory. You may add
this directory to your path with a set command:
> set path=($path ~penguin/bin)
> echo $PATH
/usr/local/bin:/usr/afsws/bin:/usr/afsws/etc:/bin:/usr/bin:/sbin:/usr/sbin:
/usr/etc:/usr/bin/X11:/cern/pro/bin:.:/afs/slac.stanford.edu/g/babar/bin:
/afs/slac.stanford.edu/g/babar/package/objy8.0.9/babar/linux86gcc3/bin:.
/bin/Linux24SL3_i386_gcc323:/afs/slac.stanford.edu/g/babar/dist/releases/
18.6.4/bin/Linux24SL3_i386_gcc323 /u/br/penguin/bin
Note the parentheses around the value to the right of the equals sign,
necessary because of the blank in the string, and the use of $path to
include all the directories in the current path. Also note the
absence of the $-sign on the left side of the equals sign!
To add new directories to your PATH, the recommended way is to put
an "addpath2" command in your .cshrc file
somewhere after the basic initialization is done, and certainly after
the hepix script is called. The correct syntax is:
addpath2 PATH /whatever/dir/you/want/to/add
"addpath2" is an alias set up somewhere in the
hepix setup. This adds the directory to both the
environment variable PATH and the shell variable "path".
You can also add new directories to your path the
"old-fashioned" way:
setenv PATH /what/ever/directory/you/want/:$PATH
This prepends the directory to the environtment variable PATH without
discarding the rest of the PATH list.
To see all of the shell variables currently set, type:
> set
...
(a long list of variables)
>
To see the path of a given program, like emacs for example, use the "which"
command:
> which emacs
/usr/bin/emacs
So emacs is actually located in /usr/bin. But the system is set up so that
whenever you type "emacs", the system responds as if you had typed the full
path name "/usr/bin/emacs".
Environment variables are like shell variables, except that they are
accessible to programs that you run as well as to the shell. By
convention, they are all upper case. Some variables are both shell
and environment. There is a PATH variable, for example,
which contains the same directories as does the path
variable. Some important environment variables are
PRINTER and DISPLAY. Some useful variables
set for you by the BaBar scripts are BFROOT and
BFDIST, which point to the root of all BaBar files, and
to the root of BaBar code, respectively.
> echo $BFROOT
/afs/slac.stanford.edu/g/babar
> echo $BFDIST
/afs/slac.stanford.edu/g/babar/dist
You can set and unset environment variables:
> setenv PRINTER puffin
> echo $PRINTER
puffin
> unsetenv PRINTER
> echo $PRINTER
PRINTER: Undefined variable.
> setenv PATH ${PATH}:~kiwi/bin
and as before, setenv without an argument lists all
environment variables currently set. Note that this command is typed
without the equals sign.
If you modify the commands in .cshrc and you would like
them to take effect, you might think that you could just type:
> .cshrc
But that doesn't work. When executing a program (and a shell script
is essentially an program) the shell starts up a new version of
itself, which disappears after the command is executed. Thus all of
the commands in .cshrc occur in another shell. To make
the file run in the current shell, type:
> source .cshrc
(The command "source script.job" executes each line in the file
script.job as if they were entered at the command line.)
The tcsh shell has the ability to complete the typing of a name on the
command line, given a unique abbreviation. This feature works with
command names, filenames, references to shell and environment
variables, and the ~username convention.
It is triggered by typing part of the name and pressing the TAB key.
If the part of the name you type does not uniquely identify a complete
name within the appropriate class of names, any additional unambiguous
characters are added to what you typed and the terminal bell is rung.
What happens next depends on the settings of certain shell variables.
If autolist is set, you will see a list of matching names. If not, you
can type CTRL-d to get the list. You can then type a few
more characters and try pressing TAB again. This feature also works
in the middle of a command line.
Give it a try:
> cd
> cd a[TAB]na30/[TAB]
BetaMiniUser/ doc/ lib@ shlib@ tmp/
bin@ include/ man/ shtmp/ workdir/
database/ java/ results/ test/
> cd ana31/w[TAB]orkdir
> pwd
/afs/slac.stanford.edu/u/br/penguin/ana31/workdir
Other modifications of this behavior are possible; for details, see the
SLAC man page or enter "man tcsh" at the
Unix prompt.
Unix has an elaborate system of recalling previously typed commands which
is based on the command history. The number of commands saved in
the history is determined by the shell variable history.
You can look at the command history by typing:
> set history=10 (the number of commands kept in the history)
> echo $history
10 (it's usually set to 100)
> history
236 17:11 echo $BDIST
237 17:14 echo $PRINTER
238 17:15 unsetenv PRINTER
239 17:17 echo $PRINTER
240 17:21 cd
241 17:22 cd ana31/workdir/
242 17:22 pwd
243 17:22 set history=10
244 17:22 echo $history
245 17:22 history
>
There are a variety of commands which you can use to review and repeat
your saved commands, but the easiest way to use this feature is to
scroll back and forward using the up and down arrow keys, possibly
edit the command you want, using the left and right arrow keys, and
re-execute the command by typing a carriage return (enter).
Another way to use the history is to ask for a specific command to be repeated:
> !242 (repeats command number 242)
pwd
/afs/slac.stanford.edu/u/br/penguin/ana31/workdir
> !! (repeats the last command)
pwd
/afs/slac.stanford.edu/u/br/penguin/ana31/workdir
> !echo
echo $BFDIST (repeats the most recent command starting with "echo")
/afs/slac.stanford.edu/g/babar/dist
The command is executed immediately, but a complicated set of
history substitutions allow a lot of flexibility, if you like
that sort of thing. Go to the
SLAC man page, or enter "man history" at the
Unix prompt, for more details.
Another feature of the shell allows you to define shorthand names for frequently
used or lengthy commands. When the alias appears in the command line
that the shell reads, its text is replaced by the definition of the alias.
...
alias cp cp -i (protects copy command)
alias ll ls -l \!* (generates long listing... the "!*" recalls the
arguments from the previous command: ll args)
...
Of course, alias commands can also be entered at the command line.
> unalias ll
As usual, issuing the alias without arguments gives you a list of
all current aliases.
Normally Unix commands take their input from the keyboard and display
output on the terminal screen. The keyboard is the standard input and
the screen is the standard output. A third i/o stream called standard
error also exists, and is ordinarily sent to the screen. However, the
shell provides a means of redirecting the input and output to
come from and go to a file, respectively.
> ls * > dir.list (sends the output of the ls command to the file
dir.list but not if the file already exists and
the shell variable noclobber is set)
>
> ls *.txt >! dir.list (overwrites dir.list even if the shell variable
noclobber is set)
>
> ls *.dat >> dir.list (appends output to the file dir.list)
> rm dir.list
> ls *.txt >>! dir.list (appends output even if the file doesn't exist)
>
> ls foobar >& error.msg (redirects output and errors to the file error.msg)
>
> myprog < input.dat > output.dat
(takes input from the file input.dat and sends it
to output.dat)
Another handy feature of the shell is the pipe. This allows
output from one command to be redirected to another without creating
intermediate files. A sequence of commands connected by pipes is
called a pipeline. Pipes are indicated by a vertical bar.
> who | grep "Jan 21" | sort | lpr
The command who produces as output a list of all the
users on the system, which becomes the input to the grep
command, which outputs all lines that contain the string "Jan
21". This list becomes the input to the command
sort whose output is a list sorted alphabetically on the
first character (which happens, in this case, to be the userid), which
becomes the input to the lpr, which causes the list to be
printed. The net result of all this is to print an alphabetical list
of all users on the system who logged on on January 21.
Combining simple commands like those above with command-line editors
like sed and awk allows for extremely
powerful, if complicated, command strings. This use of pipelines is
characteristic of Unix, and gives it great versatility. An explanation
of the use of these features is beyond the scope of the workbook, but
remember, you heard it here!
If the screen you are working in gets annoyingly cluttered, and you
start wishing for a way to just clear it off and start over, your wish
has already been answered. Just type:
> clear
A method of tidying up files and directories that are not going to be
used for some time but which aren't ready to be deleted is to make an
archive file. This is also a very useful way to save space, and to
quickly transfer several files and directories.
tar
"tar" originally meant tape archive. It is a useful
utility to pack multiple files and directories into a single file.
For example, if we have a directory structure such as:
/
cpp_files/
sort.cpp
header.cpp
print.cpp
address.cpp
old_cpp_files/
helloworld.cpp
name.cpp
test_files/
hellow_1.cpp
me.cpp
and want to pack up everything from old_cpp_files under, an
archive file can be created from the directory '/' with the command:
tar cvf myoldfiles.tar old_cpp_files
This means create an archive, give verbose output
(i.e. list all the files that are being packed up), and store it to a
file called myoldfiles.tar. This command will store
everything from the directory 'old_cpp_files' downwards (hence will
include the directory 'test_files' and it's contents also).
The directories and files that have been compressed will still exist
as separate files, but can be removed with a command such as:
rm -rf old_cpp_files
(Note: the command rm -rf is a very powerful command which
will delete all files and directories below the directory
old_cpp_files, so should be used very carefully!)
You can compress this one file using the Unix zipping package
gzip:
gzip myoldfiles.tar
which will create a file called myoldfiles.tar.gz.
Similarly, if you obtain a tarred and gzipped file, you can unpack it
with the commands:
gunzip myoldfiles.tar.gz //only need this if the file is actually
zipped
tar xvf myoldfiles.tar
(and you will probably want to save space by deleting the archive file
after this with rm myoldfiles.tar). Here the 'x' in the
options for tar means extract. The original directory structure
will be preserved, so the output will look like:
directory_used_tar_xvf_in/
old_cpp_files/
helloworld.cpp
name.cpp
test_files/
hellow_1.cpp
me.cpp
Before we discuss individual commands affecting files, there is
something you need to be aware of if you're coming to Unix from VMS.
In that operating system, each file carries with it a version
number, like foo.txt;16. Copying or renaming a file
to one which already exists doesn't destroy that file; it just creates
a file with a higher version number: foo.txt;17. Unix
has no such versions, so renaming or copying in this situation will
replace the existing file. If you mean to save a version of
a file, you must generally do it by hand. This issue also comes up
when editing.
One more thing for you VMS refugees: although many of the filenames
you will encounter bear a superficial resemblance to VMS names, in
that they have the form name.extension, and that these
extensions may have meaning for some programs, it is
important to remember that the "dot" means nothing to
Unix: foo._#_.abcd.text.. is a perfectly legal
(albeit silly) Unix file name.
- To copy a file to another file or directory, use the command:
> cp source_file target
>
Examples:
> cp myfile myfile.old (makes a copy of myfile called myfile.old)
> cp myfile myarchive/ (makes a copy of myfile in the directory
myarchive)
> cp myfile.junk junk/myfile (makes a copy of myfile.junk called myfile
in the directory junk)
> cp myarchive/myfile . (makes a copy in the current directory of
the file myfile in the directory
myarchive, keeping the same name)
**Caution** Copying a file to an existing file will replace
the existing file. To avoid this, issue the copy command with to
option -i:
> ls foo*
foo1 foo2
> cp -i foo1 foo2
cp: overwrite foo2 (yes/no)? ("y" overwrites, anything else prevents
overwriting)
In fact, you may wish to define an alias for the cp command
in your .cshrc file:
> alias cp cp -i
so that any copy will automatically ask for verification before
destroying an existing file.
- To delete a file, use the command:
> rm myfile
Once again, the -i option is available, if you would like to
verify the files.
To move or rename a file, use the command:
> mv source_file target
The usual caution applies to this command, and again the option
-i is available.
Examples:
> mv myfile myfile.old
> mv myfile archive/
> mv myfile.junk junk/myfile
> mv archive/myfile .
- To find a file, use the command:
> find start_directory -name filename -print
This will initiate a search through the directory
start_directory and all its subdirectories, for the file
filename. Wildcards may be used in the filename, but in
this case, the filename must be quoted.
Examples:
> find . -name myfile.txt -print
> find .. -name "*.txt" -print
You may wish to put an alias in your .cshrc file to
handle the simplest use of this command:
> alias findfile "find . -name \!^ -print"
There are lots of other options associated with this command. You can select
files based on date, user, size, etc., and can also perform actions on the
found files. As usual, see the
SLAC man page, or enter "man find" at the
Unix prompt, for more details.
If you aren't fluent with any Unix editor, you might try
nedit just to get started. It is a straightforward
mouse-driven editor which will remind you of your favorite PC or Mac
editor.
emacs:
You should probably start learning emacs. It
is almost guaranteed to be found on any Unix system, and recent
versions of the program have most of the features that you would
expect in a modern editor (and in fact, many more, but you may never
discover them!).
First make sure your xwindow
client is turned on so that the editor window can be created.
Then start the editor:
> emacs &
or
> emacs filename & (to edit or create a particular file)
and when the screen comes up find the Tutorial under
Help in the menu, or type: "CTRL-h
t".
Two other important Unix editors are sed and
awk, which are used in conjunction with other shell
commands to generate powerful functions which would generally be coded
in a high-level language in other operating systems. Their use is
outside the scope of this primer, but very characteristic of Unix.
Related documents:
Access permissions control who may read, write or execute a file, and
who may perform operations on a directory. All files and directories
in Unix have permission bits which can be set and displayed. These
bits determine permissions for files in NFS, but a new mechanism
supplants them for AFS files.
You may examine the permission bits for a file or directory
by issuing the command:
> ls -l foo
-rw-r--r-- 1 kiwi users 20 Jan 28 13:41 foo
The first character in the string "-rw-r--r--" tells you the
type of the file: "-" for an ordinary file; "d"
for a directory; and "l" for a link, among other
possibilities. The next three sets of three characters give the read,
write and execute permissions for the owner of the file (u), the
members of the group owning the file (g), and all others (o),
respectively. In this case, the owner may read or write to the file,
and the group and all others may read, but not write to, the file.
Nobody is allowed to execute the file.
If this file is a shell script that you just created, you will need to
allow execution. To allow everyone to execute this file, issue the
command:
> chmod +x foo
> ls -l foo
-rwxr-xr-x 1 kiwi users 20 Jan 28 13:41 foo
For web files you'll want to make the files and directories containing
the files readable for everyone (chmod a+w mywebfile.html), but not
writable. Similarly, if you have a .plan file in your root directory,
you need it to set read/execute permissions for everyone so that it
can be read.
To remove execute permission for group and others:
> chmod go-x foo
> ls -l foo
-rwxr--r-- 1 lsrea users 20 Jan 28 13:41 foo
The default permissions for a newly created file is controlled by the
file-creation mode mask. This is usually set to some
reasonable value, typically -rwxr-xr-x. For more
information see the SLAC man
page,or enter "man umask" at the Unix prompt.
The permission bits have slightly different meanings when they apply
to directories, but again, the default is generally reasonable.
For AFS files, permissions are controlled by access control
lists (ACL) and protection groups. A file uses the ACL
of the directory in which it resides; it does not have an ACL of its
own. If the file is moved to a new directory, it assumes the ACL of
that directory. A newly created subdirectory assumes the ACL of the
parent directory, but its ACL may be changed afterwards.
Unix permission bits are ignored in AFS, except for the user
(owner) bits for files. The user r bit
must be on in order for anyone, even the owner, to read the file, even
if that user has ACL read and lookup rights in the directory;
similarly for the user w bit and ACL write and lookup
rights. Although there is no ACL right directly corresponding to the
user x bit, this bit must be on, and the user have ACL
read and lookup rights, for that user to execute a file.
In most cases, the files you create will have appropriate ACL's, so
you should read the following section to get a idea of how things
work, but don't feel obligated to master the details at this time.
Instead review the documentation when you find you need to use this
system.
Here's what an ACL looks like:
$ fs listacl ~kiwi
Access list for /u/br/kiwi is
Normal rights:
system:slac rl
system:administrators rlidwka
system:authuser rl
kiwi rlidwka
Access permissions for various groups are listed. The permissions are:
- Directory access rights, which apply to the directory itself:
-
l (lookup): allows the user to list the names of files
and subdirectories in the directory, to examine the ACL for that directory, to
access the directory's subdirectories, and to look at the contents of the
directory (with ls -l).
-
i (insert): allows the user to add new files and subdirectories
to the directory.
-
d (delete): allows the user to remove or move files and subdirectories.
-
a (administrator): allows the user to change the ACL for
the directory.
- File access rights apply only to files in the directory:
-
r (read): allows the user to read and copy the contents of files.
-
w (write): allows the use to modify the contents of
files in the directory and to change their permissions with the chmod
command.
-
k (lock): allows the user to run programs which need to
lock (flock) the file.
The following shorthand forms exist for setting access rights:
-
write: includes rlidwk
-
read: includes rl
-
all: includes rlkdwka
-
none: remove the entry
A protection group is similar to a Unix group, but you can establish
and maintain the protection group yourself. There are four
system-wide protection groups at SLAC:
-
system:administrators - AFS system administrators
-
system:authuser - any user authenticated in the SLAC
cell (this is, who has a SLAC token)
-
system:anyuser - all users anywhere in the world,
authenticated or not
-
system:slac - any user of a host connected to the SLAC
local network
The standard BaBar groups and their access rights may be examined
by clicking here.
You may add individual users to your ACL, or create your own
protection groups, and change the rights of any user or group in your
ACL. Since you probably won't be needing to do this for some time,
I'll skip the discussion here, and refer you to the
SLAC AFS Users' Guide. Also,
> fs help (gives list of fs commands)
> fs [command] -help (gives syntax for fs command [command])
> man fs_[command] (gets man page for the fs command [command])
and similarly for pts commands, which manipulate protection groups.
At SLAC, the NFS and AFS file systems are backed up in different ways, with
different schedules and capabilities. In either case,
backup should be viewed primarily as a disaster recovery
mechanism, not as an archival system. Because of the huge volume of disk
files in the Unix system, it is not feasible to keep old copies of files for
decades. In fact files are kept for no more than one year.
NFS backup is not very interesting to most BaBar users, since the user
directories in the NFS file system are mostly for scratch or temporary
files.
The AFS backup is provided by the native AFS backup system. The unit
of AFS file storage and backup is the volume. Typically, each user
home directory is a single volume. For the first level of backup, AFS
creates a copy of each volume at midnight each night. You can find
this backup copy in the .backup subdirectory in your home
directory. If you have just deleted or damaged a file that existed at
midnight, check the .backup directory for a copy of it from the
previous day.
If you have deleted or damaged a file on a group directory,
things are a bit more complicated, but don't give up hope! Click here for more details.
The AFS backup is a series of full and incremental backups, designed
to provide complete coverage of recent changes, and sparser and
sparser coverage going back in time. A level 0 backup is a full
backup of the AFS file system. A level 1 backup is an incremental
backup of all changes since the previous level 0 backup. A level 2
backup is an incremental backup of all changes since the previous
level 1 backup. The schedule of AFS backups is as follows:
- A full level 0 backup is performed on the first of each month.
This backup is retained for six months.
After six months, only the quarterly (January, April, July, October)
backups are kept. The quarterly
backups are kept for one year.
- A level 1 incremental backup is performed starting at midnight
of each Sunday morning. This backup
is kept for two months.
- A level 2 incremental backup is performed starting at midnight
Monday through Saturday. This
backup is kept for two weeks.
The result of that schedule is that a volume can be retrieved from the
daily backups for the first two weeks, then from the weeklies for the
first two months, then from the monthlies for the first six months,
and then from the quarterlies for one year.
AFS backups are not yet retrievable by users. However, you can use
a web-based form to request damaged/deleted files to be restored from
earlier versions. The web form for general afs restore requests is here,
and such requests can often be dealt with within the hour. If all
else fails, send an email to unix-admin@slac.stanford.edu to request
the retrieval of a file from backup.
To print a simple text file
(or PostScript file to a PostScript printer) to issue the command:
> lpr myfile.txt [another_file ...]
> lpr myfile.ps [...]
The file(s) will be sent to your default printer, which can be found in
the environment variable PRINTER:
> echo $PRINTER
puffin
If it's not defined, you may want to set it in your .cshrc
file or at the command prompt with
>setenv PRINTER my_printer
To print a file on another printer, just type:
> lpr -Pprinter_name myfile
To find the name a printer near you, look at the file
/etc/printcap (or /etc/qconfig for AIX
machines). If this doesn't help you, ask someone who's been through
this already.
The enscript command turns a text file into a PostScript file and
gives you a wide range of options with respect to format. For example, the
command:
> enscript -2rG myfile.txt
produces a 2-up printout in landscape (rotated) mode with a fancy (Gaudy)
header at the top.
To remove a job from the print queue, you first need to get the job number
for that item:
> lpq [-Pprinter_name]
will give you this information for the jobs in the queue. Then just issue the
command:
> lprm [-Pprinter_name] job# (remove job job#)
> lprm [-Pprinter_name] user (remove all jobs queued by user)
You can't stop a job which is already printing with the lprm
command. You may be able to stop it at the printer by taking the printer
offline or pressing the reset button.
The most frequently used Unix commands for displaying the contents of a file
are cat and more. The major difference between the two
is that more pauses between each screen of text.
cat has several useful options
> cat myfile (displays the file)
> cat -n myfile (displays file with line numbers)
> cat -tv myfile (displays non-printing characters, including tabs)
more is mainly used to scroll through a file:
> more myfile
The listing stops at the end of each screenful. The usual action at
this point is to hit the SPACE-bar, which causes the next
screenful to appear, but other responses are possible. For example,
type "b" to go back 1 screen;
"5b" to go back 5 screens;
"3f" to go forward 3 screens;
"q" to return to the shell prompt. If you come
to the end of the file, and you aren't automatically returned to the
shell prompt, type "q".
head and tail allow you to look at the first
and last few lines of a file.
> head myfile (look at the first 10 lines of myfile)
> head -20 myfile (the first 20 lines of myfile)
> tail myfile (the last ten lines of myfile)
> tail -20 myfile (the last 20 lines of myfile)
grep and fgrep are used to search in a file
for lines containing certain strings. fgrep searches for
literal strings, while grep interprets strings as
regular expressions.
> grep hello myfile (find lines containing "hello" in myfile)
> grep "hi there" myfile (quoted because of the space)
> grep "Hi there" myfile (grep is case sensitive, except if used
with "-i" option)
> grep '[0-9]' myfile (lines containing numbers)
> grep '^I' myfile (lines beginning with "I")
> fgrep '^I' myfile (lines containing a literal "^I")
> cat -tv myfile | fgrep '^I' (find lines with tab characters in myfile)
Regular expressions are another characteristic feature of Unix.
diff and sdiff allow you to compare two
files. diff produces a non-user-friendly output which is
tailored to be be the input file to
sed. sdiff generates human-readable
output. However, diff only returns those lines in the
code which have been changed. sdiff quotes all
the lines of code, adding a mark next to the lines where the code has
been changed:
> cat ss1
>
Bert
Ernie
Kermit
Grover
Cookie Monster
Snuffleupagus
Elmo
>
> cat ss2
Bert
Grover
Ernie
Cookie Monster
Snuffy
Elmo
>
>
> diff ss1 ss2 (you won't like this, but here goes anyway)
2,3d1
< Ernie
< Kermit
4a3
> Ernie
6c5
< Snuffleupagus
---
> Snuffy
>
>
>sdiff -w40 ss1 ss2 (set the output width to 40 columns)
Bert Bert
Ernie <
Kermit <
Grover Grover
> Ernie
Cookie Monster Cookie Monster
Snuffleupagus | Snuffy
Elmo Elmo
>
Unix supports background processing. A background process is
a separate task, called a job, that may either be running or
stopped. You may have one foreground job and several
background jobs running simultaneously, constituting a
multitasking environment. Only the foreground job can receive input
from the terminal, but jobs can be moved from background to foreground
and vice-versa.
To execute a command in background, end the command with an ampersand
(&). You can also stop a command which is running in foreground by
type CTRL-z. In both cases, you receive a new copy of
the shell for foreground processing. Jobs stopped with a
CTRL-z remain inactive until you restart them in the
foreground, or start them in background.
> long_command > cmd.out &
[1] 26162
>... more commands
[1] Done long_command > cmd.out
>
This command, which will presumably take a while, is issued in background so
that you can do other things while waiting for it to finish. The output is
directed to a file, so that it doesn't interfere with output from the
foreground. The system returns a job number (in square brackets) and a
process number. Eventually it finishes, and you get a message to that
effect.
Now suppose you issue the command without the ampersand and
realize after a while that you would rather have it running in
background. One possibility is to type a CTRL-c, and
start over. Or you can move it into the background by first stopping
it, and starting it in background.
> long_command > cmd.out
CTRL-z
[1] 8531 Running emacs public_html/unix/unix.html
[3] 8708 Running xbiff
[4] - 2424 Suspended netscape-4.08
[5] + 7886 Suspended long_command > cmd.out
> ...
>> bg %5 ( or " %5 & " ... both change the status of the background job
from suspended (stopped) to running)
[5] long_command > cmd.out
> ...
After typing the CTRL-z, you get a list of all background
jobs and their status. (You can also get this list at any time by
typing jobs at the shell prompt.)
> jobs
[1] 8531 Running emacs public_html/unix/unix.html
[3] 8708 Running xbiff
[4] - 2424 Suspended netscape-4.08
[5] + 7886 Suspended long_command > cmd.out
>
To bring a background job to the foreground, type:
> fg %5 (or " %5 " )
(job continues in foreground)
The commands bg and fg issued without
argument affect the current job, indicated by "+". When you change the
status of a background job, that job becomes the current job (+).
If you change directories after submitting a background job, and then
bring it to the foreground, the shell returns to the directory from
which you issued the command to submit the background job.
To suspend a background job, type:
> stop %5
and to remove it entirely, type:
> kill %5
You're done for now!
By completing this tutorial, you have
read about and tried out many of the commands that you will need in
the course of your daily use of the Unix system. In addition, you
should now be vaguely aware of some of the features of Unix, to the
extent that you will not be totally surprised at what you encounter.
If you're a new user, it would probably be worth your time to revisit
this tutorial in a few months. Some of the items which may have been
opaque or confusing this time around may be much easier to understand,
and will be much more relevant.
In any event, the list of documents below should be helpful if you
need more information on any of the subjects covered here.
General related documents:
Page maintained by Adam Edwards
Last modified: March 2009
|