4.2. Back to command line.
Section quotes:
- "If there's more than one way to do a job, and one of those ways
will result in disaster, then somebody will do it that way".
-- Murphy's law (Edward A. Murphy)
Section contents:
Working with files.
We have discussed how to create a new directory 'mkdir'
remove an empty one with 'rmdir', and how to copy, move (also rename)
and remove file(s) with 'cp', 'mv', 'rm' respectively, all in section
'3.2. Commandline does NOT scares'
but we did not give examples, he they go.
Let's assume that you have a file called 'test.txt'
on your desktop, do the following:
bash$ pwd
/home/ahmad
bash$ ls Desktop/*.txt
test.txt
bash$ mv Desktop/*.txt ./
bash$ ls *.txt
test.txt
bash$ ls Desktop/*.txt
ls: Desktop/*.txt: No such file or directory
bash$ mv test.txt temp.txt
bash$ ls *.txt
temp.txt
bash$ rm -i *.txt
rm: remove 'temp.txt'? y
Tip
Recall that '.'
mean the current working directory, '..'
te parent directory, '-' the previous directory
and at last '~' the home directory.
bash$ pwd
/home/ahmad
bash$ mkdir test1 test2
bash$ ls -F
Desktop/ test1/ test2/ oldfile link@ exec.sh*
bash$ cd test1
bash$ echo "file one" > test1.txt
bash$ less test1.txt
bash$ cp -vi test1.txt test2.txt
test1.txt -> test2.txt
bash$ cp -vi test2.txt test1.txt
overwrite test1.txt [yn] ? y
test2.txt -> test1.txt
bash$ ls *.txt
test1.txt test2.txt
bash$ cd ..
bash$ rm -R test1
Warning
When you execute a command as root user, even an extra space could
case catastrophic results, for example if 'rm -R test1/test2'
is written as 'rm -R test1 / test2'
this will remove 'test1' and every thing and 'test2'.
Double read what your type before you press ENTER.
To know which program is executed when you type some command
type 'which' followed by the command name, for example
'which lilo' could give '/sbin/lilo'
this is useful is you have two or more programs having the same
name. But if you are searching for the location of a program,
a manual or a library use 'whereis'
followed by the file named without any prefixes or suffixes,
it will search for it only in specific directories, for example
bash$ whereis lilo
/sbin/lilo
/usr/share/man/man8/lilo.8.gz
/usr/share/man/man5/lilo.conf.5.gz
To do fast search for files in a cached database of
the entire file system or part of it, use 'locate'
followed by a pattern for files you want, patterns could contain
wildcards like '*' and '?' (but you have to use strong quotes so that Bash won't mess with it),
if you want case insensitive search (for Latin or alike languages)
add the option '-i' (or '--ignore-case'),
to limit searching in a directory use
'-d' followed by it's name. The following examples show how it works:
bash$ locate '*lilo*'
/sbin/lilo
/etc/lilo.conf
/usr/share/man/man8/lilo.8.gz
/usr/share/man/man5/lilo.conf.5.gz
/usr/share/doc/Lilo-Doc/lilo-readme.html
/home/ahmad/using lilo.html
bash$ locate '*.swp' -i -d /mnt/win_c
/mnt/win_c/windows/win386.swp
this tool may give inaccurate results because it's
database could be outdated, it's very useful, fast and efficient for searching
for files that come with the system. It's database should be created
then updated by the 'updatedb' tool, your distribution usually call
it frequently (once each day, week, month ...etc) with 'cron'.
The environment variables PRUNEFS and PRUNEPATHES
specify a space-separated list of file systems or directories (respectively)
excluded when updating the database.
To modify it once use something like:
bash# export PRUNEPATHES="/mnt /tmp /proc $PRUNEPATHES"
bash# updatedb
to do it permanently edit the related 'cron' files.
To search for files directly on a device (not on a cached database of it)
use 'find' followed by directories to be searched
followed by some specifiers (or constraints) like filename pattern:
'-name PAT', age of last modification in days:
'-mtime DAYS', change in file permissions:
'-ctime DAYS' and the last access to it (eg. reading)
'-atime DAYS', and if you want execute a command
of the resulted files add "-exec 'COMMAND {} ;'"
where it will replace '{}' with the filename of the file,
for example "find ~/tmp/ -atime +3 -exec 'rm {} ;'"
will search for all files in '~/tmp/'
that have not been accessed 3 (or more) days ago.
The following tools change some file properties:
| chown USER[:GROUP] FILE
|
chanage the owner of the file
|
| chgrp GROUP FILE
|
change the group owning the file
|
| chmod MODE FILE
|
change file permissions (also called file mode)
|
as we have discussed before file permissions are execution permission
'x' (being able to run the file or search the directory contents),
writing permission 'w' and reading permission 'r'.
Those permissions are controled by 'chmod' tool
for example 'chmod +x file1'
case the file 'file1' to be executable by every body,
while 'chmod -x file1'
will deny execution from every body (including it's owner).
You could specify the change to be for the owning user 'u',
owning group 'g' and others 'o' (comma separated)
for example 'chmod g-w,o-w file1'
will deny the group and others from writing leaving
the permission for the user unchanged.
Another method of specifing the permissions of a file
is to use the 0-7 octet notation instead of the 'rwx' notation
where we give 1 for 'x', 2 for 'w' and 4 for 'r' (from right to left)
and the summation (logical OR) for other, owning group and owning user (respectively from right to left)
will form the mode digits, for example if we want the owner to
read+write+execute then that digit will be 4+2+1=7, and the group
to read and execute then that digit will be 4+1=5, and others do nothing
their digit will be 0, put them all to gather to have '750', so
the command for that 'chmod 750 file1'.
There are more attributes discussed later.
It's much more simple than it looks.
user group others
rwx rwx rwx
421 421 421
\ / \ / \ /
7 7 7
chmod 777 file1
|
user group others
rwx r-x r-x
421 4 1 4 1
\ / \ / \ /
7 5 5
chmod 755 file2
|
user group others
rw- r-- r--
42 4 4
\ / \ / \ /
6 4 4
chmod 644 file3
|
Sticky 's' value is 1 on the 4th digit, for example 'chmod 1755 mydir'.
To change the permissions of a directory and its contents
recursively use '-R' for example 'chmod -R 750 myfolder'
Tip
It's very useful to use '-X' instead of '-x' to deny execution
of directory contents recursively with '-R' so it only affects non-directory
items, eg. 'chmod -R -X myfolder'.
4.2.2. Working with file systems.
We have discussed 'mount' tool in section
'4.1. Hardware configuration.',
which maps devices to the Linux root file system,
we have mentioned that this process won't copy,
it's just a virtual process where an empty directory
(actually it need not be empty, it's preevious contents will be hidden)
appears to contian files on that device,
The unmounting tool 'umount' will cancel this mapping.
What you will know on this section that this is not only about real devices,
but also for pseudo file systems like 'proc' (refere to section '3.1 File System Hierarchy.')
which contian informations provided by the kernel as files like
'cpuinfo', 'uptime', 'loadavg', 'mounts', 'filesystems'.
We have said that such devices and more are automatically mounted
when you boot Linux, the configuration file that specify this is
'/etc/fstab' it's name is short for file system table.
When you want to mount a device listed on that table you don't
have to provide any detials, just type 'mount' followed by
the device or the mount point directory.
Removable devices that should not be mounted automatically on each boot
should have 'noauto' option.
bash$ cat /proc/cpuinfo
..you see it yourself
bash$ mount
..tells you what is mounted now
bash$ man mount
..tells about mount
bash$ man fstab
..tells about after boot mount table
bash$ cat /etc/fstab
The tool 'eject' will unmount and open the door of the CD drive,
if called without arguments it will eject the first CD,
you could specify which CD you like to eject by passing
the device filename with or without '/dev' prefix.
# try this joke
bash$ eject ; mount /mnt/cdrom ; eject ; mount /mnt/cdrom
# or this, and wait a minute
bash$ { sleep 60; eject } &
If you try to unmount a device (or through its mount point directory)
and get 'device is busy' error then it could be because you are
in that directory, or there is a program trying to read from it or
write on it but the most nontrivial case is that you have mounted
a device to a mount point somewhere inside that directory
(this is common in 'chroot' environments)
bash$ pwd
/mnt/floppy
bash$ umount floppy
umount: /mnt/floppy: device is busy
Do you have two installed Linux distributions ?
or maybe you have a Linux rescue floppy or a Linux live-CD distribution,
can you run an application from one distribution other than the one
you are running ? let's assume that the other distribution
is mounted on '/mnt/linux' (the one you are running of course is mounted
to '/'). Let's try to run '/mnt/linux/usr/games/gnuchess',
it may work but more likely it won't, because of missing files,missing libraries,
bad library version, or files do not exist where it expect them to be
for exmample '/usr/share' becomes '/mnt/linux/usr/share'.
What is the solution ? maybe you are not yet interested, it does not look
necessary, but when it's about maintaining a broken system from a limited
rescue environment, like fixing LiLo, it becomes very importent.
What we want is to create a virtual environment with
with different root file system, in the example we want to change it
from '/' to '/mnt/linux' (without copying files),
the change root tools 'chroot' do that, it takes the new root directory
as first argument, for example 'chroot /mnt/linux'
will try to execute '/bin/sh from the new environment,
which will give you a shell to work in the new environment
until the shell ends with 'exit',
this mean that the new environment should have a working 'sh' and it's dependency.
bash# ls -F
bin/ sbin/ etc/ mnt/ lib/ tmp/
bash# mount /dev/hda5 /mnt/linux
bash# chroot /mnt/linux
sh# ls -F
bin/ sbin/ etc/ mnt/ lib/ usr/
tmp/ var/
sh# lilo -vv
sh# exit
bash#
You could specify different program to run by passing it's name
(in the new environment) after the directory name.
bash# ls /mnt/linux/usr/games/
fortune gnuchess
bash# /mnt/linux/usr/games/fortune
file not found
bash# chroot /mnt/linux /usr/games/fortune
Error float point unit overflow
bash#
in the last step of previous example session we have executed
'/mnt/linux/usr/games/fortune' from
the new environment and we get a fake error, don't let it fool you
it's one of fortune-mod jokes.
4.2.3. Working with archives.
GNU ZIP 'gzip' is the most used comperession method in GNU/Linux
systems, because it's very fast an effective, this tool compresses files
one by one giving each one '.gz' suffix (the original file disappear),
to decompress you have to pass '-d' option or just use
'gunzip' tool.
You could specify compression level for 'gzip' from 1 to 9 after '-',
use '-1' for best speed and '-9' for best compression.
Another compression method is BZip2 'bzip2'
it's based on a new algorithm called block-sorting,
it compresses much better to give smaller files but takes few more time.
It works just like 'gzip' and have similar options but uses '.bz2' suffix.
Both tools are programmed to compress standard input and send results
to standard output if no files are given, look at the following examples:
bash$ ls -lh delme.*
-rw-r--r-- 1 ali users 1M May 28 19:20 delme.bmp
bash$ gzip -9 delme.bmp
bash$ ls -lh delme.*
-rw-r--r-- 1 ali users 12K May 28 19:20 delme.bmp.gz
# notice that no delme.bmp
bash$ gzip -d delme.bmp.gz
bash$ ls -lh delme.*
-rw-r--r-- 1 ali users 1M May 28 19:20 delme.bmp
bash$ gzip -9 < delme.bmp > delme.bmp.gz
bash$ ls -lh delme.*
-rw-r--r-- 1 ali users 1M May 28 19:20 delme.bmp
-rw-r--r-- 1 ali users 12K May 28 19:20 delme.bmp.gz
# notice that the delme.bmp is not deleted
bash$ bzip2 -9 < delme.bmp > delme.bmp.bz2
bash$ ls -lh delme.*
-rw-r--r-- 1 ali users 1M May 28 19:20 delme.bmp
-rw-r--r-- 1 ali users 12K May 28 19:20 delme.bmp.gz
-rw-r--r-- 1 ali users 9K May 28 19:20 delme.bmp.bz2
this is not archiving, this is compressing.
The most common archiving tool in GNU/Linux systems is Tape Archives (TAR)
this is just a historical name, it's not only about tapes.
TAR creates uncompressed archives, a TAR archive could be then compressed
with 'gzip' or 'bzip2'.
Tip
Archive then compress (general compression) is more efficient than
compress (file by file) then archive (like WinZip)
because the tool in first case could find relations between segments
in different files.
'tar' tool could extract, create or list
a TAR archive with '-x', '-c' and '-t'
options respectively. The option '-p' is used to preserve permissions when extracting
files, the option '-f' is used to specify the archive name instead of standard
input or output, if '-' is used as a file name then it means standard input or output.
bash$ ls
delme.tar.bz2
bash$ bzip2 -d < delme.tar.bz2 | tar -xvf -
# here we have decompressed it to have uncompressed TAR archive
# then we have extracted files from it.
bash$ ls
delme1.bmp delme2.bmp delme3.bmp delme.tar.bz2
bash$ rm *.tar* && ls
delme1.bmp delme2.bmp delme3.bmp
# to archive and compress all non-hidden files:
bash$ tar -cvf - * | gzip -9 > delme.tar.gz
bash$ ls
delme1.bmp delme2.bmp delme3.bmp delme.tar.gz
GNU version of TAR could automatically compress/decompress
with 'gzip' by passing '-z' option, and with 'bzip2' by passing '-j' option,
so one could use:
bash$ tar -cvzf delme.tar.gz *
instead of ' tar -cvf - * | gzip -9 > delme.tar.gz '.
And one could use:
bash$ tar -xvjf delme.tar.bz2
instead of 'bzip2 -d < delme.tar.bz2 | tar -xvf - '.
Compressed listing is also supported, for example
'tar -tvzf delme.tar.gz'.
if you want to extract specfic files not all files
you could specify their names after the archive name,
you may also send the extracted file to standard output
with '-O' then pipe the output to another tool,
for example 'tar -xOf proj-2001.tar june/myfile.dat | md5sum'.
You could change the directory to extract files to instead of current directory
with '-C', for example:
bash$ tar -xjvf delme.tar.bz2 -C ~/mynew/dir
More and more options are provided by GNU TAR like
appending to and removing from archive or even appending only newer files
you should refere to TAR manual page, type 'man 1 tar'.
Tip
Wildcards ('*' and '?') substitution by shell
does not include the leading dot of hidden files, this is very useful to exclude
temporary (or backup) files, configuration files, file managers caching files,
to create an archive of all files in the directory including hidden files
use the diretory name, for example: tar -cvzf delme.tar.gz ./.
Tip
When you create an archive for a directory from itself with
'tar -cvzf delme.tar.gz ./'
file will be added directely in the archive root, but typing the following from it's parent:
'tar -cvzf delme.tar.gz myfolder/'
thhe archive will contain 'myfolder' and it's contents,
you could pass many folders and many files to be included:
'tar -cvzf delme.tar.gz myfile1.txt myfolder1/ myfolder2/'.
There are many other archiving tools for GNU/Linux, like 'cpio', 'afio' and 'ar',
they are less common for general purpose archiving.
And there are many tools for common but non-standard GNU/Linux archives
they could come with your distribution, like
'unarc', 'unarj', 'unrar', 'zip', 'unzip' and 'cabextract'.
For example to extract 'file.zip' in the current directory type:
'unzip /path/to/file.zip'
You could split a file into smaller many files with specific
size limit (to put each on a floppies) with 'split' tool,
for example 'split -b 1440k mybigfile' which will produce
many files named something like 'xaa','xab','xac'...etc.
then you could join them together with 'cat'.
There are many tools to provide an almost unique number (called checksum) for different
files, this number changes alot if a single byte in the file is changed,
it's used to detect modifications (eg. due to transmission interference)
just by comparing the checksums of the two files (before and after)
those tools are (from the weekest to the strongest): 'sum', 'chksum', 'md5sum' and 'sha1sum'
the most common is 'md5sum' but now most sites are moving to use GPG digital signatures
see section '4.10. Digital signatures and privacy.'.
To do dumb comparison between the corresponding lines of two files
use 'comm' tool, but GNU provide much smarter tool called 'diff'
to find the difference between two file, for exmaple it detect
adding a line to the first of a file, while other tools
thinks that all lines after also differ.
It could be used to create a script called patch file that automatically
do the change on unchanged (old) files, go back
to subsection '3.5.3. Patching.'.
4.2.4. Managing processes.
Each running program is called a 'process', when a program calls another
we say that the parent process forks a child process.
The child process gets the privileges of it's parent,
so when you run a program as root user and it forks another program,
the last program runs as root too.
If you close the parent process the child process will also terminated,
this is very useful but becomes annoying when you run a GUI
program like 'gedit' from one of the X-terminal then by closing the
terminal, the child process will be closed ('gedit' in the example).
If you don't want the child process to be closed use 'nohup'
to call the child, then if the parent is closed the child keep running
as an orphan.
There are many tools to manage processes in many ways like
terminating, killing a process and raising or lowing it's priority.
To make it easy a unique process identifier (PID) is given to each process,
the tool 'ps' lists processes currently running on the same terminal by the same user
with it's corresponding PID, pass '-a' argument to
list those of all terminals, the are many more options for specify
what details to show like the user, memory and CPU usage.
A tool called 'top' (a friend of mine told me that it looks better when called from 'xterm')
show an automatically updated list of system wide processes ordered by
load and resources consuming, it's an interactive tool waits to be exited
by pressing 'q' (or CTRL+C), other keys are used to
killing a process and raising or lowing it's priority.
There are some similar GUI tools like 'gnome-system-monitor'.
To terminate a process use 'killall' tool followed by the
the name of the program for example 'killall wine',
you could specify how it would be killed by specifying a signal
like the normal termination signal '-TERM'
(or by number '-15'),
this signal gives the process a chance to handle it for example by
asking the user to save his work,
Tip
Many GNU documentations suggest that you should use symbolic signal
names instead of numerical because they could differ in other Unices.
or like the lesser polite signal '-KILL' (or by number '-9'),
which force the termination of a process, it should be used
after trying '-TERM' with obstinate processes that refuse to
terminte. On the other hand 'kill' tool uses PID instead of program names
where you have to use 'ps' or 'pidof' to know PID of
the process you want to terminate, for example type 'kill -TERM `pidof wine`'
and then 'kill -KILL `pidof wine`' to kill all instances of WINE.
Those tools are very useful although it's less likely to face a program that
needs to be "killed", because those tools could many signals to processes
like STOP and CONT to suspend and resume a process.
'kill' has a major rule when you halt or restart the system safely,
it's called to terminate processes to free devices in order to be
able to unmount them, because busy devices won't be unmounted and this
case file system inconsistency and will be checked on next boot.
You could specify different priorities to different processes,
priority is denoted by number from -20 to 19, the more the negativity the higher
the priority, to run a program with non default priority type 'nice'
followed by the priority followed by the command and it's arguments,
the default priority is 0 and regular users could not have higher priorities.
To change the priority of a running process type 'renice'
followed by the new priority then the PID of the process.
Here a sammary:
| killall [-SEGNAL] PROCNAME
|
terminate a running process by name (or send it a signal)
|
| kill [-SEGNAL] PID
|
terminate a running process by PID (or send it a signal)
|
| nice PRIORITY COMMAND
|
execute COMMAND with specific priority.
|
| renice PRIORITY PID
|
change the priority of a running process by it's PID.
|
Many people think that multitasking is for GUI systems only,
this is not true, even when you are running Linux without X
you have many pseudo-terminals/virtual-terminals you could use
CTRL+ALT+F1, CTRL+ALT+F2... and so on
to move between them, and even with one terminal you could run many programs.
When you are using the GUI, you minimize a window and activate another,
you could do similar thing with console programs, press
CTRL+Z to suspend the program you are running.
Let's assume you are running VIM on terminal to edit a source code
and you want to compile it, press CTRL+Z, VIM will be
suspended and you will see something like:
where [1] is the job number and 171 is it's PID,
now type the command you like, for example 'make', let's assume that
you have just remembered that you have to copy files, so you used 'cp',
those files were larger than what you have expected and you have not
used '&' after the 'cp' command, you have to wait!
No, press CTRL+Z to suspend coping
type 'jobs' to list suspended jobs, let's switch back
to VIM, type 'fg %1' which mean put job number 1 in the foreground
(like when you click on taskbar of a minimized window with GUI),
but there is a problem, copying is suspended (not "minimized"),
Press CTRL+Z to suspend VIM, if we
type 'fg %2' coping will be brought back to the front of the terminal
(you won't see any thing, unless you use '-v' with 'cp')
this is not what we want, we don't want to wait copying.
We want copying to keep working but in the background
so type 'bg %2', now type 'fg %1'
to bring VIM to the foreground while 'cp' is working in the background
at the same time.
if 'cp' is running in verbose mode '-v' it will be imposible to work with
VIM because of messages telling you that this is copyed there,
so when you run programs in the background with '&' operator
or with 'bg', make sure to redirect it's output to
'/dev/null' or to a file to be reviewed later or even
to other terminal '/dev/ttyN'.
| jobs
|
list jobs in the current terminal, use '-l' display PIDs also.
|
| bg %N
|
send job number N (not PID) to the background.
|
| fg %N
|
bring job number N (not PID) to the foreground.
|
| kill %N
|
kill job number N (not PID).
|
To know which process is using certain file type 'fuser'
followed by filename(s), you may use '-v' option
(eg. 'fuser -v myfile')
to have output in the following syntax:
FILE USER PID ACCESS COMMAND
where ACCESS could be c, e, f, r or m to indicate how the process is using the file
which could be it's the current directory of the process, the process is executing
that file, opening it, it's root (useful in case of chroot),
or mapped (mounted or loaded as shared library) respectively.
The option '-m' is useful when you are file system tree or device
so that any program is accessing it (even through a file subdirectories).
With '-k' you could send all matched processes a signal (termination by default),
it's very useful, for example if you want to unmount the floppy
but it's busy, then you could terminate all processes the use it
with 'fuser -km /mnt/floppy', you will get an advise that
all you have to do now is to use 'umount'.
This tool also could work on network sockets like to list all
processes accessing to some IP address or port number, see
section '5.2. Working in networks'.
4.2.5. Text filters.
As you have noticed before, you could edit files without a text editor,
simply with 'cat' and output redirection, for example:
bash$ echo "create a file" > delme.txt
bash$ echo "add this line to the end" >> delme.txt
bash$ {echo "add this line at the top" ; cat delme.txt } > delme.txt
bash$ cat delme.txt
add this line at the top
create a file
add this line to the end
bash$ rm delme.txt
bash$ cat <<EOF > delme.txt
blah blah
blah
EOF
bash$ cat delme.txt
blah blah
blah
bash$
You could display first some lines of a file with 'head',
and last with 'tail'.
To make text look better use 'pr' which justify text between
left and right margins, for example 'pr -o 5 --width=70 FILE | less'
will add 5 characters left margin and justify the text (assuming a screen width of 80, you will have 5 chars right margin too).
Word counter 'wc' could tell how many lines, works, characters in a file.
'tr' replace members from the first string with the corresponding one from the second,
for exmaple to convert all upper case letters to lower one could use:
'tr A-Z a-z'. Numbering lines could be done with 'nl'
(which skip empty lines), while 'sort' is used to sort
lines of file alphabetically in ascending order, with '-n' it
sorts them as numbers (alphabetically, '1' is greater than '10'), with
'-r' the order is reversed. 'uniq' is used to remove
repeated consecutive lines, with '<>-c' it prefix lines with
number of occurrences (beside the removing).
There are tools capable of much more complicated things, like
'grep', 'sed', 'cut' and 'awk'.
Those are relatively complicted tools, read about them quickly for now
and revisit this section when writing Bash shell scripts.
'grep' for example, searches for lines matching its argument
on given files (or else on the standard input) and them,
for example 'grep vfat /etc/fstab' will print all lines
containing 'vfat' on 'fstab', in other words it tell you about windows partitions,
with '-i' it becomes case insensitive, for example 'grep -i "VfAt" /etc/fstab'.
Tip
in the last example it could match a line like this '# NON vfat goes here'
which we don't want, we need to learn more.
what if you were searching for a pattern like a line that starts with
a number or similar general searches, that's what Regular Expression (RE for short)
used for. Many programs use REs to represent text patterns such as
'grep', 'sed', 'awk' 'perl' and others.
Some characters in RE has a special meaning
those are '. * ^ $ [ ] \' they are called POSIX Standard REs,
other characters has literal meaning (remains as is), this table shows
the meaning of them
| RE | meaning | example | example meaning |
| .
| Any signle character.
| '.m' | single character followed by 'm' like 'am', 'pm' and '1m' |
| *
| specify number of previous RE to be zero or more times
| '.*' | anything from ny length |
| ^
| match at the begining of a line
| '^ali' | a line starts with 'ali' |
| $
| match at the end of a line
| '^$' | an empty line (starts then ends directly) |
| [SET]
| any single character from the SET.
| '[a-zA-Z]' | any English letter |
| [^SET]
| a signle character that does not belong to the SET
| '[^0-9]' | a single non numeric character |
| \X
| X itself, as is.
| '1\*1' | '1*1' but not repeated 1s |
If you need to use one of the a POSIX RE literally prefix it with '\'.
POSIX REs are extended with more special characters
those are '? + { } ( ) | ', and for compatibility with
the standard POSIX RE (where those chracters represent themselves)
they need to be escaped with '\' to represent the new special meaning.
Tip
Some programs (like 'mc' and 'perl') do not follow this rule, they use unescaped extended
REs (without '\') to represent the special meaning, and escaped to represent
themselves.
most programs support the extended REs but 'grep' enables this feature
only if '-e' option is used, or to be called as 'egrep'.
The following tables shows the meaning of those extended REs.
| RE | meaning | example | example meaning |
| ?
| specify the number of previous RE to be at most 1 (one or zero)
| 'e\?xtra' | matches extra or xtra. |
| +
| specify the number of previous RE to be at least 1 (one or more)
| 'lo\+k' | match 'lok', 'look','loook',...etc |
| {n}
| specify the number of previous RE to be exactely n.
| '[a-ce-z]\{6\}' | exactely 6 english letters none of them is d |
| {n,m}
| specify the number of previous RE to be from n to m inclusive.
| '[a-wy-z]\{3,6\} | 3 upto 6 none x letters |
| {n,}
| specify the number of previous RE to be at least n.
| 'f\{5,\}' | five or more f letters |
| < >
| match at word boundary
| '\<[a-zA-Z]\+\>' | a separated single english word |
| ( )
| treat what it include as a signle RE
| '\(foo\)\+' |
repeated 'foo' not 'fo' then repeated 'o' |
| |
| this or that
| '[0-9A-Za-z]*\.\(com\|net\|org\)' | alphanumeric characters followed by '.com', '.net' or '.org' |
| 1 2 .. 9
|
the result of matching the n-th parenthesis
| '\([a-z]\+\)-\1' |
two repeated words separated with '-' like 'foo-foo' |
for example "egrep 'a{3}'" will search for literal 'a{3}'
while "egrep 'a\{3\}'" will match 'a' repeated 3 times.
POSIX RE classes has the form '[[:ClassName:]]'
where ClassName is one of 'aplha', 'upper', 'lower', 'digit', 'alnum'
and 'space' defined for english as:
alpha [A-Za-z]
upper [A-Z]
lower [a-z]
digit [0-9]
alnum [A-Za-z0-9]
space any whitespace (space ,tab ,new-line)
for example "cat text.txt | grep -e '^[[:alpha:]]\+$'"
matchs lines entirely formed by consecutive letters (lines having single word).
Perl RE contians more extention like '\d' to match a single digit,
'\s' for a whitespace and '\S' for a non-whitespace.
'grep' could be used to learn English ,
for example to list all words that ends with 'tion'
type "grep -ie 'tion$' /usr/dict/words /usr/share/dict/words | less".
You will notice how useful REs are when you apply them in real life.
Tip
You could test your RE skills with something like
"echo 'SOMETHING' | egrep 'PAT'"
if it's printed then it matches.
Tip
You could searh for files containing some REs by passing '-l' to 'grep'
which case 'grep' to print filenames only.
In the following examples we practice "greping" REs
on input concesting of "bo", "bb", "bob", "bobob" and "boob" lines.
# lines starting with "b" followed with zero or more "o" followed by ending "b"
bash$ echo -e "bo\nbb\nbob\nbobob\nboob" | grep '^bo*b$'
bb
bob
boob
# lines starting with zero or more "bo" followed by ending "b"
bash$ echo -e "bo\nbb\nbob\nbobob\nboob" | grep -e '^\(bo\)*b$'
bob
bobob
# lines containing "b" followed with zero or more "o" followed by ending "b"
bash$ echo -e "bo\nbb\nbob\nbobob\nboob" | grep -e 'bo*b'
bb
bob
bobob
boob
some times you need a deeper look, the same input lines are tested
with REs that you may think to mean ending with "o" or "x"
but strange results appreared:
bash$ echo -e "bo\nbb\nbob\nbobob\nboob" | grep -e 'o\|x$'
bo
bob
bobob
boob
it really matches lines containing "o" or ending with "x",
the right way to do that is
bash$ echo -e "bo\nbb\nbob\nbobob\nboob" | grep -e '\(o\|x\)$'
bo
also "grep -e 'o$\|x$'" will work.
To deal with fields we use 'cut' tool, fields are parts of a line
separated by a given delimiter (TAB by default).
The syntax of 'cut' is like:
'cut -d Delimiter -f FieldNo[,FieldNo]...[FieldNo-FieldNo]'
which mean
"give me the those fields of each line considering 'Delimiter' to be fields separator"
for example Delimiter could be '|' and fields could be
'2-5' or '1,6'. Let's take an example from real life, the file
'/etc/passwd' contain informations about users
in the form: his login name then his password (usually omitted) then his UID ...
then his default shell, fields are separated by ':',
the commands needed to find the UID of 'ali' (the UID field is the third):
# let's look at the file
bash# cat /etc/passwd
...
ali:x:1000:100:users:/home/ali:/bin/bash
ahmad:x:1001:100:users:/home/ahmad:/bin/bash
...
bash# cat /etc/passwd | grep ^ali | cut -d ':' -f 3
1000
Exercise
Type commands needed to find the device(s) that the CDROM is connected to
(assumng that 'fstab' is outdated), you could use 'dmesg'
'awk' is very powerful tool, it's a complete programming language
that natively support RE, also 'awk' could be used as fields filter
but unlike 'cut' it supports more than one signle delimiter, by default
it uses whitespaces (multiple spaces, TABs, newlines or any combnations of them),
the delimiters are specified by IFS enironment variable (before you change it,
you should save the old value to be able restore it),
the simple syntax to do so is like: "awk '{print $FieldNo [$FieldNo]...}'"
for example "IFS=":" awk '{print $3}' /etc/passwd" to print the
third field of each line of the 'passwd' file, here we have used a BASH feature
to change an evironment variable just to execute a command.
Awk support loops, arrays and much complicated thigs, the following
awk script find the longest English word
bash$ awk '
{ if (length($0) > max) { max = length($0); r=$0 }
END { print "the longest word is" r }
' /usr/share/dict/words
refere to section '6.4. Programming with Awk and Perl'.
Stream Text Editor (SED) is a very useful tool, it's a noninterative text editor,
act on the specifyed range (by default all the file) which could be line
numbers or a line that match some RE. SED will by the default print all
input lines after change unless -n option is used,
the following table shows very small part of SED capabilities:
| p | print the range |
| d | delete the range |
| s/PAT1/PAT2 |
substitute, replace PAT1 with PAT2 once |
| s/PAT1/PAT2/g |
global substitution, all occurrences of PAT1 |
| y/SET1/SET2 |
replace members of SET1 with the corresponding member of SET2
(SET1 and SET2 should have the same size) |
where patterns are in Regular Expresions.
Range is a prefix having line numbers or RE contained by two '/',
for example "cat file | sed -e '/^Ali/s/ahmad/Ahmad/g'"
means to replace all 'ahmad' with 'Ahmad' only on lines starting with 'Ali'.
Another example 'cat file | sed -e '/^$/d''
will remove empty lines.
Notice the following trivial session:
# replace 1st occurrence of "el" with "le" in each line
bash$ echo "elelel" | sed -e 's/el/le/'
leelel
# replace every occurrence of "el" with "le" in each line
bash$ echo "elelel" | sed -e 's/el/le/g'
lelele
bash$ echo "angel" | sed -e 's/el/le/g'
angle
# replace every occurrence of ending "el" with "le" in each line
bash$ echo -e "angel\nelelel" | sed -e 's/el$/le/g'
angle
elelle
things could be much more complicated, study the following script
which was supposed to replace consecutive "o" letters with a single "x":
bash$ echo "foobar" | sed -e 's/o*/x/g'
xfxxbxaxrx
bash$ echo "foobar" | sed -e 's/\(o*\)/[\1]/g'
[]f[oo][]b[]a[]r[]
bash$ echo "foobar" | sed -e 's/o\+/x/g'
fxbar
it goes wrong because * matches zero length strings.
(try echo -e "bo\nbb\nbob\nbobob\nboob" | sed -e 's/bo*b/X/g').
Another case of confusion is coming from longest matching, study
how replacing image HTML tag with [IMG] succeeded with first input
but failed with the second:
bash$ echo "<image src="x.png">foobar" | sed -e 's/<image.*>/[IMG]/g'
[IMG]
bash$ echo "<image src="x.png"> <b>foobar</b>" | sed -e 's/<image.*>/[IMG]/g'
[IMG]
bash$ echo "<image src="x.png"> <b>foobar</b>" | sed -e 's/<image[^>]*>/[IMG]/g'
[IMG] <b>foobar</b>
bored of those examples, try to study those examples:
# flip user and domain
bash$ echo "<ahmad@somwehere.com>" |
> sed -e 's/<\([^>]*\)@\([^>]*\)>/<\2@\1>/g'
<somewhere.com@ahmad>
# SPAM protection
bash$ echo "<someone@somewhere.com>" |
> sed -e 's/\(<[^>]*\)@\([^>]*\)\.\([^>]*>\)/\1[AT]\2[DOT]\3/g'
<someone[AT]somewhere[DOT]com>
# email to web
bash$ echo "<ahmad@somwehere.com>" |
> sed -e 's/<\([^>]*\)@\([^>]*\)>/www.\2\/~\1\//g'
www.somwehere.com/~ahmad/
# web to email
bash$ echo "www.somwehere.com/~ahmad/" |
> sed -e 's/www.\([^/ ]*\)\/~\([^/ ]*\)\/\?/<\2@\1>/g'
<ahmad@somwehere.com>
# first name goes first
bash$ cat <<EOS | sed -e 's/ *\([^,]\+\), *\([^.<]\+\)\.\?/\2 \1/'
> Lastname, Firstname. <someone@somewhere.com>
> Ali, Muhammad
> Alsadi, Muayyad.
> Hacker, Jack Random.
> EOS
Firstname Lastname <someone@somewhere.com>
Muhammad Ali
Muayyad Alsadi
Jack Random Hacker
To convert from an encoding (aka. charcter set) to another
you could use internation convert tool 'iconv', it has a very simple syntax
'iconv -f FROMCODE -t TOCODE INPUT > OUTPUT'
for example
bash$ iconv -f WINDOWS-1256 -t UTF-8 arabic.html > arabic-u.html
4.2.6. Miscellaneous tools.
To display a calender of the current month type 'cal',
it has options to display the year calender or to jump to a specefic month.
Date and time could be displayed by typing 'date',
if you don't like that format you could specify
any format you like by using + sign followed by the format you
like in a 'strftime' style refere to subsection 7.2.6 of section '7.2. Using standard C library (libc)'
(or date manul), for example "date '+%Y-%m-%d'" or with time "date '+%Y-%m-%d %I:%M:%S%p'".
Date could do complex time operations with '-d' option, for example to know the date
before some hours or after some days(seems simple, but take the case of the day before the first of January),
for example to display the date two days ago "date '+%Y-%m-%d' -d '-2 days'"
or after two days "date '+%Y-%m-%d' -d '+2 days'".
You could use '-s' to change the date and time, the default syntax
is 'date -s MMDDhhmmYYYY', for example
'date -s 072820452004' sets the date
to be 2004-07-28 and sets the time to be '20:45' (ie. '08:45pm'),
it could be more readable if you specify a custom format like
"date -s '2004-07-28 08:45:00pm' '+%Y-%m-%d %I:%M:%S%P'"
'uptime' tool displays some useful system statistics,
it displays the current time, followed by the uptime in seconds
(how long the system has been running, since last boot), number of
logged users and load averages for the past 1, 5, and 15 minutes,
the higher load average the more busy the system is.
Similarly 'cat /proc/loadavg' prints the same three load average percentages
followed by a ratio of number of non-sleeping processes to the number of all processes
and followed by the PID of last process, it could be something like '0.20 0.18 0.12 1/80 1120'.
'cat /proc/uptime' will display two times in seconds,
the first is the uptime and the other is the idle time (summation of all
times the system was doing nothing), the closer those numbers
the smother the system goes, you could calculate the ratio with
bash$ ( cat /proc/uptime; echo "2 k r / 100 * p q" ) | dc
91.05
if you don't run X it's almost 100! the difference between uptime
and idle time is the time (in seconds) in which the system was busy, you could use it
to calculate the overall load average (since boot):
bash$ ( cat /proc/uptime; echo "- p q" ) | dc
1052
bash$ ( cat /proc/uptime; echo "2 k r / 1 r - p q" ) | dc
0.12
To send a message that shows up in all terminal
(pseudo terminal VTs and real serial terminals)
the same way you get 'Message to all users: the system is going down'
when you type 'halt' or in case of power failur indicated by UPS unit,
type 'wall' then at the prompt type the message then press
'CTRL+D', to send t to specific user
use 'write' tool similarly, if those messages
were annoying, disable them with 'mesg n'
to enable them type 'mesg y'.
Those messages are not mail they are not saved, they are displayed just
the time they arrive, you may use 'talk' (like chatting) or usual
email to send messages to users locally or through a network.
|
Best viewed with free web browsers
You may get more high quality software
from here for free

Generously Hosted by www.JadMadi.net
|