How to LPI-101

From Linux - Help
Jump to navigation Jump to search


Goal: The objective is to get a how to pass the LPI exams. This will cover everything you need to pass the LPI-101 exams. You will still need to read the book that you can easily buy it here.

Exploring Linux Command-Line Tools

The following exam objectives are covered in this chapter:

 103.1 Work on the command line.
 103.4 Use streams, pipes, and redirects. 
 103.2 Process text streams using filters. 
 103.7 Search text files using regular expressions. 

Work on the command line (103.1)


When you working with the command line mode you can choose a shell type, a lot exist so I list here the most known shells:

 bash The GNU Bounce Again Shell (bash) is based on the earlier Bourne shell for Unix but extends it in several ways. In Linux, bash is the most common default shell for user 
      accounts (this is the one emphasized on the exam).
 sh   The Bourne shell upon which bash is based goes by the name sh. It's not often used in Linux and the sh command is often a pointer to the bash shell or other shells' 
 tcsh This shell is based on the earlier C shell (csh). It's fairly popular shell in some circle, but no major Linux distributions make it the default shell. Although it's similar to 
      bash in many respects, some operational details differ. For instance, you don't assign environment variables the same way in tcsh as in bash. 
 csh The original C shell isn't used much on Linux, but if a user is familiar with csh, tcsh makes a good substitute. 
 ksh The Korn shell (ksh) was designed to take the best features of the Bourne shell and the C shell and extend them. It has a small but dedicated following among Linux users. 
 zsh The Z shell (zsh) takes shell evolution further thant the Korn shell, incorporating features from earlier shells and adding still more. 

Be aware that there are two types of default shells. The default interactive shell is the shell program a user uses to enter commands, run programs from the command line, run shell scripts, and so on. The other default shell type is a default system shell. The default system shell is used by the Linux system to run system shell scripts, typically at startup.

The file /bin/sh is a pointer to the system's default shell —— normally /bin/bash for Linux. However, be aware that, on some distribution, the /bin/sh points to a different shell. For ex.: on Ubuntu / Debian, /bin/sh points to the dash shell, /bin/dash.

You can start your terminal, remember that the shell is a program providing you with an interface to the Linux system. A good first command to try, uname, will show what OS is being run:

 $ uname

That's not too intersting, so try the -a option:

 $ uname -a 
 Linux game 4.9.0-4-amd64 #1 SMP Debian 4.9.51-1 (2017-09-28) x86_64 GNU/Linux

The uname -a command provides more information, including the current Linux kernel being used (4.9.0-4) as well as the system's hostname (game). The uname command is an external command. The shell also provides internal commands. It's important to know the difference between the two command types.

Using Internal and External Commands

Internal commands are built into the shell program. Thus they are also called built-in commands. Most shell offer a similar set of internal commands, but shell-to-shell differences do exist. Internal commands that you're likely to use enable you to perform some common tasks:

 cd                    - Change the Working Directory
 pwd                   - Print the Working Directory
 echo                  - Display a Line of Text
 time                  - Get Time in Seconds
 set                   - Set Options
 exit & logout         - The exit and logout commands both terminate the shell

You can quickly determine if a command is built-in command by using the type command:

 $ type pwd 
 pwd is a shell builtin
 $ type cd
 cd is a shell builtin
 $ type bash
 bash is /bin/bash

Some of these internal commands are duplicated by external commands that do the same thing. You can see if there are internal commands with installed duplicate external commands by using the -a option on the type command:

 $ type -a pwd
 pwd is a shell builtin
 pwd is /bin/pwd

Performing Some Shell Command Tricks

 history            - GNU History Library

The history command provides an interface to view and manage the history. Typing history alone displays all of the commands in the history (typically the latest 500 commands). To retrieve the last command in your shell history, type !! and press Enter. This will not only show you the command you recalled but execute it as well:

 $ !!
 type -a pwd
 pwd is a shell builtin
 pwd is /bin/pwd

You can execute a command by number via typing an exclamation mark followed by its number !210. Typing history -c clears the history, which can be handy if you've recently typed commands you'd rather not have discovered by others (as a command with a password). The bash history is stored in the .bash_history file in your home directory.

 [Retrieve a Command]
 press UP arrow key on your keyboard. 
 [search for a command]
 Ctrl+R Reverse search mode.
 Ctrl+S in Reverse search mode search forward. 
 [terminate search]
 Ctrl+Q to resume terminal operations.
 Ctrl+G to terminate searches. 
 [moving within the line]
 Ctrl+A to move the cursor to the start. 
 Ctrl+E to move the cursor to the end. 
 Ctrl+B to move one character left (same as ←).
 Ctrl+F to move one character right (same as →).
 Ctrl+← to move one word at a time left (or ESC+B).
 Ctrl+→ to move one word at a time right (or ESC+F).
 [Delete text]
 Ctrl+D (or delete key) deletes character under the cursor. 
 Backspace key deletes the character to the letf of the cursor.  
 Ctrl+K deletes all text from the cursor to the end of the line. 
 Ctrl+X and then backspace deletes all of the text from the cursor to the beginning of the line. 
 [Tranpose Text]
 Ctrl+T transpose the character before the cursor with the character under the cursor. Pressing ESC and then T transpose the two words immediately before (or under) the cursor. 
 [Change case] 
 Pressing ESC and then U converts text from the cursor to the end of the word to uppercase. 
 Pressing ESC and then L converts text from the cursor to the end of the word to lowercase. 
 Pressing ESC and then C converts the letter under the cursor (or the first letter of the next word) to uppercase.  
 [Invoke an Editor] 
 Ctrl+X followed by Ctrl+E, the bash shell attempts to launch the editor defined by the $FCEDIT or $EDITOR environment variable, or it launches Emacs as a last resorts.

Exploring Shell Configuration

Shells, like many programs, are configured through files that hold configuration option in a plain-text format. The bash configuration files are actually bash shell scripts.

The files are located in ~/.bashrc and /etc/profile.

 $ vim ~/.bashrc

Using Environment Variables

Environment variable are like variables in programming languages —— they hold data to be referred to by the variable name. Environment variables differ from programs internal variables in that they're part of the programs environment, and other programs, such as the shell, can modify this environment. Programs can rely on environment variables to set information that can apply to many different programs. For instance, many text-based programs need to know the capabilities of the terminal program you use. This information in conveyed in the $TERM environment variable, which is likely to hold a value such as xterm or linux. Programs that need to position the cursor, display color text, or perform other tasks that depend on terminal-specific capabilities can customize their output based on this information.

Example we modify the environment variable PS1, it modifies your shell prompt:

 $ echo $PS1 
 [\u@\h \W]\$
 $ PS1='\[\e[1;32m\]\u@\h:\w${text}$\[\e[m\] '
 $ export PS1

You can combine these two commands into a single form:

 $ export PS1='\[\e[1;32m\]\u@\h:\w${text}$\[\e[m\] '

You can displaying an environment variable by example $PATH with echo:

 $ echo $PATH

That's a little better. Remember, the $PATH environment variable provides the shell with a directory list to search when you're entering command or program names.

You can also view the entire environment by typing env. The result is likely to be several dozen line of environment variables and their values. To delete an environment variable, use the unset command. The command takes the name of an environment variable (without the leading $ symbol) as an option. For instance, unset PS1 removes the $PS1 environment variable. But if you do so, you will have no shell prompt!

Getting Help

Linux provides a text-based help system known as man (manual).

 $ man man     ## to see the manual page of man. 
 $ man export  ## to read more about the export command. 
 $ man builtin ## to learn more about internal commands.  

The man utility uses the less pager by default to display information.

 spacebar            to move forward a page. 
 ESC followed by v   to move back a page. 
 arrow keys          to move up or down a line at a time.
 /                   to search for text. 
 Q                   to exit. 

You aren't stuck to using the less pager with the man utility. You can change the pager by using the -P option. For ex.: if you decide to use the more pager instead to look up information on the uname command:

 $ man -P /bin/more uname

Occasionally, the problem arises where you can't remember the exact name of a command to look up. The man utility has an option to help you here. You can use the -k option along with a keyword or two to search through the man pages:

 $ man -k "system information"
 dumpe2fs (8)         - dump ext2/ext3/ext4 filesystem information
 neofetch (1)         - simple system information script
 sysinfo (2)          - return system information
 uname (1)            - print system information

Be aware that poor keyword choices may not produce the results you seek (this is not Google).

Note: On some older Linux distributions, you may get no results from a man utility keyword search. This is most likely due to a missing whatis database. The whatis database contains a short description of each man page, and it is necessary for keyword searches. To creat it or update it, type 'makewhatis at the prompt (as root).

Linux man pages are organized into several sections, which are summarized here under. Sometimes a single keyword has entries in multiple sections. For instance, passwd has entries under both section 1 and section 5. In most cases, man returns the entry in the lowest-numbered section, but you can force the issue by preceding the keyword by the section number. For instance:

 $ man 5 passwd

Manual sections:

 Section number:          Description:
        1                 Executable programs and shell commands
        2                 System calls provided by the kernel
        3                 Library calls provided by program libraries
        4                 Device files (usually stored in /dev)
        5                 File formats
        6                 Games
        7                 Miscellaneous (macro packages, conventions, and so on)
        8                 System administration commands (programs run mostly or exclusively by root)
        9                 Kernel routines

Some programs have moved away from man pages to info pages. The basic purpose of info pages is the same as that for man pages. However, info pages use a hypertext format so that you can move from section of the documentation for a program.

 $ info info

There are also pages specially for the built-in commands called the help pages.

 $ help help

The man, info, and help are intended as reference tools, not tutorials! They frequently assume basic familiarity with the command, or at least with Linux in general. For more tutorial information, you must look elsewhere, such in books or on the Web.

Use streams, pipes, and redirects (103.4)

Streams, redirection, and pipe are some of the more powerful command-line tools in Linux. Linux treats the input to and output from programs as a stream, which is data entity that can be manipulated. Ordinarily, input come from the keyboard and output goes to the screen. You can redirect these input and output streams to come from or go to other sources, such as files. Similarly, you can pipe the output of one program as input into another program. These facilities can be great tools to tie together multiple programs.

Exploring File Descriptors

Linux uses file descriptors:

 Standard Input (STDIN)   :Programs accept keyboard input via STDIN. Standard input's file descriptor is 0 (zero). In most cases, this is the data that comes into the computer from a keyboard. 
 Standard Output (STDOUT) :Text-mode programs send most data to their users via STDOUT. Standard output is normally displayed on the screen, either in a full-screen text mode session or in a 
                           GUI terminal, such as xterm. STDOUT file descriptor is 1 (one).
 Standard Error (STDERR)  :Linux provides a second type of output stream, know as STDERR. STDERR file descriptor is 2 (two). This output stream is intended to carry high-priority information such
                           as error messages. 

Internally, programs treat STDIN, STDOUT, and STDERR just like data files —— they open them, read from or write to the files, and close them when they're done. This is why the file descriptor are necessay and why they can be used in redirection.

Redirecting Input and Output

To redirect input or output, you use operators following the command, including any option it takes. For instance, to redirect the STDOUT of the echo command, you would type something like this:

 $ echo $PATH 1> path.txt
 $ cat path.txt 

The result is that the file path.txt contains the output of the command (in this case, the value of the $PATH environment variable). The operator used to perform this redirection was > and the file descriptor used to redirect STDOUT was 1 (one).

A nice feature of redirecting STDOUT is that you do not have to use its file descriptor, only the operator. Here's an example:

 $ echo $PATH > path.txt

You can see that even without the STDOUT file descriptor, the output was redirected to a file. However, the redirection operator > was still needed.

Redirection operators exist to achieve several effects:

 Redirection operator:         Effect:                                                                                                   File Descriptor needed?
 >                             Creates a new file containing STDOUT. If the specified file exists, it's overwritten.                     No  
 >>                            Appends STDOUT to the existing file. If the specified file doesn't exist, it's created.                   No
 2>                            Creates a new file containing STDERR. If the specified file exists, it's overwritten.                     Yes
 2>>                           Appends STDERR to the existing file. If the specified file doesn't exist, it's created.                   Yes
 &>                            Creates a new file containing both STDOUT and STDERR. If the specified file exists, it's overwritten.     No   
 <                             Sends the contents of the specified file to be used as STDIN.                                             No
 <<                            Accepts text on the following lines as STDIN.                                                             No
 <>                            Causes the specified file to be used for both STDIN and STDOUT.                                           No

Most of these re-directors deal with output, both STDOUT and STDERR. The most important input re-director is <, which takes the specified file's contents as STDIN.

Tip: A common trick is to redirect STDOUT or STDERR to /dev/null This file is a device that's connected to nothing; it's used when you want to get rid of data. For instance, if a program is generating too many important error messages, you can type:

 $ program 2> /dev/null

One redirection operator that requires elaboration is the << operator. This operator implements something called a here document. A here document takes text from subsequent lines as STDIN. You might use this command in a script to pass data to an interactive program. Unlike with most redirection operators, the text immediately following the << code isn't a filename; instead, it's a word that's used to mark the end of input. For instance, typing

 $ someprog << EOF

Causes someprog' to accept input until is sees a line that contains only a string EOF (without even a space following it).

Note: Some programs that take input from the command line expect you to terminate input by pressing Ctrl+D. This keystroke correspond to an end-of-file marker using the American Standard Code for information Interchange (ASCII).

Piping Data between Programs

Programs can frequently operate on other program's output. For instance, you might use a text-filtering command to manipulate text output by another program. You can do this with the help of redirection operators: send the first program's STDOUT to a file, and then redirect the second program's STDIN to read from that file. This method is awkward, though, and it involves the creation of a file that you might easily overlook, leading to unnecessary clutter on your system.

The solution is to use data pipes (aka pipelines). A pipe redirects the first program's STDOUT to the second program's STDIN, and it's denoted by a vertical bar (|):

 ## Example with redirection operators
 $ cat /etc/passwd > passwd.txt
 $ grep 'root' passwd.txt
 ## Example using pipe redirection
 $ cat /etc/passwd | grep root
 ## Example for the explanation here under
 $ first | second

For instance, suppose that first generates some system statistics, such as system uptime, CPU use, number of users logged in, and so on. This output might be lengthy, so you want to trim it a bit. You might therefore use second, which could be a script or command that echoes from its STDIN only the information in which you're interested. (The grep command is often used in this role.)

Pipes can be used in sequences of arbitrary lenght:

 $ first | second | third | fourth | [...]

Another redirection tool often used with pipes is the tee command. This command splits STDIN so that it's displayed on STDOUT and in as many files as you specify. Typically, tee is used in conjuction with data pipes so that program's output can be both stored and viewed immediately. For instance, to view and store the output of the echo $PATH command, you might type this:

 $ echo $PATH | tee path.txt

Notice that only were the results of the command displayed to STDOUT, but they were also redirected to the path.txt file by the tee command. Ordinarily, tee overwrites any files whose names you specify. If you want to append data to these files, pass the -a option to tee.

 $ echo $PATH | tee -a path.txt

Generating Command Lines

Sometimes you'll find yourself to conduct an unusual operation on your Linux server. For instance, suppose you want to remove every file in a directory tree that belongs to a certain user. With a large directory tree, this task can be daunting! The usual file-detection command, rm, doesn't provide an option to search for and delete every file that matches a specific criterion. One command that can do the search portion is find. This command displays all of the files that match the criteria you provide. If you could combine the output of find to create a series of command lines using rm, the task would be solved. This is precisely the purpose of the xargs command.

The xargs command builds a command from its STDIN. The basic syntax for this command is as follows:

 xargs [option] [command [initial-arguments]]

The command is the command you want to execute, and initial-arguments is a list of arguments you want to pass to the command. The options are xargs options; they aren't passed to command. When you run xargs, it runs command once for every word passed to it on STDIN, adding that word to the argument list for command. If you want to pass multiple options to the command, you can protect them by enclosing the group in quotation marks.

For instance, consider the task of deleting several files that belong to a particular user. You can do this by piping the output of find to xargs, which then calls rm:

 # find / -user Christine | xargs -d "\n" rm 

The first part of this command (find / -user Christine) finds all of the file in directory tree / and its sub-directories that belong to user Christine. This list is then piped to xargs, which adds each input value to its own rm command. Problems can arise if filenames contain spaces because by default xargs uses both spaces and newlines as item delimiters. The -d "\n" option tells xargs to use only newlines as delimiters, this avoiding this problem in this context. (The find command separates each found filename with a newline.)

A tool that's similar to xargs in many ways is the back-tick (`). The back-tick is not the same as the single quote character ('). Text within back-ticks is treated as a separate command whose results are substituted on the command line. For instance, to delete those user files, you can type the following command:

 # rm `find ./ -user Christine`

The back-tick solution works fine in some cases, but it breaks down in more complex situations. The reason is that the output of the back-tick-contained command is passed to the command it precedes as if it has been typed at the shell. By contrast, when you use xargs, it runs the command you specify (rm in these examples) once for each of the input items. What's more, you can't pass options such as -d "\n" to a back-tick. Thus these two examples will work the same in many cases, but not in all of them.

Note: In several sheels, you can use $(). For instance, the back-tick example used would be changed to:

 # rm $(find ./ -user Christine)

This command woks just as well, and it is much easier to read and understand.

Process text streams using filters (103.2)

In keeping with Linux's philosophy of providing small tools that can be tied together via pipes and redirection to accomplish more complex tasks, many simple commands to manipulate text are available. These commands accomplish task of various types, such as combining files, transforming the data in files, formatting text, displaying text, and summarizing data.

File-Combining Commands

Combining Files with cat

The cat command's name is short for concatenate, and this tool does just that: It links together an arbitrary number of files end to end and sends the result to STOUT. By combining cat with output redirection, you can quickly combine two files into one:

 $ cat first.txt second.txt > combined.txt
 $ cat first.txt
 Data from first file.
 $ cat second.txt  
 Data from second file. 
 $ cat combined.txt
 Data from first file.
 Data from second file.

Although cat is officially a tool for combining files, it's also commonly used to display the contents of a short file to STDOUT. If you type only one filename as an option, cat displays that file. This is great way to review short files; but for longer files, you're better off using a full-fledged pager command, such as more or less.

You can add option to have cat perform minor modifications to the files as it combines them:

 -E or --show-ends option:        The result is a dollar sign ($) at the end of each line. 
 -n or --number option:           Adds line numbers to the beginning of every line. 
 -b or --number-nonblank option:  Is similar, but it numbers only lines that contain text. 
 -s or --squeeze-blank option:    Compresses groups of blank lines down to a single blank line. 
 -T or --show-tabs option:        Displays tab characters as ^I. 
 -v or --show-nonprinting option: Displays most control and other special characters using carat (^) and M- notations. 

The tac command is similar to cat, but it reverses the order of lines in the output:

 $ cat combined.txt  
 Data from first file.
 Data from second file.
 $ tac combined.txt
 Data from second file.
 Data from first file.
Joining Files by Field with join

The join command combines two files by matching the contents of specified fields within the files. Fields are typically space-separated entries on a line. However, you can specify another character as the field separator with the -t char option, where char is the character you want to use. You can cause join to ignore case when performing comparisons by using the -i option.

The effect of join may best be understood through a demonstration:

 $ cat listing_1.txt
 555-2397 Becket, Barry
 555-5116 Carter, Gertrude
 555-7929 Jones, Theresa
 555-9871 Orwell, Samuel
 $ cat listing_2.txt
 555-2397 listed
 555-5116 unlisted
 555-7929 unlisted
 555-9871 listed
 $ join listing_1.txt listing_2.txt
 555-2397 Becket, Barry listed
 555-5116 Carter, Gertrude unlisted
 555-7929 Jones, Theresa unlisted
 555-9871 Orwell, Samuel listed

By default, join uses the first field as the one to match across files. Because Listing_1.txt and Listing_2.txt both place the phone number in this field, it's the key field in the output.

You can specify another field by using the -1 or -2 option ot indicate the join field for the first or second file, respectively. For ex.:

 $ join -1 3 -2 2 cameras.txt lenses.txt  

The -o FORMAT option enables more complex specifications for the output file's format. You can consult the man page for join for even more details.

Merging Lines with paste

The paste command merges files line by line, separating the lines from each file with tabs, as shown in the following example, using listing_1.txt and listing_2.txt again:

 $ paste listing_1.txt listing_2.txt
 555-2397 Becket, Barry	555-2397 listed
 555-5116 Carter, Gertrude	555-5116 unlisted
 555-7929 Jones, Theresa	555-7929 unlisted
 555-9871 Orwell, Samuel	555-9871 listed

You can use paste to combine data from files that aren't keyed with fields suitable for use by join Of course, to be meaningful, the file's line numbers must be exactly equivalent. Alternatively, you can use paste as a quick way to create a two-column output of textual data; however, the alignment of the second column may bot be exact if the first column's line lengths aren't exactly even.

File-Transforming Commands

Many of Linux's text-manipulation commands are aimed at transforming the contents of files. These commands don't actually change file's content but instead send the changed file's contents to STDOUT. You can then pipe this output to another command or redirect it into a new file.

Note: An important file-transforming command is sed. This command is very complex.

Converting Tabs to Spaces with expand

Sometimes text files contain tabs but programs that need to process the files don't cope well with tabs. In such case, you want to convert tabs to spaces. The expand command does this. By default, expand assumes a tab stop every eight characters. You can change this spacing with the -t num or --tabs=nums option, where num is the tab spacing value.

Displaying Files in Octal with od

Some files aren't easily displayed in ASCII. For ex.: most graphics files, audio files, and so on use non-ASCII characters that look like gibberish. Nontheless, you may sometimes want to display such files, particularly if you want to investigate the structure of a data file.

In such case, od (whose name stands for octal dump) can help. For instance, consider listing_1.txt as parsed by od:

 $  od listing_1.txt 
 0000000 032465 026465 031462 033471 041040 061545 062553 026164
 0000020 041040 071141 074562 032412 032465 032455 030461 020066
 0000040 060503 072162 071145 020054 062507 072162 072562 062544
 0000060 032412 032465 033455 031071 020071 067512 062556 026163
 0000100 052040 062550 062562 060563 032412 032465 034455 033470
 0000120 020061 071117 062567 066154 020054 060523 072555 066145
 0000140 000012

The first field on each line is an index into the file in octal. For instance, the second line begins at octal 20 (16 base 10) bytes into the file. The remaining numbers on each line represent the bytes in the file. This type of output can be difficult to interpret unless you're well versed in octal notation and perhaps in the ASCII code. Although od is nominally a tool for generating octal output, it can generate many other output, such as hexadecimal (base 16), decimal (10), and even ASCII with escaped control characters. Consult the man page for od for details on creating these variants.

Sorting Files with sort

Sometimes you'll create an output file that you want sorted. To do so, you can use a command that's called, appropriately enough, sort. This command can sort in several ways, including the following:

 -f or --ignore option cause sort to ignore case, 
 -M or --month-sort option cause the program to sort by three-letter month abbreviation (JAN through DEC). 
 -n or --numeric-sort option can sort by number. 
 -r or --reverse option sorts in reverse mode. 
 -k field or --key=field option, by default sort uses the first field as its sort field. You can specify another field. 

As an example, suppose you want to sort listing_1.txt by first name. You could do so:

 $ sort -k 3 listing_1.txt 
 555-2397 Becket, Barry
 555-5116 Carter, Gertrude
 555-9871 Orwell, Samuel
 555-7929 Jones, Theresa

The sort command supports a large number of additional options, many of them quite exotic. Consult sort's man page for details.

Breaking a File into Pieces with split

The split command can split a file into two or more files. unlike most of the text-manipulation commands described, this command requires you to enter an output filename or, more precisely, an output filename prefix, to which is added and alphabetic code. You must also normally specify how large you want the individual files to be:

 -b size or --bytes=size option breaks the input file into pieces of size bytes. This option can have the usually undesirable consequence of splitting the file mid-line.
 -C=size or line-bytes=size option can break a file into pieces of no more than a specified size without breaking lines across files.
 -l lines or --lines=lines option split the file into chunks with no more than the specified number of lines. 

As an example, consider breaking listing_1.txt into two parts by number of lines:

 $ split -l 2 listing_1.txt numbers

The result is two files, numbersaa and numbersab, which together hold the original contents of listing_1.txt. If you don't specify an input filename, split uses standard input.

Translating Characters with tr

The tr command changes individual characters from STDIN. Its syntax is as follows:

 tr [option] SET1 [SET2]

You specify the characters you want replaced in a group (SET1) and the characters with which you want to be replaced as a second group (SET2). Each character in SET1 is replaced with the one at the equivalent position in SET2. Here's an example using listing_1.txt:

 $ tr BCJ bc < listing_1.txt  
 555-2397 becket, barry
 555-5116 carter, Gertrude
 555-7929 cones, Theresa
 555-9871 Orwell, Samuel

Note: the tr command relies on STDIN, which is the reason for the input redirection (<) in this example. This is the only way to pass the command a file.

You can use the option -d which cause the program to delete the character from SET1. When using -d, you omit SET2 entirely. The tr command also accepts a number of shortcuts, such as [:alnum:] (all numbers and letters), [:upper:} (all uppercase letters), [:lower:] (all lowercase letters), and [:digit:] (all digits). You can specify a range of characters by separating them with dashes (-), as in A-M. Consult the tr man pages for a complete list of these shortcut.

Converting Spaces to Tabs with unexpand

The unexpand command is the logical opposite of expand; it converts multiple spaces to tabs. This can help compress the size of files that contain many spaces and can be helpful if a file is to be processed by a utility that expects tabs in certain locations.

Like expand, unexpand accepts the -t num or --tabs=num option, which sets the tab spacing to once every num characters. If you omit this option, unexpand assumes a tab stop every eight characters.

Deleting Duplicates Lines with uniq

The uniq command removes duplicate lines. It's most likely to be useful if you've sorted a file and don't want duplicate items. For instance, suppose you want to summarize Shakespeare's vocabulary. You might create a file with all of the Bard's works, one word per line. You can then sort this file using sort and pass it through uniq. Using a shorter example file containing the text to be or not to be, that is the question (one word per line, the result looks like this:

 $ sort shakespeare.txt | uniq

Note that the words to and be, which appeared in the original file twice, appear only once in the uniq-processed version.

File-Formatting Commands

The next three commands —— fmt, nl, and pr —— reformat the text in a file. The first of these is designed to reformat text files, such as when a program's README documentation file uses lines that are too long for your display. The nl command numbers the lines of a file, which can be helpful in referring to lines in documentation or correspondence. Finally, pr is a print-processing tool; it formats a document in pages suitable for printing.

Reformatting Paragraphs with fmt

Sometimes text files arrive with outrageously long line lengths, irregular line lengths, or other problem. Depending on the difficulty, you may be able to copy simply by using an appropriate text editor or viewer to read the file. If you want to clean up the file a bit, though, you can do so with fmt. If called with no options (other than the input filename, if you're not having it work on STDIN), the program attempts to clean up paragraphs, which it assumes are delimited by two or more blank lines or by changing in indentation. The new paragraph formatting defaults to paragraphs that are not more than 75 characters wide. You can change this with the -width, -w width, and --width=width options, which set the line length to width characters.

Numbering Lines with nl

As described earlier, in "Combining Files with cat", you can number the lines of a file with that command. The cat line-numbering options are limited, through, if you need to do complex line numbering. The nl command is the tool to use in this case. In its simplest form, you can use nl alone to accomplish much the same goal as cat -b achieves: numbering all the non-blank lines in a file. You can add many options to nl to achieve various special effects:

 -b style or --body-numbering=style You can set the numbering style for the bulk of the lines, where style is a style format code. 
 -h style or --header-numbering=style / -f style or --footer-numbering=style If the text is formatted for printing and has headers of footers, you can set the style for these elements. 
 -d=code or --selection-delimiter=code Some numbering schemes reset the line numbers for each page. You can tell nl how to identify a new page, though, it doesn't reset the line number with a new page. 
 -n format or --number-format=format You can specify the numbering format, where format is ln (left justified, no leading zeros), rn (right justified, no leading zeros), 
                                             or rz (right justified with leading zeros).

Styles used by nl:

 Style code       Description
 t                The default behavior is to number lines that aren't empty. You can make this default explicit by using a style code of t.
 a                This style code causes all lines to be numbered, including empty lines. 
 n                This style code causes all line numbers to be omitted, which may be desirable for headers and footers. 
 pREGEXP          This option causes only lines that matches the specified regular expression (REGEXP) to be numbered.   

As an example, suppose you've created a script, buggy, but you find that it's not working as you expect. When you run it, you get an error message that refer to line numbers, so you want to create a version of the script with lines that are numbered for easy reference. You can do so by calling nl with the option to number all lines, including blank lines (-b a):

 $ nl -b a buggy > numbered-buggy.txt
Preparing a File for Printing with pr

If you want to print a plain-text file, you may want to prepare it with headers, footers, page breaks, and so on. The pr command was designed to do this. In it's most basic form, you pass the command a file:

 $ pr myfile.txt

The result is text formatted for printing on a line printer —— that is, pr assumes an 80- character line length in a monospaced font. Of course, you can also use pr in a pipe, either to accept input piped from another program or to pipe its output to another program. (The recipient program might be lpr, which is used to print files.) By default, pr creates output that includes the original text with headers, which lists the current date and time, the original filename, and the page number. You can tweak the output format in a variety of ways, including the following:

 -numcols or -columns=numcols option creates output with numcols columns. For ex.: if you typed pr -3 myfile.txt, the output would be displayed in three columns. 
 -d or --double-space option cause double-spaced output from a single-spaced file. 
 -F, -f or --form-feed option, which causes pr output a form-feed character between pages. This works better with some printers.
 -l lines  or --length=lines  option sets the length of the page in lines. 
 -h text or --header=text option sets the text to be displayed in the header, replacing the filename. To specify a multi word, enclose it in quotes ("").
 -t or --omit-header option omits the header entirely.
 -o chars or --indent=chars option set the left margin to chars characters. This margin size is added to the page width, which defaults to 72 characters. 
 -w chars or --witdh chars option set the page width (default is 72 characters).

These option are just the beginning, pr supports many more options, which are described in its man page. As an example of pr in action, consider printing a double-spaced and numbered version of a configuration file (say /etc/profile) for your reference. You can do this by piping together cat and its -n option to generate a numbered output, pr and its -d option to double-space the result, and lpr to print the file:

 $ cat -n /etc/profile | pr -d | lpr 

The result should be a printout that might be handy for taking notes on the configuration file. One caveat, though: if the file contains lines that approach or exceed the 80 characters in lenght, the result can be single lines that spill across two lines. The result will be disrupted page boundaries. As workaround, you can set a somewhat short page length with -l and use -f to ensure that the printer receives from feeds after each pages:

 $ cat -n /etc/profile | pr -dfl 50 | lpr

Tip: pr command was build for printer capabilities from the 80's, it is still usefull, but you can use genscript (for GNU Enscript) that can take better advantage of modern printer features.

File-Viewing Commands

Sometimes you just want to view a file or part of a file. A few commands can help you accomplish this goal without loading the file into a full-fledge editor.

Viewing the Starts of Files with head

Sometimes all you need to do is see the first few lines of a file. This may be enough to identify what a mystery file is, for instance; or you want to see the first few entries of a log file to determine when that file was started. You can accomplish this goal with the head command, which echoes the first 10 lines.

 -C num or --bytes=num option tells head to display num bytes from the file rather than the default 10 lines. 
 -n num or --lines=num option can change the number of lines diplayed. 
Viewing the Ends of Files with tail

The tail command works jusy like head, except that tail displays the last 10 lines of a file. The tail command supports several options that aren't present in head and that enable the program to handle additional duties, including the following:

 -f or --follow option tells tail to keep the file open and to display new lines as they're added (this feature is helpful to track log activities). 
 --pid=pid option tells tail to terminate tracking (as initiated by -f) once the process with a process ID (PID) terminates. 
Paging through Files with less

The less command's name is a joke; it's a reference to the more command, which was an early file pager. The idea was to create a better version of more, so the developper called it less ("less is more").

The idea behind less (and more, for that matter) is to enable you to read a file a screen at a time. When you type less filename, the program displays the first few lines of filename. You can then page back and forth through the file:

 * Pressing the spacebar moves forward  through the file a screen at a time. 
 * Pressing ESC followed by V moves backward through the file a screen at a time. 
 * The up and down arrow keys move up and down through the file a line at a time. 
 * / key followed by the search term, will search in the file, pressing Enter key moves to the next occurrence of search term (typing n alone repeats the search forward while typing N repeats the search backward). 
 * ? will search backward in the file. 
 * g followed by the line number will move you to the specific line.
 * q to exit the program. 

Unlike most of the program described here, less can't be readily used in a pipe, except as the final command in the pipe.

File-Summarizing Commands

The final text-filtering commands described here are used to summarize text in one way or another. The cut command takes segments of an input file and sends them to STDOUT, while wc' command displays some basic statistics on the file.

Extracting Text with cut

The cut command extracts portions of input lines and displays them on STDOUT. You can specify what to cut from input lines in several ways:

 -b or --bytes=list option cuts the specified list of bytes from the input file. 
 -c or --characters=list option cuts the specified list of characters from the input file. In practice, this method and the by-byte method usually produce identical results. 
 -f or --fields=list option cuts the specified list of fields from the input file. 
 -d char or --delim=char option is to change the default behavior of -f option.
 -s or --only-delimited option is to change the default behavior of -d option. 

Many of these options take a list option, which is a way to specific multiple bytes, characters, or fields. You make this specification by number. It can be a single number (such as 4), a closed range of numbers (such as 2-4), or an open range of numbers (such as -4 or 4-). In this final case, all bytes, characters, or fields from the beginning of the line to the specified number (or from the specified number to the end of the line) are included in the list. The cut command is frequently used in scripts to extract data from some other command's output.

For ex.: suppose you're writing a script and the script needs to know the hardware address of your Ethernet adapter. This information can be obtained from the ip a s command:

 $ ip a s enp0s31f6
 2: enp0s31f6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
     link/ether 38:d5:47:1b:ae:b4 brd ff:ff:ff:ff:ff:ff
     inet brd scope global dynamic enp0s31f6
        valid_lft 4733sec preferred_lft 4733sec
     inet6 fe80::3ad5:47ff:fe1b:aeb4/64 scope link 
        valid_lft forever preferred_lft forever

Unfortunately, most of this information is extraneous for the desired purpose. The hardware address is the 6-byte hexadecimal number following link/ether. To extract that data, you can combine grep with cut in a pipe:

 $ ip a s enp0s31f6 | grep "link/ether" | cut -d " " -f 6

Of course, in a script, you would probably assign this value to a variable or otherwise process it through additional pipes.

Obtaining a Word Count with wc

The wc command procudes a word count, as well as line and byte counts, for a file:

 $ wc file.txt
 308  2343  15534 file.txt

This file contain 308 lines, 2343 words, and 15534 bytes. You can limit the output to the newline count, the word count, the byte count, or a character count with the --lines (-l), --word (-w), --bytes (-c), or --chars (-m) option, respectively. You can also learn the maximum line length with the --max-line-length (-L) option.

Search text files using regular expressions (103.7)

Many Linux programs employ regular expressions, which are tools for describing or matching patterns in text. Regular expressions are similar in principle to the wildcards that can be used to specify multiple filenames. At their simplest, regular expressions can be plain text without adornment. However, certain characters are used to denote patterns. Because of their importance, regular expressions are described in the following section.

Two programs that make heavy use of regular expressions, grep and sed, are also covered. These programs search for text within files and permit editing of files from the command line, respectively.

Understanding Regular Expressions

Two forms of regular expression are common: basic and extended. Which form you must use depends on the program. Some accept one form or the other, but others can use either type, depending on the options passed to the program. (Some programs use their own minor or major variants on either of these classes of regular expression.) The differences between basic and extended regular expression are complex and subtle, but the fundamental principles of both are similar. The simplest type of regular expression is an alphabetic string, such as Linux or HWaddr. These regular expressions match any string of the same size or longer that contains the regular expression. For instance, the HWaddr regular expression matches HWaddr, this is the HWaddr, and the HWaddr is unknown. The real strength of regular expressions comes in the use of non-alphabetic characters, which activate advanced matching rules:

 Bracket Expressions:       Characters enclosed in square brackets ([]) constitute bracket expressions, which match any one character within the brackets. For instance, the regular expression
                            b[aeiou]g matches the word bag, beg, big, bog, and bug. 
 Range Expressions:         A range expression is a variant of a bracket expression. Instead of listing every character that matches, range expressions list the start and end points separated by 
                            a dash (-), as in a[2-4]. This regular expression matches a2z, a3z, and a4z. 
 Any single Character:      The dot (.) represents any single character except a newline. For instance, a.z matches a2z, abz, aQz, or any other three-character string that begins with a and ends with z. 
 Start and End of Line:     The carat (^) represents the start of a line, and the dollar sign ($) denotes the end of line. 
 Repetition Operators:      A full or partial regular expression may be followed by a special symbol to denote how many times a matching item must exist. Specifically, an asterisk (*) denotes zero or 
                            more occurrences, a plus sign (+) matches one or more occurrences, and a question mark (?) specifies zero or one match. The asterisk is often combined with the dot (as in .*) 
                            to specify a match with any sub-string. For instance, A.*Lincoln matches any string that contains A and Lincoln, in that order —— Abe Lincoln and Abraham Lincoln are 
                            just two possible matches.
 Multiple Possible Strings: The vertical bar (|) separates two possible matches; for instance, car|truck matches either car or truck. 
 Parentheses:               Ordinary parentheses (()) surround sub-expressions. Parentheses are often used to specify how operators are to be applied; for ex.: you can put parentheses around a group of words
                            that are concatenated with the vertical bar to ensure that the words are treated as a group, any one of which may match, without involving surrounding parts of the regular expression.  
 Escaping:                  If you want to match one of the special characters, such as a dot, you must escape it —— that is, precede it with a backslash (\). For instance, to match a computer hastname
                            (say,, you must escape the dots, as in twain\.example\.com. 

The preceding descriptions apply to extended regular expressions. Some details are different for basic regular expressions. In particular, the ?, +, |, and () symbols lose their special meanings. To perform the tasks handled by these characters, some programs, such as grep, enable you to recover the functions of these characters by escaping them (say, using \| instead of |). Whether you use basic or extended regular expressions depends on which form the program supports. For programs such as grep, which support both, you can use either. Which form you choose is mostly a matter of personal preference.

Regular expression rules can be confusing, particularly when you're first introduced to them. Some examples of their use, in the context of the programs that use them, will help. The next couple of sections provide such examples.

Using grep

The grep command is extremely useful. It searches for files that contain a specified string and returns the name of the file and (if it's a text file) a line of context for that string. The basic grep syntax is as follows:

 grep [options] regexp [files]

The regexp is a regular expression, as just described. The grep command supports a large number of options. Some of the common options enable you to modify the way the program searches files:

 -c or --count option, instead of displaying context line, grep display the number of lines that match the specified pattern. 
 -f file or --file=file option takes pattern input from the specified file rather than from the command line. 
 -i or --ignore-case option perform a search that isn't case sensitive. 
 -r or --recursive option searches in the specified directory and all sub-directories rather than simply the specified directory. You can use rgrep rather than specify the option.
 -F or --fixed-strings option, turn off the grep command's use of regular expression and use basic pattern searching instead. Alternatively you can use fgrep rather than grep.
 -E or --extended-regexp option, turn on the extended regular expression. Alternatively you can use egrep rather than grep.  

A simple example of grep uses a regular expression with no special components:

 $ grep -r eth0 /etc/*

This example finds all the files in /etc that contain the string eth0 (the identifier for the first wired Ethernet device on most Linux distributions). Because the example includes -r option, it searches recursively, so files in sub-directories of /etc are examined in addition to those in /etc itself. For each matching text file, the line that contains the string is printed.

Suppose you want to locate all the files in /etc that contain the string eth0 or eth1. You can enter the following command, which uses a bracket expression to specify both variant devices:

 $ grep eth[01] /etc/*

A still more complex example searches all files in /etc that contain the hostname or and, later on the same line, the number 127. This task requires using several of the regular expression features. Expressed using extended regular expression notation, the command looks like this:

 $ grep -E "(twain\.example\.com|bronto\.pangaea\.edu).*127" /etc/*

This command illustrates another feature you may need to use: shell quoting. Because the shell uses certain characters, such as the vertical bar (|) and the asterisk (*), for its own purposes, you must enclose certain regular expressions in quotes lest the shell attempt to parse the regular expression and pass a modified version of what you type to grep. You can use grep in conjunction with commands that produce a lot of output in order to sift through that output for the material that's important to you. For ex.: suppose you want to find the process ID (PID) of a running xterm. You can use a pipe to send the result of a ps command through grep:

 $ ps ax | grep xterm

The result is a list of all running processes called xterm, along with their PID's. You can even do this in series, using grep to restrict further the output on some other criterion, which can be useful if the initial pass still produces too much output.

Using sed

The sed command directly modifies a file's contents, sending the changed file to STDOUT. Its syntax can take one of two forms:

 sed [options] -f script-file [input-file]
 sed [options] script-text [input-file] 

In either case, input-file is the name of the file you want to modify. (Modifications are temporary unless you save them in some way, as illustrated shortly.) The script (script-text or the contents of script-file) is the set of the commands you want sed to perform. When you pass a script directly on the command line, the script-text is typically enclosed in single quote marks (). Here under summarizes a few sed commands that you can use in its scripts:

 Command                Addressing   Meaning
 =                        0 or 1     Display the current line number.
 a\text                   0 or 1     Append text to the file. 
 i\text                   0 or 1     Insert text into the file. 
 r\filename               0 or 1     Append text from filename into the file. 
 c\text                   Range      Replace the selected range of lines with the provided text. 
 s/regexp/replacement     Range      Replace text that matches the regular expression (regexp) with the replacement. 
 w filename               Range      Write the current pattern space to the specified file.
 q                        0 or 1     Immediately quit the script, but print the current pattern space. 
 Q                        0 or 1     Immediately quit the script. 

This is of course incomplete; sed is quitte complex, and this section merely introduces this tool.

The Addresses column requires elaboration: sed commands operate on addresses, which are line numbers. Commands may take no addresses, in which case they operate on the entire file. If one address is specified, they operate on the specified line. If two addresses (a range) are specified, the commands operate on that range of lines, inclusive.

In operation, sed looks something line this:

 $ sed 's/2012/2013/' cal-2012.txt > cal-2013.txt

This command processes the input file, cal-2012.txt, using sed s command to replace the first occurrence of 2012 on each line with 2013. (If a single line may have more than one instance of the search string, you must perform a global search by appending g to the command string, as in s/2012/2013/g.) By default, sed sends the modified file to STDOUT, so this example uses redirection to send the output to cal-2013.txt. The idea in this example is to convert a file created for the year 2012 quickly so that it can be used in 2013. If you don't specify an input filename, sed works from STDIN, so it can accept the output of another command as its input.

Although it's conceptually simple, sed is a very complex tool; even a modest summary of its capabilities would fill a chapter. You can consult its man page for basic information, but to understand sed fully, you may want to consult a book that tackles this though subject, such as Linux Command Line and Shell Scripting Bible 3rd Edition.

Managing Software

The following exam objectives are covered in this chapter:

 102.5 Use RPM and Yum package management. 
 102.4 Use Debian package management.
 102.3 Manage shared libraries.
 103.5 Create, monitor, and kill processes.
 103.6 Modify process execution priorities. 

Package Concepts

Packages: The most basic information that package systems maintain is information avout software packages —— that is, collections of files that are installed on the computer. Packages are usually distributed as single files that are similar to tarballs (archived created with the tar utility and usually compressed with gzip or bzip2) or zip files. Once installed, most packages consist of dozens or hundreds of files, and the package system tracks them all. Package include additional information that aids in the subsequent duties of package management systems.

Installed File Database: Package systems maintain a database of installed files. The database includes information about every file installed via the package system, the name of the package to which each of those files belongs, and associated additional information.

Dependencies: One of the most important types of information maintained by the package system is dependency information —— that is, the requirements of packages for one another. For instance, if SuperProg relies on UltraLib to do its work, the package database records this information. If you attempt to install SuperProg when UltraLib isn't installed, the package system won't let you do so. Similarly, if you try to uninstall UltraLib when SuperProg is installed, the package system won't let you. (You can override these prohibitions. Doing so is usually inadvisable, though.)

Checksums: The package system maintains checksums and assorted ancillary information about files. This information can be used to verify the validity of the installed software. This feature has its limits, though; it's intended to help you spot disk errors, accidental overwritting of files, or other non-sinister problems. It's of limited use in detecting intrusions because an intruder could use the package system to install altered system software.

Upgrades and Uninstallation: By tracking files and dependencies, package systems permit easy upgrades and uninstallation: Tell the package system to upgrade or remove a package, and it will replace or remove every file in the package. Of course, this assumes that the upgrade or uninstallation doesn't cause dependency problems; if it does, the package system will block the operation unless you override it.

Binary Package Creation: Both the RPM and Debian package systems provide tools to help create binary packages (those that are installed directly) from source code. This feature is particularly helpful if you're running Linux on a peculiar CPU: you can download source code and create a binary package, even if the developers didn't provide explicit support for your CPU. Creating a binary package from source has advantages over compiling software from source in more conventional ways, because you can use the package management system to track dependencies, attend to individual files, and so on.

Both RPM and Debian package systems provide all of these basic features, although the details of their operation differ. These two package systems are incompatible with one another in the sense that their package files and their installed file database are different; that is, you can't directly install an RPM package on a Debian-based system or vice versa. (Tools to convert between formats do exist, and developers are working on ways to integrate the two package formats better.)

Warning: Most distributions install just one package system. It's possible to install more than one, though, and some programs (such as alien) require both for full functionality. Actually using both systems to install software is inadvisable because their database are separate.

Use RPM and Yum package management (102.5)

Note: Red Hat has splintered into three distributions: Fedora is the downloadable version favored by home users, students, and businesses on a tight budget. The Red Hat name is now reserved for the for-pay version of the distribution, known more formally as Red Hat Enterprise Linux (RHEL). CentOS is a freely redistributable version intended for enterprise users.

The convention for naming RPM packages is as follows:


Each of the filename components has a specific meaning:

Package Name: The first component (packagename) is the name of the package, such as samba or samba-server for the Samba file and printer server. Note that the same program may be given different package names by different distribution maintainers.

Version Number: The second component (a.b.c) is the package version number, such as 3.6.5. The version number doesn't have to be three numbers separated by periods, but that's the most common form. The program author assigns the version number.

Build Number: The number following the version (x) is the build number (also known as the release number). This number represents minor changes made by the package maintainer, not by the program author. These changes may represent altered startup scripts or configuration files, changed file locations, added documentation, or patches appended to the original program to fix bugs or to make the program more compatible with the target Linux distribution. Many distribution maintainers add a letter code to the build number to distinguish their packages from those of others. Note that these numbers are not comparable across package maintainers.

Architecure: The final component preceding the .rpm extension (arch) is a code for the package's architecture. The i386 architecture code is common; it represents a file compiled for any x86 CPU from the 80386 onward. Some packages include optimizations for Pentiums or newer (i586 or i686), and non-x86 binary packages use codes for their CPUs, such as ppc for PowerPC CPUs or x86_64 for the x86_64 platform. Scripts, documentation, and other CPU-independent packages generally use the noarch architecture code. The main exception to this rule is source RPMs, which use the src architecture code.

  • Distributions may use different version of the RPM utilities. This problem can completely prevent RPM from one distribution from being used on another.
  • An RPM package designed for one distribution may have dependencies that are unmet in another distribution. A package may require a newer version of a library than is present on the distribution you're using, for instance. You can usually overcome this problem by installing or upgrading the package dependencies, but sometimes doing so causes problems because the upgrade may break other packages. By rebuilding the package you want to install from a source RPM, you can often work around these problems, but sometimes the underlying source code also needs the upgraded libraries.
  • An RPM package may be built to depend on a package of a particular name, such as samba-client depending on samba-common, but if the distribution you're using has named the package differently, the rpm utility will object. You can override this objection by using the --nodeps switch, but sometimes the package won't work once installed. Rebuilding from a source RPM may or may not fix this problem.
  • Even when a dependency appears to be met, different distributions may include slightly different files in their packages. For this reason, a package meant for one distribution may not run correctly when installed on another distribution. Sometimes installing an additional package will fix this problem.
  • Some programs include distribution-specific scripts or configuration files. This problem is particularly acute for servers, which may include startup scripts that go in /etc/rc.d/init.d or elsewhere. Overcoming this problem usually requires that you remove the offending script after installing the RPM and either start the server in some other way or write a new startup script, perhaps modeled after one that came with some other server for your distribution.

The rpm Command Set

The main RPM utility is known as rpm. Use this program to install or upgrade a package at the shell prompt. The rpm command has the following syntax:

 rmp [operation] [options] [package-file|package-names]

The most common rpm operations are:

 -i           Installs a package; system must not contain a package of the same name. 
 -U           Installs a new package or upgrades an existing one
 -F           Or --freshen. Upgrade a package only if an earlier version already exist. 
 -q           Queries a package —— finds whether a package is installed, what files it contain, and so on.  
 -V           Or --verify. Verifies a package —— checks that its files are present and unchanged since installation. 
 -e           Uninstalls a package. 
 -b           Builds a binary package, given source code and configuration files; moved to the rpmbuild program with RPM version 4.2.
 --rebuild    Builds a binary package, given a source RPM file; moved to the rpmbuild program with RPM version 4.2.
 --rebuilddb  Rebuilds the RPM database to fix errors.  

The most common rpm options are:

 Option                 Used with operations  Description
 --root dir             Any                   Modifies the Linux system having a root directory located at dir. This option can be used to maintain one Linux installation discrete from another one.
 --force                -i, -U, -F            Forces installation of a package even when it means overwriting existing files or packages.
 -h or --hash           -i, -U, -F            Displays a series of hash marks (#) to indicate the progress of the operation. 
 -v                     -i, -U, -F            Used in conjunction with the -h option to produce a uniform number of hash marks for each package. 
 --nodeps               -i, -U, -F, -e        Specifies that no dependency checks be performed. Install or removes the package even if it relies on a package or file that's not present or is required by a 
                                              package that's not being uninstalled.
 --test                 -i, -U, -F            Checks for dependencies, conflicts, and other problems without actually installing the package. 
 --prefix path          -i, -U, -F            Sets the installation directory path (works only for some packages). 
 -a or --all            -q, -V                Queries or verifies all packages. 
 -f file or --file file -q, -V                Queries or verifies the package that owns the file. 
 -p package-file        -q                    Queries the uninstalled RPM package-file. 
 -i                     -q                    Displays package information, including the package maintainer, a short description, and so on. 
 -R or --requires       -q                    Displays the packages and files on which this one depends. 
 -l or --list           -q                    Displays the files contained in the package. 

To use rpm, you combine one operation with one or more options. In most cases, you include one or more package names of package filenames as well. You can issue the rpm command once for each package, or you can list multiple packages, separated by spaces, on the command line. The latter is often preferable when you're installing or removing several packages, some of which depend on others in the group. Issuing separate commands in this situation requires that you install the depend-on package first or remove it last, whereas issuing a single command allows you to list the packages on the command line in any order.

Some operations requires that you give a package filename, and other requires a package name. In particular, -i, -U, -F, and the rebuild operations require package filenames; -q, -V, and -e normally take a package name, although the -p option can modify a query (-q) operation to work on a package filename.

When you're installing or upgrading a package, the -U operation is generally the most useful because it enables you to install the package without manually uninstalling the old one. This one-step operation is particularly helpful when packages contain many dependencies; rpm detects these and can perform the operation should the new package fulfill the dependencies provided by the old one.

To use rpm to install or upgrade a package from an RPM file that you have already downloaded to your local system, issue a command similar to the following:

 # rpm -Uvh samba-4.1.9-4.fc20.x86_64.rpm

You can also use rpm -ivh in place of rpm -Uvh if you don't already have samba package installed.

Verify that the package is installed with the rpm -qi command, which displays information such as when and on what computer the binary package was built:

 $ rpm -qi samba
 Name        : samba
 Epoch       : 2
 Version     : 4.7.0
 Release     : 12.fc27
 Architecture: x86_64
 Install Date: Sat 09 Dec 2017 01:19:41 PM CET
 Group       : Unspecified
 Size        : 1966169
 License     : GPLv3+ and LGPLv3+
 Signature   : RSA/SHA256, Thu 21 Sep 2017 08:53:08 PM CEST, Key ID f55e7430f5282ee4
 Source RPM  : samba-4.7.0-12.fc27.src.rpm
 Build Date  : Thu 21 Sep 2017 07:05:27 PM CEST
 Build Host  :
 Relocations : (not relocatable)
 Packager    : Fedora Project
 Vendor      : Fedora Project
 URL         :
 Summary     : Server and Client software to interoperate with Windows machines
 Description : Samba is the standard Windows interoperability suite of programs for Linux and Unix.

Extracting Data from RPMs

Occasionally, you may want to extract data from RPMs without installing the package. For instance, this can be a good way to retrieve the original source code from a source RPM for compiling the software without the help of the RPM tools or to retrieve fonts or other non-program data for use on a non RPM system. RPM files are actually modified cpio archive. Thus, converting the files into cpio files is relatively straightforward, whereupon you can use cpio to retrieve the individual files. To do this job, you need to use the rpm2cpio program that ships with most Linux distributions.

 $ rpm2cpio samba-4.1.9-4.fc20.src.rpm > samba-4.1.9-4.fc20.src.cpio

You can then extract the data using cpio, which takes the -i option to extract an archive and --make-directories to create directories:

 $ cpio -i --make-directories < samba-4.7.0-12.fc27.src.cpio

Alternatively, you can use a pipe to link these two commands together without creating an intermediary file:

 $ rpm2cpio samba-4.1.9-4.fc20.src.rpm  |  cpio -i --make-directories

In either case, the result is an extraction of the files in the archive in the current directory. In this case of binary packages, this is likely to be a series of sub-directories that mimic the layout of the Linux root directory —— that is, /usr, /lib, /etc, and so on, although precisely which directories are included depends on the package. For a source package, the result of the extraction process is likely to be a source code tarball, a .spec file (which holds information that RPM uses to build the package), and perhaps some patch files.

Another option for extracting data from RPMs is to use alien, which is described later. This program can convert an RPM into a Debian package or tarball.

Using Yum

Yum, mentioned earlier, is one of several meta-packagers —— it enables you to install package and all its dependencies easily using a single command line. When using Yum, you don't even need to locate and download the package file because Yum does this for you by searching in one or more repositories —— Internet sites that host RPM files for a particular distribution.

Yum is used by Red Hat, CentOS, Fedora and some other RPM-based distributions. Yum isn't used by all, SUSE and Mandriva, to name just two, each uses their own meta-packager (SUSE use zypher).

Debian-based distributions generally employ the Advanced Package Tool (APT), as described later. Nonetheless, because the popularity of Red Hat, CentOS, and Fedora, knowing Yum can be valuable. The most basic way to use Yum is with the yum command, which has the following syntax:

 yum [options] [command] [package]

Which options are available depend on the command you use, here is a description of most common used yum commands:

 Command                   Description
 install                   Installs one or more packages by package name. Also installs dependencies of the specified package or packages. 
 update                    Updates the specified package or packages to the latest available version. If no packages are specified, yum' updates every installed package. 
 check-update              Checks to see wether updates are available. If they are, yum displays their names, versions, and repository area (updates or extras, for instance).
 upgrade                   Works like update with the --obseletes flag set, which handles obsolete packages in a way that's superior when performing a distribution version upgrade. 
 remove or erase           Deletes a package from the system, similar to rpm -e, but yum also removes depended-on packages. 
 list                      Displays information about a package, such as the installed version and whether an update is available.
 provides or whatprovides  Displays information about packages that provide a specified program or feature. For instance, typing yum provides samba lists all the Samba-related packages, including every
                           available update. Note that the output can be copious. 
 search                    Searches for packages names, summaries, packagers, and descriptions for a specified keyword. This is useful if you don't know a package's name but can think of a word that's likely to
                           appear in one of these fields but not in these fields for other packages.
 info                      Displays information about a package, similar to rpm -qi command. 
 clean                     Cleans up the Yum cache directory. Running this command from time to time is advisable. 
 shell                     Enters the Yum shell mode, in which you can enter multiple Yum commands one after another.
 resolvedep                Displays packages matching the specified dependency. 
 localinstall              Installs the specified local RPM files, using your Yum repositories to resolve dependencies.
 localupdate               Updates the system using the specified local RPM files, using your Yum repositories to resolve dependencies. Packages other than those updated by local files and their dependencies 
                           are not updated. 
 deplist                   Displays dependencies of the specified package.

In most cases, using Yum is easier than using RPM directly to manage package because Yum finds the latest available package, downloads it, and installs any required dependencies. Yum has its limits, though; it's only as good as its repositories, so it can't install software that's not stored in those repositories.

If you don't want to install the package but merely want to obtain it, you can use yumdomloader.

 $ yumdownloader samba

This can be handy if you need to update a system that's not connected to the Internet.

 Command:                                          Description: 
 $ su                                              # or you can use sudo if installed.
 # rpm -q zsh                                      # check if it is installed.
 # wget #download in your current directory.
 # rpm -qpi zsh-5.4.1-1.fc27.x86_64.rpm            # Show information about the RPM package.
 # rpm -ivh zsh-5.4.1-1.fc27.x86_64.rpm            # Install the RPM package.
 # rpm -q zsh                                      # check if it is installed.
 # zsh                                             # launches zsh shell. 
 # rpm -V zsh                                      # system should not produce any output. 
 # rpm -e zsh                                      # system should not produce any output. This command removes the packages from the system. 
 # exit                                            # or Ctrl+D. 
 # rpm -q zsh                                      # The system should respond zsh is not installed.
 # yum install zsh
 # rpm -q zsh                                      # check if it is installed.
 # rpm -e zsh                                      # This removes again zsh from the system but produces no STDOUT. 

RPM Configuration Files

The main RPM configuration file is /usr/lib/rpm/rpmrc. This file sets a variety of options, mostly related to the CPU optimizations used when compiling source packages. You shouldn't edit this file, though; instead, you could create and edit /etc/rpmrc (to make global changes) or ~/.rpmrc (to make changes on a per-user basis). The main reason to create such file is to implement architecture optimizations —— For instance, to optimize your code for your CPU model by passing appropriate compiler option when build a source RPM into a binary RPM. This is done with the optflags line:

 optflags : athlon -02 -g -march=i686

This line tells RPM to pass the -02 -g -march-i686 option to the compiler whenever it's building for the athlon platform. Although RPM can determine your system's architecture, the optflags line by itself isn't likely to be enough to set the correct flags. Most default rpmrc files include a series of buildarchtranslate lines that cause rpmbuild (or rpm for older version of RPM) to use one set of optimizations for a whole family of CPUs. For x86 systems, these lines typically looks like this:

 buildarchtranslate: athlon: i386
 buildarchtranslate: i686: i386  
 buildarchtranslate: i586: i386
 buildarchtranslate: i486: i386
 buildarchtranslate: i386: i386

These lines tell RPM to translate the athlon, i686, i586, i486, and i386 CPU codes to use the i386 optimizations. This effectively defeats the purpose of any CPU-specific optimizations you create on the optflags line for your architecture, but it guarantees that the RPMs you build will be maximally portable. To change matters, you must alter the line for your CPU type, as returned when you type:

 $ uname -p

For instance, on an Athlon-based system, you might enter the following line:

 buildarchtranslate: athlon: athlon

Thereafter, when you rebuild a source RPM, the system will use the appropriate Athlon optimizations. The result can be slight performance boost on your own system but reduced protability —— depending on the precise optimizations you choose, such packages may not run on non-Athlon CPUs (you may not even able to install them on non-Athlon CPUs).

Yum Configuration Files

Yum is configured via the /etc/yum.conf file, with additional configuration files in the /etc/yum.repos.d directory. The yum.conf file holds basic options, such as the directory to which Yum downloads RPMs and where Yum logs its activities. Chances are that you won't need to modify this file. The /etc/yum.repos.d directory, on the other hand, potentially holds several files, each which describes a Yum repository —— that is, a site that holds RPMs that may be installed via Yum. You probably shouldn't directly edit these files; instead, if you want to add a repository, you should manually download the RPM that includes the repository configuration and install it using rpm. The next time you see Yum, it will access your new repository along whit the old ones. Several Yum repositories exist, mostly for Red Hat, CentOS, and Fedora, such as the following:

 Livna: This repository hosts multimedia tools, such as additional codecs and video drivers.  
 KDE Red Hat: Red Hat, CentOS, and Fedora favor the GNU Network Object Model Environment (GNOME) desktop environment, although they ship with the K Desktop Environment (KDE) too.
 Fresh RPMs: This repository provides additional RPMs, mostly focusing on multimedia applications and drivers. 

Many additional repositories exist. Try a web search on terms such as yum repository, or check the web page of any site that hosts unusual software that you want to run to see whether it provides a Yum repository. If so, it should provide an RPM or other instructions on adding its site to your Yum repository list.

Use Debian package management (102.4)

In their overall features, Debian packages are similar to RPMs, but the details of operation for each differ, and Debian packages are used on different distributions than are RPMs. Because each system uses its own database format, RPMs and Debian packages aren't interchangeable without converting formats. Using Debian packages requires knowing how to use the dpkg, dselect, and apt-get commands. A few other commands can also be helpful.

The dpkg Command Set

Debian packages are incompatible with RPM packages, but the basic principles of operation are the same across both package types. Like RPMs, Debian packages include dependency information, and the Debian package utilities maintain a database of installed packages, files, and so on. You use the dpkg command to install a Debian package. This command's syntax is similar of rpm:

 dpkg [option] [action] [package-files|[package-name]

action is the action to be taken and the option modify the behavior of the action.

Most actions are described here:

 Action                         Description
 -i or --install                Installs a package.
 --configure                    Reconfigure an installed package: runs the post-installation script to set site-specific options. 
 -r or --remove                 Removes a package but leaves configuration files intact. 
 -P or --purge                  Removes a package, including configuration files. 
 --get-selections               Displays currently installed packages.
 -p or --print-avail            Displays information about an installed package. 
 -I or --info                   Displays information about an uninstalled package file. 
 -l pattern or --list pattern   List all installed packages whose names match patern.
 -L or --listfiles              List the installed files associated with a package. 
 -S pattern or --search pattern Locates the package(s) that own the file(s) specified by pattern.
 -C or --audit                  Searches for partially installed packages and suggests what to do with them. 

Most options described here:

 Option                         Used with actions      Description
 --root=dir                     all                    Modifies the Linux system using a root directory located at dir. Can be used to maintain one Linux installation discrete from another one. 
 -B or --auto-deconfigure       -r                     Disables packages that rely on one that is being removed. 
 --force-things                 Assorted               Overrides defaults that would ordinarily cause dpkg to abort. Consult the dpkg man page for details of what this option does. 
 --ignore-depends=package       -i, -r                 Ignores dependency information for the specified package. 
 --no-act                       -r, -r                 Checks for dependencies, conflicts, and other problems without actually installing or removing the package. 
 --recursive                    -i                     Installs all packages that match the package-name wildcard in the specified directory and all sub-directories. 
 -G                             -i                     Doesn't install the package if a newer version of the same package is already installed. 
 -E or --skip-same-version      -i                     Doesn't install the package if the same version of the package is already installed. 

As with rpm, dpkg expects a package name in some cases and a package filename in others. Specifically, --install (-i) and --info (-I) both require the package filename, but the other commands take the shorter package name. As an example, consider the following command, which install the samba_4.1.6+dfsg-1ubuntu2.1404.3_amd64.deb package:

 # dpkg -i samba_4.1.6+dfsg-1ubuntu2.1404.3_amd64.deb

If you're upgrading a package, you may need to remove an old package before installing the new one. To do this, use the -r option to dpkg, as in the following:

 # dpkg -r samba

To find information about an installed package, use the -p parameter to dpkg, as show here:

 $ dpkg -p samba

Debian-based systems often use a pair of somewhat higher-level utilities, apt-get and dselect, to handle package installation and removal. dpkg is often more convenient when you're manipulating just one or two packages. Because dpkg can take package filenames as input, it's often also the preferred method of installing a package that you download from an unusual source or create yourself.

Using apt-cache

The APT suite of tools include a program, apt-cache, that's intended solely to provide information about the Debian package database (known in Debian terminology as the package cache). You may be interested in using several features of this tool:

Display Package Information: Using the showpkg subcommand, as in apt-cache showpkg samba, diplays information about the package. The information displayed is different from that returned by dpkg's informational actions.

Display Package Statistics: You can learn how many package you've installed, how many dependencies are recorded, and various other statistics about the package database by passing the stats sub-command. as in apt-cache stats' . Find Unmet Dependencies: If a program is reporting missing libraries or files, typing apt-cache unmet may helps; this function of apt-cache returns information about unmet dependencies, which may help you track down the source of missing-file problems. Display Dependencies: Using the depends sub-command, as in apt-cache depends samba, shows all of the specified package's dependencies. This information can be helpful in tracking down dependency-related problems. The rdepends sub-command finds reverse dependencies-packages that depend on the one you specify. 'Locate All Packages: The pkgnames sub-command displays the names of all the packages installed on the system. If you include a second parameter, as in apt-cache pkgnames sa, the program returns only those packages that begin with the specified string.

Several more sub-commands and options exist, but these are the ones you're most likely to use. Several apt-cache sub-commands are intended for package maintainers and debugging serious database problems rather than day-to-day system administration.

Using apt-get

APT, with the apt-get utility, is Debian's equivalent to Yum on certain RPM-based distributions. This meta-packaging tool lets you perform easy upgrades of packages, especially if you have a fast Internet connection. Debian-based system include a file, /etc/apt/sources.list, that specifies locations from which important packages can be obtained.

Warning: Don't add a site to /etc/apt/sources.list unless you're sure it can be trusted. The apt-get utility does automatic and semiautomatic upgrades, so if you add a network source to source.list and that source contains unreliable programs or programs utlity with security holes, your system will become vulnerable after upgrading.

Although APT is most strongly associated with Debian systems, a port to RPM-based system is also available. Check here. The syntax is similar to that of dpkg:

 apt-get [options] [command] [package-names]

Here a list of commonly used command of APT:

 Command                  Description
 update                   Obtains updated information about packages available from the installation sources listed in /etc/apt/sources.list file.
 upgrade                  Upgrades all installed packages to the newest version available, based on locally stored information about available packages.
 dselect-upgrade          Performs any changes in package status (installation, removal, and so on) left undone after running dselect.
 dist-upgrade             Similar to upgrade, but performs "smart" conflict resolution to avoid upgrading a package if doing so would break a dependency. 
 install                  Installs a package by package name, obtaining the package from the source that contains the most up-to-date version. 
 remove                   Removes a specified package by package name.
 source                   Retrieves the newest available source package file by package file-name using information about available packages and installation archives listed in /etc/apt/sources.list
 check                    Checks the package database for consistency and broken package installations.
 clean                    Performs housekeeping to help clear out information about retrieved files from the Debian package database. If you don't use dselect for package management, run this from time to 
                          time in order to save disk space.
 autoclean                Similar to clean, but removes information only about packages that can no longer be downloaded. 

Most-useful apt-get options:

 Option                   Used with                                  Description
 -d or --download-only    upgrade, dselect-upgrade, install, source  Downloads package files but doesn't install them.
 -f or --fix-broken       install, remove                            Attempts to fix a system on which dependencies are unsatisfied.
 -m, --ignore-missing,    upgrade, dselect-upgrade, install,
 or --fix-missing         remove, source                             Ignores all packages files that can't be retrieved (because of network errors, missing files, or the like). 
 -q or --quiet            All                                        Omits some progress indicator information. May be doubled (for instance ,-qq) to produce still less progress information.
 -s, --simulate, --just   All                                        
 -print, --dry-run, 
 --recon, or --no-act                                                Performs a simulation of the action without actually modifying, installing, or removing files.
 -y, --yes, or --assume-  All                                        
 yes                                                                 Produce a "yes" response to any yes/no prompt in installation scripts.
 -b, --compile, or        source
 --build                                                             Compiles a source package after retrieving it. 
 --no-upgrade             install                                    Causes apt-get to not upgrade a package if an older version is already installed. 

Search a deb package on this website.

 Command:                                          Description: 
 $ su                                              # or you can use sudo if installed.
 # dpkg -L zsh                                     # check if it is installed.
 # #download in your current directory.
 # dpkg -I zsh_5.3.1-4+b1_amd64.deb                # Show information about the RPM package.
 # dpkg -i zsh_5.3.1-4+b1_amd64.deb                # Install the RPM package.
 # dpkg -L zsh                                     # check if it is installed.
 # zsh                                             # launches zsh shell. 
 # dpkg -P zsh                                     # system should not produce any output. This command removes the packages from the system.               
 # exit                                            # or Ctrl+D. 
 # dpkg -L zsh                                     # The system should respond zsh is not installed.
 # apt-get install zsh
 # dpkg -p zsh                                     # check if it is installed.
 # dpkg -P zsh                                     # This removes again zsh from the system but produces no STDOUT.

Using dselect, aptitude, and Synaptic

The dselect program is a high-level package browser. Using it, you can select packages to install on your system from the APT archives defined in /etc/apt'sources.list, review the packages that are already installed on your system, uninstall packages, and upgrade packages. Overall, dselect is a powerful tool, but it can be intimidating to the uninitiated because it presents a lot of options that aren't obvious, using a text-mode interactive user interface. Because of that, most Linux distributions don't install it by default. Nonetheless, it's well worth taking the time to install it and getting to know how to use it. Although dselect supports a few command-line options, they're mostly obscure or minor (such as options to set the color scheme). To use the program, type dselect.

 $ dselect
 Debian 'dselect' package handling frontend version 1.18.24 (amd64).
  * 0. [A]ccess    Choose the access method to use.                                         
    1. [U]pdate    Update list of available packages, if possible.
    2. [S]elect    Request which packages you want on your system.
    3. [I]nstall   Install and upgrade wanted packages.
    4. [C]onfig    Configure any packages that are unconfigured.
    5. [R]emove    Remove unwanted software.
    6. [Q]uit      Quit dselect.
 Move around with ^P and ^N, cursor keys, initial letters, or digits;
 Press <enter> to confirm selection.   ^L redraws screen.
 Copyright (C) 1994-1996 Ian Jackson.
 Copyright (C) 2000,2001 Wichert Akkerman.
 This is free software; see the GNU General Public License version 2 or
 later for copying conditions. There is NO warranty.
 Read-only access: only preview of selections is available!

Another text-based Debian package manager is aptitude. In interactive mode, aptitude is similar to dselect in a rough way, but aptitude adds menus accessed by pressing Ctrl+T and rearranges some features. You can also pass various commands to aptitude on the command line, as in aptitude search sambe, which searches for packages related to Samba. Features accessible from the command line (or interactive interface) include the following:

Update Package Lists: You can update package lists from the APT repositories by typing aptitude update. Install software: The install command-line option installs a named package. This command has several variant names and syntaxes that modify its action. For instance, typing aptitude install zsh installs the zsh package, but typing aptitiude install zsh- or aptitude remove zsh uninstall zsh. Upgrade Software: The full-upgrade and safe-upgrade option both upgrade all installed packages. The safe-upgrade option is conservative about removing packages or installing new ones and so may fail; full-upgrade is less conservative about these actions, and it is more likely to complete his tasks. However, it may break software in the process. Search for packages: The search option, noted earlier, searches the database for packages matching the specified name. The result is a list of packages, one per line, with summary codes for each package's install status, its name, and a brief description. Clean Up the Database: The autoclean option removes already-downloaded packages that are no longer available, and clean removes all downloaded packages. Obtain help: Typing aptitude help results in a complete list of options.

Broadly speaking, aptitude combines the interactive features of dslelect with the command-line option of apt-get. All three programs provide similar functionality, so you can use whichever one you prefer. A tool that's is similar to dslect is Synaptic based on GUI.

Re-configuring Packages

Debian packages often provide more-extensive initial setup options than do their RPM counterparts. Frequently, the install script included in the package asks a handful of questions, such as querying for the name of an outgoing mail relay system for a mail server program. These questions help the package system set up a standardized configuration that has nonetheless been customized for your computer. In the course of your system administration, you may alter configuration files for a package. If you do this and find that you've made a mess of things, you may want to revert to the initial standard configuration. To do so, you can use dpkg-reconfigure program, which runs the initial configuration script for the package you specify:

 # dpkg-reconfigure samba

This command reconfigures the samba package, asking the packages's initial installation questions and restarting the Samba daemons. Once this is done, the package should be in something close to its initial state.

Configuring Debian Package Tools

With the exception of the APT sources list mentioned earlier, Debian package tools don't usually require configuration. Debian installs reasonable defaults (as do its derivative distributions). On rare occasions, though, you may want to adjust some of these defaults. Doing so require that you know where to look for them.

The main configuration file for dpkg is /etc/dpkg/spkg.cfg or ~/.dpkg.cfg. This file contains dpkg options. For instance, to have dpkg always perform a test run rather than actually install a package, you'd create a dpkg.cfg file that contains one line: 

For APT, the main configuration file you're likely to modify is /etc/apt/sources.list, which was described earlier. Beyond this file is /etc/apt/apt.conf, which controls APT and dselect options. As with dpkg.cfg, chances are you won't need to modify apt.conf. If you do need to make changes, the format is more complex and is modeled after those of the Internet Software Consortium's (ISC's) Dynamic Host Configuration Protocol (DHCP) and Berkley Internet Name Domain (BIND) servers configuration files. Option are grouped together by open and close curly braces ({}):

    Download-only "true";

These lines are equivalent to setting the --download-only option permanently. You can of course, set many more options. For detail, consult the apt.conf man page.

You may also want to retrieve the sample configuration file, /usr/share/doc/apt/example/apt.conf.

You should be aware that Debian's package tools rely on various files in the /var/lib/dpkg directory tree. These files maintain lists of available packages, list of installed packages, and so on. In other words, this directory tree is effectively the Debian installed file database. As such, you should be sure to back up this directory when you perform system backups and be careful about modifying its contents.

Converting between Package Formats

You can do this with a tool called alien, to install it type:

 $ sudo apt-get install alien # and you will need rpm for Debian distribution or dpkg for Red Hat distribution installed. 

The basic syntax of alien is as follows:

 alien [options] file[...]

Most important are: --to-deb, --to-rpm, --to-slp, and --to-tgz, which convert to Debian, RPM, Stampede, and tarball format. (if you omit the destination format, alien assumes that you want a Debian package.)

The --install option isntalls the converted package and removes the converted file. Consult the alien man page for additional options.

For ex.: you want to convert a Debian (.deb) package to RPM do as follow:

 # alien --to-rpm someprogram-1.2.3-4_i386.deb

If you use a Debian-based system and want to install a tarball but keep a record of the files it contains in your Debian package database, use the following command:

 # alien --install binary-tarball.tar.gz

It's important to remember that converting a tarball converts the files in the directory structure of the original tarball using the system's root directory as the base. Therefore, you may need to unpack the tarball, juggle files around, and repack it to get the desired results prior to installing the tarball with alien. For instance, suppose you have a binary tarball that creates a directory called program-files, with bin, man, and lib directories under this. The intent may have been to unpack the tarball in /usr, or /usr/local and create links for critical files. To convert this tarball to an RPM, you can issue the following commands:

 # tar xvfz program.tar.gz
 # mv program-files usr
 # tar cvfz program.tar.gz usr
 # rm -r usr
 # alien --to-rpm program.tgz

By renaming the program-file direcotry to sur and creating a new tarball, you've created a tarball that, when converted to RPM format, will have files in the locations you want —— /usr/bin, /usr/man, and /usr/lib. You might need to perform more extensive modifications, depending on the contents of the original tarball.

Package Dependencies and Conflicts

Although package installation often proceeds smoothly, sometimes it doesn't. The usual sources of problems relate unsatisfied dependencies or conflicts between packages. The RPM and DPKG systems are intended to help you locate and resolve such problems. However, on occasion (particularly when mixing packages from different vendors), they can actually cause problems. In either event, it pays to recognize these errors and know how to resolve them.

Real and Imagined Package Dependency Problems

Package dependencies and conflicts can arise for a variety of reasons, including the following:

Missing Libraries or Support Programs: One of the most common dependency problems is caused by a missing support package. For instance, all KDE programs rely on Qt, a widget set that provides assorted GUI tools. If Qt isn't installed, you won't be bale to install any KDE packages using RPM or DPKG. Libraries —— support code that can be used by many different programs as if it were part of the program itself —— are particularly common sources of problems in this respect.

Incompatible Libraries or Support Programs: Even if a library or support program is installed on your system, it may be the wrong version. For instance, if a program requires Qt 4.8, the presence of Qt 3.3 won't do much good. Fortunately, Linux library-naming conventions enable you to install multiple versions of a library in case you have programs with competing requirements.

Duplicate Files or Features: Conflicts arise when one package includes files that are already installed and that belong to another package. Occasionally, broad features can conflict as well, as in two web server packages. Features conflicts are usually accompanied by name conflicts. Conflicts are most common when mixing packages intended for different distributions, because distributions may split files across packages in different ways.

Mismatched Names: RPM and DPKG systems give names to their packages. These names don't always match across distributions. For this reason, if one package checks for another package by name, the first package may not install on another distribution, even if the appropriate package is installed, because that target package has a different name.

Some of these problems are very real and serious. Missing libraries, for instance, must be installed. Others, like mismatched package names, are artifacts of the packaging system. Unfortunately, it's not always easy to tell into which category a conflict fits. When using a package management system, you may be able to use the error message returned by the packaging system, along with your own experience with and knowledge of specific packages, to make a judgment. For instance, if RPM reports that you're missing a slew of libraries with which you're unfamiliar, you'll probably have to track down at least one package —— unless you know you've installed the libraries in some other way, in which case you may want to force the installation.

Workaround for Package Dependency Problems

Forcing the installation

One approach is to ignore the issue. Although this sounds risky, it's appropriate in some cases involving failed RPM or Debian dependencies. For instance, if the dependency is on a package that you installed by compiling the source code yourself, you can safely ignore the dependency. When using rpm, you can tell the program to ignore failed dependencies by using the --nodeps parameter:

 # rpm -i apackage.rpm --nodeps

You can force the installation over some other errors, such as conflicts with existing packages, by using the --force parameter:

 # rpm -i apackage.rpm --force

If you are using dpkg, you can use the --ignore-depends=package, --force-depends, and --force-conflicts parameter to overcome dependency and conflict problems in Debian-based systems. Because there's less deviation in package names and requirements among Debian-based systems, these options are less often needed on such systems.

Upgrading or Replacing the Depend-on Pakcage

Officially, the proper way to overcome a package dependency problem is to install, upgrade, or replace the depended-on package. If a program requires, say Qt 4.8 or greater, you should upgrade an older version (such as 4.4) to 4.8. To perform such an upgrade, you'll need to track down and install the appropriate package. This usually isn't too difficult if the new package you want comes from a Linux distribution, especially if you use a meta-packager such as Yum or APT; the appropriate depend-on package should come with the same distribution.

One problem with this approach is that packages intended for different distributions sometimes have differing requirements. If you run Distribution A and install a package that was built for Distribution B, the package will express dependencies in terms of Distribution B's files and versions. The appropriate versions may not be available in a form intended for Distribution A, and by installing Distribution B's versions, you can sometimes cause conflicts with other Distribution A packages. Even if you install the upgraded package and it works, you may run into problems in the future when it comes time to install some other program or upgrade the distribution as a whole —— the upgrade installer may not recognize Distribution B's package or may not be able to upgrade to its own newer version.

Rebuilding the Problem Package

Some dependencies result from the libraries and other support utilities installed on the computer that compiled the package, not from requirements in the underlying source code. If the software is recompiled on a system that has different packages, the dependencies will change. Therefore, rebuilding a package from source code can overcome as least some dependencies. Most developer-oriented RPM-based systems, such as Fedora, include a command to rebuild an RPM package: you call rpmbuild (or rpm with old versions of RPM) with the name of the source package and use --rebuild, as follows:

 # rpmbuild --rebuild packagename-version.src.rpm

Of course, to do this you must have the source RPM for the package. This can usually be obtained from the same location as the binary RPM. When you execute this command, rpmbuild extracts the source code and executes whatever commands are required to build a new package —— or sometimes several new packages. The compilation process can take anywhere from a few seconds to several hours, depending on the size of the package and the speed of your computer. The result should be one or more new binary RPMs in /usr/src/distname/RPMS/arch, where distname is a distribution-specific name. and arch is your CPU architecture (such as i386 or i586 for x86 or ppc for PowerPC). You can move these RPMs to any convenient location and install them just as you would any others.

Be aware that compiling a source package typically requires you to have appropriate development tools installed on your system, such as the GNU Compiler Collection (GCC) and assorted development libraries. Development libraries are the parts of library that enable programs to be written for the library. Many Linux installations lack development libraries even when the matching binary libraries are installed. Thus, you may need to install quite few packages to recompile a source package. The error messages you receive when you attempt but fail to build a source package can help you track down the necessary software, but you may need to read several lines of error messages and use your package system to search for appropriate tools and development libraries.

Locating Another Version of the Problem Package

Frequently, the simplest way to fix a dependency problem or package conflict is to use a different version of the package that you want to install. This could be a newer or older official version (4.2.3 rather than 4.4.7), or it might be the same official version but built for your distribution rather than for another distribution. Sites like RPMfind and Debian's package listing can be very useful in tracking down alternative versions of a package. Your own distribution's website FTP site can also be a good place to locate packages.

The main problem with locating another version of the package is that sometimes you really need the version that's not installing correctly. It may have features that you need, or it may fix important bugs. On occasion, other versions may not be available, or you may be unable to locate another version of the package in your preferred package format.

Startup Script Problems

One particularly common problem when trying to install servers from one distribution in another is getting startup scripts to work. In the past, most major Linux distributions used SysV startup scripts, but these scripts weren't always transportable across distributions. Today, alternatives to Sysv are common, such as the systemd startup method, which further complicates this problem. The result is that the server you installed may not start up. Possible workarounds include modifying the startup script that came with the server, building a new script based on another one from your distribution, and starting the server through a local startup script like /etc/rc.d/rc.local or /etc/rc.d/boot.local.

Managing Shared Libraries (102.3)

Library Principles

The idea behind a library is to simplify programmers live by providing commonly used program fragments. For instance, one of the most important libraries is the C library (libc), which provides many of higher-level features associated with the C programming language. Another common type of library is associated with GUI's. These libraries are often called widget sets because they provide the onscreen widget used by programs —— buttons, scroll bars, menu bars, and so on. The GIMP Tool Kit (GTK+) and Qt are the most popular Linux widget sets, both ship largely as libraries. Programmers choose libraries, not users; usually can't substitute one library for another. (The main exceptions are minor version upgrades.)

Note: Linux uses the GNU C library (glibc) version of the C library. Package-manager dependencies and other library references are to glibc specifically. As of glibc 2.15, for historical reasons the mail glibc file is usually called /lin/ or /lib64/, but this file is sometimes a symbolic link to a file of another name, such as /lib/

In principle, the routines in a library can be linked into a program's main file, just like all of the object code files created by the compiler. This approach, however, has certain problems:

 * The resulting program file is huge. This means it take up a lot of disk space, and it consumes a lot of RAM when loaded. 
 * If multiple programs use the library, as is common, the program-size issue is multiplied several times; the library is effectively stored multiple times on disk and in RAM. 
 * The program can't take advantage of improvements in the library without being recompiled (or at least relinked). 

For these reasons, most programs use their libraries as shared libraries (aka dynamic libraries). In this form, the main program executable omits most of the library routines. Instead, the executable includes references to shared library files, which can then be loaded along with the main program file. This approach helps keep program file size down, enables sharing of the memory consumed by libraries across programs, and enables programs to take advantage of improvements in libraries by upgrading the library.

Note: Linux shared libraries are similar to dynamic link libraries (DLLs) of Windows. Windows DLLs are usually identified by .dll filename extensions. In Linux, however, shared libraries usually have a .so (so stand for shared oobject) or .so.version.

On the downside, shared libraries can degrade program load time slightly if the library isn't already in use by another program, and they can create software management complications:

 * Shared library changes can be incompatible with some or all programs that use the library. Linux uses library-numbering schemes to enable you to keep multiple versions of a library installed at once.
   Upgrades that shouldn't cause problems can overwrite older versions, whereas major upgrades get installed side by side with their older counterparts. This approach minimizes the chance of problems, but 
   sometimes changes that shouldn't cause problems do cause them. 
 * Programs must be able to locate shared libraries. This task requires adjusting configuration files and environment variables. If it's done wrong, or if a program overrides the defaults and looks in the 
   wrong place, the result is usually that the program won't run at all.
 * The number of libraries for Linux has risen dramatically over time. When they're used in shared form, the result can be a tangled mess of package dependencies, particularly if you use programs that rely 
   on many or obscure libraries. In most cases, this issue boils down to a package problem that can be handled by your package management tools.    
 * If an important shared library becomes inaccessible because it was accidentally overwritten due to disk error or for any other reason, the result can be severe system problems. In a worst-case scenario, the system might not even boot. 

In most cases, these drawbacks are manageable and are much less important than the problems associated with using static libraries. Thus, dynamic libraries are very popular.

Locating Library Files

The major administrative challenge of handling shared libraries involves enabling programs to locate those shared libraries. Binary program files can point to libraries either by name alone (as in or by providing a complete path (as in /lib/ In the first case, you must configure a library path —— a set of directories in which programs should search for libraries. This can be done both through a global configuration file and through an environment variable. If a static path to library is wrong, you must find a way to correct the problem. In all of these cases, after making a change, you may need to use a special command to get the system to recognize the change, as described later in "Library Management Commands."

Setting the Path System Wide

The first way to set the library path is to edit the /etc/ file. This file consists of a series of lines, each of which lists one directory in which shared library files may be found. Typically, this file lists between half of dozen and a couple dozen directories. Some distributions have an additional type of line in this file. These lines begin with the include directive; they list files that are to be included as if they were part of the main file. For ex.: Ubuntu begins with this line:

 include /etc/*.conf

This line tells the system to load all files in /etc/ whose names end in .conf as if they were part of the main /etc/ file. This mechanism enable package maintainers to add their unique library directories to the search list by placing a .conf file in the appropriate directory. Some distributions, such as Gentoo, use a mechanism with a similar goal but different details. With these distributions, the env-update utility reads file in /etc/env.d to create the final form of several /etc configuration files, including /etc/

In particular, the LDPATH variables in thes files are read, and their values make up the lines in Thus, to change in Gentoo or other distributions that use this mechanism, you should add or edit files in /etc/env.d and then type env-update to do the job.

Generally speaking, there's seldom a need to change the library path system wide. Library package files usually install themselves in directories that are already on the path or add their paths automatically. The main reason to make such changes would be if you installed a library package, or a program that creates its own libraries, in an unusual location via a mechanism other than your distribution's main package utility. For ex.: you might compile a library from source code and then need to update your library path in this way.

After you change your library path, you must use ldconfig to have your programs use the new path.

Note: in addition to the directories specified in /etc/, Linux refers to the trusted library directories, /lib and /usr/lib. These directories are always on the library path, even if they aren't listed in

Temporarily Changing the Path
 $ export LD_LIBRARY_PATH=/usr/local/testlib:/opt/newlib
Correcting Problems

Library path problems usually manifest as a program's inability to locate a library. If you launch the program from a shell, you'll see an error message like this:

 $ gimp
 gimp: error while loading shared libraries: cannot-CA open shared object file: no such file or directory

This message indicates that the system couldn't find the library file. The usual cause of such problems is that the library isn't installed, so you should look for it using command such as find. If the file isn't installed, try to track down the package to which it should belong (a Web search can work wonders for this task) and install it.

If, on the other hand, the library is available, you may need to add its directory globally or to LD_LIBRARY_PATH. Sometimes, the library's path is hard-coded in the program's binary file. (You can discover this using ldd.) When this happens, you may need to create a symbolic link from the location of the library on your system to the location the program expects. A similar problem can occur when the program expects a library to have one name but the library has another name on your system. For instance, the program may link to, but your system has installed. Minor version-number changes like this are usually inconsequential, so creating a symbolic link will correct the problem:

 # ln -s

You must type this command as root in the directory in which the library resides. You must then run ldcondig.

Library Management Commands

Linux provides a pair of commands that you're likely to use for library management. The ldd program displays a program 's shared library dependencies —— that is, the standard libraries that a program uses. The ldcondig program updates caches and links used by the system for locating libraries —— that is, it reads /etc/ and implements any changes in that files or in the directories to which it refers. Both of these tools are invaluable in managing libraries.

Displaying Shared Library Dependencies

If you run into programs that won't launch because of missing libraries, the first step is to check which libraries the program file uses. You can do this with the ldd command:

 $ ldd /bin/ls (0x00007ffd5e54b000) => /lib64/ (0x00007fb05a220000) => /lib64/ (0x00007fb05a01b000) => /lib64/ (0x00007fb059c36000) => /lib64/ (0x00007fb0599b2000) => /lib64/ (0x00007fb0597ae000) /lib64/ (0x00007fb05a66a000) => /lib64/ (0x00007fb05958f000)

Each line of STDOUT begins with a library name, such as or If the library name doesn't contain a complete path, ldd attempts to find the true library and displays the complete path following the => symbol. You needn't be concerned about the long hexadecimal number following the complete path to the library file. The preceding example shows one library (/lib64/ that's referred to with a complete path in the executable file. It lacks the initial directory-less library name and => symbol.

The 'ldd command accepts a few options. The most notable of these is probably -v, which displays a long list of version information following the main entry. This information may be helpful in tracking down which version of a library a program is using, in case you have multiple versions installed.

Keep in mind that libraries can themselves depend on other libraries, thus you can use ldd to discover what libraries are used by a library. Because of this potential for a dependency chain, it's possible that a program will fail to run even though all of its libraries are present. When using ldd to track down problems, be sure to check the needs of all the libraries of the program, and all of the libraries used by the first tier of libraries, and so on, until you've exhausted the chain.

The ldd utility can be run by ordinary users as well as by root. You must first run it as root if you can't read the program file as ordinary user.

Rebuilding the Library Cache

You need to use the command ldconfig in order to update the cach library, ordinarily it is called without any options:

 # ldconfig

This program does, through, take options to modify its behavior:

 -v option cause the program to summarize the directories and files it's registering as it goes about its business. 
 -N option cause ldconfig not to perform its primary duty of updating the library cache. It will, though, update symbolic links to libraries, which is a secondary duty of this program. 
 -n option cause ldconfig to update the links contained in the directories specified on the command line. 
 -f option cause ldconfig to use a new specified configuration file (-f configfile). 
 -C option cause ldocnfig to use a new cache file (-C cachefile). 
 -r option tells ldconfig to treat dir as if it were the root (/) directory. This option is helpful when you're recovering a baldy corrupted system or installing a new OS. 
 -p option cause ldconfig to display the current cache —— all of the library directories and the libraries they contain. 

Both RPM and DPKG library packages typically run ldconfig automatically after installing or removing the package. The same thing happens as part of the installation process for many packages compiled from source. Thus, you may well be running ldconfig more than you realize in the process of software management. You may need to run the program yourself if you manually modify your library configuration in any way.

Managing Processes and Managing Processes Priorities (103.5 & 103.6)

When you type a command name, that program is run and a process is created for it. Knowing how to manage these processes is critical to using Linux. Key details in this task include identifying processes, manipulating foreground and background processes, killing processes, and adjusting process priorities.

Understanding the Kernel: The First Process

The Linux kernel is at the heart of every Linux system. Although you can't manage the kernel process in quite the way you can manage other processes, short of rebooting the computer, you can learn about it. To do so, you can use the uname command, which takes several options to display information:

 -n -r --nodename           option displays the system's node name; that is, its network hostname. 
 -s or --kernel-name        option displays the kernel name, which is Linux on a Linux system.  
 -v or --kernel-version     option displays the kernel version. Ordinarily, this holds the kernel build date and time, not an actual version number.
 -r or --kernel-release     option displays the kernel version number. 
 -m or --machine            option returns information about your machine. This is likely to be a CPU code, such as x86_64. 
 -p or --processor          option returns information about your CPU such as the manufacturer, model, and clock speed; in practice, it returns unknown on many systems.
 -i or --hardware-plateform option returns hardware platform information, but this option often returns unknown. 
 -o or --operating-system   option returns the OS name —— Normally GNU / Linux for a Linux system. 

In practice, you're mostly to use uname -a at the command line to learn some of the basics about the kernel and system. The other option are most useful in multi-platform scripts, which can use these options to obtain critical information quickly in order to help them adjust their actions for the system on which they're running.

Examining Process Lists

One of the most important tools in process management is ps. This program displays processes status (ps), It sports many helpful options, and it's useful in monitoring what's happening on a system. This can be particularly critical when the computer isn't working as it should be —— for instance, if it's unusually slow. The ps program supports an unusual number of options, but just a few of them will take you a long way. Likewise, interpreting ps output can be tricky because so many options modify the program's output. Some ps-like program, most notably top, also deserve attention.

Using Useful ps Option

see How Linux Work.

top: A Dynamic ps Variant

see How Linux Work.

jobs: Processes Associated with Your Session

The jobs command displays minimal information about the processes associated with the current session. In practice, jobs is usually of limited value, but it does have a few uses. One of these is to provide job ID numbers. These numbers are conceptually similar to PID numbers, but they're not the same. Jobs are numbered starting from 1 for each session and, in most cases, a single shell has only a few associated jobs. The job ID numbers are used by a handful of utilities in place of PIDs, so you may need this information. A second use of jobs is to ensure that all of your programs have terminated prior to logging out. Under some circumstances, logging out of a remote login session can cause the client program to freeze up if you've left programs running. A quick check with jobs will inform you of any forgotten processes and enable you to shut them down.

pgrep: Finding Processes

The pgrep command was introduced in the Solaris OS, but it has been ported to the open-source world and is becoming more popular in Linux. It allows you to perform simple searches within the process list; similar to piping the ps command STDOUT to grep command. The format of the pgrep command is as follow:

 pgrep [options] pattern

You can search for processes based on the username, user ID, or group ID as well as any type of regular expression pattern:

 $ pgrep -u your_user firefox 

This example searches for a process named firefox, run by a user (put your username).

Understanding Foreground and Background Processes

see [Linux Work ].

Managing Process Priorities

see How Linux Work.

Killing Processes

see How Linux Work.

Configuring Hardware

The following exam objectives are covered in this chapter:

 101.1 Determine and configure hardware settings.
 102.1 Design hard disk layout. 
 104.1 Create partitions and filesystems. 
 104.2 Maintain the integrity of filesystems. 
 104.3 Control mounting and unmounting of filesystems.

Determine and configure hardware settings (101.1)