|
|
Introduction1. Why this guide?The primary reason for writing this document is that a lot of readers feel the existing HOWTO to be too short and incomplete, while the Bash Scripting guide is too much of a reference work. There is nothing in between these two extremes. I also wrote this guide on the general principal that not enough free basic courses are available, though they should be. This is a practical guide which, while not always being too serious, tries to give real-life instead of theoretical examples. I partly wrote it because I don't get excited with stripped down and over-simplified examples written by people who know what they are talking about, showing some really cool Bash feature so much out of its context that you cannot ever use it in practical circumstances. You can read that sort of stuff after finishing this book, which contains exercises and examples that will help you survive in the real world. From my experience as UNIX/Linux user, system administrator and trainer, I know that people can have years of daily interaction with their systems, without having the slightest knowledge of task automation. Thus they often think that UNIX is not userfriendly, and even worse, they get the impression that it is slow and old-fashioned. This problem is another one that can be remedied by this guide. 2. Who should read this book?Everybody working on a UNIX or UNIX-like system who wants to make life easier on themselves, power users and sysadmins alike, can benefit from reading this book. Readers who already have a grasp of working the system using the command line will learn the ins and outs of shell scripting that ease execution of daily tasks. System administration relies a great deal on shell scripting; common tasks are often automated using simple scripts. This document is full of examples that will encourage you to write your own and that will inspire you to improve on existing scripts. Prerequisites/not in this course:
See Introduction to Linux (or your local TLDP mirror) if you haven't mastered one or more of these topics. Additional information can be found in your system documentation (man and info pages), or at the Linux Documentation Project. 3. New versions, translations and availabilityThe most recent edition can be found at http://tille.xalasys.com/training/bash/. You should find the same version at http://tldp.org/LDP/Bash-Beginners-Guide/html/index.html. This guide is available in print from Fultus.com. This guide has been translated:
A french translation is in the making and will be linked to as soon as it is finished. 4. Revision History
5. ContributionsThanks to all the friends who helped (or tried to) and to my husband; your encouraging words made this work possible. Thanks to all the people who submitted bug reports, examples and remarks - among many, many others:
Special thanks to Tabatha Marshall, who volunteered to do a complete review and spell and grammar check. We make a great team: she works when I sleep. And vice versa ;-) 6. FeedbackMissing information, missing links, missing characters, remarks? Mail it to the maintainer of this document.7. Copyright informationCopyright © 2003-2005 Machtelt Garrels. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation; with the Invariant Sections being "New versions of this document", "Contributions", "Feedback" and "Copyright information", with no Front-Cover Texts and no Back-Cover Texts. A copy of the license is included in Appendix B entitled "GNU Free Documentation License". The author and publisher have made every effort in the preparation of this book to ensure the accuracy of the information. However, the information contained in this book is offered without warranty, either express or implied. Neither the author nor the publisher nor any dealer or distributor will be held liable for any damages caused or alleged to be caused either directly or indirectly by this book. The logos, trademarks and symbols used in this book are the properties of their respective owners. 8. What do you need?bash, available from http://www.gnu.org/directory/GNU/. The Bash shell is available on nearly every Linux system, and can these days be found on a wide variety of UNIX systems. Compiles easily if you need to make your own, tested on a wide variety of UNIX, Linux, MS Windows and other systems. 9. Conventions used in this documentThe following typographic and usage conventions occur in this text: Table 1. Typographic and usage conventions
10. Organization of this documentThis guide discusses concepts useful in the daily life of the serious Bash user. While a basic knowledge of the usage of the shell is required, we start with a discussion of the basic shell components and practices in the first three chapters. Chapters four to six are discussions of basic tools that are commonly used in shell scripts. Chapters eight to twelve discuss the most common constructs in shell scripts. All chapters come with exercises that will test your preparedness for the next chapter.
Chapter 1. Bash and Bash scripts1.1. Common shell programs1.1.1. General shell functionsThe UNIX shell program interprets user commands, which are either directly entered by the user, or which can be read from a file called the shell script or shell program. Shell scripts are interpreted, not compiled. The shell reads commands from the script line per line and searches for those commands on the system (see Section 1.2), while a compiler converts a program into machine readable form, an executable file - which may then be used in a shell script. Apart from passing commands to the kernel, the main task of a shell is providing a user environment, which can be configured individually using shell resource configuration files. 1.1.2. Shell typesJust like people know different languages and dialects, your UNIX system will usually offer a variety of shell types:
The file /etc/shells gives an overview of known shells on a Linux system:
Your default shell is set in the /etc/passwd file, like this line for user mia:
To switch from one shell to another, just enter the name of the new shell in the active terminal. The system finds the directory where the name occurs using the PATH settings, and since a shell is an executable file (program), the current shell activates it and it gets executed. A new prompt is usually shown, because each shell has its typical appearance:
1.2. Advantages of the Bourne Again SHell1.2.1. Bash is the GNU shellThe GNU project (GNU's Not UNIX) provides tools for UNIX-like system administration which are free software and comply to UNIX standards. Bash is an sh-compatible shell that incorporates useful features from the Korn shell (ksh) and C shell (csh). It is intended to conform to the IEEE POSIX P1003.2/ISO 9945.2 Shell and Tools standard. It offers functional improvements over sh for both programming and interactive use; these include command line editing, unlimited size command history, job control, shell functions and aliases, indexed arrays of unlimited size, and integer arithmetic in any base from two to sixty-four. Bash can run most sh scripts without modification. Like the other GNU projects, the bash initiative was started to preserve, protect and promote the freedom to use, study, copy, modify and redistribute software. It is generally known that such conditions stimulate creativity. This was also the case with the bash program, which has a lot of extra features that other shells can't offer. 1.2.2. Features only found in bash1.2.2.1. InvocationIn addition to the single-character shell command line options which can generally be configured using the set shell built-in command, there are several multi-character options that you can use. We will come across a couple of the more popular options in this and the following chapters; the complete list can be found in the Bash info pages, ->. 1.2.2.2. Bash startup filesStartup files are scripts that are read and executed by Bash when it starts. The following subsections describe different ways to start the shell, and the startup files that are read consequently. 1.2.2.2.1. Invoked as an interactive login shell, or with `--login'Interactive means you can enter commands. The shell is not running because a script has been activated. A login shell means that you got the shell after authenticating to the system, usually by giving your user name and password. Files read:
Error messages are printed if configuration files exist but are not readable. If a file does not exist, bash searches for the next. 1.2.2.2.2. Invoked as an interactive non-login shellA non-login shell means that you did not have to authenticate to the system. For instance, when you open a terminal using an icon, or a menu item, that is a non-login shell. Files read:
This file is usually referred to in ~/.bash_profile: if [ -f ~/.bashrc ]; then . ~/.bashrc; fi See Chapter 7 for more information on the if construct. 1.2.2.2.3. Invoked non-interactivelyAll scripts use non-interactive shells. They are programmed to do certain tasks and cannot be instructed to do other jobs than those for which they are programmed. Files read:
PATH is not used to search for this file, so if you want to use it, best refer to it by giving the full path and file name. 1.2.2.2.4. Invoked with the sh commandBash tries to behave as the historical Bourne sh program while conforming to the POSIX standard as well. Files read:
When invoked interactively, the ENV variable can point to extra startup information. 1.2.2.2.5. POSIX modeThis option is enabled either using the set built-in: set -o posix or by calling the bash program with the --posix option. Bash will then try to behave as compliant as possible to the POSIX standard for shells. Setting the POSIXLY_CORRECT variable does the same. Files read:
1.2.2.2.6. Invoked remotelyFiles read when invoked by rshd:
1.2.2.3. Interactive shells1.2.2.3.1. What is an interactive shell?An interactive shell generally reads from, and writes to, a user's terminal: input and output are connected to a terminal. Bash interactive behavior is started when the bash command is called upon without non-option arguments, except when the option is a string to read from or when the shell is invoked to read from standard input, which allows for positional parameters to be set (see Chapter 3 ). 1.2.2.3.2. Is this shell interactive?Test by looking at the content of the special parameter -, it contains an 'i' when the shell is interactive:
In non-interactive shells, the prompt, PS1, is unset. 1.2.2.3.3. Interactive shell behaviorDifferences in interactive mode:
More information:
1.2.2.4. ConditionalsConditional expressions are used by the [[ compound command and by the test and [ built-in commands. Expressions may be unary or binary. Unary expressions are often used to examine the status of a file. You only need one object, for instance a file, to do the operation on. There are string operators and numeric comparison operators as well; these are binary operators, requiring two objects to do the operation on. If the FILE argument to one of the primaries is in the form /dev/fd/N, then file descriptor N is checked. If the FILE argument to one of the primaries is one of /dev/stdin, /dev/stdout or /dev/stderr, then file descriptor 0, 1 or 2 respectively is checked. Conditionals are discussed in detail in Chapter 7. More information about the file descriptors in Section 8.2.3. 1.2.2.5. Shell arithmeticThe shell allows arithmetic expressions to be evaluated, as one of the shell expansions or by the let built-in. Evaluation is done in fixed-width integers with no check for overflow, though division by 0 is trapped and flagged as an error. The operators and their precedence and associativity are the same as in the C language, see Chapter 3. 1.2.2.6. AliasesAliases allow a string to be substituted for a word when it is used as the first word of a simple command. The shell maintains a list of aliases that may be set and unset with the alias and unalias commands. Bash always reads at least one complete line of input before executing any of the commands on that line. Aliases are expanded when a command is read, not when it is executed. Therefore, an alias definition appearing on the same line as another command does not take effect until the next line of input is read. The commands following the alias definition on that line are not affected by the new alias. Aliases are expanded when a function definition is read, not when the function is executed, because a function definition is itself a compound command. As a consequence, aliases defined in a function are not available until after that function is executed. We will discuss aliases in detail in Section 3.5. 1.2.2.7. ArraysBash provides one-dimensional array variables. Any variable may be used as an array; the declare built-in will explicitly declare an array. There is no maximum limit on the size of an array, nor any requirement that members be indexed or assigned contiguously. Arrays are zero-based. See Chapter 10. 1.2.2.8. Directory stackThe directory stack is a list of recently-visited directories. The pushd built-in adds directories to the stack as it changes the current directory, and the popd built-in removes specified directories from the stack and changes the current directory to the directory removed. Content can be displayed issuing the dirs command or by checking the content of the DIRSTACK variable. More information about the workings of this mechanism can be found in the Bash info pages. 1.2.2.9. The promptBash makes playing with the prompt even more fun. See the section Controlling the Prompt in the Bash info pages. 1.2.2.10. The restricted shellWhen invoked as rbash or with the --restricted or -r option, the following happens:
When a command that is found to be a shell script is executed, rbash turns off any restrictions in the shell spawned to execute the script. More information:
1.3. Executing commands1.3.1. GeneralBash determines the type of program that is to be executed. Normal programs are system commands that exist in compiled form on your system. When such a program is executed, a new process is created because Bash makes an exact copy of itself. This child process has the same environment as its parent, only the process ID number is different. This procedure is called forking. After the forking process, the address space of the child process is overwritten with the new process data. This is done through an exec call to the system. The fork-and-exec mechanism thus switches an old command with a new, while the environment in which the new program is executed remains the same, including configuration of input and output devices, environment variables and priority. This mechanism is used to create all UNIX processes, so it also applies to the Linux operating system. Even the first process, init, with process ID 1, is forked during the boot procedure in the so-called bootstrapping procedure. 1.3.2. Shell built-in commandsBuilt-in commands are contained within the shell itself. When the name of a built-in command is used as the first word of a simple command, the shell executes the command directly, without creating a new process. Built-in commands are necessary to implement functionality impossible or inconvenient to obtain with separate utilities. Bash supports 3 types of built-in commands:
Most of these built-ins will be discussed in the next chapters. For those commands for which this is not the case, we refer to the Info pages. 1.3.3. Executing programs from a scriptWhen the program being executed is a shell script, bash will create a new bash process using a fork. This subshell reads the lines from the shell script one line at a time. Commands on each line are read, interpreted and executed as if they would have come directly from the keyboard. While the subshell processes each line of the script, the parent shell waits for its child process to finish. When there are no more lines in the shell script to read, the subshell terminates. The parent shell awakes and displays a new prompt. 1.4. Building blocks1.4.1. Shell building blocks1.4.1.1. Shell syntaxIf input is not commented, the shell reads it and divides it into words and operators, employing quoting rules to define the meaning of each character of input. Then these words and operators are translated into commands and other constructs, which return an exit status available for inspection or processing. The above fork-and-exec scheme is only applied after the shell has analyzed input in the following way:
1.4.1.2. Shell commandsA simple shell command such as touch file1 file2 file3 consists of the command itself followed by arguments, separated by spaces. More complex shell commands are composed of simple commands arranged together in a variety of ways: in a pipeline in which the output of one command becomes the input of a second, in a loop or conditional construct, or in some other grouping. A couple of examples: ls | more gunzip file.tar.gz | tar xvf - 1.4.1.3. Shell functionsShell functions are a way to group commands for later execution using a single name for the group. They are executed just like a "regular" command. When the name of a shell function is used as a simple command name, the list of commands associated with that function name is executed. Shell functions are executed in the current shell context; no new process is created to interpret them. Functions are explained in Chapter 11. 1.4.1.4. Shell parametersA parameter is an entity that stores values. It can be a name, a number or a special value. For the shell's purpose, a variable is a parameter that stores a name. A variable has a value and zero or more attributes. Variables are created with the declare shell built-in command. If no value is given, a variable is assigned the null string. Variables can only be removed with the unset built-in. Assigning variables is discussed in Section 3.2, advanced use of variables in Chapter 10. 1.4.1.5. Shell expansionsShell expansion is performed after each command line has been split into tokens. These are the expansions performed:
We'll discuss these expansion types in detail in Section 3.4. 1.4.1.6. RedirectionsBefore a command is executed, its input and output may be redirected using a special notation interpreted by the shell. Redirection may also be used to open and close files for the current shell execution environment. 1.4.1.7. Executing commandsWhen executing a command, the words that the parser has marked as variable assignments (preceding the command name) and redirections are saved for later reference. Words that are not variable assignments or redirections are expanded; the first remaining word after expansion is taken to be the name of the command and the rest are arguments to that command. Then redirections are performed, then strings assigned to variables are expanded. If no command name results, variables will affect the current shell environment. An important part of the tasks of the shell is to search for commands. Bash does this as follows:
1.4.1.8. Shell scriptsWhen a file containing shell commands is used as the first non-option argument when invoking Bash (without -c or -s, this will create a non-interactive shell. This shell first searches for the script file in the current directory, then looks in PATH if the file cannot be found there. 1.5. Developing good scripts1.5.1. Properties of good scriptsThis guide is mainly about the last shell building block, scripts. Some general considerations before we continue:
1.5.2. StructureThe structure of a shell script is very flexible. Even though in Bash a lot of freedom is granted, you must ensure correct logic, flow control and efficiency so that users executing the script can do so easily and correctly. When starting on a new script, ask yourself the following questions:
1.5.3. TerminologyThe table below gives an overview of programming terms that you need to be familiar with: Table 1-1. Overview of programming terms
1.5.4. A word on order and logicIn order to speed up the developing process, the logical order of a program should be thought over in advance. This is your first step when developing a script. A number of methods can be used; one of the most common is working with lists. Itemizing the list of tasks involved in a program allows you to describe each process. Individual tasks can be referenced by their item number. Using your own spoken language to pin down the tasks to be executed by your program will help you to create an understandable form of your program. Later, you can replace the everyday language statements with shell language words and constructs. The example below shows such a logic flow design. It describes the rotation of log files. This example shows a possible repetitive loop, controlled by the number of base log files you want to rotate:
The user should provide information for the program to do something. Input from the user must be obtained and stored. The user should be notified that his crontab will change. 1.5.5. An example Bash script: mysystem.shThe mysystem.sh script below executes some well-known commands (date, w, uname, uptime) to display information about you and your machine.
A script always starts with the same two characters, "#!". After that, the shell that will execute the commands following the first line is defined. This script starts with clearing the screen on line 2. Line 3 makes it print a message, informing the user about what is going to happen. Line 5 greets the user. Lines 6, 9, 13, 16 and 20 are only there for orderly output display purposes. Line 8 prints the current date and the number of the week. Line 11 is again an informative message, like lines 3, 18 and 22. Line 12 formats the output of the w; line 15 shows operating system and CPU information. Line 19 gives the uptime and load information. Both echo and printf are Bash built-in commands. The first always exits with a 0 status, and simply prints arguments followed by an end of line character on the standard output, while the latter allows for definition of a formatting string and gives a non-zero exit status code upon failure. This is the same script using the printf built-in:
Creating user friendly scripts by means of inserting messages is treated in Chapter 8.
The following chapters will discuss the details of the above scripts. 1.5.6. Example init scriptAn init script starts system services on UNIX and Linux machines. The system log daemon, the power management daemon, the name and mail daemons are common examples. These scripts, also known as startup scripts, are stored in a specific location on your system, such as /etc/rc.d/init.d or /etc/init.d. Init, the initial process, reads its configuration files and decides which services to start or stop in each run level. A run level is a configuration of processes; each system has a single user run level, for instance, for performing administrative tasks, for which the system has to be in an unused state as much as possible, such as recovering a critical file system from a backup. Reboot and shutdown run levels are usually also configured. The tasks to be executed upon starting a service or stopping it are listed in the startup scripts. It is one of the system administrator's tasks to configure init, so that services are started and stopped at the correct moment. When confronted with this task, you need a good understanding of the startup and shutdown procedures on your system. We therefore advise that you read the man pages for init and inittab before starting on your own initialization scripts. Here is a very simple example, that will play a sound upon starting and stopping your machine:
The case statement often used in this kind of script is described in Section 7.2.5. 1.6. SummaryBash is the GNU shell, compatible with the Bourne shell and incorporating many useful features from other shells. When the shell is started, it reads its configuration files. The most important are:
Bash behaves different when in interactive mode and also has a POSIX compliant and a restricted mode. Shell commands can be split up in three groups: the shell functions, shell built-ins and existing commands in a directory on your system. Bash supports additional built-ins not found in the plain Bourne shell. Shell scripts consist of these commands arranged as shell syntax dictates. Scripts are read and executed line per line and should have a logical structure. 1.7. ExercisesThese are some exercises to warm you up for the next chapter:
Chapter 2. Writing and debugging scripts2.1. Creating and running a script2.1.1. Writing and namingA shell script is a sequence of commands for which you have a repeated use. This sequence is typically executed by entering the name of the script on the command line. Alternatively, you can use scripts to automate tasks using the cron facility. Another use for scripts is in the UNIX boot and shutdown procedure, where operation of daemons and services are defined in init scripts. To create a shell script, open a new empty file in your editor. Any text editor will do: vim, emacs, gedit, dtpad et cetera are all valid. You might want to chose a more advanced editor like vim or emacs, however, because these can be configured to recognize shell and Bash syntax and can be a great help in preventing those errors that beginners frequently make, such as forgetting brackets and semi-colons. Put UNIX commands in the new empty file, like you would enter them on the command line. As discussed in the previous chapter (see Section 1.3), commands can be shell functions, shell built-ins, UNIX commands and other scripts. Give your script a sensible name that gives a hint about what the script does. Make sure that your script name does not conflict with existing commands. In order to ensure that no confusion can rise, script names often end in .sh; even so, there might be other scripts on your system with the same name as the one you chose. Check using which, whereis and other commands for finding information about programs and files: which -a script_name whereis script_name locate script_name 2.1.2. script1.shIn this example we use the echo Bash built-in to inform the user about what is going to happen, before the task that will create the output is executed. It is strongly advised to inform users about what a script is doing, in order to prevent them from becoming nervous because the script is not doing anything. We will return to the subject of notifying users in Chapter 8. Write this script for yourself as well. It might be a good idea to create a directory ~/scripts to hold your scripts. Add the directory to the contents of the PATH variable: export PATH="$PATH:~/scripts" If you are just getting started with Bash, use a text editor that uses different colours for different shell constructs. Syntax highlighting is supported by vim, gvim, (x)emacs, kwrite and many other editors; check the documentation of your favorite editor.
2.1.3. Executing the scriptThe script should have execute permissions for the correct owners in order to be runnable. When setting permissions, check that you really obtained the permissions that you want. When this is done, the script can run like any other command:
This is the most common way to execute a script. It is preferred to execute the script like this in a subshell. The variables, functions and aliases created in this subshell are only known to the particular bash session of that subshell. When that shell exits and the parent regains control, everything is cleaned up and all changes to the state of the shell made by the script, are forgotten. If you did not put the scripts directory in your PATH, and . (the current directory) is not in the PATH either, you can activate the script like this: ./script_name.sh A script can also explicitly be executed by a given shell, but generally we only do this if we want to obtain special behavior, such as checking if the script works with another shell or printing traces for debugging: rbash script_name.sh sh script_name.sh bash -x script_name.sh The specified shell will start as a subshell of your current shell and execute the script. This is done when you want the script to start up with specific options or under specific conditions which are not specified in the script. If you don't want to start a new shell but execute the script in the current shell, you source it: source script_name.sh
The script does not need execute permission in this case. Commands are executed in the current shell context, so any changes made to your environment will be visible when the script finishes execution:
2.2. Script basics2.2.1. Which shell will run the script?When running a script in a subshell, you should define which shell should run the script. The shell type in which you wrote the script might not be the default on your system, so commands you entered might result in errors when executed by the wrong shell. The first line of the script determines the shell to start. The first two characters of the first line should be #!, then follows the path to the shell that should interpret the commands that follow. Blank lines are also considered to be lines, so don't start your script with an empty line. For the purpose of this course, all scripts will start with the line #!/bin/bash As noted before, this implies that the Bash executable can be found in /bin. 2.2.2. Adding commentsYou should be aware of the fact that you might not be the only person reading your code. A lot of users and system administrators run scripts that were written by other people. If they want to see how you did it, comments are useful to enlighten the reader. Comments also make your own life easier. Say that you had to read a lot of man pages in order to achieve a particular result with some command that you used in your script. You won't remember how it worked if you need to change your script after a few weeks or months, unless you have commented what you did, how you did it and/or why you did it. Take the script1.sh example and copy it to commented-script1.sh, which we edit so that the comments reflect what the script does. Everything the shell encounters after a hash mark on a line is ignored and only visible upon opening the shell script file:
In a decent script, the first lines are usually comment about what to expect. Then each big chunk of commands will be commented as needed for clarity's sake. Linux init scripts, as an example, in your system's init.d directory, are usually well commented since they have to be readable and editable by everyone running Linux. 2.3. Debugging Bash scripts2.3.1. Debugging on the entire scriptWhen things don't go according to plan, you need to determine what exactly causes the script to fail. Bash provides extensive debugging features. The most common is to start up the subshell with the -x option, which will run the entire script in debug mode. Traces of each command plus its arguments are printed to standard output after the commands have been expanded but before they are executed. This is the commented-script1.sh script ran in debug mode. Note again that the added comments are not visible in the output of the script.
2.3.2. Debugging on part(s) of the scriptUsing the set Bash built-in you can run in normal mode those portions of the script of which you are sure they are without fault, and display debugging information only for troublesome zones. Say we are not sure what the w command will do in the example commented-script1.sh, then we could enclose it in the script like this:
Output then looks like this:
You can switch debugging mode on and off as many times as you want within the same script. The table below gives an overview of other useful Bash options: Table 2-1. Overview of set debugging options
The dash is used to activate a shell option and a plus to deactivate it. Don't let this confuse you! In the example below, we demonstrate these options on the command line:
Alternatively, these modes can be specified in the script itself, by adding the desired options to the first line shell declaration. Options can be combined, as is usually the case with UNIX commands: #!/bin/bash -xv Once you found the buggy part of your script, you can add echo statements before each command of which you are unsure, so that you will see exactly where and why things don't work. In the example commented-script1.sh script, it could be done like this, still assuming that the displaying of users gives us problems:
In more advanced scripts, the echo can be inserted to display the content of variables at different stages in the script, so that flaws can be detected:
2.4. SummaryA shell script is a reusable series of commands put in an executable text file. Any text editor can be used to write scripts. Scripts start with #! followed by the path to the shell executing the commands from the script. Comments are added to a script for your own future reference, and also to make it understandable for other users. It is better to have too many explanations than not enough. Debugging a script can be done using shell options. Shell options can be used for partial debugging or for analyzing the entire script. Inserting echo commands at strategic locations is also a common troubleshooting technique. 2.5. ExercisesThis exercise will help you to create your first script.
Chapter 3. The Bash environment3.1. Shell initialization files3.1.1. System-wide configuration files3.1.1.1. /etc/profileWhen invoked interactively with the --login option or when invoked as sh, Bash reads the /etc/profile instructions. These usually set the shell variables PATH, USER, MAIL, HOSTNAME and HISTSIZE. On some systems, the umask value is configured in /etc/profile; on other systems this file holds pointers to other configuration files such as:
All settings that you want to apply to all your users' environments should be in this file. It might look like this:
This configuration file sets some basic shell environment variables as well as some variables required by users running Java and/or Java applications in their web browser. See Section 3.2. See Chapter 7 for more on the conditional if used in this file; Chapter 9 discusses loops such as the for construct. The Bash source contains sample profile files for general or individual use. These and the one in the example above need changes in order for them to work in your environment! 3.1.1.2. /etc/bashrcOn systems offering multiple types of shells, it might be better to put Bash-specific configurations in this file, since /etc/profile is also read by other shells, such as the Bourne shell. Errors generated by shells that don't understand the Bash syntax are prevented by splitting the configuration files for the different types of shells. In such cases, the user's ~/.bashrc might point to /etc/bashrc in order to include it in the shell initialization process upon login. You might also find that /etc/profile on your system only holds shell environment and program startup settings, while /etc/bashrc contains system-wide definitions for shell functions and aliases. The /etc/bashrc file might be referred to in /etc/profile or in individual user shell initialization files. The source contains sample bashrc files, or you might find a copy in /usr/share/doc/bash-2.05b/startup-files. This is part of the bashrc that comes with the Bash documentation:
Apart from general aliases, it contains useful aliases which make commands work even if you misspell them. We will discuss aliases in Section 3.5.2. This file contains a function, pskill; functions will be studied in detail in Chapter 11. 3.1.2. Individual user configuration files
3.1.2.1. ~/.bash_profileThis is the preferred configuration file for configuring user environments individually. In this file, users can add extra configuration options or change default settings:
This user configures the backspace character for login on different operating systems. Apart from that, the user's .bashrc and .bash_login are read. 3.1.2.2. ~/.bash_loginThis file contains specific settings that are normally only executed when you log in to the system. In the example, we use it to configure the umask value and to show a list of connected users upon login. This user also gets the calendar for the current month:
In the absence of ~/.bash_profile, this file will be read. 3.1.2.3. ~/.profileIn the absence of ~/.bash_profile and ~/.bash_login, ~/.profile is read. It can hold the same configurations, which are then also accessible by other shells. Mind that other shells might not understand the Bash syntax. 3.1.2.4. ~/.bashrcToday, it is more common to use a non-login shell, for instance when logged in graphically using X terminal windows. Upon opening such a window, the user does not have to provide a user name or password; no authentication is done. Bash searches for ~/.bashrc when this happens, so it is referred to in the files read upon login as well, which means you don't have to enter the same settings in multiple files. In this user's .bashrc a couple of aliases are defined and variables for specific programs are set after the system-wide /etc/bashrc is read:
More examples can be found in the Bash package. Remember that sample files might need changes in order to work in your environment. Aliases are discussed in Section 3.5. 3.1.2.5. ~/.bash_logoutThis file contains specific instructions for the logout procedure. In the example, the terminal window is cleared upon logout. This is useful for remote connections, which will leave a clean window after closing them.
3.1.3. Changing shell configuration filesWhen making changes to any of the above files, users have to either reconnect to the system or source the altered file for the changes to take effect. By interpreting the script this way, changes are applied to the current shell session: Most shell scripts execute in a private environment: variables are not inherited by child processes unless they are exported by the parent shell. Sourcing a file containing shell commands is a way of applying changes to your own environment and setting variables in the current shell. This example also demonstrates the use of different prompt settings by different users. In this case, red means danger. When you have a green prompt, don't worry too much. Note that source resourcefile is the same as . resourcefile. Should you get lost in all these configuration files, and find yourself confronted with settings of which the origin is not clear, use echo statements, just like for debugging scripts; see Section 2.3.2. You might add lines like this:
or like this:
3.2. Variables3.2.1. Types of variablesAs seen in the examples above, shell variables are in uppercase characters by convention. Bash keeps a list of two types of variables: 3.2.1.1. Global variablesGlobal variables or environment variables are available in all shells. The env or printenv commands can be used to display environment variables. These programs come with the sh-utils package. Below is a typical output:
3.2.1.2. Local variablesLocal variables are only available in the current shell. Using the set built-in command without any options will display a list of all variables (including environment variables) and functions. The output will be sorted according to the current locale and displayed in a reusable format. Below is a diff file made by comparing printenv and set output, after leaving out the functions which are also displayed by the set command:
3.2.1.3. Variables by contentApart from dividing variables in local and global variables, we can also divide them in categories according to the sort of content the variable contains. In this respect, variables come in 4 types:
We'll discuss these types in Chapter 10. For now, we will work with integer and string values for our variables. 3.2.2. Creating variablesVariables are case sensitive and capitalized by default. Giving local variables a lowercase name is a convention which is sometimes applied. However, you are free to use the names you want or to mix cases. Variables can also contain digits, but a name starting with a digit is not allowed:
To set a variable in the shell, use VARNAME="value" Putting spaces around the equal sign will cause errors. It is a good habit to quote content strings when assigning values to variables: this will reduce the chance that you make errors. Some examples using upper and lower cases, numbers and spaces:
3.2.3. Exporting variablesA variable created like the ones in the example above is only available to the current shell. It is a local variable: child processes of the current shell will not be aware of this variable. In order to pass variables to a subshell, we need to export them using the export built-in command. Variables that are exported are referred to as environment variables. Setting and exporting is usually done in one step: export VARNAME="value" A subshell can change variables it inherited from the parent, but the changes made by the child don't affect the parent. This is demonstrated in the example:
When first trying to read the value of full_name in a subshell, it is not there (echo shows a null string). The subshell quits, and full_name is exported in the parent - a variable can be exported after it has been assigned a value. Then a new subshell is started, in which the variable exported from the parent is visible. The variable is changed to hold another name, but the value for this variable in the parent stays the same. 3.2.4. Reserved variables3.2.4.1. Bourne shell reserved variablesBash uses certain shell variables in the same way as the Bourne shell. In some cases, Bash assigns a default value to the variable. The table below gives an overview of these plain shell variables: Table 3-1. Reserved Bourne shell variables
3.2.4.2. Bash reserved variablesThese variables are set or used by Bash, but other shells do not normally treat them specially. Table 3-2. Reserved Bash variables
Check the Bash man, info or doc pages for extended information. Some variables are read-only, some are set automatically and some lose their meaning when set to a different value than the default. 3.2.5. Special parametersThe shell treats several parameters specially. These parameters may only be referenced; assignment to them is not allowed. Table 3-3. Special bash variables
The positional parameters are the words following the name of a shell script. They are put into the variables $1, $2, $3 and so on. As long as needed, variables are added to an internal array. $# holds the total number of parameters, as is demonstrated with this simple script:
Upon execution one could give any numbers of arguments:
More on evaluating these parameters is in Chapter 7 and Section 9.7. Some examples on the other special parameters:
User franky starts entering the grep command, which results in the assignment of the _ variable. The process ID of his shell is 10662. After putting a job in the background, the ! holds the process ID of the backgrounded job. The shell running is bash. When a mistake is made, ? holds an exit code different from 0 (zero). 3.2.6. Script recycling with variablesApart from making the script more readable, variables will also enable you to faster apply a script in another environment or for another purpose. Consider the following example, a very simple script that makes a backup of franky's home directory to a remote server:
First of all, you are more likely to make errors if you name files and directories manually each time you need them. Secondly, suppose franky wants to give this script to carol, then carol will have to do quite some editing before she can use the script to back up her home directory. The same is true if franky wants to use this script for backing up other directories. For easy recycling, make all files, directories, usernames, servernames etcetera variable. Thus, you only need to edit a value once, without having to go through the entire script to check where a parameter occurs. This is an example:
3.3. Quoting characters3.3.1. Why?A lot of keys have special meanings in some context or other. Quoting is used to remove the special meaning of characters or words: quotes can disable special treatment for special characters, they can prevent reserved words from being recognized as such and they can disable parameter expansion. 3.3.2. Escape charactersEscape characters are used to remove the special meaning from a single character. A non-quoted backslash, \, is used as an escape character in Bash. It preserves the literal value of the next character that follows, with the exception of newline. If a newline character appears immediately after the backslash, it marks the continuation of a line when it is longer that the width of the terminal; the backslash is removed from the input stream and effectively ignored.
In this example, the variable date is created and set to hold a value. The first echo displays the value of the variable, but for the second, the dollar sign is escaped. 3.3.3. Single quotesSingle quotes ('') are used to preserve the literal value of each character enclosed within the quotes. A single quote may not occur between single quotes, even when preceded by a backslash. We continue with the previous example:
3.3.4. Double quotesUsing double quotes the literal value of all characters enclosed is preserved, except for the dollar sign, the backticks (backward single quotes, ``) and the backslash. The dollar sign and the backticks retain their special meaning within the double quotes. The backslash retains its meaning only when followed by dollar, backtick, double quote, backslash or newline. Within double quotes, the backslashes are removed from the input stream when followed by one of these characters. Backslashes preceding characters that don't have a special meaning are left unmodified for processing by the shell interpreter. A double quote may be quoted within double quotes by preceding it with a backslash.
3.4. Shell expansion3.4.1. GeneralAfter the command has been split into tokens (see Section 1.4.1.1), these tokens or words are expanded or resolved. There are eight kinds of expansion performed, which we will discuss in the next sections, in the order that they are expanded. After all expansions, quote removal is performed. 3.4.2. Brace expansionBrace expansion is a mechanism by which arbitrary strings may be generated. Patterns to be brace-expanded take the form of an optional PREAMBLE, followed by a series of comma-separated strings between a pair of braces, followed by an optional POSTSCRIPT. The preamble is prefixed to each string contained within the braces, and the postscript is then appended to each resulting string, expanding left to right. Brace expansions may be nested. The results of each expanded string are not sorted; left to right order is preserved:
Brace expansion is performed before any other expansions, and any characters special to other expansions are preserved in the result. It is strictly textual. Bash does not apply any syntactic interpretation to the context of the expansion or the text between the braces. To avoid conflicts with parameter expansion, the string "${" is not considered eligible for brace expansion. A correctly-formed brace expansion must contain unquoted opening and closing braces, and at least one unquoted comma. Any incorrectly formed brace expansion is left unchanged. 3.4.3. Tilde expansionIf a word begins with an unquoted tilde character ("~"), all of the characters up to the first unquoted slash (or all characters, if there is no unquoted slash) are considered a tilde-prefix. If none of the characters in the tilde-prefix are quoted, the characters in the tilde-prefix following the tilde are treated as a possible login name. If this login name is the null string, the tilde is replaced with the value of the HOME shell variable. If HOME is unset, the home directory of the user executing the shell is substituted instead. Otherwise, the tilde-prefix is replaced with the home directory associated with the specified login name. If the tilde-prefix is "~+", the value of the shell variable PWD replaces the tilde-prefix. If the tilde-prefix is "~-", the value of the shell variable OLDPWD, if it is set, is substituted. If the characters following the tilde in the tilde-prefix consist of a number N, optionally prefixed by a "+" or a "-", the tilde-prefix is replaced with the corresponding element from the directory stack, as it would be displayed by the dirs built-in invoked with the characters following tilde in the tilde-prefix as an argument. If the tilde-prefix, without the tilde, consists of a number without a leading "+" or "-", "+" is assumed. If the login name is invalid, or the tilde expansion fails, the word is left unchanged. Each variable assignment is checked for unquoted tilde-prefixes immediately following a ":" or "=". In these cases, tilde expansion is also performed. Consequently, one may use file names with tildes in assignments to PATH, MAILPATH, and CDPATH, and the shell assigns the expanded value. Example:
~/testdir will be expanded to $HOME/testdir, so if $HOME is /var/home/franky, the directory /var/home/franky/testdir will be added to the content of the PATH variable. 3.4.4. Shell parameter and variable expansionThe "$" character introduces parameter expansion, command substitution, or arithmetic expansion. The parameter name or symbol to be expanded may be enclosed in braces, which are optional but serve to protect the variable to be expanded from characters immediately following it which could be interpreted as part of the name. When braces are used, the matching ending brace is the first "}" not escaped by a backslash or within a quoted string, and not within an embedded arithmetic expansion, command substitution, or parameter expansion. The basic form of parameter expansion is "${PARAMETER}". The value of "PARAMETER" is substituted. The braces are required when "PARAMETER" is a positional parameter with more than one digit, or when "PARAMETER" is followed by a character that is not to be interpreted as part of its name. If the first character of "PARAMETER" is an exclamation point, Bash uses the value of the variable formed from the rest of "PARAMETER" as the name of the variable; this variable is then expanded and that value is used in the rest of the substitution, rather than the value of "PARAMETER" itself. This is known as indirect expansion. You are certainly familiar with straight parameter expansion, since it happens all the time, even in the simplest of cases, such as the one above or the following:
The following is an example of indirect expansion:
Note that this is not the same as echo $N*. The following construct allows for creation of the named variable if it does not yet exist: ${VAR:=value} Example:
Special parameters, among others the positional parameters, may not be assigned this way, however. We will further discuss the use of the curly braces for treatment of variables in Chapter 10. More information can also be found in the Bash info pages. 3.4.5. Command substitutionCommand substitution allows the output of a command to replace the command itself. Command substitution occurs when a command is enclosed like this: $(command) or like this using backticks: `command` Bash performs the expansion by executing COMMAND and replacing the command substitution with the standard output of the command, with any trailing newlines deleted. Embedded newlines are not deleted, but they may be removed during word splitting.
When the old-style backquoted form of substitution is used, backslash retains its literal meaning except when followed by "$", "`", or "\". The first backticks not preceded by a backslash terminates the command substitution. When using the "$(COMMAND)" form, all characters between the parentheses make up the command; none are treated specially. Command substitutions may be nested. To nest when using the backquoted form, escape the inner backticks with backslashes. If the substitution appears within double quotes, word splitting and file name expansion are not performed on the results. 3.4.6. Arithmetic expansionArithmetic expansion allows the evaluation of an arithmetic expression and the substitution of the result. The format for arithmetic expansion is: $(( EXPRESSION )) The expression is treated as if it were within double quotes, but a double quote inside the parentheses is not treated specially. All tokens in the expression undergo parameter expansion, command substitution, and quote removal. Arithmetic substitutions may be nested. Evaluation of arithmetic expressions is done in fixed-width integers with no check for overflow - although division by zero is trapped and recognized as an error. The operators are roughly the same as in the C programming language. In order of decreasing precedence, the list looks like this: Table 3-4. Arithmetic operators
Shell variables are allowed as operands; parameter expansion is performed before the expression is evaluated. Within an expression, shell variables may also be referenced by name without using the parameter expansion syntax. The value of a variable is evaluated as an arithmetic expression when it is referenced. A shell variable need not have its integer attribute turned on to be used in an expression. Constants with a leading 0 (zero) are interpreted as octal numbers. A leading "0x" or "0X" denotes hexadecimal. Otherwise, numbers take the form "[BASE'#']N", where "BASE" is a decimal number between 2 and 64 representing the arithmetic base, and N is a number in that base. If "BASE'#'" is omitted, then base 10 is used. The digits greater than 9 are represented by the lowercase letters, the uppercase letters, "@", and "_", in that order. If "BASE" is less than or equal to 36, lowercase and uppercase letters may be used interchangably to represent numbers between 10 and 35. Operators are evaluated in order of precedence. Sub-expressions in parentheses are evaluated first and may override the precedence rules above. Wherever possible, Bash users should try to use the syntax with angular brackets: $[ EXPRESSION ] However, this will only calculate the result of EXPRESSION, and do no tests:
See Section 7.1.2.2, among others, for practical examples in scripts. 3.4.7. Process substitutionProcess substitution is supported on systems that support named pipes (FIFOs) or the /dev/fd method of naming open files. It takes the form of <(LIST) or >(LIST) The process LIST is run with its input or output connected to a FIFO or some file in /dev/fd. The name of this file is passed as an argument to the current command as the result of the expansion. If the ">(LIST)" form is used, writing to the file will provide input for LIST. If the "<(LIST)" form is used, the file passed as an argument should be read to obtain the output of LIST. Note that no space may appear between the < or > signs and the left parenthesis, otherwise the construct would be interpreted as a redirection. When available, process substitution is performed simultaneously with parameter and variable expansion, command substitution, and arithmetic expansion. More information in Section 8.2.3. 3.4.8. Word splittingThe shell scans the results of parameter expansion, command substitution, and arithmetic expansion that did not occur within double quotes for word splitting. The shell treats each character of $IFS as a delimiter, and splits the results of the other expansions into words on these characters. If IFS is unset, or its value is exactly "'<space><tab><newline>'", the default, then any sequence of IFS characters serves to delimit words. If IFS has a value other than the default, then sequences of the whitespace characters "space" and "Tab" are ignored at the beginning and end of the word, as long as the whitespace character is in the value of IFS (an IFS whitespace character). Any character in IFS that is not IFS whitespace, along with any adjacent IF whitespace characters, delimits a field. A sequence of IFS whitespace characters is also treated as a delimiter. If the value of IFS is null, no word splitting occurs. Explicit null arguments ("""" or "''") are retained. Unquoted implicit null arguments, resulting from the expansion of parameters that have no values, are removed. If a parameter with no value is expanded within double quotes, a null argument results and is retained.
3.4.9. File name expansionAfter word splitting, unless the -f option has been set (see Section 2.3.2), Bash scans each word for the characters "*", "?", and "[". If one of these characters appears, then the word is regarded as a PATTERN, and replaced with an alphabetically sorted list of file names matching the pattern. If no matching file names are found, and the shell option nullglob is disabled, the word is left unchanged. If the nullglob option is set, and no matches are found, the word is removed. If the shell option nocaseglob is enabled, the match is performed without regard to the case of alphabetic characters. When a pattern is used for file name generation, the character "." at the start of a file name or immediately following a slash must be matched explicitly, unless the shell option dotglob is set. When matching a file name, the slash character must always be matched explicitly. In other cases, the "." character is not treated specially. The GLOBIGNORE shell variable may be used to restrict the set of file names matching a pattern. If GLOBIGNORE is set, each matching file name that also matches one of the patterns in GLOBIGNORE is removed from the list of matches. The file names . and .. are always ignored, even when GLOBIGNORE is set. However, setting GLOBIGNORE has the effect of enabling the dotglob shell option, so all other file names beginning with a "." will match. To get the old behavior of ignoring file names beginning with a ".", make ".*" one of the patterns in GLOBIGNORE. The dotglob option is disabled when GLOBIGNORE is unset. 3.5. Aliases3.5.1. What are aliases?An alias allows a string to be substituted for a word when it is used as the first word of a simple command. The shell maintains a list of aliases that may be set and unset with the alias and unalias built-in commands. Issue the alias without options to display a list of aliases known to the current shell.
Aliases are useful for specifying the default version of a command that exists in several versions on your system, or to specify default options to a command. Another use for aliases is for correcting incorrect spelling. The first word of each simple command, if unquoted, is checked to see if it has an alias. If so, that word is replaced by the text of the alias. The alias name and the replacement text may contain any valid shell input, including shell metacharacters, with the exception that the alias name may not contain "=". The first word of the replacement text is tested for aliases, but a word that is identical to an alias being expanded is not expanded a second time. This means that one may alias ls to ls -F, for instance, and Bash will not try to recursively expand the replacement text. If the last character of the alias value is a space or tab character, then the next command word following the alias is also checked for alias expansion. Aliases are not expanded when the shell is not interactive, unless the expand_aliases option is set using the shopt shell built-in. 3.5.2. Creating and removing aliasesAliases are created using the alias shell built-in. For permanent use, enter the alias in one of your shell initialization files; if you just enter the alias on the command line, it is only recognized within the current shell.
Bash always reads at least one complete line of input before executing any of the commands on that line. Aliases are expanded when a command is read, not when it is executed. Therefore, an alias definition appearing on the same line as another command does not take effect until the next line of input is read. The commands following the alias definition on that line are not affected by the new alias. This behavior is also an issue when functions are executed. Aliases are expanded when a function definition is read, not when the function is executed, because a function definition is itself a compound command. As a consequence, aliases defined in a function are not available until after that function is executed. To be safe, always put alias definitions on a separate line, and do not use alias in compound commands. Aliases are not inherited by child processes. Bourne shell (sh) does not recognize aliases. More about functions is in Chapter 11.
3.6. More Bash options3.6.1. Displaying optionsWe already discussed a couple of Bash options that are useful for debugging your scripts. In this section, we will take a more in-depth view of the Bash options. Use the -o option to set to display all shell options:
See the Bash Info pages, section -> for a description of each option. A lot of options have one-character shorthands: the xtrace option, for instance, is equal to specifying set -x. 3.6.2. Changing optionsShell options can either be set different from the default upon calling the shell, or be set during shell operation. They may also be included in the shell resource configuration files. The following command executes a script in POSIX-compatible mode:
For changing the current environment temporarily, or for use in a script, we would rather use set. Use - (dash) for enabling an option, + for disabling:
The above example demonstrates the noclobber option, which prevents existing files from being overwritten by redirection operations. The same goes for one-character options, for instance -u, which will treat unset variables as an error when set, and exits a non-interactive shell upon encountering such errors:
This option is also useful for detecting incorrect content assignment to variables: the same error will also occur, for instance, when assigning a character string to a variable that was declared explicitly as one holding only integer values. One last example follows, demonstrating the noglob option, which prevents special characters from being expanded:
3.7. SummaryThe Bash environment can be configured globally and on a per user basis. Various configuration files are used to fine-tune the behavior of the shell. These files contain shell options, settings for variables, function definitions and various other building blocks for creating ourselves a cosy environment. Except for the reserved Bourne shell, Bash and special parameters, variable names can be chosen more or less freely. Because a lot of characters have double or even triple meanings, depending on the environment, Bash uses a system of quoting to take away special meaning from one or multiple characters when special treatment is not wanted. Bash uses various methods of expanding command line entries in order to determine which commands to execute. 3.8. ExercisesFor this exercise, you will need to read the useradd man pages, because we are going to use the /etc/skel directory to hold default shell configuration files, which are copied to the home directory of each newly added user. First we will do some general exercises on setting and displaying variables.
Don't forget to chmod your scripts! Chapter 4. Regular expressions4.1. Regular expressions4.1.1. What are regular expressions?A regular expression is a pattern that describes a set of strings. Regular expressions are constructed analogously to arithmetic expressions by using various operators to combine smaller expressions. The fundamental building blocks are the regular expressions that match a single character. Most characters, including all letters and digits, are regular expressions that match themselves. Any metacharacter with special meaning may be quoted by preceding it with a backslash. 4.1.2. Regular expression metacharactersA regular expression may be followed by one of several repetition operators (metacharacters): Table 4-1. Regular expression operators
Two regular expressions may be concatenated; the resulting regular expression matches any string formed by concatenating two substrings that respectively match the concatenated subexpressions. Two regular expressions may be joined by the infix operator "|"; the resulting regular expression matches any string matching either subexpression. Repetition takes precedence over concatenation, which in turn takes precedence over alternation. A whole subexpression may be enclosed in parentheses to override these precedence rules. 4.1.3. Basic versus extended regular expressionsIn basic regular expressions the metacharacters "?", "+", "{", "|", "(", and ")" lose their special meaning; instead use the backslashed versions "\?", "\+", "\{", "\|", "\(", and "\)". Check in your system documentation whether commands using regular expressions support extended expressions. 4.2. Examples using grep4.2.1. What is grep?grep searches the input files for lines containing a match to a given pattern list. When it finds a match in a line, it copies the line to standard output (by default), or whatever other sort of output you have requested with options. Though grep expects to do the matching on text, it has no limits on input line length other than available memory, and it can match arbitrary characters within a line. If the final byte of an input file is not a newline, grep silently supplies one. Since newline is also a separator for the list of patterns, there is no way to match newline characters in a text. Some examples:
With the first command, user cathy displays the lines from /etc/passwd containing the string root. Then she displays the line numbers containing this search string. With the third command she checks which users are not using bash, but accounts with the nologin shell are not displayed. Then she counts the number of accounts that have /bin/false as the shell. The last command displays the lines from all the files in her home directory starting with ~/.bash, excluding matches containing the string history, so as to exclude matches from ~/.bash_history which might contain the same string, in upper or lower cases. Note that the search is for the string "ps", and not for the command ps. Now let's see what else we can do with grep, using regular expressions. 4.2.2. Grep and regular expressions
4.2.2.1. Line and word anchorsFrom the previous example, we now exclusively want to display lines starting with the string "root":
If we want to see which accounts have no shell assigned whatsoever, we search for lines ending in ":":
To check that PATH is exported in ~/.bashrc, first select "export" lines and then search for lines starting with the string "PATH", so as not to display MANPATH and other possible paths:
Similarly, \> matches the end of a word. If you want to find a string that is a separate word (enclosed by spaces), it is better use the -w, as in this example where we are displaying information for the root partition:
If this option is not used, all the lines from the file system table will be displayed. 4.2.2.2. Character classesA bracket expression is a list of characters enclosed by "[" and "]". It matches any single character in that list; if the first character of the list is the caret, "^", then it matches any character NOT in the list. For example, the regular expression "[0123456789]" matches any single digit. Within a bracket expression, a range expression consists of two characters separated by a hyphen. It matches any single character that sorts between the two characters, inclusive, using the locale's collating sequence and character set. For example, in the default C locale, "[a-d]" is equivalent to "[abcd]". Many locales sort characters in dictionary order, and in these locales "[a-d]" is typically not equivalent to "[abcd]"; it might be equivalent to "[aBbCcDd]", for example. To obtain the traditional interpretation of bracket expressions, you can use the C locale by setting the LC_ALL environment variable to the value "C". Finally, certain named classes of characters are predefined within bracket expressions. See the grep man or info pages for more information about these predefined expressions.
In the example, all the lines containing either a "y" or "f" character are first displayed, followed by an example of using a range with the ls command. 4.2.2.3. WildcardsUse the "." for a single character match. If you want to get a list of all five-character English dictionary words starting with "c" and ending in "h" (handy for solving crosswords):
If you want to display lines containing the literal dot character, use the -F option to grep. For matching multiple characters, use the asterisk. This example selects all words starting with "c" and ending in "h" from the system's dictionary:
If you want to find the literal asterisk character in a file or output, use grep -F:
4.3. Pattern matching using Bash features4.3.1. Character rangesApart from grep and regular expressions, there's a good deal of pattern matching that you can do directly in the shell, without having to use an external program. As you already know, the asterisk (*) and the question mark (?) match any string or any single character, respectively. Quote these special characters to match them literally:
But you can also use the square braces to match any enclosed character or range of characters, if pairs of characters are separated by a hyphen. An example:
This lists all files in cathy's home directory, starting with "a", "b", "c", "x", "y" or "z". If the first character within the braces is "!" or "^", any character not enclosed will be matched. To match the dash ("-"), include it as the first or last character in the set. The sorting depends on the current locale and of the value of the LC_COLLATE variable, if it is set. Mind that other locales might interpret "[a-cx-z]" as "[aBbCcXxYyZz]" if sorting is done in dictionary order. If you want to be sure to have the traditional interpretation of ranges, force this behavior by setting LC_COLLATE or LC_ALL to "C". 4.3.2. Character classesCharacter classes can be specified within the square braces, using the syntax [:CLASS:], where CLASS is defined in the POSIX standard and has one of the values "alnum", "alpha", "ascii", "blank", "cntrl", "digit", "graph", "lower", "print", "punct", "space", "upper", "word" or "xdigit". Some examples:
When the extglob shell option is enabled (using the shopt built-in), several extended pattern matching operators are recognized. Read more in the Bash info pages, section ->->->. 4.4. SummaryRegular expressions are powerful tools for selecting particular lines from files or output. A lot of UNIX commands use regular expressions: vim, perl, the PostgreSQL database and so on. They can be made available in any language or application using external libraries, and they even found their way to non-UNIX systems. For instance, regular expressions are used in the Excell spreadsheet that comes with the MicroSoft Windows Office suite. In this chapter we got the feel of the grep command, which is indispensable in any UNIX environment.
Bash has built-in features for matching patterns and can recognize character classes and ranges. 4.5. ExercisesThese exercises will help you master regular expressions.
Chapter 5. The GNU sed stream editor
5.1. Introduction5.1.1. What is sed?A Stream EDitor is used to perform basic transformations on text read from a file or a pipe. The result is sent to standard output. The syntax for the sed command has no output file specification, but results can be saved to a file using output redirection. The editor does not modify the original input. What distinguishes sed from other editors, such as vi and ed, is its ability to filter text that it gets from a pipeline feed. You do not need to interact with the editor while it is running; that is why sed is sometimes called a batch editor. This feature allows use of editing commands in scripts, greatly easing repetitive editing tasks. When facing replacement of text in a large number of files, sed is a great help. 5.1.2. sed commandsThe sed program can perform text pattern substitutions and deletions using regular expressions, like the ones used with the grep command; see Section 4.2. The editing commands are similar to the ones used in the vi editor: Table 5-1. Sed editing commands
Apart from editing commands, you can give options to sed. An overview is in the table below: Table 5-2. Sed options
The sed info pages contain more information; we only list the most frequently used commands and options here. 5.2. Interactive editing5.2.1. Printing lines containing a patternThis is something you can do with grep, of course, but you can't do a "find and replace" using that command. This is just to get you started. This is our example text file:
We want sed to find all the lines containing our search pattern, in this case "erors". We use the p to obtain the result:
As you notice, sed prints the entire file, but the lines containing the search string are printed twice. This is not what we want. In order to only print those lines matching our pattern, use the -n option:
5.2.2. Deleting lines of input containing a patternWe use the same example text file. Now we only want to see the lines not containing the search string:
The d command results in excluding lines from being displayed. Matching lines starting with a given pattern and ending in a second pattern are showed like this:
5.2.3. Ranges of linesThis time we want to take out the lines containing the errors. In the example these are lines 2 to 4. Specify this range to address, together with the d command:
To print the file starting from a certain line until the end of the file, use a command similar to this:
This only prints the first two lines of the example file. The following command prints the first line containing the pattern "a text", up to and including the next line containing the pattern "a line":
5.2.4. Find and replace with sedIn the example file, we will now search and replace the errors instead of only (de)selecting the lines containing the search string.
As you can see, this is not exactly the desired effect: in line 4, only the first occurrence of the search string has been replaced, and there is still an 'eror' left. Use the g command to indicate to sed that it should examine the entire line instead of stopping at the first occurrence of your string:
To insert a string at the beginning of each line of a file, for instance for quoting:
Insert some string at the end of each line:
Multiple find and replace commands are separated with individual -e options:
Keep in mind that by default sed prints its results to the standard output, most likely your terminal window. If you want to save the output to a file, redirect it: sed option 'some/expression' file_to_process > sed_output_in_a_file
5.3. Non-interactive editing5.3.1. Reading sed commands from a fileMultiple sed commands can be put in a file and executed using the -f option. When creating such a file, make sure that:
5.3.2. Writing output filesWriting output is done using the output redirection operator >. This is an example script used to create very simple HTML files from plain text files.
$1 holds the first argument to a given command, in this case the name of the file to convert:
More on positional parameters in Chapter 7.
This is not really how it is done; this example just demonstrates sed capabilities. See Section 6.3 for a more decent solution to this problem, using awk BEGIN and END constructs.
5.4. SummaryThe sed stream editor is a powerful command line tool, which can handle streams of data: it can take input lines from a pipe. This makes it fit for non-interactive use. The sed editor uses vi-like commands and accepts regular expressions. The sed tool can read commands from the command line or from a script. It is often used to perform find-and-replace actions on lines containing a pattern. 5.5. ExercisesThese exercises are meant to further demonstrate what sed can do.
Chapter 6. The GNU awk programming language
6.1. Getting started with gawk6.1.1. What is gawk?Gawk is the GNU version of the commonly available UNIX awk program, another popular stream editor. Since the awk program is often just a link to gawk, we will refer to it as awk. The basic function of awk is to search files for lines or other text units containing one or more patterns. When a line matches one of the patterns, special actions are performed on that line. Programs in awk are different from programs in most other languages, because awk programs are "data-driven": you describe the data you want to work with and then what to do when you find it. Most other languages are "procedural." You have to describe, in great detail, every step the program is to take. When working with procedural languages, it is usually much harder to clearly describe the data your program will process. For this reason, awk programs are often refreshingly easy to read and write.
6.1.2. Gawk commandsWhen you run awk, you specify an awk program that tells awk what to do. The program consists of a series of rules. (It may also contain function definitions, loops, conditions and other programming constructs, advanced features that we will ignore for now.) Each rule specifies one pattern to search for and one action to perform upon finding the pattern. There are several ways to run awk. If the program is short, it is easiest to run it on the command line: awk PROGRAM inputfile(s) If multiple changes have to be made, possibly regularly and on multiple files, it is easier to put the awk commands in a script. This is read like this: awk -f PROGRAM-FILE inputfile(s) 6.2. The print program6.2.1. Printing selected fieldsThe print command in awk outputs selected data from the input file. When awk reads a line of a file, it divides the line in fields based on the specified input field separator, FS, which is an awk variable (see Section 6.3.2). This variable is predefined to be one or more spaces or tabs. The variables $1, $2, $3, ..., $N hold the values of the first, second, third until the last field of an input line. The variable $0 (zero) holds the value of the entire line. This is depicted in the image below, where we see six colums in the output of the df command: In the output of ls -l, there are 9 columns. The print statement uses these fields as follows:
This command printed the fifth column of a long file listing, which contains the file size, and the last column, the name of the file. This output is not very readable unless you use the official way of referring to columns, which is to separate the ones that you want to print with a comma. In that case, the default output separater character, usually a space, will be put in between each output field. 6.2.2. Formatting fieldsWithout formatting, using only the output separator, the output looks rather poor. Inserting a couple of tabs and a string to indicate what output this is will make it look a lot better:
Note the use of the backslash, which makes long input continue on the next line without the shell interpreting this as a separate command. While your command line input can be of virtually unlimited length, your monitor is not, and printed paper certainly isn't. Using the backslash also allows for copying and pasting of the above lines into a terminal window. The -h option to ls is used for supplying humanly readable size formats for bigger files. The output of a long listing displaying the total amount of blocks in the directory is given when a directory is the argument. This line is useless to us, so we add an asterisk. We also add the -d option for the same reason, in case asterisk expands to a directory. The backslash in this example marks the continuation of a line. See Section 3.3.2. You can take out any number of columns and even reverse the order. In the example below this is demonstrated for showing the most critical partitions:
The table below gives an overview of special formatting characters: Quotes, dollar signs and other meta-characters should be escaped with a backslash. 6.2.3. The print command and regular expressionsA regular expression can be used as a pattern by enclosing it in slashes. The regular expression is then tested against the entire text of each record. The syntax is as follows: awk 'EXPRESSION { PROGRAM }' file(s) The following example displays only local disk device information, networked file systems are not shown:
Slashes need to be escaped, because they have a special meaning to the awk program. Below another example where we search the /etc directory for files ending in ".conf" and starting with either "a" or "x", using extended regular expressions:
This example illustrates the special meaning of the dot in regular expressions: the first one indicates that we want to search for any character after the first search string, the second is escaped because it is part of a string to find (the end of the file name). 6.2.4. Special patternsIn order to precede output with comments, use the BEGIN statement:
The END statement can be added for inserting text after the entire input is processed:
6.2.5. Gawk scriptsAs commands tend to get a little longer, you might want to put them in a script, so they are reusable. An awk script contains awk statements defining patterns and actions. As an illustration, we will build a report that displays our most loaded partitions. See Section 6.2.2.
awk first prints a begin message, then formats all the lines that contain an eight or a nine at the beginning of a word, followed by one other number and a percentage sign. An end message is added.
6.3. Gawk variablesAs awk is processing the input file, it uses several variables. Some are editable, some are read-only. 6.3.1. The input field separatorThe field separator, which is either a single character or a regular expression, controls the way awk splits up an input record into fields. The input record is scanned for character sequences that match the separator definition; the fields themselves are the text between the matches. The field separator is represented by the built-in variable FS. Note that this is something different from the IFS variable used by POSIX-compliant shells. The value of the field separator variable can be changed in the awk program with the assignment operator =. Often the right time to do this is at the beginning of execution before any input has been processed, so that the very first record is read with the proper separator. To do this, use the special BEGIN pattern. In the example below, we build a command that displays all the users on your system with a description:
In an awk script, it would look like this:
Choose input field separators carefully to prevent problems. An example to illustrate this: say you get input in the form of lines that look like this: "Sandy L. Wong, 64 Zoo St., Antwerp, 2000X" You write a command line or a script, which prints out the name of the person in that record: awk 'BEGIN { FS="," } { print $1, $2, $3 }' inputfile But a person might have a PhD, and it might be written like this: "Sandy L. Wong, PhD, 64 Zoo St., Antwerp, 2000X" Your awk will give the wrong output for this line. If needed, use an extra awk or sed to uniform data input formats. The default input field separator is one or more whitespaces or tabs. 6.3.2. The output separators6.3.2.1. The output field separatorFields are normally separated by spaces in the output. This becomes apparent when you use the correct syntax for the print command, where arguments are separated by commas:
If you don't put in the commas, print will treat the items to output as one argument, thus omitting the use of the default output separator, OFS. Any character string may be used as the output field separator by setting this built-in variable. 6.3.2.2. The output record separatorThe output from an entire print statement is called an output record. Each print command results in one output record, and then outputs a string called the output record separator, ORS. The default value for this variable is "\n", a newline character. Thus, each print statement generates a separate line. To change the way output fields and records are separated, assign new values to OFS and ORS:
If the value of ORS does not contain a newline, the program's output is run together on a single line. 6.3.3. The number of recordsThe built-in NR holds the number of records that are processed. It is incremented after reading a new input line. You can use it at the end to count the total number of records, or in each output record:
6.3.4. User defined variablesApart from the built-in variables, you can define your own. When awk encounters a reference to a variable which does not exist (which is not predefined), the variable is created and initialized to a null string. For all subsequent references, the value of the variable is whatever value was assigned last. Variables can be a string or a numeric value. Content of input fields can also be assigned to variables. Values can be assigned directly using the = operator, or you can use the current value of the variable in combination with other operators:
C-like shorthands like VAR+= value are also accepted. 6.3.5. More examplesThe example from Section 5.3.2 becomes much easier when we use an awk script:
And the command to execute is also much more straightforward when using awk instead of sed:
6.3.6. The printf programFor more precise control over the output format than what is normally provided by print, use printf. The printf command can be used to specify the field width to use for each item, as well as various formatting choices for numbers (such as what output base to use, whether to print an exponent, whether to print a sign, and how many digits to print after the decimal point). This is done by supplying a string, called the format string, that controls how and where to print the other arguments. The syntax is the same as for the C-language printf statement; see your C introduction guide. The gawk info pages contain full explanations. 6.4. SummaryThe gawk utility interprets a special-purpose programming language, handling simple data-reformatting jobs with just a few lines of code. It is the free version of the general UNIX awk command. This tools reads lines of input data and can easily recognize columned output. The print program is the most common for filtering and formatting defined fields. On-the-fly variable declaration is straightforward and allows for simple calculation of sums, statistics and other operations on the processed input stream. Variables and commands can be put in awk scripts for background processing. Other things you should know about awk:
6.5. ExercisesThese are some practical examples where awk can be useful.
Chapter 7. Conditional statements
7.1. Introduction to if7.1.1. GeneralAt times you need to specify different courses of action to be taken in a shell script, depending on the success or failure of a command. The if construction allows you to specify such conditions. The most compact syntax of the if command is: if TEST-COMMANDS; then CONSEQUENT-COMMANDS; fi The TEST-COMMAND list is executed, and if its return status is zero, the CONSEQUENT-COMMANDS list is executed. The return status is the exit status of the last command executed, or zero if no condition tested true. The TEST-COMMAND often involves numerical or string comparison tests, but it can also be any command that returns a status of zero when it succeeds and some other status when it fails. Unary expressions are often used to examine the status of a file. If the FILE argument to one of the primaries is of the form /dev/fd/N, then file descriptor "N" is checked. stdin, stdout and stderr and their respective file descriptors may also be used for tests. 7.1.1.1. Expressions used with ifThe table below contains an overview of the so-called "primaries" that make up the TEST-COMMAND command or list of commands. These primaries are put between square brackets to indicate the test of a conditional expression. Table 7-1. Primary expressions
Expressions may be combined using the following operators, listed in decreasing order of precedence: Table 7-2. Combining expressions
The [ (or test) built-in evaluates conditional expressions using a set of rules based on the number of arguments. More information about this subject can be found in the Bash documentation. Just like the if is closed with fi, the opening angular bracket should be closed after the conditions have been listed. 7.1.1.2. Commands following the then statementThe CONSEQUENT-COMMANDS list that follows the then statement can be any valid UNIX command, any executable program, any executable shell script or any shell statement, with the exception of the closing fi. It is important to remember that the then and fi are considered to be separated statements in the shell. Therefore, when issued on the command line, they are separated by a semi-colon. In a script, the different parts of the if statement are usually well-separated. Below, a couple of simple examples. 7.1.1.3. Checking filesThe first example checks for the existence of a file:
7.1.1.4. Checking shell optionsTo add in your Bash configuration files:
7.1.2. Simple applications of if7.1.2.1. Testing exit statusThe ? variable holds the exit status of the previously executed command (the most recently completed foreground process). The following example shows a simple test:
The following example demonstrates that TEST-COMMANDS might be any UNIX command that returns an exit status, and that if again returns an exit status of zero:
The same result can be obtained as follows:
7.1.2.2. Numeric comparisonsThe examples below use numerical comparisons:
This script is executed by cron every Sunday. If the week number is even, it reminds you to put out the garbage cans:
7.1.2.3. String comparisonsAn example of comparing strings for testing the user ID:
With Bash, you can shorten this type of construct. The compact equivalent of the above test is as follows:
Similar to the "&&" expression which indicates what to do if the test proves true, "||" specifies what to do if the test is false. Regular expressions may also be used in comparisons:
See the info pages for Bash for more information on pattern matching with the "(( EXPRESSION ))" and "[[ EXPRESSION ]]" constructs. 7.2. More advanced if usage7.2.1. if/then/else constructs7.2.1.1. Dummy exampleThis is the construct to use to take one course of action if the if commands test true, and another if it tests false. An example:
Like the CONSEQUENT-COMMANDS list following the then statement, the ALTERNATE-CONSEQUENT-COMMANDS list following the else statement can hold any UNIX-style command that returns an exit status. Another example, extending the one from Section 7.1.2.1:
We switch to the root account to demonstrate the effect of the else statement - your root is usually a local account while your own user account might be managed by a central system, such as an LDAP server. 7.2.1.2. Checking command line argumentsInstead of setting a variable and then executing a script, it is frequently more elegant to put the values for the variables on the command line. We use the positional parameters $1, $2, ..., $N for this purpose. $# refers to the number of command line arguments. $0 refers to the name of the script. The following is a simple example: Here's another example, using two arguments:
7.2.1.3. Testing the number of argumentsThe following example shows how to change the previous script so that it prints a message if more or less than 2 arguments are given:
The first argument is referred to as $1, the second as $2 and so on. The total number of arguments is stored in $#. Check out Section 7.2.5 for a more elegant way to print usage messages. 7.2.1.4. Testing that a file existsThis test is done in a lot of scripts, because there's no use in starting a lot of programs if you know they're not going to work:
Note that the file is referred to using a variable; in this case it is the first argument to the script. Alternatively, when no arguments are given, file locations are usually stored in variables at the beginning of a script, and their content is referred to using these variables. Thus, when you want to change a file name in a script, you only need to do it once. 7.2.2. if/then/elif/else constructs7.2.2.1. GeneralThis is the full form of the if statement: if TEST-COMMANDS; then CONSEQUENT-COMMANDS; elif MORE-TEST-COMMANDS; then MORE-CONSEQUENT-COMMANDS; else ALTERNATE-CONSEQUENT-COMMANDS; fi The TEST-COMMANDS list is executed, and if its return status is zero, the CONSEQUENT-COMMANDS list is executed. If TEST-COMMANDS returns a non-zero status, each elif list is executed in turn, and if its exit status is zero, the corresponding MORE-CONSEQUENT-COMMANDS is executed and the command completes. If else is followed by an ALTERNATE-CONSEQUENT-COMMANDS list, and the final command in the final if or elif clause has a non-zero exit status, then ALTERNATE-CONSEQUENT-COMMANDS is executed. The return status is the exit status of the last command executed, or zero if no condition tested true. 7.2.2.2. ExampleThis is an example that you can put in your crontab for daily execution:
7.2.3. Nested if statementsInside the if statement, you can use another if statement. You may use as many levels of nested ifs as you can logically manage. This is an example testing leap years:
7.2.4. Boolean operationsThe above script can be shortened using the Boolean operators "AND" (&&) and "OR" (||). We use the double brackets for testing an arithmetic expression, see Section 3.4.6. This is equivalent to the let statement. You will get stuck using angular brackets here, if you try something like $[$year % 400], because here, the angular brackets don't represent an actual command by themselves. Among other editors, gvim is one of those supporting colour schemes according to the file format; such editors are useful for detecting errors in your code. 7.2.5. Using the exit statement and ifWe already briefly met the exit statement in Section 7.2.1.3. It terminates execution of the entire script. It is most often used if the input requested from the user is incorrect, if a statement did not run successfully or if some other error occurred. The exit statement takes an optional argument. This argument is the integer exit status code, which is passed back to the parent and stored in the $? variable. A zero argument means that the script ran successfully. Any other value may be used by programmers to pass back different messages to the parent, so that different actions can be taken according to failure or success of the child process. If no argument is given to the exit command, the parent shell uses the current value of the $? variable. Below is an example with a slightly adapted penguin.sh script, which sends its exit status back to the parent, feed.sh:
This script is called upon in the next one, which therefore exports its variables menu and animal:
As you can see, exit status codes can be chosen freely. Existing commands usually have a series of defined codes; see the programmer's manual for each command for more information. 7.3. Using case statements7.3.1. Simplified conditionsNested if statements might be nice, but as soon as you are confronted with a couple of different possible actions to take, they tend to confuse. For the more complex conditionals, use the case syntax: case EXPRESSION in CASE1) COMMAND-LIST;; CASE2) COMMAND-LIST;; ... CASEN) COMMAND-LIST;; esac Each case is an expression matching a pattern. The commands in the COMMAND-LIST for the first match are executed. The "|" symbol is used for separating multiple patterns, and the ")" operator terminates a pattern list. Each case plus its according commands are called a clause. Each clause must be terminated with ";;". Each case statement is ended with the esac statement. In the example, we demonstrate use of cases for sending a more selective warning message with the disktest.sh script:
Of course you could have opened your mail program to check the results; this is just to demonstrate that the script sends a decent mail with "To:", "Subject:" and "From:" header lines. Many more examples using case statements can be found in your system's init script directory. The startup scripts use start and stop cases to run or stop system processes. A theoretical example can be found in the next section. 7.3.2. Initscript exampleInitscripts often make use of case statements for starting, stopping and querying system services. This is an excerpt of the script that starts Anacron, a daemon that runs commands periodically with a frequency specified in days.
The tasks to execute in each case, such as stopping and starting the daemon, are defined in functions, which are partially sourced from the /etc/rc.d/init.d/functions file. See Chapter 11 for more explanation. 7.4. SummaryIn this chapter we learned how to build conditions into our scripts so that different actions can be undertaken upon success or failure of a command. The actions can be determined using the if statement. This allows you to perform arithmetic and string comparisons, and testing of exit code, input and files needed by the script. A simple if/then/fi test often preceeds commands in a shell script in order to prevent output generation, so that the script can easily be run in the background or through the cron facility. More complex definitions of conditions are usually put in a case statement. Upon successful condition testing, the script can explicitly inform the parent using the exit 0 status. Upon failure, any other number may be returned. Based on the return code, the parent program can take appropriate action. 7.5. ExercisesHere are some ideas to get you started using if in scripts:
Chapter 8. Writing interactive scripts8.1. Displaying user messages8.1.1. Interactive or not?Some scripts run without any interaction from the user at all. Advantages of non-interactive scripts include:
Many scripts, however, require input from the user, or give output to the user as the script is running. The advantages of interactive scripts are, among others:
When writing interactive scripts, never hold back on comments. A script that prints appropriate messages is much more user-friendly and can be more easily debugged. A script might do a perfect job, but you will get a whole lot of support calls if it does not inform the user about what it is doing. So include messages that tell the user to wait for output because a calculation is being done. If possible, try to give an indication of how long the user will have to wait. If the waiting should regularly take a long time when executing a certain task, you might want to consider integrating some processing indication in the output of your script. When prompting the user for input, it is also better to give too much than too little information about the kind of data to be entered. This applies to the checking of arguments and the accompanying usage message as well. Bash has the echo and printf commands to provide comments for users, and although you should be familiar with at least the use of echo by now, we will discuss some more examples in the next sections. 8.1.2. Using the echo built-in commandThe echo built-in command outputs its arguments, separated by spaces and terminated with a newline character. The return status is always zero. echo takes a couple of options:
As an example of adding comments, we will make the feed.sh and penguin.sh from Section 7.2.1.2 a bit better:
More about escape characters can be found in Section 3.3.2. The following table gives an overview of sequences recognized by the echo command: Table 8-1. Escape sequences used by the echo command
For more information about the printf command and the way it allows you to format output, see the Bash info pages. 8.2. Catching user input8.2.1. Using the read built-in commandThe read built-in command is the counterpart of the echo and printf commands. The syntax of the read command is as follows: read [options] NAME1 NAME2 ... NAMEN One line is read from the standard input, or from the file descriptor supplied as an argument to the -u option. The first word of the line is assigned to the first name, NAME1, the second word to the second name, and so on, with leftover words and their intervening separators assigned to the last name, NAMEN. If there are fewer words read from the input stream than there are names, the remaining names are assigned empty values. The characters in the value of the IFS variable are used to split the input line into words or tokens; see Section 3.4.8. The backslash character may be used to remove any special meaning for the next character read and for line continuation. If no names are supplied, the line read is assigned to the variable REPLY. The return code of the read command is zero, unless an end-of-file character is encountered, if read times out or if an invalid file descriptor is supplied as the argument to the -u option. The following options are supported by the Bash read built-in: Table 8-2. Options to the read built-in
This is a straightforward example, improving on the leaptest.sh script from the previous chapter:
8.2.2. Prompting for user inputThe following example shows how you can use prompts to explain what the user should enter.
Note that no output is omitted here. The script only stores information about the people Michel is interested in, but it will always say you are added to the list, unless you are already in it. Other people can now start executing the script:
After a while, the friends list begins to look like this:
Of course, this situation is not ideal, since everybody can edit (but not delete) Michel's files. You can solve this problem using special access modes on the script file, see SUID and SGID in the Introduction to Linux guide. 8.2.3. Redirection and file descriptors8.2.3.1. GeneralAs you know from basic shell usage, input and output of a command may be redirected before it is executed, using a special notation - the redirection operators - interpreted by the shell. Redirection may also be used to open and close files for the current shell execution environment. Redirection can also occur in a script, so that it can receive input from a file, for instance, or send output to a file. Later, the user can review the output file, or it may be used by another script as input. File input and output are accomplished by integer handles that track all open files for a given process. These numeric values are known as file descriptors. The best known file descriptors are stdin, stdout and stderr, with file descriptor numbers 0, 1 and 2, respectively. These numbers and respective devices are reserved. Bash can take TCP or UDP ports on networked hosts as file descriptors as well. The output below shows how the reserved file descriptors point to actual devices:
You might want to check info MAKEDEV and info proc for more information about /proc subdirectories and the way your system handles standard file descriptors for each running process. When you run a script from the command line, nothing much changes because the child shell process will use the same file descriptors as the parent. When no such parent is available, for instance when you run a script using the cron facility, the standard file descriptors are pipes or other (temporary) files, unless some form of redirection is used. This is demonstrated in the example below, which shows output from a simple at script:
And one with cron:
8.2.3.2. Redirection of errorsFrom the previous examples, it is clear that you can provide input and output files for a script (see Section 8.2.4 for more), but some tend to forget about redirecting errors - output which might be depended upon later on. Also, if you are lucky, errors will be mailed to you and eventual causes of failure might get revealed. If you are not as lucky, errors will cause your script to fail and won't be caught or sent anywhere, so that you can't start to do any worthwhile debugging. When redirecting errors, note that the order of precedence is significant. For example, this command, issued in /var/spool
will redirect output of the ls command to the file unaccessible-in-spool in /var/tmp. The command
will direct both standard input and standard error to the file spoollist. The command
directs only the standard output to the destination file, because the standard error is copied to standard output before the standard output is redirected. For convenience, errors are often redirected to /dev/null, if it is sure they will not be needed. Hundreds of examples can be found in the startup scripts for your system. Bash allows for both standard output and standard error to be redirected to the file whose name is the result of the expansion of FILE with this construct: &> FILE This is the equivalent of > FILE 2>&1, the construct used in the previous set of examples. It is also often combined with redirection to /dev/null, for instance when you just want a command to execute, no matter what output or errors it gives. 8.2.4. File input and output8.2.4.1. Using /dev/fdThe /dev/fd directory contains entries named 0, 1, 2, and so on. Opening the file /dev/fd/N is equivalent to duplicating file descriptor N. If your system provides /dev/stdin, /dev/stdout and /dev/stderr, you will see that these are equivalent to /dev/fd/0, /dev/fd/1 and /dev/fd/2, respectively. The main use of the /dev/fd files is from the shell. This mechanism allows for programs that use pathname arguments to handle standard input and standard output in the same way as other pathnames. If /dev/fd is not available on a system, you'll have to find a way to bypass the problem. This can be done for instance using a hyphen (-) to indicate that a program should read from a pipe. An example:
The cat command first reads the file header.txt, next its standard input which is the output of the filter command, and last the footer.txt file. The special meaning of the hyphen as a command-line argument to refer to the standard input or standard output is a misconception that has crept into many programs. There might also be problems when specifying hyphen as the first argument, since it might be interpreted as an option to the preceding command. Using /dev/fd allows for uniformity and prevents confusion:
In this clean example, all output is additionally piped through lp to send it to the default printer. 8.2.4.2. Read and exec8.2.4.2.1. Assigning file descriptors to filesAnother way of looking at file descriptors is thinking of them as a way to assign a numeric value to a file. Instead of using the file name, you can use the file descriptor number. The exec built-in command is used to assign a file descriptor to a file. Use exec fdN> file for assigning file descriptor N to file for output, and exec fdN< file for assigning file descriptor N to file for input. After a file descriptor has been assigned to a file, it can be used with the shell redirection operators, as is demonstrated in the following example:
8.2.4.2.2. Read in scriptsThe following is an example that shows how you can alternate between file input and command line input:
8.2.4.3. Closing file descriptorsSince child processes inherit open file descriptors, it is good practice to close a file descriptor when it is no longer needed. This is done using the exec fd<&- syntax. In the above example, file descriptor 7, which has been assigned to standard input, is closed each time the user needs to have access to the actual standard input device, usually the keyboard. The following is a simple example redirecting only standard error to a pipe:
8.2.4.4. Here documentsFrequently, your script might call on another program or script that requires input. The here document provides a way of instructing the shell to read input from the current source until a line containing only the search string is found (no trailing blanks). All of the lines read up to that point are then used as the standard input for a command. The result is that you don't need to call on separate files; you can use shell-special characters, and it looks nicer than a bunch of echo's:
Although we talk about a here document, it is supposed to be a construct within the same script. This is an example that installs a package automatically, eventhough you should normally confirm:
And this is how the script runs. When prompted with the "Is this ok [y/N]" string, the script answers "y" automatically:
8.3. SummaryIn this chapter, we learned how to provide user comments and how to prompt for user input. This is usually done using the echo/read combination. We also discussed how files can be used as input and output using file descriptors and redirection, and how this can be combined with getting input from the user. We stressed the importance of providing ample message for the users of our scripts. As always when others use your scripts, it is better to give too much information than not enough. Here documents is a type of shell construct that allows creation of lists, holding choices for the users. This construct can also be used to execute otherwise interactive tasks in the background, without intervention. 8.4. ExercisesThese exercises are practical applications of the constructs discussed in this chapter. When writing the scripts, you may test by using a test directory that does not contain too much data. Write each step, then test that portion of code, rather than writing everything at once.
Chapter 9. Repetitive tasks9.1. The for loop9.1.1. How does it work?The for loop is the first of the three shell looping constructs. This loop allows for specification of a list of values. A list of commands is executed for each value in the list. The syntax for this loop is: for NAME [in LIST ]; do COMMANDS; done If [in LIST] is not present, it is replaced with in $@ and for executes the COMMANDS once for each positional parameter that is set (see Section 3.2.5 and Section 7.2.1.2). The return status is the exit status of the last command that executes. If no commands are executed because LIST does not expand to any items, the return status is zero. NAME can be any variable name, although i is used very often. LIST can be any list of words, strings or numbers, which can be literal or generated by any command. The COMMANDS to execute can also be any operating system commands, script, program or shell statement. The first time through the loop, NAME is set to the first item in LIST. The second time, its value is set to the second item in the list, and so on. The loop terminates when NAME has taken on each of the values from LIST and no items are left in LIST. 9.1.2. Examples9.1.2.1. Using command substitution for specifying LIST itemsThe first is a command line example, demonstrating the use of a for loop that makes a backup copy of each .xml file. After issuing the command, it is safe to start working on your sources:
This one lists the files in /sbin that are just plain text files, and possibly scripts:
9.1.2.2. Using the content of a variable to specify LIST itemsThe following is a specific application script for converting HTML files, compliant with a certain scheme, to PHP files. The conversion is done by taking out the first 25 and the last 21 lines, replacing these with two PHP tags that provide header and footer lines:
Since we don't do a line count here, there is no way of knowing the line number from which to start deleting lines until reaching the end. The problem is solved using tac, which reverses the lines in a file. 9.2. The while loop9.2.1. What is it?The while construct allows for repetitive execution of a list of commands, as long as the command controlling the while loop executes successfully (exit status of zero). The syntax is: while CONTROL-COMMAND; do CONSEQUENT-COMMANDS; done CONTROL-COMMAND can be any command(s) that can exit with a success or failure status. The CONSEQUENT-COMMANDS can be any program, script or shell construct. As soon as the CONTROL-COMMAND fails, the loop exits. In a script, the command following the done statement is executed. The return status is the exit status of the last CONSEQUENT-COMMANDS command, or zero if none was executed. 9.2.2. Examples9.2.2.1. Simple example using whileHere is an example for the impatient:
9.2.2.2. Nested while loopsThe example below was written to copy pictures that are made with a webcam to a web directory. Every five minutes a picture is taken. Every hour, a new directory is created, holding the images for that hour. Every day, a new directory is created containing 24 subdirectories. The script runs in the background.
Note the use of the true statement. This means: continue execution until we are forcibly interrupted (with kill or Ctrl+C). This small script can be used for simulation testing; it generates files:
Note the use of the date command to generate all kinds of file and directory names. See the man page for more.
9.2.2.3. Using keyboard input to control the while loopThis script can be interrupted by the user when a Ctrl+C sequence is entered:
A here document is used to present the user with possible choices. And again, the true test repeats the commands from the CONSEQUENT-COMMANDS list over and over again. 9.2.2.4. Calculating an averageThis script calculates the average of user input, which is tested before it is processed: if input is not within range, a message is printed. If q is pressed, the loop exits:
Note how the variables in the last lines are left unquoted in order to do arithmetic. 9.3. The until loop9.3.1. What is it?The until loop is very similar to the while loop, except that the loop executes until the TEST-COMMAND executes successfully. As long as this command fails, the loop continues. The syntax is the same as for the while loop: until TEST-COMMAND; do CONSEQUENT-COMMANDS; done The return status is the exit status of the last command executed in the CONSEQUENT-COMMANDS list, or zero if none was executed. TEST-COMMAND can, again, be any command that can exit with a success or failure status, and CONSEQUENT-COMMANDS can be any UNIX command, script or shell construct. As we already explained previously, the ";" may be replaced with one or more newlines wherever it appears. 9.3.2. ExampleAn improved picturesort.sh script (see Section 9.2.2.2), which tests for available disk space. If not enough disk space is available, remove pictures from the previous months:
Note the initialization of the HOUR and DISKFULL variables and the use of options with ls and date in order to obtain a correct listing for TOREMOVE. 9.4. I/0 redirection and loops9.4.1. Input redirectionInstead of controlling a loop by testing the result of a command or by user input, you can specify a file from which to read input that controls the loop. In such cases, read is often the controlling command. As long as input lines are fed into the loop, execution of the loop commands continues. As soon as all the input lines are read the loop exits. Since the loop construct is considered to be one command structure (such as while TEST-COMMAND; do CONSEQUENT-COMMANDS; done), the redirection should occur after the done statement, so that it complies with the form command < file This kind of redirection also works with other kinds of loops. 9.4.2. Output redirectionIn the example below, output of the find command is used as input for the read command controlling a while loop:
Files are compressed before they are moved into the archive directory. 9.5. Break and continue9.5.1. The break built-inThe break statement is used to exit the current loop before its normal ending. This is done when you don't know in advance how many times the loop will have to execute, for instance because it is dependent on user input. The example below demonstrates a while loop that can be interrupted. This is a slightly improved version of the wisdom.sh script from Section 9.2.2.3.
Mind that break exits the loop, not the script. This can be demonstrated by adding an echo command at the end of the script. This echo will also be executed upon input that causes break to be executed (when the user types "0"). In nested loops, break allows for specification of which loop to exit. See the Bash info pages for more. 9.5.2. The continue built-inThe continue statement resumes iteration of an enclosing for, while, until or select loop. When used in a for loop, the controlling variable takes on the value of the next element in the list. When used in a while or until construct, on the other hand, execution resumes with TEST-COMMAND at the top of the loop. 9.5.3. ExamplesIn the following example, file names are converted to lower case. If no conversion needs to be done, a continue statement restarts execution of the loop. These commands don't eat much system resources, and most likely, similar problems can be solved using sed and awk. However, it is useful to know about this kind of construction when executing heavy jobs, that might not even be necessary when tests are inserted at the correct locations in a script, sparing system resources.
This script has at least one disadvantage: it overwrites existing files. The noclobber option to Bash is only useful when redirection occurs. The -b option to the mv command provides more security, but is only safe in case of one accidental overwrite, as is demonstrated in this test:
The tr is part of the textutils package; it can perform all kinds of character transformations. 9.6. Making menus with the select built-in9.6.1. General9.6.1.1. Use of selectThe select construct allows easy menu generation. The syntax is quite similar to that of the for loop: select WORD [in LIST]; do RESPECTIVE-COMMANDS; done LIST is expanded, generating a list of items. The expansion is printed to standard error; each item is preceded by a number. If in LIST is not present, the positional parameters are printed, as if in $@ would have been specified. LIST is only printed once. Upon printing all the items, the PS3 prompt is printed and one line from standard input is read. If this line consists of a number corresponding to one of the items, the value of WORD is set to the name of that item. If the line is empty, the items and the PS3 prompt are displayed again. If an EOF (End Of File) character is read, the loop exits. Since most users don't have a clue which key combination is used for the EOF sequence, it is more user-friendly to have a break command as one of the items. Any other value of the read line will set WORD to be a null string. The read line is saved in the REPLY variable. The RESPECTIVE-COMMANDS are executed after each selection until the number representing the break is read. This exits the loop. 9.6.1.2. ExamplesThis is a very simple example, but as you can see, it is not very user-friendly:
Setting the PS3 prompt and adding a possibility to quit makes it better:
9.7. The shift built-in9.7.1. What does it do?The shift command is one of the Bourne shell built-ins that comes with Bash. This command takes one argument, a number. The positional parameters are shifted to the left by this number, N. The positional parameters from N+1 to $# are renamed to variable names from $1 to $# - N+1. Say you have a command that takes 10 arguments, and N is 4, then $4 becomes $1, $5 becomes $2 and so on. $10 becomes $7 and the original $1, $2 and $3 are thrown away. If N is zero or greater than $# (the total number of arguments, see Section 7.2.1.2). If N is not present, it is assumed to be 1. The return status is zero unless N is greater than $# or less than zero; otherwise it is non-zero. 9.7.2. ExamplesA shift statement is typically used when the number of arguments to a command is not known in advance, for instance when users can give as many arguments as they like. In such cases, the arguments are usually processed in a while loop with a test condition of (( $# )). This condition is true as long as the number of arguments is greater than zero. The $1 variable and the shift statement process each argument. The number of arguments is reduced each time shift is executed and eventually becomes zero, upon which the while loop exits. The example below, cleanup.sh, uses shift statements to process each file in the list generated by find:
In the next example, we modified the script from Section 8.2.4.4 so that it accepts multiple packages to install at once:
9.8. SummaryIn this chapter, we discussed how repetitive commands can be incorporated in loop constructs. Most common loops are built using the for, while or until statements, or a combination of these commands. The for loop executes a task a defined number of times. If you don't know how many times a command should execute, use either until or while to specify when the loop should end. Loops can be interrupted or reiterated using the break and continue statements. A file can be used as input for a loop using the input redirection operator, loops can also read output from commands that is fed into the loop using a pipe. The select construct is used for printing menus in interactive scripts. Looping through the command line arguments to a script can be done using the shift statement. 9.9. ExercisesRemember: when building scripts, work in steps and test each step before incorporating it in your script.
Chapter 10. More on variables10.1. Types of variables10.1.1. General assignment of valuesAs we already saw, Bash understands many different kinds of variables or parameters. Thus far, we haven't bothered much with what kind of variables we assigned, so our variables could hold any value that we assigned to them. A simple command line example demonstrates this:
There are cases when you want to avoid this kind of behavior, for instance when handling telephone and other numbers. Apart from integers and variables, you may also want to specify a variable that is a constant. This is often done at the beginning of a script, when the value of the constant is declared. After that, there are only references to the constant variable name, so that when the constant needs to be changed, it only has to be done once. A variable may also be a series of variables of any type, a so-called array of variables (VAR0VAR1, VAR2, ... VARN). 10.1.2. Using the declare built-inUsing a declare statement, we can limit the value assignment to variables. The syntax for declare is the following: declare OPTION(s) VARIABLE=value The following options are used to determine the type of data the variable can hold and to assign it attributes: Table 10-1. Options to the declare built-in
Using + instead of - turns off the attribute instead. When used in a function, declare creates local variables. The following example shows how assignment of a type to a variable influences the value.
Note that Bash has an option to declare a numeric value, but none for declaring string values. This is because, by default, if no specifications are given, a variable can hold any type of data:
As soon as you restrict assignment of values to a variable, it can only hold that type of data. Possible restrictions are either integer, constant or array. See the Bash info pages for information on return status. 10.1.3. ConstantsIn Bash, constants are created by making a variable read-only. The readonly built-in marks each specified variable as unchangeable. The syntax is: readonly OPTION VARIABLE(s) The values of these variables can then no longer be changed by subsequent assignment. If the -f option is given, each variable refers to a shell function; see Chapter 11. If -a is specified, each variable refers to an array of variables. If no arguments are given, or if -p is supplied, a list of all read-only variables is displayed. Using the -p option, the output can be reused as input. The return status is zero, unless an invalid option was specified, one of the variables or functions does not exist, or -f was supplied for a variable name instead of for a function name.
10.2. Array variables10.2.1. Creating arraysAn array is a variable containing multiple values. Any variable may be used as an array. There is no maximum limit to the size of an array, nor any requirement that member variables be indexed or assigned contiguously. Arrays are zero-based: the first element is indexed with the number 0. Indirect declaration is done using the following syntax to declare a variable: ARRAY[INDEXNR]=value The INDEXNR is treated as an arithmetic expression that must evaluate to a positive number. Explicit declaration of an array is done using the declare built-in: declare -a ARRAYNAME A declaration with an index number will also be accepted, but the index number will be ignored. Attributes to the array may be specified using the declare and readonly built-ins. Attributes apply to all variables in the array; you can't have mixed arrays. Array variables may also be created using compound assignments in this format: ARRAY=(value1 value2 ... valueN) Each value is then in the form of [indexnumber=]string. The index number is optional. If it is supplied, that index is assigned to it; otherwise the index of the element assigned is the number of the last index that was assigned, plus one. This format is accepted by declare as well. If no index numbers are supplied, indexing starts at zero. Adding missing or extra members in an array is done using the syntax: ARRAYNAME[indexnumber]=value Remember that the read built-in provides the -a option, which allows for reading and assigning values for member variables of an array. 10.2.2. Dereferencing the variables in an arrayIn order to refer to the content of an item in an array, use curly braces. This is necessary, as you can see from the following example, to bypass the shell interpretation of expansion operators. If the index number is @ or *, all members of an array are referenced.
Referring to the content of a member variable of an array without providing an index number is the same as referring to the content of the first element, the one referenced with index number zero. 10.2.3. Deleting array variablesThe unset built-in is used to destroy arrays or member variables of an array:
10.2.4. Examples of arraysPractical examples of the usage of arrays are hard to find. You will find plenty of scripts that don't really do anything on your system but that do use arrays to calculate mathematical series, for instance. And that would be one of the more interesting examples...most scripts just show what you can do with an array in an oversimplified and theoretical way. The reason for this dullness is that arrays are rather complex structures. You will find that most practical examples for which arrays could be used are already implemented on your system using arrays, however on a lower level, in the C programming language in which most UNIX commands are written. A good example is the Bash history built-in command. Those readers who are interested might check the built-ins directory in the Bash source tree and take a look at fc.def, which is processed when compiling the built-ins. Another reason good examples are hard to find is that not all shells support arrays, so they break compatibility. After long days of searching, I finally found this example operating at an Internet provider. It distributes Apache web server configuration files onto hosts in a web farm:
First two tests are performed to check whether the correct user is running the script with the correct arguments. The names of the hosts that need to be configured are listed in the array farm_hosts. Then all these hosts are provided with the Apache configuration file, after which the daemon is restarted. Note the use of commands from the Secure Shell suite, encrypting the connections to remote hosts. Thanks, Eugene and colleague, for this contribution. Dan Richter contributed the following example. This is the problem he was confronted with: "...In my company, we have demos on our web site, and every week someone has to test all of them. So I have a cron job that fills an array with the possible candidates, uses date +%W to find the week of the year, and does a modulo operation to find the correct index. The lucky person gets notified by e-mail." And this was his way of solving it:
This script is then used in other scripts, such as this one, which uses a here document:
10.3. Operations on variables10.3.1. Arithmetic on variablesWe discussed this already in Section 3.4.6. 10.3.2. Length of a variableUsing the ${#VAR} syntax will calculate the number of characters in a variable. If VAR is "*" or "@", this value is substituted with the number of positional parameters or number of elements in an array in general. This is demonstrated in the example below:
10.3.3. Transformations of variables10.3.3.1. Substitution${VAR:-WORD} If VAR is not defined or null, the expansion of WORD is substituted; otherwise the value of VAR is substituted:
This form is often used in conditional tests, for instance in this one:
It is a shorter notation for
See Section 7.1.2.3 for more information about this type of condition testing. If the hyphen (-) is replaced with the equal sign (=), the value is assigned to the parameter if it does not exist:
The following syntax tests the existence of a variable. If it is not set, the expansion of WORD is printed to standard out and non-interactive shells quit. A demonstration:
Using "+" instead of the exclamation mark sets the variable to the expansion of WORD; if it does not exist, nothing happens. 10.3.3.2. Removing substringsTo strip a number of characters, equal to OFFSET, from a variable, use this syntax: ${VAR:OFFSET:LENGTH} The LENGTH parameter defines how many characters to keep, starting from the first character after the offset point. If LENGTH is omitted, the remainder of the variable content is taken:
${VAR#WORD} and ${VAR##WORD} These constructs are used for deleting the pattern matching the expansion of WORD in VAR. WORD is expanded to produce a pattern just as in file name expansion. If the pattern matches the beginning of the expanded value of VAR, then the result of the expansion is the expanded value of VAR with the shortest matching pattern ("#") or the longest matching pattern (indicated with "##"). If VAR is * or @, the pattern removal operation is applied to each positional parameter in turn, and the expansion is the resultant list. If VAR is an array variable subscribed with "*" or "@", the pattern removal operation is applied to each member of the array in turn, and the expansion is the resultant list. This is shown in the examples below:
The opposite effect is obtained using "%" and "%%", as in this example below. WORD should match a trailing portion of string:
10.3.3.3. Replacing parts of variable namesThis is done using the ${VAR/PATTERN/STRING} or ${VAR//PATTERN/STRING} syntax. The first form replaces only the first match, the second replaces all matches of PATTERN with STRING:
More information can be found in the Bash info pages. 10.4. SummaryNormally, a variable can hold any type of data, unless variables are declared explicitly. Constant variables are set using the readonly built-in command. An array holds a set of variables. If a type of data is declared, then all elements in the array will be set to hold only this type of data. Bash features allow for substitution and transformation of variables "on the fly". Standard operations include calculating the length of a variable, arithmetic on variables, substituting variable content and substituting part of the content. 10.5. ExercisesHere are some brain crackers:
Chapter 11. Functions11.1. Introduction11.1.1. What are functions?Shell functions are a way to group commands for later execution, using a single name for this group, or routine. The name of the routine must be unique within the shell or script. All the commands that make up a function are executed like regular commands. When calling on a function as a simple command name, the list of commands associated with that function name is executed. A function is executed within the shell in which it has been declared: no new process is created to interpret the commands. Special built-in commands are found before shell functions during command lookup. The special built-ins are: break, :, ., continue, eval, exec, exit, export, readonly, return, set, shift, trap and unset. 11.1.2. Function syntaxFunctions either use the syntax function FUNCTION { COMMANDS; } or FUNCTION () { COMMANDS; } Both define a shell function FUNCTION. The use of the built-in command function is optional; however, if it is not used, parentheses are needed. The commands listed between curly braces make up the body of the function. These commands are executed whenever FUNCTION is specified as the name of a command. The exit status is the exit status of the last command executed in the body.
11.1.3. Positional parameters in functionsFunctions are like mini-scripts: they can accept parameters, they can use variables only known within the function (using the local shell built-in) and they can return values to the calling shell. A function also has a system for interpreting positional parameters. However, the positional parameters passed to a function are not the same as the ones passed to a command or script. When a function is executed, the arguments to the function become the positional parameters during its execution. The special parameter # that expands to the number of positional parameters is updated to reflect the change. Positional parameter 0 is unchanged. The Bash variable FUNCNAME is set to the name of the function, while it is executing. If the return built-in is executed in a function, the function completes and execution resumes with the next command after the function call. When a function completes, the values of the positional parameters and the special parameter # are restored to the values they had prior to the function's execution. If a numeric argument is given to return, that status is returned. A simple example:
Note that the return value or exit code of the function is often storen in a variable, so that it can be probed at a later point. The init scripts on your system often use the technique of probing the RETVAL variable in a conditional test, like this one:
Or like this example from the /etc/init.d/amd script, where Bash's optimazation features are used:
The commands after && are only executed when the test proves to be true; this is a shorter way to represent an if/then/fi structure. The return code of the function is often used as exit code of the entire script. You'll see a lot of initscripts ending in something like exit $RETVAL. 11.1.4. Displaying functionsAll functions known by the current shell can be displayed using the set built-in without options. Functions are retained after they are used, unless they are unset after use. The which command also displays functions:
This is the sort of function that is typically configured in the user's shell resource configuration files. Functions are more flexible than aliases and provide a simple and easy way of adapting the user environment. Here's one for DOS users:
11.2. Examples of functions in scripts11.2.1. RecyclingThere are plenty of scripts on your system that use functions as a structured way of handling series of commands. On some Linux systems, for instance, you will find the /etc/rc.d/init.d/functions definition file, which is sourced in all init scripts. Using this method, common tasks such as checking if a process runs, starting or stopping a daemon and so on, only have to be written once, in a general way. If the same task is needed again, the code is recycled. From this functions file the checkpid function:
This function is reused in the same script in other functions, which are reused in other scripts. The daemon function, for instance, is used in the majority of the startup scripts for starting a server process (on machines that use this system). 11.2.2. Setting the pathThis section might be found in your /etc/profile file. The function pathmunge is defined and then used to set the path for the root and other users:
The function takes its first argument to be a path name. If this path name is not yet in the current path, it is added. The second argument to the function defines if the path will be added in front or after the current PATH definition. Normal users only get /usr/X11R6/bin added to their paths, while root gets a couple of extra directories containing system commands. After being used, the function is unset so that it is not retained. 11.2.3. Remote backupsThe following example is one that I use for making backups of the files for my books. It uses SSH keys for enabling the remote connection. Two functions are defined, buplinux and bupbash, that each make a .tar file, which is then compressed and sent to a remote server. After that, the local copy is cleaned up. On Sunday, only bupbash is executed.
This script runs from cron, meaning without user interaction, so we redirect standard error from the scp command to /dev/null. It might be argued that all the separate steps can be combined in a command such as tar c dir_to_backup/ | bzip2 | ssh server "cat > backup.tar.bz2" However, if you are interested in intermediate results, which might be recovered upon failure of the script, this is not what you want. The expression command &> file is equivalent to command > file 2>&1 11.3. SummaryFunctions provide an easy way of grouping commands that you need to execute repetitively. When a function is running, the positional parameters are changed to those of the function. When it stops, they are reset to those of the calling program. Functions are like mini-scripts, and just like a script, they generate exit or return codes. While this was a short chapter, it contains important knowledge needed for achieving the ultimate state of laziness that is the typical goal of any system administrator. 11.4. ExercisesHere are some useful things you can do using functions:
Chapter 12. Catching signals12.1. Signals12.1.1. Introduction12.1.1.1. Finding the signal man pageYour system contains a man page listing all the available signals, but depending on your operating system, it might be opened in a different way. On most Linux systems, this will be man 7 signal. When in doubt, locate the exact man page and section using commands like man -k signal | grep list or apropos signal | grep list Signal names can be found using kill -l. 12.1.1.2. Signals to your Bash shellIn the absence of any traps, an interactive Bash shell ignores SIGTERM and SIGQUIT. SIGINT is caught and handled, and if job control is active, SIGTTIN, SIGTTOU and SIGTSTP are also ignored. Commands that are run as the result of a command substitution also ignore these signals, when keyboard generated. SIGHUP by default exits a shell. An interactive shell will send a SIGHUP to all jobs, running or stopped; see the documentation on the disown built-in if you want to disable this default behavior for a particular process. Use the huponexit option for killing all jobs upon receiving a SIGHUP signal, using the shopt built-in. 12.1.1.3. Sending signals using the shellThe following signals can be sent using the Bash shell: Table 12-1. Control signals in Bash
12.1.2. Usage of signals with killMost modern shells, Bash included, have a built-in kill function. In Bash, both signal names and numbers are accepted as options, and arguments may be job or process IDs. An exit status can be reported using the -l option: zero when at least one signal was successfully sent, non-zero if an error occurred. Using the kill command from /usr/bin, your system might enable extra options, such as the ability to kill processes from other than your own user ID and specifying processes by name, like with pgrep and pkill. Both kill commands send the TERM signal if none is given. This is a list of the most common signals: Table 12-2. Common kill signals
When killing a process or series of processes, it is common sense to start trying with the least dangerous signal, SIGTERM. That way, programs that care about an orderly shutdown get the chance to follow the procedures that they have been designed to execute when getting the SIGTERM signal, such as cleaning up and closing open files. If you send a SIGKILL to a process, you remove any chance for the process to do a tidy cleanup and shutdown, which might have unfortunate consequences. But if a clean termination does not work, the INT orKILL signals might be the only way. For instance, when a process does not die using Ctrl+C, it is best to use the kill -9 on that process ID:
When a process starts up several instances, killall might be easier. It takes the same option as the kill command, but applies on all instances of a given process. Test this command before using it in a production environment, since it might not work as expected on some of the commercial Unices. 12.2. Traps12.2.1. GeneralThere might be situations when you don't want users of your scripts to exit untimely using keyboard abort sequences, for example because input has to be provided or cleanup has to be done. The trap statement catches these sequences and can be programmed to execute a list of commands upon catching those signals. The syntax for the trap statement is straightforward: trap [COMMANDS] [SIGNALS] This instructs the trap command to catch the listed SIGNALS, which may be signal names with or without the SIG prefix, or signal numbers. If a signal is 0 or EXIT, the COMMANDS are executed when the shell exits. If one of the signals is DEBUG, the list of COMMANDS is executed after every simple command. A signal may also be specified as ERR; in that case COMMANDS are executed each time a simple command exits with a non-zero status. Note that these commands will not be executed when the non-zero exit status comes from part of an if statement, or from a while or until loop. Neither will they be executed if a logical AND (&&) or OR (||) result in a non-zero exit code, or when a command's return status is inverted using the ! operator. The return status of the trap command itself is zero unless an invalid signal specification is encountered. The trap command takes a couple of options, which are documented in the Bash info pages. Here is a very simple example, catching Ctrl+C from the user, upon which a message is printed. When you try to kill this program without specifying the KILL signal, nothing will happen:
12.2.2. How Bash interprets trapsWhen Bash receives a signal for which a trap has been set while waiting for a command to complete, the trap will not be executed until the command completes. When Bash is waiting for an asynchronous command via the wait built-in, the reception of a signal for which a trap has been set will cause the wait built-in to return immediately with an exit status greater than 128, immediately after which the trap is executed. 12.2.3. More examples12.2.3.1. Detecting when a variable is usedWhen debugging longer scripts, you might want to give a variable the trace attribute and trap DEBUG messages for that variable. Normally you would just declare a variable using an assignment like VARIABLE=value. Replacing the declaration of the variable with the following lines might provide valuable information about what your script is doing:
12.2.3.2. Removing rubbish upon exitThe whatis command relies on a database which is regularly built using the makewhatis.cron script with cron:
12.3. SummarySignals can be sent to your programs using the kill command or keyboard shortcuts. These signals can be caught, upon which action can be performed, using the trap statement. Some programs ignore signals. The only signal that no program can ignore is the KILL signal. 12.4. ExercisesA couple of practical examples:
Appendix A. Shell FeaturesA.1. Common featuresThe following features are standard in every shell. Note that the stop, suspend, jobs, bg and fg commands are only available on systems that support job control. Table A-1. Common Shell Features
A.2. Differing featuresThe table below shows major differences between the standard shell (sh), Bourne Again SHell (bash), Korn shell (ksh) and the C shell (csh).
Table A-2. Differing Shell Features
The Bourne Again SHell has many more features not listed here. This table is just to give you an idea of how this shell incorporates all useful ideas from other shells: there are no blanks in the column for bash. More information on features found only in Bash can be retrieved from the Bash info pages, in the "Bash Features" section. More information: You should at least read one manual, being the manual of your shell. The preferred choice would be info bash, bash being the GNU shell and easiest for beginners. Print it out and take it home, study it whenever you have 5 minutes. Appendix B. GNU Free Documentation LicenseVersion 1.1, March 2000
B.1. PreambleThe purpose of this License is to make a manual, textbook, or other written document "free" in the sense of freedom: to assure everyone the effective freedom to copy and redistribute it, with or without modifying it, either commercially or noncommercially. Secondarily, this License preserves for the author and publisher a way to get credit for their work, while not being considered responsible for modifications made by others. This License is a kind of "copyleft", which means that derivative works of the document must themselves be free in the same sense. It complements the GNU General Public License, which is a copyleft license designed for free software. We have designed this License in order to use it for manuals for free software, because free software needs free documentation: a free program should come with manuals providing the same freedoms that the software does. But this License is not limited to software manuals; it can be used for any textual work, regardless of subject matter or whether it is published as a printed book. We recommend this License principally for works whose purpose is instruction or reference. B.2. Applicability and definitionsThis License applies to any manual or other work that contains a notice placed by the copyright holder saying it can be distributed under the terms of this License. The "Document", below, refers to any such manual or work. Any member of the public is a licensee, and is addressed as "you". A "Modified Version" of the Document means any work containing the Document or a portion of it, either copied verbatim, or with modifications and/or translated into another language. A "Secondary Section" is a named appendix or a front-matter section of the Document that deals exclusively with the relationship of the publishers or authors of the Document to the Document's overall subject (or to related matters) and contains nothing that could fall directly within that overall subject. (For example, if the Document is in part a textbook of mathematics, a Secondary Section may not explain any mathematics.) The relationship could be a matter of historical connection with the subject or with related matters, or of legal, commercial, philosophical, ethical or political position regarding them. The "Invariant Sections" are certain Secondary Sections whose titles are designated, as being those of Invariant Sections, in the notice that says that the Document is released under this License. The "Cover Texts" are certain short passages of text that are listed, as Front-Cover Texts or Back-Cover Texts, in the notice that says that the Document is released under this License. A "Transparent" copy of the Document means a machine-readable copy, represented in a format whose specification is available to the general public, whose contents can be viewed and edited directly and straightforwardly with generic text editors or (for images composed of pixels) generic paint programs or (for drawings) some widely available drawing editor, and that is suitable for input to text formatters or for automatic translation to a variety of formats suitable for input to text formatters. A copy made in an otherwise Transparent file format whose markup has been designed to thwart or discourage subsequent modification by readers is not Transparent. A copy that is not "Transparent" is called "Opaque". Examples of suitable formats for Transparent copies include plain ASCII without markup, Texinfo input format, LaTeX input format, SGML or XML using a publicly available DTD, and standard-conforming simple HTML designed for human modification. Opaque formats include PostScript, PDF, proprietary formats that can be read and edited only by proprietary word processors, SGML or XML for which the DTD and/or processing tools are not generally available, and the machine-generated HTML produced by some word processors for output purposes only. The "Title Page" means, for a printed book, the title page itself, plus such following pages as are needed to hold, legibly, the material this License requires to appear in the title page. For works in formats which do not have any title page as such, "Title Page" means the text near the most prominent appearance of the work's title, preceding the beginning of the body of the text. B.3. Verbatim copyingYou may copy and distribute the Document in any medium, either commercially or noncommercially, provided that this License, the copyright notices, and the license notice saying this License applies to the Document are reproduced in all copies, and that you add no other conditions whatsoever to those of this License. You may not use technical measures to obstruct or control the reading or further copying of the copies you make or distribute. However, you may accept compensation in exchange for copies. If you distribute a large enough number of copies you must also follow the conditions in section 3. You may also lend copies, under the same conditions stated above, and you may publicly display copies. B.4. Copying in quantityIf you publish printed copies of the Document numbering more than 100, and the Document's license notice requires Cover Texts, you must enclose the copies in covers that carry, clearly and legibly, all these Cover Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on the back cover. Both covers must also clearly and legibly identify you as the publisher of these copies. The front cover must present the full title with all words of the title equally prominent and visible. You may add other material on the covers in addition. Copying with changes limited to the covers, as long as they preserve the title of the Document and satisfy these conditions, can be treated as verbatim copying in other respects. If the required texts for either cover are too voluminous to fit legibly, you should put the first ones listed (as many as fit reasonably) on the actual cover, and continue the rest onto adjacent pages. If you publish or distribute Opaque copies of the Document numbering more than 100, you must either include a machine-readable Transparent copy along with each Opaque copy, or state in or with each Opaque copy a publicly-accessible computer-network location containing a complete Transparent copy of the Document, free of added material, which the general network-using public has access to download anonymously at no charge using public-standard network protocols. If you use the latter option, you must take reasonably prudent steps, when you begin distribution of Opaque copies in quantity, to ensure that this Transparent copy will remain thus accessible at the stated location until at least one year after the last time you distribute an Opaque copy (directly or through your agents or retailers) of that edition to the public. It is requested, but not required, that you contact the authors of the Document well before redistributing any large number of copies, to give them a chance to provide you with an updated version of the Document. B.5. ModificationsYou may copy and distribute a Modified Version of the Document under the conditions of sections 2 and 3 above, provided that you release the Modified Version under precisely this License, with the Modified Version filling the role of the Document, thus licensing distribution and modification of the Modified Version to whoever possesses a copy of it. In addition, you must do these things in the Modified Version:
If the Modified Version includes new front-matter sections or appendices that qualify as Secondary Sections and contain no material copied from the Document, you may at your option designate some or all of these sections as invariant. To do this, add their titles to the list of Invariant Sections in the Modified Version's license notice. These titles must be distinct from any other section titles. You may add a section entitled "Endorsements", provided it contains nothing but endorsements of your Modified Version by various parties--for example, statements of peer review or that the text has been approved by an organization as the authoritative definition of a standard. You may add a passage of up to five words as a Front-Cover Text, and a passage of up to 25 words as a Back-Cover Text, to the end of the list of Cover Texts in the Modified Version. Only one passage of Front-Cover Text and one of Back-Cover Text may be added by (or through arrangements made by) any one entity. If the Document already includes a cover text for the same cover, previously added by you or by arrangement made by the same entity you are acting on behalf of, you may not add another; but you may replace the old one, on explicit permission from the previous publisher that added the old one. The author(s) and publisher(s) of the Document do not by this License give permission to use their names for publicity for or to assert or imply endorsement of any Modified Version. B.6. Combining documentsYou may combine the Document with other documents released under this License, under the terms defined in section 4 above for modified versions, provided that you include in the combination all of the Invariant Sections of all of the original documents, unmodified, and list them all as Invariant Sections of your combined work in its license notice. The combined work need only contain one copy of this License, and multiple identical Invariant Sections may be replaced with a single copy. If there are multiple Invariant Sections with the same name but different contents, make the title of each such section unique by adding at the end of it, in parentheses, the name of the original author or publisher of that section if known, or else a unique number. Make the same adjustment to the section titles in the list of Invariant Sections in the license notice of the combined work. In the combination, you must combine any sections entitled "History" in the various original documents, forming one section entitled "History"; likewise combine any sections entitled "Acknowledgements", and any sections entitled "Dedications". You must delete all sections entitled "Endorsements." B.7. Collections of documentsYou may make a collection consisting of the Document and other documents released under this License, and replace the individual copies of this License in the various documents with a single copy that is included in the collection, provided that you follow the rules of this License for verbatim copying of each of the documents in all other respects. You may extract a single document from such a collection, and distribute it individually under this License, provided you insert a copy of this License into the extracted document, and follow this License in all other respects regarding verbatim copying of that document. B.8. Aggregation with independent worksA compilation of the Document or its derivatives with other separate and independent documents or works, in or on a volume of a storage or distribution medium, does not as a whole count as a Modified Version of the Document, provided no compilation copyright is claimed for the compilation. Such a compilation is called an "aggregate", and this License does not apply to the other self-contained works thus compiled with the Document, on account of their being thus compiled, if they are not themselves derivative works of the Document. If the Cover Text requirement of section 3 is applicable to these copies of the Document, then if the Document is less than one quarter of the entire aggregate, the Document's Cover Texts may be placed on covers that surround only the Document within the aggregate. Otherwise they must appear on covers around the whole aggregate. B.9. TranslationTranslation is considered a kind of modification, so you may distribute translations of the Document under the terms of section 4. Replacing Invariant Sections with translations requires special permission from their copyright holders, but you may include translations of some or all Invariant Sections in addition to the original versions of these Invariant Sections. You may include a translation of this License provided that you also include the original English version of this License. In case of a disagreement between the translation and the original English version of this License, the original English version will prevail. B.10. TerminationYou may not copy, modify, sublicense, or distribute the Document except as expressly provided for under this License. Any other attempt to copy, modify, sublicense or distribute the Document is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. B.11. Future revisions of this licenseThe Free Software Foundation may publish new, revised versions of the GNU Free Documentation License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. See http://www.gnu.org/copyleft/. Each version of the License is given a distinguishing version number. If the Document specifies that a particular numbered version of this License "or any later version" applies to it, you have the option of following the terms and conditions either of that specified version or of any later version that has been published (not as a draft) by the Free Software Foundation. If the Document does not specify a version number of this License, you may choose any version ever published (not as a draft) by the Free Software Foundation. B.12. How to use this License for your documentsTo use this License in a document you have written, include a copy of the License in the document and put the following copyright and license notices just after the title page:
If you have no Invariant Sections, write "with no Invariant Sections" instead of saying which ones are invariant. If you have no Front-Cover Texts, write "no Front-Cover Texts" instead of "Front-Cover Texts being LIST"; likewise for Back-Cover Texts. If your document contains nontrivial examples of program code, we recommend releasing these examples in parallel under your choice of free software license, such as the GNU General Public License, to permit their use in free software. GlossaryA
C
F
I
L
M
N
P
R
S
T
U
W
X
IndexBC
EGIK
NPRS |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||








