Mastering Error Handling in Shell Scripting

In the world of DevOps, shell scripting plays an essential role in automating repetitive tasks. However, to ensure reliability and smooth execution, it's crucial to handle errors effectively within scripts. In this blog post, I’ll walk through some fundamental error handling techniques in Bash scripting, which will help you create more resilient automation scripts.

Why is Error Handling Important?

In a real-world environment, scripts often interact with files, directories, and external systems. If an unexpected error occurs (e.g., missing files, permission issues), it could result in unexpected behavior. By anticipating potential failures and gracefully handling them, we can prevent scripts from crashing, create better logs, and improve troubleshooting.

1. Understanding Exit Status

Every command in Linux returns an exit status:

0 means success.
Non-zero means failure (any non-zero value indicates an error).

In this section, we’ll look at how to check the exit status after a command to determine if it executed successfully.

Example: Checking Exit Status

#!/bin/bash

<<Info
Author      : Amitabh Soni
Date        : 6/12/24
Description : This script attempts to create a directory and checks if the command was successful. If not, it prints an error message.
Info

# Creating directory
mkdir -p task1

# Checking if the previous command got executed or not
if [[ $? -eq 0 ]]; then
    echo "Directory 'task1' successfully created."
else
    echo "ERROR: Failed to create directory 'task1'. Please check your permissions or try again."
fi

What’s happening here?

The mkdir command tries to create a directory.
$? holds the exit status of the last command.
If the exit status is 0, the directory was created successfully; otherwise, an error occurred.

Output:

2. Using `if` Statements for Error Checking

It's common to execute multiple commands in a script, and it's crucial to check for errors at each step. Using if statements ensures that each command is checked for success before proceeding to the next.

Example: File Creation with Error Checking

#!/bin/bash

<<Info
Author      : Amitabh Soni
Date        : 6/12/24
Description : This is a script that attempts to create a directory and checks if the command was successful. If not, print an error message.
Info

# Task 1
# Creating directory
mkdir -p task1

# Checking if the previous command got executed or not
if [[ $? -eq 0 ]]; then
    echo "Directory 'task1' successfully created."
else
    echo "ERROR: Failed to create directory 'task1'. Please check your permissions or try again."
fi

# Task 2
# Checking if directory 'task1' exists
if [[ -d task1 ]]; then

    # Checking if file already exists
    if [[ -f task1/task1.txt ]]; then
        echo "File 'task1.txt' already exists in 'task1'."
    else

        # Create the file if it does not exist
        touch task1/task1.txt

        if [[ $? -eq 0 ]]; then
            echo "File 'task1.txt' successfully created in 'task1'."
        else
            echo "ERROR: Failed to create file 'task1.txt' in 'task1'."
        fi
    fi
else
    echo "ERROR: Directory 'task1' does not exist. Cannot create file."
fi

This script first creates a directory, then checks for the existence of a file inside it. If the file doesn’t exist, it creates the file, handling errors at every stage.

3. Using `trap` for Cleanup

When scripts encounter errors, you might want to perform some cleanup (e.g., deleting temporary files or undoing changes). The trap command allows you to execute actions when a script exits unexpectedly.

Example: Using trap for Cleanup

#!/bin/bash

<<Info
Author      : Amitabh Soni
Date        : 6/12/24
Description : This script creates a temporary file and sets a trap to delete the file if the script exits unexpectedly.
Info

# Create a temporary directory
mkdir -p tmp

# Create a temporary file inside the directory
touch tmp/temp1.txt
echo "This is a temporary file." > tmp/temp1.txt

# Set a trap to delete the temporary file if the script exits unexpectedly (due to an error)
trap "rm -f tmp/temp1.txt; echo 'Temporary file deleted due to error.'" ERR

# Simulate script work here
echo "Script is running. The temporary file has been created."

# Uncomment the next line to simulate an error and test the trap
invalid_command  # This will trigger an error and activate the trap

# Check the exit status of the previous command to decide what happens next
if [[ $? -ne 0 ]]; then
    # Error occurred, file has been deleted by the trap
    echo "An error occurred, script exiting."
else
    # No error, continue execution
    echo "Script completed successfully. Temporary file will not be deleted."
fi

Here, the trap command ensures that the temporary file is deleted when an error occurs in the script, preventing unnecessary clutter on the filesystem.

4. Redirecting Errors

Sometimes, you may want to redirect error messages to a file for later review instead of displaying them on the terminal. You can achieve this using output redirection.

Example: Redirecting Errors to a Log File

#!/bin/bash

<<Info
Author      : Amitabh Soni
Date        : 6/12/24
Description : This script attempts to read a non-existent file and redirects the error message to a file called error.log.
Info

# Attempting to read a non-existent file
cat non_existent_file.txt 2> error.log

# Checking if error.log has been created and displaying the error message
if [[ -f error.log ]]; then
    echo "Error has been logged to error.log"
else
    echo "No error occurred."
fi

In this script, the error message generated by attempting to read a non-existent file is redirected to error.log.

5. Customizing Error Messages

By adding custom error messages, we can make it easier for users or developers to understand the nature of a problem. This is particularly useful when debugging or reporting issues.

Example: Custom Error Messages

#!/bin/bash

<<Info
Author      : Amitabh Soni
Date        : 6/12/24
Description : This script attempts to read a non-existent file and redirects the error message to a file called error.log with custom error messages.
Info

# Attempting to read a non-existent file
echo "Attempting to read a non-existent file: non_existent_file.txt"
cat non_existent_file.txt 2> error.log

# Custom error message for failure
if [[ -f error.log ]]; then
    echo "Custom Error Message: The file 'non_existent_file.txt' does not exist or cannot be read." >> error.log
    echo "The error has been logged to error.log with more context."
else
    echo "No error occurred."
fi

Conclusion

Handling errors gracefully is a critical skill for any DevOps practitioner. By using the right tools like exit status checks, trap for cleanup, error redirection, and custom messages, we can ensure that our shell scripts are robust and dependable. These techniques will help you write better automation scripts that handle unexpected failures effectively.

Happy Scripting!

Don't forget to check out my DevOps learning journey:

Call to Action:

Let me know if you have any questions about error handling in shell scripting.
Follow my #90DaysOfDevOps challenge on Hashnode to learn more DevOps tips!

Day-11 of #90DaysOfDevops : Error Handling in Shell Scripting: A Key Aspect of Robust DevOps Automation

Table of contents

Why is Error Handling Important?

1. Understanding Exit Status

2. Using `if` Statements for Error Checking

3. Using `trap` for Cleanup

4. Redirecting Errors

5. Customizing Error Messages

Conclusion

Day-11 of #90DaysOfDevops : Error Handling in Shell Scripting: A Key Aspect of Robust DevOps Automation

Table of contents

Why is Error Handling Important?

1. Understanding Exit Status

2. Using if Statements for Error Checking

3. Using trap for Cleanup

4. Redirecting Errors

5. Customizing Error Messages

Conclusion

2. Using `if` Statements for Error Checking

3. Using `trap` for Cleanup