Shell Script Hack #12: Array Mastery – Managing Lists and Collections

Shell Script Hack #12: Array Mastery – Managing Lists and Collections

Managing lists of data in shell scripts usually involves messy loops and temporary files. But Bash arrays offer a powerful, clean way to store and manipulate collections of data. Let me show you how to unlock array superpowers in your scripts.

The Problem

You need to store a list of servers and perform operations on each one. The old-school approach:

# Ugly string splitting
servers="web1 web2 web3 db1 db2"
for server in $servers; do
    ssh "$server" uptime
done

# What if server names have spaces? Breaks!

The Hack: Bash Arrays

Arrays store collections properly and handle spaces, special characters, and complex data:

# Declare array
servers=(web1 web2 web3 db1 db2)

# Iterate safely
for server in "${servers[@]}"; do
    ssh "$server" uptime
done

Clean, safe, and powerful!

Array Basics

Creating Arrays

# Method 1: Direct assignment
fruits=(apple banana cherry)

# Method 2: Individual elements
colors[0]="red"
colors[1]="green"
colors[2]="blue"

# Method 3: From command output
files=($(ls *.txt))

# Method 4: Empty array
empty_array=()

# Method 5: Declare explicitly
declare -a numbers

Accessing Elements

names=(Alice Bob Charlie)

# Single element
echo "${names[0]}"      # Alice
echo "${names[1]}"      # Bob

# Last element
echo "${names[-1]}"     # Charlie

# All elements
echo "${names[@]}"      # Alice Bob Charlie
echo "${names[*]}"      # Alice Bob Charlie

# All elements (properly quoted)
for name in "${names[@]}"; do
    echo "$name"
done

Array Properties

fruits=(apple banana cherry date)

# Length
echo "${#fruits[@]}"    # 4

# Indices
echo "${!fruits[@]}"    # 0 1 2 3

# Check if element exists
if [ -v fruits[2] ]; then
    echo "Index 2 exists"
fi

Array Manipulation

Adding Elements

numbers=(1 2 3)

# Append to end
numbers+=(4)
numbers+=(5 6)

# Add to beginning (create new array)
numbers=(0 "${numbers[@]}")

echo "${numbers[@]}"    # 0 1 2 3 4 5 6

# Insert at position
position=2
value=99
numbers=("${numbers[@]:0:$position}" "$value" "${numbers[@]:$position}")

Removing Elements

colors=(red green blue yellow)

# Remove by index
unset colors[1]
echo "${colors[@]}"     # red blue yellow

# Remove last element
unset colors[-1]

# Remove all
unset colors

# Remove by value (create new array)
colors=(red green blue green yellow)
new_colors=()
for color in "${colors[@]}"; do
    [[ $color != "green" ]] && new_colors+=("$color")
done
echo "${new_colors[@]}" # red blue yellow

Slicing Arrays

numbers=(0 1 2 3 4 5 6 7 8 9)

# ${array[@]:start:length}

# From index 3, length 4
echo "${numbers[@]:3:4}"    # 3 4 5 6

# From index 5 to end
echo "${numbers[@]:5}"      # 5 6 7 8 9

# Last 3 elements
echo "${numbers[@]: -3}"    # 7 8 9

# All except last 2
echo "${numbers[@]:0:${#numbers[@]}-2}"  # 0 1 2 3 4 5 6 7

Practical Examples

Server Management

#!/bin/bash

# Define servers
web_servers=(web1.example.com web2.example.com web3.example.com)
db_servers=(db1.example.com db2.example.com)
cache_servers=(cache1.example.com)

# Combine all servers
all_servers=("${web_servers[@]}" "${db_servers[@]}" "${cache_servers[@]}")

# Run command on all servers
for server in "${all_servers[@]}"; do
    echo "Checking $server..."
    ssh "$server" "uptime; df -h /"
done

# Parallel execution
for server in "${web_servers[@]}"; do
    (ssh "$server" "systemctl restart nginx") &
done
wait

echo "All servers restarted"

File Processing

#!/bin/bash

# Collect files
image_files=($(find . -name "*.jpg" -o -name "*.png"))

echo "Found ${#image_files[@]} images"

# Process each file
for file in "${image_files[@]}"; do
    basename="${file##*/}"
    dirname="${file%/*}"
    
    echo "Processing: $basename"
    convert "$file" -resize 800x600 "${dirname}/thumb_${basename}"
done

# Create archive
tar czf images_backup.tar.gz "${image_files[@]}"

Configuration Management

#!/bin/bash

# Define required packages
required_packages=(
    nginx
    postgresql
    redis-server
    nodejs
    npm
)

# Check which are installed
installed=()
missing=()

for pkg in "${required_packages[@]}"; do
    if dpkg -l | grep -q "^ii.*$pkg"; then
        installed+=("$pkg")
    else
        missing+=("$pkg")
    fi
done

echo "Installed: ${installed[@]}"
echo "Missing: ${missing[@]}"

# Install missing packages
if [ ${#missing[@]} -gt 0 ]; then
    echo "Installing missing packages..."
    apt-get install -y "${missing[@]}"
fi

Log Analysis

#!/bin/bash

# Read log entries into array
mapfile -t log_lines < access.log

# Count by status code
declare -A status_counts

for line in "${log_lines[@]}"; do
    # Extract status code (assuming it's field 9)
    status=$(echo "$line" | awk '{print $9}')
    ((status_counts[$status]++))
done

# Display results
for status in "${!status_counts[@]}"; do
    echo "Status $status: ${status_counts[$status]} times"
done

Associative Arrays (Dictionaries)

Basic Usage

# Declare associative array
declare -A colors

# Assign values
colors[apple]="red"
colors[banana]="yellow"
colors[grape]="purple"

# Access values
echo "${colors[apple]}"     # red

# All keys
echo "${!colors[@]}"        # apple banana grape

# All values
echo "${colors[@]}"         # red yellow purple

# Iterate
for fruit in "${!colors[@]}"; do
    echo "$fruit is ${colors[$fruit]}"
done

Real-World Example: Environment Config

#!/bin/bash

# Configuration for different environments
declare -A dev_config
dev_config[db_host]="localhost"
dev_config[db_port]="5432"
dev_config[db_name]="myapp_dev"
dev_config[debug]="true"

declare -A prod_config
prod_config[db_host]="prod-db.example.com"
prod_config[db_port]="5432"
prod_config[db_name]="myapp_prod"
prod_config[debug]="false"

# Select environment
ENV=${1:-dev}

if [ "$ENV" = "prod" ]; then
    declare -n config=prod_config
else
    declare -n config=dev_config
fi

# Use configuration
echo "Connecting to ${config[db_host]}:${config[db_port]}"
echo "Database: ${config[db_name]}"
echo "Debug mode: ${config[debug]}"

Advanced Techniques

Array of Arrays (Workaround)

#!/bin/bash

# Bash doesn't support nested arrays directly
# Workaround: use name references

declare -a server1=(web1 8080 nginx)
declare -a server2=(web2 8081 apache)
declare -a server3=(db1 5432 postgresql)

servers=(server1 server2 server3)

for server_ref in "${servers[@]}"; do
    declare -n server=$server_ref
    echo "Server: ${server[0]}, Port: ${server[1]}, Service: ${server[2]}"
done

Reading Files into Arrays

# Method 1: mapfile (readarray)
mapfile -t lines < file.txt

# Method 2: Read loop
while IFS= read -r line; do
    lines+=("$line")
done < file.txt

# Method 3: From command
IFS=$'\n' read -r -d '' -a lines < <(cat file.txt && printf '\0')

# Skip empty lines
mapfile -t lines < <(grep -v '^$' file.txt)

Sorting Arrays

#!/bin/bash

numbers=(5 2 8 1 9 3)

# Sort (create new array)
IFS=$'\n' sorted=($(sort -n <<< "${numbers[*]}"))
unset IFS

echo "Original: ${numbers[@]}"
echo "Sorted: ${sorted[@]}"

# Sort strings
names=(Charlie Alice Bob)
IFS=$'\n' sorted_names=($(sort <<< "${names[*]}"))
unset IFS

echo "Sorted names: ${sorted_names[@]}"

Unique Elements

#!/bin/bash

# Remove duplicates
items=(apple banana apple cherry banana date)

# Using associative array
declare -A seen
unique=()

for item in "${items[@]}"; do
    if [ -z "${seen[$item]}" ]; then
        seen[$item]=1
        unique+=("$item")
    fi
done

echo "Unique: ${unique[@]}"  # apple banana cherry date

Array Patterns

Command Line Arguments

#!/bin/bash

# Store all arguments
args=("$@")

echo "Number of arguments: ${#args[@]}"
echo "First argument: ${args[0]}"
echo "Last argument: ${args[-1]}"

# Process arguments
for arg in "${args[@]}"; do
    case "$arg" in
        --verbose|-v)
            VERBOSE=true
            ;;
        --output=*)
            OUTPUT="${arg#*=}"
            ;;
        *)
            FILES+=("$arg")
            ;;
    esac
done

echo "Files to process: ${FILES[@]}"

Queue Implementation

#!/bin/bash

# Simple queue
queue=()

# Enqueue (add to end)
enqueue() {
    queue+=("$1")
}

# Dequeue (remove from start)
dequeue() {
    if [ ${#queue[@]} -eq 0 ]; then
        echo "Queue empty"
        return 1
    fi
    echo "${queue[0]}"
    queue=("${queue[@]:1}")
}

# Usage
enqueue "task1"
enqueue "task2"
enqueue "task3"

while [ ${#queue[@]} -gt 0 ]; do
    task=$(dequeue)
    echo "Processing: $task"
done

Stack Implementation

#!/bin/bash

# Simple stack
stack=()

# Push
push() {
    stack+=("$1")
}

# Pop
pop() {
    if [ ${#stack[@]} -eq 0 ]; then
        echo "Stack empty"
        return 1
    fi
    echo "${stack[-1]}"
    unset stack[-1]
}

# Usage
push "item1"
push "item2"
push "item3"

echo "Popped: $(pop)"  # item3
echo "Popped: $(pop)"  # item2

Complete Example: Deployment Script

#!/bin/bash

# Define deployment stages
declare -a stages=(
    "validate"
    "test"
    "build"
    "backup"
    "deploy"
    "verify"
)

# Define servers by role
declare -a web_servers=(web1 web2 web3)
declare -a api_servers=(api1 api2)
declare -a worker_servers=(worker1)

# Track results
declare -A results

# Execute stage
execute_stage() {
    local stage=$1
    echo "Executing stage: $stage"
    
    case "$stage" in
        validate)
            npm run lint && npm run type-check
            ;;
        test)
            npm test
            ;;
        build)
            npm run build
            ;;
        backup)
            for server in "${web_servers[@]}"; do
                ssh "$server" "tar czf /backup/app-$(date +%Y%m%d).tar.gz /var/www/app"
            done
            ;;
        deploy)
            local all_servers=("${web_servers[@]}" "${api_servers[@]}" "${worker_servers[@]}")
            for server in "${all_servers[@]}"; do
                rsync -avz ./dist/ "$server:/var/www/app/"
                ssh "$server" "systemctl restart app"
            done
            ;;
        verify)
            for server in "${web_servers[@]}"; do
                curl -f "http://$server/health" || return 1
            done
            ;;
    esac
}

# Run deployment
for stage in "${stages[@]}"; do
    if execute_stage "$stage"; then
        results[$stage]="SUCCESS"
        echo "✓ $stage completed"
    else
        results[$stage]="FAILED"
        echo "✗ $stage failed"
        exit 1
    fi
done

# Report
echo ""
echo "Deployment Report:"
for stage in "${stages[@]}"; do
    echo "  $stage: ${results[$stage]}"
done

Pro Tips

  • Always quote: Use “${array[@]}” not ${array[@]}
  • Check length: Test ${#array[@]} before accessing
  • Use mapfile: Faster than loop for reading files
  • Associative arrays: Perfect for configs and lookups
  • Index -1: Easy access to last element

Common Mistakes

# Wrong - word splitting
for item in ${array[@]}; do  # Don't do this

# Correct
for item in "${array[@]}"; do

# Wrong - treating as string
echo $array  # Only prints first element

# Correct
echo "${array[@]}"  # Prints all elements

# Wrong - unquoted expansion
files=($(ls))  # Breaks with spaces

# Better
mapfile -t files < <(ls)

Conclusion

Bash arrays are a powerful feature that transforms how you handle collections of data in shell scripts. From simple lists to complex associative arrays, they enable clean, efficient data management without external tools. Master arrays and your scripts will be more robust and maintainable.

Stop treating everything as strings. Use arrays and unlock the full power of Bash!

References

Written by:

439 Posts

View All Posts
Follow Me :