Bash wget Command – Complete Guide to Non-Interactive Network Downloader

Table of Contents

Introduction to `wget`

wget (World Wide Web Get) is a powerful command-line utility for downloading files from the internet. It supports HTTP, HTTPS, FTP protocols, and can handle recursive downloads, resuming interrupted transfers, and many other features. Understanding wget is essential for automated downloads, web scraping, and system administration.

Basic Syntax

wget [options] [URL]

Key Features

Non-interactive: Works in background, perfect for scripts
Resume capability: Can resume interrupted downloads
Recursive downloads: Download entire websites
Bandwidth control: Limit download speed
Authentication: Support for HTTP/FTP authentication
Proxy support: Works with HTTP proxies

1. Basic Usage

Simple File Download

# Download a single file
wget https://example.com/file.zip
# Download with different filename
wget -O newname.zip https://example.com/file.zip
# Download to specific directory
wget -P /path/to/directory/ https://example.com/file.zip
# Download multiple files
wget https://example.com/file1.zip https://example.com/file2.zip

Examples

$ wget https://releases.ubuntu.com/22.04/ubuntu-22.04.3-desktop-amd64.iso
--2024-03-11 10:30:15--  https://releases.ubuntu.com/22.04/ubuntu-22.04.3-desktop-amd64.iso
Resolving releases.ubuntu.com... 91.189.91.38
Connecting to releases.ubuntu.com|91.189.91.38|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4823359488 (4.5G) [application/x-iso9660-image]
Saving to: ‘ubuntu-22.04.3-desktop-amd64.iso’
ubuntu-22.04.3-desktop-amd64.iso   0%[                    ]  10.23M  1.2MB/s

2. Common Options

Output Control

# -O: Save with different filename
wget -O custom_name.html https://example.com
# -P: Save to directory
wget -P /tmp/downloads/ https://example.com/file.pdf
# -nd: No directory structure
wget -nd -P /downloads/ https://example.com/images/photo.jpg
# -x: Force directory structure
wget -x https://example.com/images/photo.jpg
# Creates: example.com/images/photo.jpg

Quiet and Verbose Modes

# -q: Quiet mode (no output)
wget -q https://example.com/file.zip
# -v: Verbose mode (default)
wget -v https://example.com/file.zip
# --no-verbose: Less verbose
wget --no-verbose https://example.com/file.zip
# Progress indicators
wget --progress=bar https://example.com/largefile.zip
wget --progress=dot https://example.com/largefile.zip

Download Resume

# -c: Continue/resume partial download
wget -c https://example.com/largefile.zip
# --start-pos: Start from specific position
wget --start-pos=1048576 https://example.com/file.zip
# --timeout: Set network timeout
wget --timeout=10 https://example.com/file.zip

3. Advanced Download Options

Bandwidth Limiting

# --limit-rate: Limit download speed
wget --limit-rate=200k https://example.com/largefile.zip
# Limit in different units
wget --limit-rate=1m https://example.com/largefile.zip  # 1 MB/s
wget --limit-rate=500k https://example.com/largefile.zip # 500 KB/s

Retry Options

# -t: Number of retries
wget -t 5 https://example.com/file.zip
# --retry-connrefused: Retry on connection refused
wget --retry-connrefused -t 10 https://example.com/file.zip
# --wait: Wait between retries
wget --wait=5 -t 3 https://example.com/file.zip
# --waitretry: Wait longer between retries
wget --waitretry=10 -t 5 https://example.com/file.zip
# --random-wait: Randomize wait time
wget --random-wait --wait=5 https://example.com/

Timeout Control

# --dns-timeout: DNS lookup timeout
wget --dns-timeout=10 https://example.com
# --connect-timeout: Connection timeout
wget --connect-timeout=15 https://example.com
# --read-timeout: Read timeout
wget --read-timeout=20 https://example.com
# --timeout: All timeouts
wget --timeout=30 https://example.com

4. Authentication and Headers

HTTP Authentication

# --user and --password: HTTP authentication
wget --user=username --password=password https://example.com/private/file.zip
# Ask for password (more secure)
wget --user=username --ask-password https://example.com/private/file.zip
# Using .netrc file
echo "machine example.com login username password secret" > ~/.netrc
chmod 600 ~/.netrc
wget https://example.com/private/file.zip

Custom Headers

# --header: Add custom HTTP headers
wget --header="User-Agent: Mozilla/5.0" https://example.com
wget --header="Accept: application/json" https://api.example.com/data
wget --header="Referer: https://google.com" https://example.com
# Multiple headers
wget --header="User-Agent: Mozilla/5.0" \
--header="Accept-Language: en-US,en;q=0.9" \
--header="Cookie: session=12345" \
https://example.com
# --referer: Set referer
wget --referer="https://google.com" https://example.com
# --user-agent: Set user agent
wget --user-agent="Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36" \
https://example.com

Cookies

# --save-cookies: Save cookies to file
wget --save-cookies cookies.txt --keep-session-cookies \
https://example.com/login
# --load-cookies: Load cookies from file
wget --load-cookies cookies.txt https://example.com/protected/page
# --no-cookies: Disable cookies
wget --no-cookies https://example.com

5. FTP Downloads

FTP Operations

# Download FTP file
wget ftp://username:[email protected]/file.zip
# Anonymous FTP
wget ftp://ftp.example.com/pub/file.zip
# FTP directory listing
wget ftp://ftp.example.com/pub/
# Recursive FTP download
wget -r ftp://ftp.example.com/pub/
# FTP with specific port
wget ftp://example.com:2121/file.zip

FTP Options

# --ftp-user and --ftp-password
wget --ftp-user=username --ftp-password=password ftp://example.com/file.zip
# --no-passive-ftp: Disable passive mode
wget --no-passive-ftp ftp://example.com/file.zip
# --retr-symlinks: Retrieve symlinks as files
wget --retr-symlinks -r ftp://example.com/pub/

6. Recursive Downloads

Basic Recursion

# -r: Recursive download
wget -r https://example.com/docs/
# -l: Recursion depth
wget -r -l 2 https://example.com/docs/  # 2 levels deep
# --no-parent: Don't ascend to parent directory
wget -r --no-parent https://example.com/docs/
# --accept: Accept specific file types
wget -r --accept pdf,doc,txt https://example.com/docs/

Advanced Recursion

# -np: No parent (don't go to parent directory)
wget -r -np https://example.com/docs/
# --reject: Reject specific file types
wget -r --reject jpg,gif,mp4 https://example.com/gallery/
# --accept-list: File with accepted patterns
wget -r --accept-list=patterns.txt https://example.com/
# --exclude-directories: Skip directories
wget -r --exclude-directories=/images,/tmp https://example.com/
# --include-directories: Only these directories
wget -r --include-directories=/docs,/downloads https://example.com/

Mirror a Website

# -m: Mirror website (equivalent to -r -N -l inf --no-remove-listing)
wget -m https://example.com/
# Mirror with conversion for offline viewing
wget -m -k -K -E https://example.com/
# -k: Convert links for local viewing
# -K: Keep original file as .orig
# -E: Adjust extensions
# Mirror with timestamping
wget -m -N https://example.com/
# -N: Only download newer files
# Complete mirror command
wget --mirror --convert-links --adjust-extension \
--page-requisites --no-parent https://example.com/

7. Page Requisites and Conversion

Download Page Requisites

# -p: Download all page requisites (images, CSS, JS)
wget -p https://example.com/page.html
# --page-requisites: Same as -p
wget --page-requisites https://example.com/page.html
# Download single page with all assets
wget -p -k https://example.com/page.html
# -k: Convert links for local viewing

Link Conversion

# -k: Convert links for local viewing
wget -k https://example.com/page.html
# -K: Keep original files with .orig extension
wget -K https://example.com/page.html
# --convert-links: Convert links after download
wget --convert-links https://example.com/page.html
# --adjust-extension: Add appropriate extensions
wget --adjust-extension https://example.com/page.php

8. Timestamping and File Management

Timestamping

# -N: Only download newer files (timestamping)
wget -N https://example.com/file.zip
# --timestamping: Same as -N
wget --timestamping https://example.com/file.zip
# --no-use-server-timestamps: Don't set local timestamps
wget --no-use-server-timestamps https://example.com/file.zip

File Versioning

# --backups: Number of backups to keep
wget --backups=5 -N https://example.com/file.zip
# --backup-converted: Backup converted files
wget -k --backup-converted https://example.com/page.html
# --keep-session-cookies: Keep session cookies
wget --keep-session-cookies --save-cookies cookies.txt \
https://example.com/login

9. Input from Files

Download from File List

# -i: Read URLs from file
wget -i urls.txt
# Download from file with options
wget -i urls.txt -P downloads/
# URLs file format (one per line)
echo "https://example.com/file1.zip" > urls.txt
echo "https://example.com/file2.zip" >> urls.txt
echo "https://example.com/file3.zip" >> urls.txt

Advanced Input Handling

# Download from stdin
cat urls.txt | wget -i -
# Combine with other commands
find . -name "*.url" -exec cat {} \; | wget -i -
# Process URLs from command output
curl -s https://api.example.com/files | jq -r '.[].url' | wget -i -

10. Spider and Testing

Web Spider Mode

# --spider: Check URLs without downloading
wget --spider https://example.com
# Check multiple URLs
wget --spider -i urls.txt
# Check broken links
wget --spider --force-html -r -l1 https://example.com/ 2>&1 | \
grep -B2 '404'
# Verbose spider output
wget --spider -v https://example.com

Testing and Debugging

# --debug: Debug output
wget --debug https://example.com
# --server-response: Print server response
wget --server-response https://example.com
# --save-headers: Save headers to file
wget --save-headers https://example.com
# -S: Show server response
wget -S https://example.com

11. Proxy Configuration

HTTP Proxy

# --proxy: Use HTTP proxy
wget --proxy=on --proxy-user=user --proxy-password=pass \
-e use_proxy=yes -e http_proxy=proxy.example.com:8080 \
https://example.com
# Environment variables
export http_proxy=http://proxy.example.com:8080
export https_proxy=http://proxy.example.com:8080
wget https://example.com
# --no-proxy: Bypass proxy
wget --no-proxy https://example.com

FTP Proxy

# FTP proxy
export ftp_proxy=ftp://proxy.example.com:2121
wget ftp://ftp.example.com/file.zip

12. Script Examples

Download Manager Script

#!/bin/bash
# download_manager.sh - Advanced download manager with wget
DOWNLOAD_DIR="$HOME/downloads"
LOG_FILE="$DOWNLOAD_DIR/download.log"
URLS_FILE="$DOWNLOAD_DIR/urls.txt"
MAX_CONCURRENT=3
mkdir -p "$DOWNLOAD_DIR"
download_file() {
local url="$1"
local filename=$(basename "$url")
echo "[$(date '+%Y-%m-%d %H:%M:%S')] Starting download: $filename" >> "$LOG_FILE"
wget -c \
--timeout=30 \
--tries=5 \
--limit-rate=500k \
--user-agent="Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36" \
-P "$DOWNLOAD_DIR" \
"$url" 2>&1 | while read line; do
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $line" >> "$LOG_FILE"
done
if [ $? -eq 0 ]; then
echo "[$(date '+%Y-%m-%d %H:%M:%S')] Completed: $filename" >> "$LOG_FILE"
else
echo "[$(date '+%Y-%m-%d %H:%M:%S')] Failed: $filename" >> "$LOG_FILE"
echo "$url" >> "$DOWNLOAD_DIR/failed.txt"
fi
}
# Download multiple files with concurrency control
download_concurrent() {
local count=0
while read url; do
if [[ -n "$url" && ! "$url" =~ ^# ]]; then
download_file "$url" &
((count++))
if ((count >= MAX_CONCURRENT)); then
wait
count=0
fi
fi
done < "$URLS_FILE"
wait
}
# Main execution
case "$1" in
single)
download_file "$2"
;;
batch)
download_concurrent
;;
retry)
if [[ -f "$DOWNLOAD_DIR/failed.txt" ]]; then
mv "$DOWNLOAD_DIR/failed.txt" "$URLS_FILE"
download_concurrent
fi
;;
*)
echo "Usage: $0 {single <url>|batch|retry}"
exit 1
;;
esac

Website Backup Script

#!/bin/bash
# backup_website.sh - Backup entire website with wget
BACKUP_DIR="$HOME/website_backups"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
SITE_URL="$1"
if [[ -z "$SITE_URL" ]]; then
echo "Usage: $0 <website_url>"
exit 1
fi
DOMAIN=$(echo "$SITE_URL" | awk -F/ '{print $3}')
BACKUP_PATH="$BACKUP_DIR/${DOMAIN}_$TIMESTAMP"
mkdir -p "$BACKUP_PATH"
echo "Starting backup of $SITE_URL to $BACKUP_PATH"
wget \
--mirror \
--convert-links \
--adjust-extension \
--page-requisites \
--no-parent \
--wait=2 \
--limit-rate=500k \
--random-wait \
--user-agent="Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36" \
--directory-prefix="$BACKUP_PATH" \
--output-file="$BACKUP_PATH/backup.log" \
"$SITE_URL"
if [ $? -eq 0 ]; then
echo "Backup completed successfully"
# Create archive
tar -czf "$BACKUP_PATH.tar.gz" -C "$BACKUP_DIR" "${DOMAIN}_$TIMESTAMP"
rm -rf "$BACKUP_PATH"
echo "Archive created: $BACKUP_PATH.tar.gz"
# Generate report
cat > "$BACKUP_PATH.report" << EOF
Website Backup Report
====================
URL: $SITE_URL
Date: $(date)
Backup File: $BACKUP_PATH.tar.gz
Size: $(du -h "$BACKUP_PATH.tar.gz" | cut -f1)
Download Log Summary:
$(grep "saved\|removed" "$BACKUP_PATH/backup.log" | tail -20)
EOF
echo "Report saved: $BACKUP_PATH.report"
else
echo "Backup failed"
exit 1
fi

Batch Download Script

#!/bin/bash
# batch_download.sh - Batch download with pattern matching
PATTERN="$1"
OUTPUT_DIR="${2:-./downloads}"
if [[ -z "$PATTERN" ]]; then
echo "Usage: $0 <url_pattern> [output_dir]"
echo "Example: $0 'https://example.com/images/img{001..100}.jpg'"
exit 1
fi
mkdir -p "$OUTPUT_DIR"
# Expand pattern using brace expansion
URLS=$(eval echo "$PATTERN")
download_with_progress() {
local total=$(echo "$URLS" | wc -w)
local current=0
for url in $URLS; do
((current++))
filename=$(basename "$url")
echo "[$current/$total] Downloading: $filename"
wget -c \
--quiet \
--show-progress \
--timeout=30 \
--tries=3 \
-P "$OUTPUT_DIR" \
"$url"
if [ $? -eq 0 ]; then
echo "  ✓ Completed"
else
echo "  ✗ Failed"
echo "$url" >> "$OUTPUT_DIR/failed.txt"
fi
done
}
# Start download
echo "Starting batch download of $(echo "$URLS" | wc -w) files"
download_with_progress
# Summary
echo
echo "Download Summary:"
echo "  Location: $OUTPUT_DIR"
ls -lh "$OUTPUT_DIR" | tail -n +2
if [[ -f "$OUTPUT_DIR/failed.txt" ]]; then
echo "  Failed downloads: $(wc -l < "$OUTPUT_DIR/failed.txt")"
fi

13. Rate Limiting and Politeness

Rate Control

# --wait: Wait between downloads
wget --wait=5 -r https://example.com/
# --random-wait: Randomize wait time
wget --random-wait --wait=5 -r https://example.com/
# --limit-rate: Limit bandwidth
wget --limit-rate=200k -r https://example.com/
# --quota: Limit total download size
wget --quota=100m -r https://example.com/

Robots.txt Handling

# Respect robots.txt (default)
wget -e robots=off https://example.com/
# Ignore robots.txt
wget -e robots=off https://example.com/
# Custom user agent for robots.txt
wget --user-agent="MyBot/1.0" https://example.com/

14. SSL/TLS Options

Certificate Handling

# --no-check-certificate: Skip certificate validation
wget --no-check-certificate https://example.com
# --certificate: Client certificate
wget --certificate=client.crt --private-key=client.key https://example.com
# --ca-certificate: CA certificate
wget --ca-certificate=ca.crt https://example.com
# --secure-protocol: Specify SSL/TLS protocol
wget --secure-protocol=TLSv1_2 https://example.com

15. Output Formatting

Custom Log Format

# --output-file: Log to file
wget --output-file=download.log https://example.com/file.zip
# --append-output: Append to log file
wget --append-output=download.log https://example.com/file2.zip
# --output-document: Output to file
wget --output-document=output.html https://example.com
# --progress: Progress indicator format
wget --progress=bar:force https://example.com/largefile.zip
wget --progress=dot:giga https://example.com/largefile.zip

16. Integration with Other Commands

Piping and Redirection

# Pipe output to other commands
wget -qO- https://example.com | grep "title"
# Download and process
wget -qO- https://example.com/data.json | jq '.users[].name'
# Download and extract
wget -qO- https://example.com/archive.tar.gz | tar xz
# Download and checksum
wget -qO- https://example.com/file.zip | sha256sum

With `curl` Comparison

# wget is better for recursive downloads
wget -r https://example.com/
# curl is better for API interactions
curl -X POST -d '{"key":"value"}' https://api.example.com
# Both can download files
wget https://example.com/file.zip
curl -O https://example.com/file.zip

17. Error Handling

Exit Codes

# 0: Success
# 1: Generic error
# 2: Parse error
# 3: File I/O error
# 4: Network failure
# 5: SSL verification failure
# 6: Authentication failure
# 7: Protocol errors
# 8: Server error
wget https://example.com/file.zip
case $? in
0) echo "Download successful" ;;
4) echo "Network error" ;;
6) echo "Authentication failed" ;;
8) echo "Server error" ;;
*) echo "Other error: $?" ;;
esac

Error Recovery

# Retry on error
while ! wget -c https://example.com/file.zip; do
echo "Download failed, retrying in 10 seconds..."
sleep 10
done
# Continue from where it left off
wget -c https://example.com/largefile.zip
# Log errors
wget https://example.com/file.zip 2>> error.log

18. Quick Reference Card

Most Common Options

Option	Description
`-O file`	Save as different filename
`-P dir`	Save to directory
`-c`	Continue partial download
`-r`	Recursive download
`-l depth`	Recursion depth
`-np`	No parent directories
`-nd`	No directory structure
`-x`	Force directory structure
`-N`	Timestamping
`-m`	Mirror website
`-p`	Download page requisites
`-k`	Convert links
`-E`	Adjust extensions
`-t num`	Number of retries
`--limit-rate`	Limit download speed
`--wait`	Wait between downloads
`--random-wait`	Randomize wait time
`--user`	HTTP username
`--password`	HTTP password
`--header`	Custom HTTP header
`--referer`	Set referer
`--user-agent`	Set user agent
`--no-check-certificate`	Skip SSL validation
`-q`	Quiet mode
`-v`	Verbose mode
`--spider`	Check URLs only
`-i file`	Read URLs from file

Common Combinations

Command	Purpose
`wget -c URL`	Resume download
`wget -r -np URL`	Download directory recursively
`wget -m URL`	Mirror website
`wget -p -k URL`	Download page with assets
`wget -i urls.txt`	Download multiple URLs
`wget --limit-rate=200k URL`	Limit speed
`wget -t 5 --wait=10 URL`	Retry with wait
`wget --user=user --ask-password URL`	Authenticated download
`wget -qO- URL \| command`	Pipe output
`wget --spider URL`	Check if URL exists

Conclusion

wget is an indispensable tool for automated downloading and web content retrieval:

Key Points Summary

Non-interactive: Perfect for scripts and automation
Resume capability: Continue interrupted downloads with -c
Recursive downloads: Mirror entire websites with -r and -m
Bandwidth control: Limit speed with --limit-rate
Authentication: Support for HTTP/FTP auth
Proxy support: Works with HTTP proxies
Robust error handling: Retry mechanisms and timeouts

Best Practices

Use -c for large downloads - Resume if interrupted
Implement rate limiting - Be respectful to servers
Use --wait and --random-wait - Avoid overwhelming servers
Set appropriate timeouts - Prevent hanging downloads
Log downloads - Track what was downloaded
Verify SSL certificates - Ensure secure downloads
Use -i for batch downloads - Maintain URL lists
Test with --spider first - Verify URLs before downloading

Quick Reference

Want to…	Command
Download a file	`wget URL`
Resume download	`wget -c URL`
Limit speed	`wget --limit-rate=200k URL`
Download recursively	`wget -r URL`
Mirror website	`wget -m URL`
Download with authentication	`wget --user=name --ask-password URL`
Download multiple	`wget -i urls.txt`
Check URL	`wget --spider URL`
Pipe to command	`wget -qO- URL \| command`
Ignore SSL	`wget --no-check-certificate URL`

wget's versatility and robustness make it the go-to tool for automated downloads, website mirroring, and content retrieval in scripts and command-line operations.

Introduction to wget