Docker Prune Automating Cleanup Across Multiple Container Hosts
If you run many Docker hosts or multiple hosts in a Swarm cluster, if you are updating and respinning your containers (and you should be), you will have space accumulated on your Docker hosts from things like old container images, container overlay storage, and other things that can add up to a significant amount of storage. There is a helpful built-in command called docker prune
that helps to automate the process of cleaning up your Docker host environments. Also helpful, we can take this command and put it in a script or pipeline and schedule this to run on all the hosts in the environment.
Table of contents
Docker prune brief overview
I have covered this in a previous blog post on how to get rid of Docker overlay2 volumes that are not being used and taking up disk space. You can read that post here: Docker Overlay2 Cleanup: 5 Ways to Reclaim Disk Space.
However, the docker prune command is a built-in and, importantly, safe way to delete unneeded storage in your Docker environment for housekeeping purposes. While we are introducing the command as docker prune
if you execute that command by itself, docker won’t recognize it:
However, if we combine “prune” with other docker components like network, image, or system, we will see the following output:
As you can see in the image above, you can be specific about what you want to prune, if that is only network components, images, etc. However, the docker system prune
command is a command that is more global. As the output states, when you enter the command, it will remove the following:
- all stopped containers
- all networks not used by at least one container
- all dangling images
- unused build cache
Below is the output from running the docker system prune command on a Docker host that had many dangling images and overlay storage. You can see the space you can reclaim can be significant.
Extra parameters you can use with docker prune
There are a couple of extra parameters you can use with the docker prune command. Note the following:
-a
(All):- This flag tells Docker to remove all unused images, not just dangling ones
- Without
-a
,docker system prune
removes only dangling images (those not associated with any container). With-a
, it also removes any images not associated with running containers, even if theyโre still tagged.
-f
(Force):- This flag makes the prune command run without prompting for you to verify
- By default, Docker prompts you with
Are you sure you want to continue? [y/N]
before proceeding. Adding-f
skips this confirmation.
Bash script to loop through a list of VMs and run docker prune
You can create a bash script that will loop through a list of VMs and connect to each, running the docker prune command.
#!/bin/bash
# Path to the file containing VM names
VM_LIST="vm_names.txt"
# Initialize a report file
REPORT_FILE="reclaimed_space_report.txt"
echo "" > "$REPORT_FILE" # Clear any previous content
# Loop through each VM and perform docker prune
while IFS= read -r vm_name; do
echo "Processing $vm_name..."
reclaimed=$(ssh -o StrictHostKeyChecking=no root@"$vm_name" 'docker system prune -a -f --volumes' | grep 'Total reclaimed space' || echo "Failed to connect to $vm_name")
# Check if space was reclaimed and append result to the report
if [[ $reclaimed == *"Total reclaimed space"* ]]; then
echo "$vm_name - $reclaimed" >> "$REPORT_FILE"
else
echo "$vm_name - Connection failed or no space reclaimed" >> "$REPORT_FILE"
fi
done < "$VM_LIST"
# Output the report to the console
echo "Reclaimed Space Report:"
cat "$REPORT_FILE"
Use a pipeline to run and loop through your VMs and run docker prune
In the home lab one of the cool projects I have spun up recently is using a pipeline in Gitlab to loop through my docker hosts and run the docker system prune -a -f
command. To do this, I create a dynamic list of virtual machines that have “docker” or swarm” in the name.
To do this, I have a two-stage pipeline that dynamically pulls a list of virtual machines from my VMware vSphere environment with docker or swarm in the name. Then once the list is created, it loops through the list running the docker prune.
First let me show you how I get the names of the VMs. I do this with the following PowerCLI script. The variables are pipeline variables for the project stored in Gitlab. The file is named fetch_vm_names.ps1:
# Setting CEIP options
Set-PowerCLIConfiguration -InvalidCertificateAction Ignore -ParticipateInCeip $false -Confirm:$false
# Logging into vCenter
Connect-VIServer -Server $env:VSPHERE_SERVER -User $env:VSPHERE_USER -Password $env:VSPHERE_PASSWORD
# Retrieve powered-on VMs with "docker" or "swarm" in the name
$targetVMs = Get-VM | Where-Object { $_.PowerState -eq "PoweredOn" -and $_.Name -match "docker|swarm" } | Sort-Object
# Preparing the server name list for output
$vmNames = @()
foreach ($vm in $targetVMs) {
$vmNames += $vm.Name
}
# Outputting the server names to a text file for pipeline artifact
$vmNames | Out-File -FilePath vm_names.txt -Encoding UTF8
The vm_names.txt file is stored as an artifact when the pipeline runs.
I created a super simple shell script that contains the following in a file called prune_docker.sh:
#!/bin/bash
# Run Docker system prune to clean up unused images, containers, and networks
docker system prune -a -f
Then, the Gitlab pipeline file contains the following. Note the overview of the script:
- It has 2 stages
- It calls the fetch_vm_names.ps1 script
- The artifact is created called vm_names.txt
- In the run_script section, it loops through the Docker hosts that were found in the vm_names.txt file and connects using an SSH private key stored as a Gitlab pipeline variable, called SSH_PRIVATE_KEY
- It greps the line “Total reclaimed space” and stores this in a file called reclaimed_space.txt
- Then, finally, it emails this information to me when the pipeline runs and shows in the body of the email how much space was reclaimed on each docker host
# GitLab CI/CD pipeline configuration
stages:
- fetch_vm_names
- run_script
fetch_vm_names:
stage: fetch_vm_names
image: vmware/powerclicore
script:
- pwsh -File fetch_vm_names.ps1
artifacts:
paths:
- vm_names.txt
run_script:
stage: run_script
image: ubuntu
dependencies:
- fetch_vm_names
before_script:
- eval "$(ssh-agent -s)"
- echo "${SSH_PRIVATE_KEY}" | tr -d '\r' > /tmp/ssh_key
- chmod 600 /tmp/ssh_key
- ssh-add /tmp/ssh_key
- rm /tmp/ssh_key
- mkdir -p ~/.ssh
- chmod 700 ~/.ssh
- echo "StrictHostKeyChecking no" > ~/.ssh/config
- chmod 644 ~/.ssh/config
script:
- echo "" > reclaimed_space.txt # Initialize the report file
- |
# Loop through each server and collect reclaimed space information
for vm_name in $(cat vm_names.txt); do
reclaimed=$(ssh root@$vm_name 'docker system prune -a -f --volumes' | grep 'Total reclaimed space' || echo "Failed to connect to $vm_name")
# Append results to the report file
if [[ $reclaimed == *"Total reclaimed space"* ]]; then
echo "$vm_name - $reclaimed" >> reclaimed_space.txt
else
echo "$vm_name - Connection failed or no space reclaimed" >> reclaimed_space.txt
fi
done
# Send the consolidated report via email after processing all servers
- pwsh -Command "Send-MailMessage -From '[email protected]' -To 'Pushover <[email protected]>' -Subject 'Space Reclaimed' -Body (Get-Content -Raw -Path reclaimed_space.txt) -SmtpServer '10.1.149.19' -Port 8025"
Wrapping up
Hopefully this post will help anyone who is looking to free up space on their Docker container host that is running multiple containers and has lingering images or stale containers using up storage. The Docker prune command is a powerful built-in command you need to know. Also, it is great to use with automation to cleanup your hosts either using a script or with a pipeline.