Implementing Fifo (First In First Out) To Maintain Diskspace - Bytesized Hosting Wiki

DISCLAIMER: USE AT YOUR OWN RISK!

[Update: 21.09.16 To preserve media a little longer, I've incorporated a script to remove all Deluge torrents over 31 days before it starts deleting my media files. Scroll to the bottom to see this script]

So I have some very media hungry family members constantly adding stuff to my media folder, so instead of constantly deleting old stuff I wanted to find a script that checks if diskspace is running low, and if so, delete the oldest files added to the folder until diskspace has been recovered to a specific percentage.

Found a nice script that does just that, and with a few tweaks by yours truly and the infamous Animazing (thanks!) its working really well!

Instructions: Copy and paste the below code into a new file called something like deleteOldFiles.sh. Then run it with the following syntax:

./deleteOldFiles.sh [PATH] [limit in percentage]

For example if you want it to run on your media folder which is located at ~/media and you want it to delete files until your diskspace is no more than 90% full, run the following:

./deleteOldFiles.sh ~/media 90

The script basically checks if your diskspace is higher than the threshold you specified, and if so recursively finds the oldest file in the directory then deletes it. It repeats this process until the diskspace is within the threshold.

WARNING: When implementing this, I would strongly advise you modify the 'MAX_CYCLES' variable to a small number - this is a built in failsafe to prevent any possible runaway scripts that may delete all your files!

WARNING2: I would advise test running this on a directory that you don't really care about - for example the Sabnzbd incomplete folder (if your queue is empty of course), before you implement it to your media folder.

If all works fine, then you can put this into a CRON job and have it run every day for example.

deleteoldfiles.sh:

#!/bin/bash
#
############################################################################### 
# Author            :  Louwrentius (and hacked to death by theCheek!)
# Contact           : [email protected]
# Initial release   : August 2011
# Licence           : Simplified BSD License
############################################################################### 

VERSION=1.01

#
# Mounted volume to be monitored.
#
MOUNT="$1"
#
# Maximum threshold of volume used as an integer that represents a percentage:
# 95 = 95%.
#
MAX_USAGE="$2"
#
# Failsafe mechansim. Delete a maxium of MAX_CYCLES files, raise an error after
# that. Prevents possible runaway script. Disable by choosing a high value.
#
MAX_CYCLES=50
CYCLES=0
deletedCounter=1
oldTorrentsRemoved=0
dt=$(date '+%d/%m/%Y %H:%M:%S');
deletedFiles=""
show_header () {

    echo "$dt"
    echo DELETE OLD FILES $VERSION
    echo

}

show_header

reset () {
    OLDEST_FILE=""
    OLDEST_DATE=0
    ARCH=`uname`
}

reset

if [ -z "$MOUNT" ] || [ ! -e "$MOUNT" ] || [ ! -d "$MOUNT" ] || [ -z "$MAX_USAGE" ]
then
    echo "Usage: $0 <mountpoint> <threshold>"
    echo "Where threshold is a percentage."
    echo
    echo "Example: $0 /storage 90"
    echo "If disk usage of /storage exceeds 90% the oldest"
    echo "file(s) will be deleted until usage is below 90%."
    echo 
    echo "Wrong command line arguments or another error:"
    echo 
    echo "- Directory not provided as argument or"
    echo "- Directory does not exist or"
    echo "- Argument is not a directory or"
    echo "- no/wrong percentage supplied as argument."
    echo
    exit 1
fi

check_capacity () {
    TOTAL=`quota | tail -n1 | awk ' { print $2 }' | sed  -e 's/G//'`
    CURRENT=`quota | tail -n1 | awk ' { print $1 }' | sed  -e 's/G//'`
    let USAGE=(100*CURRENT)/TOTAL
    echo "Ani says usage is $USAGE"

    if [ ! "$?" == "0" ]    
    then
        echo "Error: mountpoint $MOUNT not found in df output."
        exit 1
    fi

    if [ -z "$USAGE" ]
    then
        echo "Didn't get usage information of $MOUNT"
        echo "Mountpoint does not exist or please remove trailing slash."
        exit 1
    fi

    if [ "$USAGE" -gt "$MAX_USAGE" ]
    then
        echo "Usage of $USAGE% exceeded limit of $MAX_USAGE percent."
        return 0
    else
        echo "Usage of $USAGE% is within limit of $MAX_USAGE percent."
        if [ "$deletedCounter" -gt 1 ]
        then
            sendSummary
        fi 
        return 1
    fi
}

check_age () {

    FILE="$1"
    if [ "$ARCH" == "Linux" ]
    then
        FILE_DATE=`stat -c %Z "$FILE"`
    elif [ "$ARCH" == "Darwin" ]
    then
        FILE_DATE=`stat -f %Sm -t %s "$FILE"`
    else
        echo "Error: unsupported architecture."
        echo "Send a patch for the correct stat arguments for your architecture."
    fi

    NOW=`date +%s`
    AGE=$((NOW-FILE_DATE))
    if [ "$AGE" -gt "$OLDEST_DATE" ]
    then
        export OLDEST_DATE="$AGE"
        export OLDEST_FILE="$FILE"
    fi
}
sendSummary(){
echo "Storage is full, so following a First in First Out policy, the following oldest added files have been deleted to make space:\n $deletedFiles ."
    ~/scripts/pushbullet-bash/pushbullet push [**REPLACEchannelOrDeviceorAll**REPLACE] note "Storage is full, so following a First in First Out policy, the following oldest added files have been deleted to make space:\n $deletedFiles .\n "
}

process_file () {

    FILE="$1"

    #
    # Replace the following commands with wathever you want to do with 
    # this file. You can delete files but also move files or do something else.
    #
    if [ $oldTorrentsRemoved -eq 1 ] # If Torrents over 31 days have been removed, start deleting files
    then
        echo "Deleting oldest file $FILE"
    deletedFiles+="$deletedCounter) ${FILE:25}\n"

        rm -f "$FILE"
        deletedCounter=$[deletedCounter + 1]
    else
        echo "Removing torrents older than 31 days" 
        python ~/scripts/removeOldTorrents.py # To ensure maximum storage efficiency, this python script will remove all torrents from Deluge if they are over 31 days old
        oldTorrentsRemoved=1
    fi


}

while check_capacity
do
    echo $CYCLES
    if [ "$CYCLES" -gt "$MAX_CYCLES" ]
    then
        echo "Error: after $MAX_CYCLES deleted files still not enough free space."
    sendSummary
        exit 1
    fi

    reset

    FILES=`find "$MOUNT" -type f -not -path "*sync*"` #finds all files as long as they are not part of the BTSync's working folders

    IFS=$'\n'
    for x in $FILES
    do
        check_age "$x"
    done

    if [ -e "$OLDEST_FILE" ]
    then
        #
        # Do something with file.
        #
        if [[ $OLDEST_FILE != *"sync"* ]]
    then
        process_file "$OLDEST_FILE";
    else
        echo "Sync Folder Detected in $OLDEST_FILE , ignoring"
        CYCLES=$[CYCLES - 1]
    fi
    else
        echo "Error: somehow, item $OLDEST_FILE disappeared."
    fi
    echo "Increasing Counter"
    CYCLES=$[CYCLES + 1]
done
echo

echo

And here is the latest update - removeOldTorrents.py which is called from the above script. Remember to modify the script with your own Box's details. You can also adjust how old torrents should be before they are removed, or for it to ignore specific folders that are your seed folders and should never be deleted).


#!/usr/bin/python

from deluge.log import LOG as log
from deluge.ui.client import client
import deluge.component as component
from twisted.internet import reactor, defer
import time

############
# Change the following
cliconnect = client.connect(host='127.0.0.1',port=2779,username="yourdelugeDaemonuser",password="yourdelugeDaemonPassword")
seeddir = "xxxx" # Directory to ignore for torrents to remain seeding
timedifference = 31 # Remove torrents older than this this time (in days)
is_interactive = True # Set this to True to allow direct output or set to False for cron
do_remove_data = True # Set to True to delete torrent data as well, false to leave it

###############
# Do not edit below this line!

oldcount = 0
skipcount = 0
seedcount = 0
errorcount = 0
torrent_ids = []

def printSuccess(dresult, is_success, smsg):
    global is_interactive
    if is_interactive:
        if is_success:
            print "[+]", smsg
        else:
            print "[i]", smsg

def printError(emsg):
    global is_interactive
    if is_interactive:
        print "[e]", emsg

def endSession(esresult):
    if esresult:
        print esresult
        reactor.stop()
    else:
        client.disconnect()
        printSuccess(None, False, "Client disconnected.")
        reactor.stop()

def printReport(rresult):
    if errorcount > 0:
        printError(None, "Failed! Number of errors: %i" % (errorcount))
    else:
        if oldcount > 0:
            printSuccess(None, True, "Removed %i torrents -- Skipped %i torrents -- Seeding %i torrents" % (oldcount, skipcount, seedcount))
        else:
            printSuccess(None, True, "No old torrents! -- Skipped %i torrents -- Seeding %i torrents" % (skipcount, seedcount))
    endSession(None)

def on_torrents_status(torrents):
    global filtertime
    tlist=[]
    for torrent_id, status in torrents.items():
        if status["save_path"] == seeddir:
            global seedcount
            seedcount += 1
        else:
            unixtime = "%s" % (status["time_added"])
            numunixtime = int(unixtime[:-2])
            humantime = time.ctime(numunixtime)
            if numunixtime < filtertime:
                global do_remove_data
                global oldcount
                oldcount += 1
                successmsg = " Removed %s:  %s from %s" % (humantime, status["name"], status["save_path"])
                errormsg = "Error removing %s" % (status["name"])
                tlist.append(client.core.remove_torrent(torrent_id, do_remove_data).addCallbacks(printSuccess, printError, callbackArgs = (True, successmsg), errbackArgs = (errormsg)))
            else:
                global skipcount
                skipcount += 1
                printSuccess(None, False, " Skipping %s: %s from %s" % (humantime, status["name"], status["save_path"]))
    defer.DeferredList(tlist).addCallback(printReport)

def on_session_state(result):
    client.core.get_torrents_status({"id": result}, ["name","time_added","save_path",]).addCallback(on_torrents_status)

def on_connect_success(result):
    printSuccess(None, True, "Connection was successful!")
    global timedifference
    global filtertime
    curtime = time.time()
    filtertime = curtime - (timedifference * 24 * 60 * 60)
    printSuccess(None, False, "Current unix time is %i" % (curtime))
    printSuccess(None, False, "Filtering torrents older than %s" % (time.ctime(int(filtertime))))
    client.core.get_session_state().addCallback(on_session_state)

cliconnect.addCallbacks(on_connect_success, endSession, errbackArgs=("Connection failed: check settings and try again."))

reactor.run()

Enjoy :)


Last Author Contributors Versions Last update
theCheek None 5 Tue, 09 Mar 2021 22:58:11 +0100