Automated Archival of Zoo di 105 Episodes

Setting Up and Maintaining zoo.ovh: Automated Archival of Lo Zoo di 105

Since 2015, I have been running and maintaining zoo.ovh, an automated archival of all episodes of Lo Zoo di 105, a popular Italian radio show. The system downloads new episodes daily, converts them to a lower bitrate to save storage space, and uploads them to rotating MEGA accounts. Each episode is automatically published on the zoo.ovh website.

This post covers the setup and scripts that have been working for the past 8 years with minimal maintenance.

Architecture Overview

The system is designed to:

Download the daily episode from the radio’s website at 19:00 every day.
Convert the audio using lame to reduce the bitrate to 128kbps.
Upload the episode to a rotating list of MEGA accounts to ensure sufficient storage.
Generate and publish a blog post on zoo.ovh with the download link.

Daily Download Script

This script runs via cron every day at 19:00.

Cron setup:

0 19 * * * /path/to/download_new_episode.sh

Script: download_new_episode.sh

#!/bin/bash
PUNTATA=`date +"%d%m%Y"`
DATA=`date +"%a"`

echo "Starting download..."
wget -O /root/Zoo/${DATA}_${PUNTATA}.mp3 http://www.105.net/upload/uploadedContent/repliche/zoo/${DATA}_${PUNTATA}_zoo.mp3
wgetreturn=$?

if [[ $wgetreturn -ne 0 ]]; then
    TODAY=$(date +'%Y/%-m/%-d')
    wget -O /root/Zoo/${DATA}_${PUNTATA}.mp3 http://podcast.mediaset.net/repliche/${TODAY}/${DATA}_${PUNTATA}_zoo.mp3
fi

wgetreturn=$?
if [[ $wgetreturn -ne 0 ]]; then
    cd /var/www/
    ./avvisamdrzn.sh
    echo "no" > statodownload

Tries to download the episode from the primary URL. If that fails, it attempts an alternative URL.

If both fails, it sends a notification on Telegram.

else
    echo "Download completed!"
    echo "ok" > statodownload

    echo "Converting to 128kbps"
    cd /root/Zoo
    lame -b 128 *

    echo "Lame executed. Renaming mp3"
    rename -f 's/.mp3//' *.mp3.mp3

    echo "Login..."
    mega-login $MEGAACCOUNT

    echo "Starting upload.."
    mega-cd Zoo
    mega-put . /root/Zoo/*

    echo "Upload completed.."
    rm -rf /root/Zoo/*
    echo "Files deleted.."

    cd /var/www/
    ./lista.sh ${DATA}_${PUNTATA}
fi

Explanation:

wget retries: The script first tries to download the episode from the primary URL. If that fails, it attempts an alternative URL.
Bitrate conversion: Converts the downloaded episode to 128kbps using lame.
Uploading to MEGA: The script logs into a specific MEGA account and uploads the converted file.

MEGA’s free accounts have limited storage capacity (15-20GB each), which necessitates using multiple accounts and rotating between them. While I could use a single paid account with 1TB of storage, this would create a single point of failure - one DMCA takedown request could disable the entire archive. Although I have informal permission from the radio show to maintain this archive, spreading the content across multiple accounts provides better resilience against potential issues.

Blog Post Creation Script

The following script generates and publishes a post on zoo.ovh with the download link.

Script: lista.sh $episode_name

#!/bin/bash

PUNTATA=${1}

echo "Exporting..."
mega-export -a ${PUNTATA}.mp3
mega-export -f ${PUNTATA}.mp3 > megalista.txt
awk '{print $10}' megalista.txt > linkmega.txt
awk '{print $1}' megalista.txt > nomepuntata.txt
peso=`mega-ls -lh | grep ${PUNTATA} | awk '{print $3}'`
dim=`mega-ls -lh | grep ${PUNTATA} | awk '{print $4}'`
value=`cat linkmega.txt | head -n1`
nomepuntata=`cat nomepuntata.txt | head -n1`
nomeok=${nomepuntata::-4}

# Extract date components
data=${nomeok:4}
giorno="${data:0:2}"
mese="${data:2:2}"
anno="${data:4:4}"

This script extracts metadata (date, title, day of week) from the episode filename using basic string manipulation in bash. While not the most elegant solution, it’s effective for files following the naming pattern “mon_ddmmyy_zoo.mp3”.

I wrote this back in 2015-2016 when I was learning bash scripting. Even though tools like Python with regex or modern LLMs would handle this more cleanly, I still appreciate bash for quick text processing tasks like this. The script has proven reliable over the years despite (or perhaps because of) its simplicity.

# Create the blog post
cat << EOF > post.txt
Link al download:<br>
<a href=${value}><img src='http://zoo.ovh/wp-content/uploads/2019/01/download.png' alt='Clicca per scaricare la puntata' title='Replica di ${data}'></img></a>
EOF

/usr/local/bin/wp post create ./post.txt \
    --post_title="Puntata ${nomeok}" \
    --post_date=${anno}-${mese}-${giorno} \
    --post_status=publish --allow-root

# Send notification
LINK=$(/usr/local/bin/wp post list --allow-root --field=url | grep ${PUNTATA})
./inviachannel.sh "Ecco la replica della puntata: ${LINK}, peso puntata: ${peso} ${dim}"

Explanation:

MEGA export: Retrieves a public download link for the episode.
Post content generation: Uses the MEGA link to create a WordPress post.
Notification: Sends a message to a channel with the episode details.

Storage Management

The storage infrastructure evolved organically over time to accommodate the growing archive.

Initially, I created a single MEGA account which provided 50GB of free storage space during MEGA’s early days. Once that account reached its capacity at around 45GB, I created a second account to continue storing new episodes.

This pattern of creating new accounts as needed continued, leading to the current setup of 19 different MEGA accounts. Here’s a snapshot of storage usage across some of the accounts:

Zoo01 Used: 45.12 GB (back when Mega.co.nz gave 50GB accounts for free)
Zoo02 Used: 14.59 GB (About 6 years ago Mega.co.nz decreased the free storage to 15GB)
Zoo16 Used: 19.91 GB (About 3 years ago Mega.nz increased the free storage to 20GB)
Zoo19 Used: 15.50 GB (The one currently used)

Maintenance

Over the years, maintenance has been minimal, typically involving:

Updating MEGA-CLI when necessary.
Managing WordPress updates.
Occasional adjustments to the download URLs if the radio station changes its hosting setup.

Despite these minor tasks, the system has been incredibly reliable.

Final Thoughts

The automated archival system for Lo Zoo di 105 demonstrates the power of combining simple shell scripts, reliable tools like wget and lame, and automation via cron jobs. With minimal intervention, this setup has preserved years of radio content efficiently.

If you’re interested in implementing a similar system or have questions about the setup, feel free to reach out!

← Back to blog

Table of contents

Automated Archival of Zoo di 105 Episodes

Setting Up and Maintaining zoo.ovh: Automated Archival of Lo Zoo di 105

Architecture Overview

Daily Download Script

Explanation:

Blog Post Creation Script

Explanation:

Storage Management

Maintenance

Final Thoughts