Automated Archival of Zoo di 105 Episodes
Setting Up and Maintaining zoo.ovh: Automated Archival of Lo Zoo di 105
Since 2015, I have been running and maintaining zoo.ovh, an automated archival of all episodes of Lo Zoo di 105, a popular Italian radio show. The system downloads new episodes daily, converts them to a lower bitrate to save storage space, and uploads them to rotating MEGA accounts. Each episode is automatically published on the zoo.ovh website.
This post covers the setup and scripts that have been working for the past 8 years with minimal maintenance.
Architecture Overview
The system is designed to:
- Download the daily episode from the radio’s website at 19:00 every day.
- Convert the audio using
lameto reduce the bitrate to 128kbps. - Upload the episode to a rotating list of MEGA accounts to ensure sufficient storage.
- Generate and publish a blog post on zoo.ovh with the download link.
Daily Download Script
This script runs via cron every day at 19:00.
Cron setup:
0 19 * * * /path/to/download_new_episode.shScript: download_new_episode.sh
#!/bin/bashPUNTATA=`date +"%d%m%Y"`DATA=`date +"%a"`
echo "Starting download..."wget -O /root/Zoo/${DATA}_${PUNTATA}.mp3 http://www.105.net/upload/uploadedContent/repliche/zoo/${DATA}_${PUNTATA}_zoo.mp3wgetreturn=$?
if [[ $wgetreturn -ne 0 ]]; then TODAY=$(date +'%Y/%-m/%-d') wget -O /root/Zoo/${DATA}_${PUNTATA}.mp3 http://podcast.mediaset.net/repliche/${TODAY}/${DATA}_${PUNTATA}_zoo.mp3fi
wgetreturn=$?if [[ $wgetreturn -ne 0 ]]; then cd /var/www/ ./avvisamdrzn.sh echo "no" > statodownloadTries to download the episode from the primary URL. If that fails, it attempts an alternative URL.
If both fails, it sends a notification on Telegram.
else echo "Download completed!" echo "ok" > statodownload
echo "Converting to 128kbps" cd /root/Zoo lame -b 128 *
echo "Lame executed. Renaming mp3" rename -f 's/.mp3//' *.mp3.mp3
echo "Login..." mega-login $MEGAACCOUNT
echo "Starting upload.." mega-cd Zoo mega-put . /root/Zoo/*
echo "Upload completed.." rm -rf /root/Zoo/* echo "Files deleted.."
cd /var/www/ ./lista.sh ${DATA}_${PUNTATA}fiExplanation:
wgetretries: The script first tries to download the episode from the primary URL. If that fails, it attempts an alternative URL.- Bitrate conversion: Converts the downloaded episode to 128kbps using
lame. - Uploading to MEGA: The script logs into a specific MEGA account and uploads the converted file.
MEGA’s free accounts have limited storage capacity (15-20GB each), which necessitates using multiple accounts and rotating between them. While I could use a single paid account with 1TB of storage, this would create a single point of failure - one DMCA takedown request could disable the entire archive. Although I have informal permission from the radio show to maintain this archive, spreading the content across multiple accounts provides better resilience against potential issues.
Blog Post Creation Script
The following script generates and publishes a post on zoo.ovh with the download link.
Script: lista.sh $episode_name
#!/bin/bash
PUNTATA=${1}
echo "Exporting..."mega-export -a ${PUNTATA}.mp3mega-export -f ${PUNTATA}.mp3 > megalista.txtawk '{print $10}' megalista.txt > linkmega.txtawk '{print $1}' megalista.txt > nomepuntata.txtpeso=`mega-ls -lh | grep ${PUNTATA} | awk '{print $3}'`dim=`mega-ls -lh | grep ${PUNTATA} | awk '{print $4}'`value=`cat linkmega.txt | head -n1`nomepuntata=`cat nomepuntata.txt | head -n1`nomeok=${nomepuntata::-4}
# Extract date componentsdata=${nomeok:4}giorno="${data:0:2}"mese="${data:2:2}"anno="${data:4:4}"This script extracts metadata (date, title, day of week) from the episode filename using basic string manipulation in bash. While not the most elegant solution, it’s effective for files following the naming pattern “mon_ddmmyy_zoo.mp3”.
I wrote this back in 2015-2016 when I was learning bash scripting. Even though tools like Python with regex or modern LLMs would handle this more cleanly, I still appreciate bash for quick text processing tasks like this. The script has proven reliable over the years despite (or perhaps because of) its simplicity.
# Create the blog postcat << EOF > post.txtLink al download:<br><a href=${value}><img src='http://zoo.ovh/wp-content/uploads/2019/01/download.png' alt='Clicca per scaricare la puntata' title='Replica di ${data}'></img></a>EOF
/usr/local/bin/wp post create ./post.txt \ --post_title="Puntata ${nomeok}" \ --post_date=${anno}-${mese}-${giorno} \ --post_status=publish --allow-root
# Send notificationLINK=$(/usr/local/bin/wp post list --allow-root --field=url | grep ${PUNTATA})./inviachannel.sh "Ecco la replica della puntata: ${LINK}, peso puntata: ${peso} ${dim}"Explanation:
- MEGA export: Retrieves a public download link for the episode.
- Post content generation: Uses the MEGA link to create a WordPress post.
- Notification: Sends a message to a channel with the episode details.
Storage Management
The storage infrastructure evolved organically over time to accommodate the growing archive.
Initially, I created a single MEGA account which provided 50GB of free storage space during MEGA’s early days. Once that account reached its capacity at around 45GB, I created a second account to continue storing new episodes.
This pattern of creating new accounts as needed continued, leading to the current setup of 19 different MEGA accounts. Here’s a snapshot of storage usage across some of the accounts:
- Zoo01 Used: 45.12 GB (back when Mega.co.nz gave 50GB accounts for free)
- Zoo02 Used: 14.59 GB (About 6 years ago Mega.co.nz decreased the free storage to 15GB)
- Zoo16 Used: 19.91 GB (About 3 years ago Mega.nz increased the free storage to 20GB)
- Zoo19 Used: 15.50 GB (The one currently used)
Maintenance
Over the years, maintenance has been minimal, typically involving:
- Updating MEGA-CLI when necessary.
- Managing WordPress updates.
- Occasional adjustments to the download URLs if the radio station changes its hosting setup.
Despite these minor tasks, the system has been incredibly reliable.
Final Thoughts
The automated archival system for Lo Zoo di 105 demonstrates the power of combining simple shell scripts, reliable tools like wget and lame, and automation via cron jobs. With minimal intervention, this setup has preserved years of radio content efficiently.
If you’re interested in implementing a similar system or have questions about the setup, feel free to reach out!
← Back to blog