How I Used LLMs to Build a Webpage for My Zoo di 105 Archive

Managing a large archive of podcast episodes can be challenging, especially when you want users to easily search, navigate, and access transcriptions in real time. For my Zoo di 105 archive, I built zoo.mdrzn.it, a page that dynamically lists all available .mp3 files with synchronized .vtt transcriptions. In this post, I explain how I used large language models (LLMs) to automate key tasks and streamline development.

Overview of the Page

At zoo.mdrzn.it, users can:

View a list of available episodes: All .mp3 files stored on my server are displayed in a structured list.
Access real-time transcriptions: Each episode has an accompanying .vtt file, created using Whisper, that displays the current transcript line as the track plays.
Search across all transcripts: A search feature allows users to find keywords and phrases across the entire archive of episodes.

Below is a screenshot of the interface:

Interface Screenshot

Step 1: Preparing the Audio Files and Transcriptions

Before building the webpage, I needed high-quality transcriptions for each episode. Here’s how I automated the transcription process:

Using Whisper to Generate `.vtt` Files

On my desktop PC, I used OpenAI Whisper to transcribe the audio. The process was fairly straightforward:

whisper "path/to/audio/*.mp3" --output_dir "path/to/vtt_files/" --format vtt

Whisper automatically generated .vtt files, which are time-stamped captions suitable for displaying the transcriptions while the audio plays.

Step 2: Building the Webpage

The webpage needed to dynamically list all episodes and provide a smooth user experience for playback and real-time transcription. Here’s how I approached it:

Server-Side Setup

Since I host this on a Hetzner server running Debian 12 with Virtualmin, I created a simple API endpoint that scans the directory containing the audio files and .vtt transcriptions:

const fs = require('fs');
const express = require('express');
const app = express();
const port = 3000;

app.get('/api/episodes', (req, res) => {
  const files = fs.readdirSync('/path/to/audio');
  const episodes = files.filter(file => file.endsWith('.mp3')).map(file => ({
    name: file.replace('.mp3', ''),
    audioPath: `/audio/${file}`,
    vttPath: `/vtt/${file.replace('.mp3', '.vtt')}`
  }));

  res.json(episodes);
});

app.listen(port, () => console.log(`API running on port ${port}`));

This endpoint returns the list of episodes and their associated file paths, which the front-end can consume.

Front-End Design

I used React with Tailwind CSS to create a clean interface. The main page fetches the list of episodes and displays them in a grid:

import React, { useState, useEffect } from 'react';
import TranscriptViewer from './TranscriptViewer';

function App() {
  const [episodes, setEpisodes] = useState([]);
  const [currentEpisode, setCurrentEpisode] = useState(null);

  useEffect(() => {
    fetch('/api/episodes')
      .then(response => response.json())
      .then(data => setEpisodes(data));
  }, []);

  return (
    <div className="container mx-auto p-4">
      <h1 className="text-2xl font-bold">Zoo di 105 Archive</h1>
      <ul className="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-3 gap-4 mt-4">
        {episodes.map(episode => (
          <li key={episode.name} onClick={() => setCurrentEpisode(episode)}>
            <div className="p-4 border rounded-lg cursor-pointer hover:bg-gray-100">
              {episode.name}
            </div>
          </li>
        ))}
      </ul>

      {currentEpisode && (
        <TranscriptViewer episode={currentEpisode} />
      )}
    </div>
  );
}

export default App;

Transcript Synchronization

For the real-time transcript display, I wrote a simple component that fetches the .vtt file, parses it, and highlights the current line based on the audio’s progress:

import React, { useEffect, useState } from 'react';

function TranscriptViewer({ episode }) {
  const [transcript, setTranscript] = useState([]);
  const [currentLine, setCurrentLine] = useState(null);

  useEffect(() => {
    fetch(episode.vttPath)
      .then(response => response.text())
      .then(vtt => parseVTT(vtt));
  }, [episode]);

  const parseVTT = (vtt) => {
    const lines = vtt.split('\n\n').map(block => {
      const [time, ...text] = block.split('\n');
      return { time, text: text.join(' ') };
    });
    setTranscript(lines);
  };

  const handleAudioTimeUpdate = (e) => {
    const currentTime = e.target.currentTime;
    const line = transcript.find(t => parseTime(t.time) <= currentTime);
    setCurrentLine(line);
  };

  return (
    <div className="mt-6">
      <h2 className="text-xl">Playing: {episode.name}</h2>
      <audio src={episode.audioPath} controls onTimeUpdate={handleAudioTimeUpdate}></audio>
      <div className="mt-4">
        {transcript.map((line, index) => (
          <p key={index} className={line === currentLine ? 'text-blue-500' : ''}>
            {line.text}
          </p>
        ))}
      </div>
    </div>
  );
}

export default TranscriptViewer;

The Search Feature

To enable search across transcripts, I leveraged an indexed search algorithm. The transcripts are indexed on the server, and when a search is performed, the server returns matching segments along with their time codes.

Step 3: Deploying the Application

Deployment was simple thanks to my existing setup with Virtualmin on the Hetzner server. Here’s a summary of what I did:

Built the React app using:
Terminal window
```
npm run build
```
Moved the build files to the server’s web directory:
Terminal window
```
scp -r build/* user@myserver:/var/www/zoo.mdrzn.it
```
Configured Nginx to serve the static files and proxy the API.

Conclusion

Using LLMs and AI tools like Whisper saved me countless hours of manual transcription work, and building the page using React provided a smooth, dynamic user experience. If you’re managing a large audio archive, consider automating transcription and search features—you’ll be amazed at how much easier it makes the entire process.

Let me know in the comments if you’d like me to expand on any specific section, such as the Whisper setup, React component design, or search implementation!

← Back to blog

Table of contents