YouTube Transcript Search

Full-stack application with browser extensions for capturing and searching YouTube video transcripts across channels.

2025
Ongoing
ProductionOpen Source
YouTube Transcript Search interface

Main application interface showing channel management and search capabilities

Frontend landing page showing channel search

Channel search and management interface with subscription status and video counts

Extension popup showing search results

Extension popup demonstrating Mode 1 search with API-powered results and click-to-seek navigation

In-page search UI injected into YouTube

Mode 2 in-page search UI seamlessly integrated into YouTube's transcript panel, matching native theme

Search results showing matches across videos

Cross-channel transcript search results with timestamp navigation and context preview

System architecture diagram

Multi-component architecture showing extension, API, database, and WebSub integration

A comprehensive full-stack system designed to solve the problem of searchability across YouTube content. The project consists of three integrated components: a FastAPI backend with PostgreSQL full-text search, a SvelteKit web interface for channel management and cross-video search, and browser extensions for Chrome and Firefox that automatically capture transcripts. The extension architecture is particularly sophisticated, featuring a modular manifest patching system that allows separate development and production builds for each browser, handling the nuances of Chrome's Manifest V3 service workers versus Firefox's scripts requirement. Users can search transcripts in two ways: through an in-page UI that seamlessly integrates into YouTube's existing transcript panel (matching the native theme in both light and dark modes), or via the extension popup which provides API-powered search with click-to-seek functionality. The service worker implements intelligent caching with FIFO eviction (50 video limit) to minimize API calls while maintaining fresh data. WebSub (PubSubHubbub) integration provides real-time notifications when subscribed channels publish new videos, triggering automatic transcript capture. The backend leverages PostgreSQL's ts_vector for performant full-text search across potentially thousands of video transcripts, while the frontend offers an intuitive interface for managing channel subscriptions and exploring search results across entire catalogs.

Technologies Used

framework

FastAPISvelteKitSQLAlchemy

language

TypeScriptPython

database

PostgreSQL

platform

Browser ExtensionChromeFirefox

tool

WebpackAlembic

protocol

WebSub

Challenges

Modular manifest patching for multi-browser, multi-environment builds. Chrome service_worker vs Firefox scripts in Manifest V3. Theme-aware UI integration matching YouTube's native interface. Service worker caching with FIFO eviction under memory constraints. YouTube SPA navigation detection without memory leaks. PostgreSQL full-text search optimization across large datasets. WebSub protocol implementation for real-time notifications. CORS configuration for chrome-extension:// and moz-extension:// origins.

Key Learnings

Extension Manifest V3 architecture across browsers. Advanced Webpack multi-target builds with transform-based patching. Service worker lifecycle and memory optimization. PostgreSQL ts_vector and GIN index for full-text search. FastAPI dependency injection and SQLAlchemy relationships. Alembic migrations across environments. WebSub/PubSubHubbub protocol and signature verification. Chrome vs Firefox extension API differences. CSS custom properties for theme-aware integration. Separation of concerns in multi-component architecture.

Project Details

Difficulty
advanced
Duration
Ongoing
Role
Full-Stack Developer & System Architect

Related Projects

Back to all projects