This repository hosts a set of scripts designed to automate the process of downloading a YouTube video, extracting its audio, transcribing the audio, and summarizing the transcribed text. It's perfect for quickly generating summaries of long video content. This is basically a fork of http://github.com/actuallyrizzn/process. I'm simplifying the "original" of process so that it can be more widely used, and keeping this as a repo of the video and pdf specific variant.
Repo URL: http://github.com/actuallyrizzn/process
The main workflow is controlled by a bash script process.sh
, which in turn calls several Python scripts and a bash script to complete specific tasks:
process
Usage: ./process 2.
This Python script downloads a YouTube video. It uses the ytget.py
pytube
library to do this.
v2a.sh
ffmpeg
to convert the downloaded video file to an audio file. If the audio file is larger than 25MB, it is split into smaller chunks using ffmpeg
.transcribe.py
openai.api
.chnk.py
summarize.py
openai.api
.1. Clone the repository:
git clone http://github.com/actuallyrizzn/process.git
2. Navigate to the cloned directory:
cd process
3. Make sure you have the necessary dependencies installed. You can install the Python dependencies using pip:
pip install pytube openai
4. Install ffmpeg
on your system. The installation process varies depending on your operating system.Run the process.sh
script with a YouTube video URL:
./process.sh -u
You can also specify a destination directory for the downloaded video:
./process.sh -u -d
---
Please replace
and
with the actual URL and directory when you use the script.