Panopto Caption

tl;dr How to use:

Install Tampermonkey (or another userscript manager) on your browser:
Install the userscript:
1. Click here.
2. You should be prompted with a Tampermonkey screen asking to install Panopto Caption.
3. Click Install.
Download subtitles!
1. Navigate to whatever Panopto video you want to download captions from.
2. Within moments, the Panopto Caption userscript will automatically download a subtitle file for that the Panopto video.

Motive

Panopto is a video hosting platform made for businesses and universities.

Because of the nature of the videos being hosted on Panopto (i.e. university lectures), being able to download the videos can be convenient. There are already methods (hls downloader browser extensions) for downloading the videos when there are no download methods readily available on the video viewer client interface. However, this usually just downloads the video and not the captions that accompany the video in Panopto.

From this, I created the userscript (a custom file that can execute javascript on the client side of a website) Panopto Caption to automatically generate a subtitle file from Panopto. Here is how Panopto Caption works:

How it works

Subtitle files (ending with ‘.srt’) are formatted as follows:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
1
00:00:00,000 --> 00:00:03,111
First caption

2
00:00:03,111 --> 00:00:05,123
Second caption


3
00:00:05,123 --> 00:00:10,321
Third caption

For each subtitle/caption in the subtitle file is accompanied with two things:

Sequence number
Time span

The sequence number indicates where in the sequence of captions the corresponding caption is located (i.e. the caption with sequence number 3 is third caption). The time span indicates the starting and ending times of the corresponding caption (i.e. the second caption above starts showing up 3.111 seconds into the video and disappear 5.123 seconds into the video).

So for the first 3.111 seconds, the subtitle is “First caption.” From 3.111 to 5.123 seconds, the subtitle is “Second caption.” From 5.123 to 10.321 seconds, the subtitle is “Third caption.”

Panopto Caption generates and downloads the subtitle file for the Panopto video that you want to watch offline. This is done in four steps.

1. Waiting for captions to be accessible

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
function waitForCaptions(querySelector, callback) {
    const observer = new MutationObserver((mutationList, obs) => {
        if (document.querySelector(querySelector)) {
            obs.disconnect();
            setTimeout(callback, 3000);
            return;
        } else {
            console.log('PCE: captions not found yet...');
        }
    });
    observer.observe(document, {
        attributes: true,
        childList: true,
        subtree: true
    });
}

As with most webscraping projects, waiting for the presence/interactability of elements is crucial. For this I use MutationObserver to check for when the caption element (which matches querySelector) are available whenever changes are made in the DOM tree.

The main part of Panopto Caption is passed in as the callback function to waitForCaptions so that everything (subtitles extration, generation, and download) is executed once the caption element is available in the DOM tree.

2. Extract and generate subtitles text

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
let captionString = '';
let captionHour = 0;
let captionMinute = 0;
let captionSeconds = 0;
const captionHTMLElements = document.querySelector("#transcriptTabPane > div.event-tab-scroll-pane > ul").children;


// Generate full caption string
for (let i = 0; i < captionHTMLElements.length; i++) {

    // Add caption index
    captionString += i + 1 + '\n';

    // Extract caption start time
    const captionTime = captionHTMLElements[i].children[1].children[2].innerText;
    [captionHour, captionMinute, captionSeconds] = extractTime(captionTime, captionHour, captionMinute, captionSeconds);

    // Format + add start time
    let formattedTime = formatTime(captionHour, captionMinute, captionSeconds);
    captionString += formattedTime + ',000 --> ';

    if (i === captionHTMLElements.length - 1) {
        // Get end of video time for last caption
        let videoEndHour = 0;
        let videoEndMinute = 0;
        let videoEndSeconds = 0;

        const timeElapsed = document.getElementById('timeElapsed').innerText;
        [videoEndHour, videoEndMinute, videoEndSeconds] = extractTime(timeElapsed, videoEndHour, videoEndMinute, videoEndSeconds);

        const timeRemaining = document.getElementById('timeRemaining').innerText.slice(1);
        const timeRemainingComponents = timeRemaining.split(':');
        if (timeRemainingComponents.length === 2) {
            videoEndMinute += parseInt(timeRemainingComponents[0]);
            videoEndSeconds += parseInt(timeRemainingComponents[1]);
        } else if (timeRemainingComponents.length === 3) {
            videoEndHour += parseInt(timeRemainingComponents[0]);
            videoEndMinute += parseInt(timeRemainingComponents[1]);
            videoEndSeconds += parseInt(timeRemainingComponents[2]);
        }

        // Correct format to standard time units
        const adjustedSeconds = videoEndSeconds % 60;
        const adjustedMinute = (videoEndMinute + Math.floor(videoEndSeconds / 60)) % 60;
        const adjustedHour = videoEndHour + Math.floor((videoEndMinute + Math.floor(videoEndSeconds / 60)) / 60);

        // Format + add video end time
        formattedTime = formatTime(adjustedHour, adjustedMinute, adjustedSeconds);;
        captionString += formattedTime + ',000\n';

    } else {
        // Extract caption end time for non-last captions
        const captionTime = captionHTMLElements[i+1].children[1].children[2].innerText;
        [captionHour, captionMinute, captionSeconds] = extractTime(captionTime, captionHour, captionMinute, captionSeconds);

        // Format + add end time
        formattedTime = formatTime(captionHour, captionMinute, captionSeconds);
        captionString += formattedTime + ',000\n';
    }

    // Extract + add caption text
    const captionText = captionHTMLElements[i].children[1].children[1].innerText;
    captionString += captionText.trim() + '\n\n';
}

Throughout I use two helper functions extractTime and formatTime.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
function extractTime(timeString, hour, minute, seconds) {
    const timeComponents = timeString.split(':');
    if (timeComponents.length === 2) {
        minute = parseInt(timeComponents[0]);
        seconds = parseInt(timeComponents[1]);
    } else if (timeComponents.length === 3) {
        hour = parseInt(timeComponents[0]);
        minute = parseInt(timeComponents[1]);
        seconds = parseInt(timeComponents[2]);
    }
    return [hour, minute, seconds];
}

extractTime takes in timeString and extracts and returns the hour, minute, and seconds. If timeComponents.length === 2, then the given timeString only has numbers for minute and seconds. If timeComponents.length === 3, then the given timeString has numbers for hour, minute, and seconds.

1
2
3
4
5
6
7
function formatTime(hour, minute, seconds) {
    return [
        hour.toString().padStart(2, '0'),
        minute.toString().padStart(2, '0'),
        seconds.toString().padStart(2, '0')
    ].join(':');
}

formatTime takes in hour, minute, and seconds and returns the formatted (with padding) string that is of format “00:00:00.”

All of this is to form the captionString which contains the fully formatted string for the subtitle file.

3. Create subtitles file

1
2
3
let textFile = null;
const data = new Blob([captionString], {type: 'text/plain'});
textFile = URL.createObjectURL(data);

To create the actual subtitle file, I use a Blob object from the raw accumulated captionString. Then I use URL.createObjectUrl() to generate a URL representation of the created Blob object. We then use this as the download link to the subtitle file.

4. Download subtitle file

1
2
3
4
5
const downloadCaption = document.createElement('a');
downloadCaption.href = textFile;
downloadCaption.download = document.getElementsByTagName('title')[0].innerText + '.srt';
downloadCaption.click();
URL.revokeObjectURL(textFile);

Here we create an anchor element to hold the object URL as the download link. Subtitle files have the ‘.srt’ extension so we add that to the end of the title of the video. We call click() on the created anchor element to simulate a mouse click on the element. In the end we revoke the object URL to prevent memory leaks.

Repo

The code for Panopto Caption is hosted on Github.

Issues

Please report any issues to the repository’s issue section.