I built a thing for my family this Christmas and I wanted to post about it briefly.
If you're one of the few people dedicated enough to follow this blog, you'll know that my grandfather died last year, and that he was sort of the family videographer. What you likely don't know however is that this year, on my trip home I acquired his entire collection of DVDs that he'd been accumulating over the years.
This some really old stuff:
- Around the Christmas tree when I was 3 or 4 years old
- My dog learning tricks for the first time
- My parent's wedding
- My graduation
- My mother as a child in Romania
- My grandparents, so much younger, with friends in Romania
- My niece, Violet
It was an amazing collection spanning 4 generations over 39 DVDs, and I spent a few days on that trip home ripping every last one of the disks onto a portable hard drive so I could take the raw data home for a special project.
Well that project is now finished, so for those of you who don't care about the technical aspects, here's the link. I shared the URL with my family by email on Christmas day since I was on the other side of the world for the holiday festivities this year, but all in all, it seems to have gone over well.
My father has suggested that I expand on the collection with my own videos in the future -- I may just do that, though I'm more of a still photos guy. We'll see.
This whole thing was a HUGE pain in the ass, so I want to document the process, perhaps if only for future websurfers looking to do something similar.
The videos were in DVD format. Thankfully, it was digital, but it's certainly not web-friendly. The video data needed to be ripped from the disks and compressed into a web-friendly format that was high-quality enough to preserve the video, but in a file small enough to stream to Canada-quality internet connections.
Also, the DVDs were terribly organised and not indexed in any way. The disks often had multiple title tracks, sometimes duplicate tracks, and there were tracks that just contained garbage data.
Oh, and there was a time constraint. I only had the disks for a few days when I was in Canada. I wasn't going to take them back to the UK with me.
It was basically done in three stages:
Raw DVD > .iso file > .webm file
.iso file step was just a clean & easy way to back up all of the DVDs
without having to worry about accidentally missing something while I was
hurriedly trying to get through them all in Canada. By turning 39 DVDs into
39 files on a USB drive, I could be sure that I wouldn't accidentally lose
data during the ripping process.
As it turns out, this was a good plan, since it took a few weeks of tinkering with this project before I realised that some disks had multiple titles on them.
The creation of the
.iso files was easy. I just put the disk in the USB DVD
drive I brought with me and typed this:
$ dd if=/dev/dvd of=/path/to/usb/hard-drive/disk-00.iso
Waited about 20min, then took the disk out, and repeated this... 39 times.
The creation of the actual video file on the other hand was the big problem. There are lots of sites out there that claim to tell you how to do this, and very few of them have anything helpful. I think that this is because the end goal is rarely understood up front. Sometimes people are trying to encode DVDs into a high quality file for local playback, and the settings for that are rather different from what someone would want to do to encode for a web-friendly format.
There's also a wide variety of tools out there, most of which are buggy, unsupported, don't have a port for Gentoo, or just plain suck. The most common recommendation I found was for Handbrake, which is an impressive GUI for ripping videos but for me:
- It didn't encode files that were high enough quality given the file size
- It didn't make web-friendly formats. Even when you tick the box to make it web-friendly, the output file doesn't stream in Firefox. I didn't test other browsers.
- It was terribly slow to find all the tracks, apply the settings I wanted and then wait to see if things panned out. There's no command-line interface to make things easier.
All of this lead to a lot of frustration and weeks of tinkering, finally
leading me to a site that gave me the magic
ffmpeg incantation to generate a
$ ffmpeg \ -i /path/to/input.mp4 \ -vpre libvpx-720p \ -pass 1 -passlogfile ffmpeg-18 -an -f webm \ -y /path/to/output.webm && \ ffmpeg -i \ /path/to/input.mp4 \ -vpre libvpx-720p \ -pass 2 -passlogfile ffmpeg-18 -acodec libvorbis -ab 100k -f webm \ -y /path/to/output.webm
Of course this assumed a
.mp4 input file, and I wanted to rip straight from
.iso, so after much digging, I discovered that ffmpeg has a means of
concatenating (chaining) video inputs and it can read straight from a
.VOB file. With this nugget of knowledge, all I had to do was mount
.iso locally and compile a list of files conforming to this naming
With that information, I wrote a quick shell script that ended up generating a great big queue file of commands that look a lot like this:
ffmpeg -i \ 'concat:/mnt/grandpa/18/VIDEO_TS/VTS_01_1.VOB|/mnt/grandpa/18/VIDEO_TS/VTS_01_2.VOB|/mnt/grandpa/18/VIDEO_TS/VTS_01_3.VOB|/mnt/grandpa/18/VIDEO_TS/VTS_01_4.VOB|/mnt/grandpa/18/VIDEO_TS/VTS_01_5.VOB' \ -vpre libvpx-720p -pass 1 -passlogfile ffmpeg-18 -an -f webm \ -y /home/daniel/Projects/Grandpa/htdocs/vid/18.webm && \ ffmpeg -i \ 'concat:/mnt/grandpa/18/VIDEO_TS/VTS_01_1.VOB|/mnt/grandpa/18/VIDEO_TS/VTS_01_2.VOB|/mnt/grandpa/18/VIDEO_TS/VTS_01_3.VOB|/mnt/grandpa/18/VIDEO_TS/VTS_01_4.VOB|/mnt/grandpa/18/VIDEO_TS/VTS_01_5.VOB' \ -vpre libvpx-720p \ -pass 2 -passlogfile ffmpeg-18 -acodec libvorbis -ab 100k -f webm \ -y /home/daniel/Projects/Grandpa/htdocs/vid/18.webm
Unfortunately, ffmpeg doesn't really do threading very well, and the prevailing advice out there appears to be that you should just thread the process yourself rather than ask ffmpeg to try to use all your CPUs itself. For this bit, I wrote a very simple paralleliser in Python and magically, all of the cores on my super machine could crunch Grandpa's videos, 16 at a time.
Finally, I wrapped the whole thing in a simple script
that mounted all of the
.isos simultaneously and then ran the paralleliser,
and ran that in a tmux session so I could get on a plane and Fly to Greece
while my computer did its thing for two days.
While I was in Athens, I spent a day or two fiddling with the site itself, getting video.js to work the way I wanted it to and playing with Select2 to try and get an interface that the non-technical people in my family could follow. I wish I had better skills in this area 'cause frankly, the site is kinda ugly, but at least it's functional now.
So that's it. I hope that one day, someone will find this stuff useful. The ffmpeg incantations were especially difficult to find and assemble, so I figure that'll help someone eventually.