Downloading 78s from the Internet Archive
I’m leaving this as a note for myself, so that I can refer to it in the future, but it might help other people too. I’m downloading a large quantitiy of rips of 78 RPM records from the internet archive’s Great 78 Project (for a project that I’ll discuss later) and I want to ensure that I only get the “The preferred versions suggested by an audio engineer at George Blood, L.P. [which] have been copied to have […] more friendly filenames.”
I’m using The Internet Archive’s python CLI. It doesn’t have an obvious method for doing this. It does, however, support searching files to download with a glob. This is how I’ve downloaded things like this in the past, but I’ve been doing too much bash and not enough Python recently, so I kept screwing up the syntax.
The syntax to download friendly filename mp3s from the George Blood LP collection at the internet archive using the ia python tool is:
./ia download --search="collection:georgeblood" --glob="[!_]*.mp3"
So there we go, a one liner for the Internet Archive ia python CLI glob to ignore filenames with underscores.
(And that should be enough keywords that I actually find this in four years when I need to remember it again. :-) )
Check out the other stuff I do: Retro Social (Mastodon Instance), Analog Revolution, Space Age Ideas, Of Many Trades. If you want to help me keep making stuff, check out the rewards available from my Patreon.
Share on: [Reddit] [FaceBook] [Google+] [Twitter]