When adding to my collection of music, I find that it is necessary to listen to the whole lot in order to weed out duplicates. The scenario goes something like this: I am handed a CD full of ripped stuff with the comment that there might be something in there that I like. Sure, there is. Our tastes are similiar. But that CD is about 12 hours long and that is how long it will take for me to listen to all of it.
So it is copied to somewhere on the hard disk, in quarantine, so as to speak, until I get to listen to it and add it to my collection. What I need is some proggy which will listen to the whole lot, sort of, and notify me of duplicate tunes.
A simple file compare does not do the job. Two rips of the same audio track rarely match. Then again, when two different people rip the same track using different software and hardware even the playing time and file size would differ. Then there is the difference in bit rates, VBR etc to be taken into account.
The program would take the first few seconds of each song, decode to raw and then do a shift/multiply/add operation on each pair (ie, correlation). This would have to be repeated for a few different time offsets also, to allow for tracks clipped differently at the start.
The results would be presented as rankings: A table showing the highest few scores for each. A new song will have low scores. A duplicate will have a high score against ONE tune in my collection. If a song has high scores against many tracks there is something wrong.
There must be something like this out there. If not, it is heigh, ho, with a C and a song, off to code a new prog.-- neelandan, Jul 02 2003 I've seen programs that do just this for images, so it should be quite feasible for music as well.
Of course it's O(n^2 l) to check an entire collection for duplicates...-- cp, Jul 02 2003 MoodLogic can identify songs based on the audio (spectral) content. It even corrects id3 tags for you. Database is still fairly small though I think. (Recognizes popular songs off of an album, but not every track.) http://www.moodlogic.com/-- jamieacura, Jul 02 2003 Welllllllll, there's quite a few different recordings of John Coltrane's "Naima" - 40, in fact. A few of those are on compilation sets, but the vast majority are separate and distinct performances of the same tune. "Giant Steps" appears on 23 albums, this time - a smaller percentage are *different* performances. Both songs also appear on a few albums twice, though performed a bit differently -the ol' *alternate takes* trick. That's just 2 songs by 1 jazz legend - 'Trane.-- thumbwax, Jul 02 2003 How about something that checks a song by bassline and tempo. For most music the bass and drums are distinct and easy to recognize by amplitude peaks. These would stay more or less constant even in different live versions of the song.-- feedmewithyourkids, Jul 03 2003 Yesss..I like this idea. Perhaps an app that could compare MP3's bit for bit. It flags duplicates based on a percentage of how closely the files' bits match, like if 80 or 90% of the bits match up. hmmm....-- frakamazog, Jul 04 2003 this sort of allready exists-- kurtynlsn, Feb 21 2004 There are already programs that can search directories on your computer for duplicate files in general, one of those would probably help considerably, as long as most of your music is in the same format.-- Psudomorph, May 03 2007 random, halfbakery