This should be easy. I have a .mp3 ripped from a disc I bought in high school. But it’s ripped as a single song, one for the first disc, one for the second disc. And the disc has long since been destroyed, or lost by my parents/siblings after I moved out. So I’m cleaning it up, and learning a few things in the process.
Normally I would do this:
ffmpeg -ss [start seconds] -i input.mp3 -t [length in seconds] -c copy -metadata artist="[artist name]" -metadata title="[song title]" -metadata track="[track number]/[total tracks]" -id3v2_version 3 -write_id3v1 1 output.mp3
Grab the track info. This doesn’t show what’s on disc 1 and disc 2, so we’ll have to tally up the time and see where the end of disc 1 falls. Put it into your favorite text editor and find/replace until you have a list of lists. Disc 1 is 1:08:46 which is 4,126 seconds.
trackInfo = [["Banyan Tree","Feel The Sun Rise","2","09"],["Andy Duguid Featuring Leah","Wasted","4","18"],["King Unique","Yohkoh (King Unique Original Mix)","4","24"],["Motorcitysoul","Space Katzle (Jerome Sydenham Remix)","4","55"],["Three Drives","Feel The Rhythm (Ton TB Dub Mix)","5","53"],["Rachael Starr","To Forever (Moonbeam Remix)","5","08"],["Jerry Ropero Featuring Cozi","The Storm (Inpetto Remix)","5","36"],["Kamui ","Get Lifted","5","21"],["Cary Brothers","Ride (Tiësto Remix)","5","06"],["Airbase Featuring Floria Ambra","Denial","6","03"],["Dokmai","Reason To Believe","6","03"],["Cressida","6AM (Kyau & Albert Remix)","5","05"],["Allure Featuring Christian Burns","Power Of You","6","39"],["Clouded Leopard","Hua-Hin","2","01"],["Steve Forte Rio Featuring Jes (12)","Blossom (Lounge Mix)","2","00"],["Zoo Brazil","Crossroads","4","47"],["Beltek","Kenta","5","59"],["Sied van Riel","Rush","5","00"],["Tiësto*","Driving To Heaven (Mat Zo Remix)","5","49"],["Carl B.*","Just A Thought","6","05"],["Kimito Lopez","Melkweg","6","09"],["JPL","Whenever I May Find Her (Joni Remix)","5","34"],["Estiva vs. Marninx","Casa Grande","5","12"],["Existone","Wounded Soul","6","31"],["André Visior & Kay Stone","Something For Your Mind (Guiseppe Ottaviani Remix)","6","01"],["Hensha","The Curtain","6","57"],["DJ Eremit","Tanz Der Seele (YOMC Remix)","3","49"],["Manilla Rising","Beyond The Stars","1","31"]]
runningTotalTime = []
running = 0
for track in trackInfo:
minutes = int(track[2])
seconds = int(track[3])
total = (60 * minutes) + seconds
running += total
runningTotalTime.append([total, running])
for length in runningTotalTime:
print(length)
Yields:
[129, 129]
[258, 387]
[264, 651]
[295, 946]
[353, 1299]
[308, 1607]
[336, 1943]
[321, 2264]
[306, 2570]
[363, 2933]
[363, 3296]
[305, 3601]
[399, 4000]
[121, 4121]
[120, 4241]
[287, 4528]
[359, 4887]
[300, 5187]
[349, 5536]
[365, 5901]
[369, 6270]
[334, 6604]
[312, 6916]
[391, 7307]
[361, 7668]
[417, 8085]
[229, 8314]
[91, 8405]
That’s 14 tracks on disc 1. Let’s break it up into two lists. Then we’ll take what we did and define it as a function we can run on each individual disc as part of our main function. Also add a function to construct the commands and feed them into the terminal.
For this I’m using subprocess. You can either use subprocess.run or subprocess.popen. As I understand it, run will wait for the first item to complete, look for a clean exit code in the pipe, then move on to the next item when the first is good and done.
The other option is to use popen, which will create a process for each and almost immediately look like it’s complete. Then your cpu will be pegged with individual ffmpeg processes for a while until they start completing.
For very large files, I would probably opt for the latter option. But I would probably initially create the file as .mp3.part or some intermediate, then append the command with && mv [filename].mp3.part [filename].mp3 or something similar. It looked like the files were all done in the file explorer initially, and I could have run into r/w errors trying to open/modify a song in VLC that was still being written to by ffmpeg in the background.
#variable declaration
...
import subprocess
#split the album into songs, with seconds
def timeSplitter(inputList):
runningTotal = []
runningTime = 0
for track in inputList:
minutes = int(track[2])
seconds = int(track[3])
total = (60 * minutes) + seconds
runningTime += total
runningTotal.append([total, runningTime])
return runningTotal
disc1Times = timeSplitter(disc1Tracks)
disc2Times = timeSplitter(disc2Tracks)
#print(disc1Times)
#print(disc2Times)
#feed me a music file; track info (artist, song, mm, ss); formatted song lengths in seconds; disc #, if multiple
def commandFfmpeg(albumMp3, trackInfo, discTimes, disc = None):
i = 0
totalTracks = len(trackInfo)
while i < totalTracks:
#get ready to prepend output filename, but only if non-null
if disc == None:
discString = None
else:
discString = str(disc) + '.'
startSeconds = str(discTimes[i][1] - discTimes[i][0])
input = albumMp3
length = str(discTimes[i][0])
artist = 'artist="' + trackInfo[i][0] + '"'
title = 'title="' + trackInfo[i][1] + '"'
album = 'album="In Search of Sunrise 7 - Asia"'
trackNumber = 'track="' + str(i + 1) + '"'
#use .rjust to add leading zero as needed to string
#.replace() used to remove spaces from filename
output = './' + discString + str(i + 1).rjust(2, '0') + '-' + trackInfo[i][1].replace(' ', '') + '.mp3'
print(str(i))
subprocess.run(["ffmpeg", "-ss", startSeconds, "-i", input, "-t", length, "-metadata", artist, "-metadata", title, "-metadata", album, "-metadata", trackNumber, output])
i += 1
commandFfmpeg("./1-isos7.mp3", disc1Tracks, disc1Times, 1)
commandFfmpeg("./2-isos7.mp3", disc2Tracks, disc2Times, 2)
import subprocess
#split the album into songs, with seconds
def timeSplitter(inputList):
runningTotal = []
runningTime = 0
for track in inputList:
minutes = int(track[2])
seconds = int(track[3])
total = (60 * minutes) + seconds
runningTime += total
runningTotal.append([total, runningTime])
return runningTotal
disc1Times = timeSplitter(disc1Tracks)
disc2Times = timeSplitter(disc2Tracks)
#print(disc1Times)
#print(disc2Times)
#feed me a music file; track info (artist, song, mm, ss); formatted song lengths in seconds; disc #, if multiple
def commandFfmpeg(albumMp3, trackInfo, discTimes, disc = None):
i = 0
totalTracks = len(trackInfo)
while i < totalTracks:
#get ready to prepend output filename, but only if non-null
if disc == None:
discString = None
else:
discString = str(disc) + '.'
startSeconds = str(discTimes[i][1] - discTimes[i][0])
input = albumMp3
length = str(discTimes[i][0])
artist = 'artist="' + trackInfo[i][0] + '"'
title = 'title="' + trackInfo[i][1] + '"'
album = 'album="In Search of Sunrise 7 - Asia"'
trackNumber = 'track="' + str(i + 1) + '"'
#use .rjust to add leading zero as needed to string
#.replace() used to remove spaces from filename
output = './' + discString + str(i + 1).rjust(2, '0') + '-' + trackInfo[i][1].replace(' ', '') + '.mp3'
print(str(i))
subprocess.run(["ffmpeg", "-ss", startSeconds, "-i", input, "-t", length, "-metadata", artist, "-metadata", title, "-metadata", album, "-metadata", trackNumber, output])
i += 1
commandFfmpeg("./1-isos7.mp3", disc1Tracks, disc1Times, 1)
commandFfmpeg("./2-isos7.mp3", disc2Tracks, disc2Times, 2)
Final thoughts:
I still need to work on using files, rather than declaring variables that are multi-line directly into the file. I’ll be digging into the csv library shortly (along with pandas) to toy with some graphing.
This is still a bit messy. I really should have just called the timeSplitter function as part of commandFfmpeg. That would have saved me from declaring individual variables outside of the function, and from putting additional parameters into the commandFfmpeg function call, since it already has everything it needs without that.
This is really only useful for splitting a single file. If you want to just edit metadata, I recommend using Picard. It will automatically look up albums and take care of literally everything else for you if you want, or you can modify individual metadata on individual files as needed before saving.
Either way, for someone who’s less than a month into this, I’m pleasantly surprised with my ability to correctly identify places where I need to type cast BEFORE I get errors in the parser. And I’m still pleasantly surprised with the time that’s saved when writing code compared to c++.
Leave a Reply