I have a list of songs gathered from this subreddit, presented in the following format:
[
"Lophelia -- MYTCH [Acoustic Prog-Rock/Jazz] (2019)",
"Julia Jacklin - Pressure to Party [Rock] (2019)",
"The Homeless Gospel Choir - I'm Going Home [Folk-Punk] (2019) cover of Pat the Bunny | A Fistful of Vinyl",
"Lea Salonga and Simon Bowman - The last night of the world [musical] (1990)",
"$uicideboy$ - Death",
"SNFU -- Joni Mitchell Tapes [Punk/Alternative] (1993)",
"Blab - afdosafhsd (2000)",
"Something strange and badly formatted without any artist [Classical]",
"シロとクロ「ミッドナイトにグッドナイト」(Goodnight to Midnight - Shirotokuro) - (Official Music Video) [Indie/Alternative]",
"Victor Love - Irrationality (feat. Spiritual Front) [Industrial Rock/Cyberpunk]"
...
]
I'm attempting to extract the title and artist information from each entry using regex, but I'm encountering difficulties.
My initial approach of splitting by "-"
only provides the artist, which is cumbersome to deal with.
I also tried using regex, but I can't seem to get it right. Here's what I had for the artist:
/(?<= -{1,2} )[\S ]*(?= \[|\( )/i
and for the title: /[\S ]*(?= -{1,2} )/i
.
Each entry consists of a song title with the artist potentially preceding it with one or two dashes. Genres may be enclosed in square brackets and release dates in parentheses. While I don't expect perfect results, I prefer undefined
for the artist if parsing is unsuccessful for unusual formats.
For example:
[
{ title: "MYTCH", artist: "Lophelia" },
{ title: "Pressure to Party", artist: "Julia Jacklin" },
{ title: "I'm Going Home", artist: "The Homeless Gospel Choir" },
{ title: "The last night of the world", artist: "Lea Salonga and Simon Bowman" },
{ title: "Death", artist: "$uicideboy$" },
{ title: "Joni Mitchell Tapes", artist: "SNFU" },
{ title: "afdosafhsd", artist: "Blab" },
{ title: "Something strange and badly formatted without any artist" },
{ title: "Goodnight to Midnight", artist: "Shirotokuro" }, // AI might be needed for this
{ title: "Irrationality", artist: "Victor Love" }
]