I agree with Lloyd and shlind; there is a deep need to be able to find, fine-tune and create styles that catch the characteristics of the song one has in one’s head.

Shlind talks about the need for another “dimension” that “catches the feel/groove of the song”. This is certainly one way to think about the solution. But I too have no idea what that other dimension could be. Does anyone?

This also has been discussed here.
Real-time Style Creation

As an engineer I have to look at this from an “engineering perspective” and decompose the problem into two parts, a Part A and a Part B.

Part A
As I see it, Part A is to acquire a deep and working understanding of what music is. For example, what is it that makes any artist sound like themselves?
Is it the drums? No. There are lots of artists that use the same patterns.
Guitar? No, for the same reason.
Synths? No.
Bass? No.
Percussion? No.
Vocals? No.
Effects? No.
Unique Skill? No.
For sure, all of these are important but more is needed to truly replicate any given artists song.

The detailed “recipe” that the artist in question uses appears to be a fundamental question that must be answered at a deep and complete level for any algorithm or AI to successfully replicate the song. We use terms like “feel” and “groove” but what do they really mean? These terms need to be defined sonically and perhaps mathematically so that the programmers can program.

The good news is that skillful, discerning musicians instinctively know the recipe even if they can’t articulate it without their instruments. Case in point is the Foxes and Fossils cover of Sail on Sailor . Not only were they able to replicate the original to a very high level (not to be confused with duplicate), but I happen to like their rendition better than the Beach Boys original! Another example of this for me is Killing Me Softly by Roberta Flak. Most likely everyone has their own examples.

Much like a world-class chef uses the same elements as us lesser cooks (meat, vegetables, grains, water, spices, temperature, time, etc.) it’s the recipe (or process) that makes the difference. Given that the human tongue can detect only 4 tastes; sweet, salty, sour, umami and bitter, how can it be that in the course of just 1 week we can detect dozens of different flavors and over a lifetime, hundreds if not thousands? The answer is a recipe of combinations of tastes. So too with music.

More good news is that much of the heavy lifting has already been done in this arena and there are vast resources available. One resource is Librosa , a Python library. This is not at all to say that this is an easy problem to solve. Quite the contrary, but eventually some team will eventually solve it.

Part B
The Part B of the solution (imho) is taking the progress and knowledge from Part A and exploit the wonderful RTs and RDs to create a general solution using software and perhaps machine-learning tools. It seems to me that the HIL (Human-in the-Loop) is the weak point. If humans are painstakingly categorizing popular songs into style buckets it’s likely this is not sustainable because
1. There are far too many songs to categorize; over 100,000 songs are uploaded per day worldwide
2. New genres will continue to be created by the movers and shakers in the music industry; genres that could not have been recorded when the RTs and RDs were recorded because they didn’t exist at that time.
3. Users may want to replicate unpopular songs that were never categorized in the first place.

For grins, I asked the following question to my AI assistant.
Given a static and limited set of drum loops, guitar recordings, keyboard riffs, and other recordings, is it known by musicologists exactly how to accurately replicate a given song using these recordings?

In addition to the complex task of considering the arrangement, structure, timing, tempo, harmony, melody, instrumentation, timber, mixing and production, this was the summary answer.
While musicologists and producers can get very close to replicating a song using these techniques, achieving an exact match involves a deep understanding of the original recording’s nuances and the creative decisions made during its production.

This answer tells me that software is the only way to produce a general and complete solution.


https://soundcloud.com/user-646279677
BiaB 2025 Windows
For me there’s no better place in the band than to have one leg in the harmony world and the other in the percussive. Thank you Paul Tutmarc and Leo Fender.