FWIW, I think that in the near future AI will not only be separating stems with frequency but also with tone. Thus it could separate an acoustic guitar from an electric guitar, a bass guitar from a tuba, etc.
I think the present software already does that to a very large extent, though perhaps not quite in those terms. I think it would be impossible to achieve the separation they're managing today without a good recognition of which bits of the total sound apply to which instruments. The envelopes will tell a lot, but certainly not everything.
If all instruments produced just harmonics of the fundamental, I suspect isolation would be relatively(!) easy, but when most instruments, including voice, produce overtones and sympathetic resonances, it must become very difficult to identify what sounds belong to which instrument. We've probably all heard times when a bass has causes a snare or cymbol to sound, similarly a piano with the dampers off.