Earlier this month, a company called HYBE spent US$32 million to acquire another South Korean company called Supertone. Its main asset is some software that it claims can create “a hyper-realistic and expressive voice that [is not] distinguishable from real humans.”
And it works pretty well. In January 2021, Supertone revealed its Singing Voice Synthesis technology.
The big party trick was to present Kim Kwang-seok, a Korean folk superstar who sold millions of records at home, singing a new brand new song. Pretty cool, considering that he died in 1996.
Using artificial intelligence, Supertone’s SVS tech “learned” about 100 songs by 20 different singers in order to develop a style. Then it learned 10 songs by Kim. Putting everything together, the AI was able to create something that was more than a reasonable facsimile.
Why would HYBE in interested in such technology? Because it’s the company behind some major K-pop acts, including BTS. This past year, the boy band shocked their global fanbase by announcing that they were going to take a break from music. Fair enough, given the insane ride they’ve been on for the last number of years.
This, however, created some serious problems. First, under South Korean law, every member of BTS is now required to complete compulsory military service (they had been exempt under exceptions given to artists and athletes).
This will take BTS out of the spotlight for at least 18 months. And since the seven members are of different ages, the times when they are supposed to start their time in the army will be staggered. BTS could be MIA for years. Not good for an enterprise that has revenues of US$3.6 billion a year.
Might HYBE use Supertone to create new BTS material while the boys are in the army? It appears possible.
From a sheer capitalistic point of view, this seems brilliant. No more temperamental musicians who get drunk, high, and end up doing stupid #MeToo things with fans. New music can be summoned on cue so no more waiting for inspiration to strike. Talk about music that’s cheap to produce, too. No salaries, royalties, per diems, or any of those other expenses eaten up by real human beings.
This begs the question: If the fake is indistinguishable from the real, will fans fall for it? Maybe.
Music synthesis by machine has been a dream of scientists for decades. In 1961, Max Matthews, a programmer at IBM, was playing around with a 7905 mainframe and managed to get it to sing. No computer had ever done this before.
What were cutting-edge research and science fiction are now very, very real. And while machines aren’t totally autonomous composers and performers yet, we’re headed in that direction. Right now, though, the focus is on AI-powered music creation software as a tool. Call it software-assisted composing.
In 2020, Grimes (the Canadian singer who is Elon Musk’s ex) worked with a startup called Endel to create a new piece of music she called an “AI lullaby.” She created “stems” (short distinct clips) of both music and vocals and then let the software do the rest. Endel has also been used to create music that helps people sleep and driving music to keep Mercedes-Benz drivers focused on the road.
Google is working on a system called AudioLM which can both create natural-sounding speech/singing and create music. All it needs is a few seconds of original audio and it’ll take it from there. Its piano pieces are smooth, fluid, and reasonably nuanced. No piano is necessary, either.
Harmonai is a project of a company called Stability AI, which describes itself as “a community-driven organization releasing open-source generative audio tools to make music production more accessible and fun for everyone.” It also has a tool called Dance Diffusion (currently in beta) which can generate new original short clips of music based on its knowledge of a catalogue of music. Some artists are using the software as a way of thought-starting a new composition.
Then there’s Amazon, which is worki