Science is everywhere around us. It’s a fact. Even the most objects we handle on a regular basis are steeped in it. Sometimes, someone just needs to explain where to look and how it works. Enter, The World According to Jeff Goldblum. On the show, the actor plucks everyday topics, so familiar that we barely even notice them, and shows the magic behind them. On his latest episode, Goldblum tackles the way puzzles permeate our everyday lives and how we solve dozens of them every day. One particularly interesting segment featured an artificial intelligence wielding robot named Shimon who can create and improvise music on the fly.
SCINQ caught up with Dr. Gil Weinberg from the Georgia Tech Center for Music Technology, the researcher responsible for bringing Shimon to life.
Can you describe how Shimon works in layman’s terms?
Shimon can compose, improvise, sing and rap. For composing and improvising, we train Shimon on datasets of transcribed music, from Mozart through Mile Davis to Lady Gaga and many more. Shimon uses Artificial Intelligence, in particular Machine Learning techniques, to learn these different styles and generate his own music based on what he learned. When Shimon composes, he can morph between the different styles he learned to create novel hybrid compositions that could push the envelope on musical expression and genre . When he improvises, Shimon can listen and respond to what he hears in real-time and come up with relevant output that could be surprising and inspiring for the human collaborators. For singing and rapping, we train Shimon on datasets of lyrics. Here too, we are using different styles – from prog rock to jazz. Shimon can then generate new lyrics that he can sing with the band, or generate the lyrics in realtime, by listening and understanding what a rapper says, and respond accordingly.
Playing musical improv with a band does not seem like it has anything to do with solving puzzles. Can you describe how it the two are similar?
One way to look at musical improvisation as a puzzle is to think about what you play as a piece of a puzzle that has to fit with what the rest of the group are playing. Jazz improvisation in particular requires constantly listening to your peers and coming up with just the right response (“piece?”) that would be relevant yet interesting and unique. It’s like constantly looking for the right piece of the puzzle, although the puzzle is constantly shifting. Another difference is that with improvisation, there is not only one right piece that is the only solution for the puzzle. The goal of the improviser is to come up with the piece that would fit best for that moment, as judged by the collaborators and the audiences.
When creating and playing music for humans, what are the advantages of having a physical robot present to perform with and for humans?
One of our main guidelines for Shimon is that he should “Listen like a human but play like a machine.” For Shimon to “listen like a human” we teach him elements of music the humans respond to and appreciate such as beat, tension and release, dissonance and consonance, etc.. This is helpful for Shimon to create a connection between what we as humans like and appreciate and how he understands music. But when Shimon improvises, he uses computational processes and mechanical abilities that humans do not poses, ie, he “plays like a machine.” On the software side, he can use processes such as statistical analysis, genetic algorithms and fractals to create unique responses that you could not expect from humans, yet they will still relevant and aesthetically pleasing, since they are based on Shimon’s “listening like a human” Module. On the mechanical / physical side, Shimon has many more arms than humans (he has 8 strikers) and each arm can play much faster than any human (20 hits per second). This can lead to unique musical outcome, novel timbres and inhuman sonorities that would be inspiring to humans to collaborate with.
Does Shimon improvise something new every time he performs? Do his algorithms give him a certain style?
Yes and yes. By morphing and combining different human inspired styles, Shimon can create his own unique style. Moreover, his idiosyncratic mechanical abilities (8 strikers that can play up to 20 Hz) leads to a unique musical outcome that only he poses.
Shimon plays jazz, a form that can be structurally complex, and pulls it off to a degree. But is he capable of taking something like rock music — often compared to playing Twinkle, Twinkle, Little Star — and give it the intangible magic that humans do?
Yes he can, and actually he did so in the past. It all depends on what we train him on. If we train him on rock songs, he could play in that style. Also, just for fun, we pre-programmed Shimon to play one of the few rock song that uses marimba. Check it out here – https://open.spotify.com/track/7LKxS0TrqN4gjdu9Q87F7t?si=d9a802250f464081
Shimon can create and play music. The logical next step is to create good music consistently. How can you code for good music or a hit song? (This may be the mother of all puzzles.)
Yes, I agree. This puzzle has not been solved yet, although many have tried. It may be that we need more data to train the AI, or find new machine learning models to use this data better. The one who cracks this puzzle is going to be a rich person :).
IMAGE CREDIT: National Geographic.