Lead singer Sam Harris recognizes that we all have parts of ourselves we keep bottled up, and in the song, he explores the often euphoric (and occasionally messy) release of giving some light to our inner demons.
Throughout the album, Sam’s lyrics and storytelling interludes give voice to those dark thoughts inside his head—which got him thinking: what if he could actually put that voice inside the heads of his fans? That thought brought X Ambassadors to Microsoft, and it’s the basis of their latest tech collaboration.
Working closely with Microsoft engineers, Sam created a Custom Neural Voice—a highly accurate AI-powered voice model that’s a synthetic recreation of his human voice. Sam assumed the role of his voice-inside-your-head character, “The Shadow,” and recorded hundreds lines of dialogue in his home studio. Those recordings were then uploaded to the Microsoft Azure Speech Studio to train the AI, crafting a voice that could read back any text as if it was the voice inside Sam’s head.
I like the idea of being able to use something like AI because it’s still a bit of uncharted territory.Listen to Sam Harris recording his voice to train the Custom Neural Voice
I read hundreds of lines of text in the tone of voice of “The Shadow” for the AI to map to my voice. I was so surprised at how accurately it captured my cadence.This is the Custom Neural Voice, trained on Sam Harris’ recordings, reading back the same lines
Now, ”The Shadow” just needed something to say. Since the voice grew out of the album’s storytelling, the band thought having it talk to fans about The Beautiful Liar’s backstory would be a great place to start. But what would truly make the experience unique would be if the voice didn’t just talk, but actually interacted with fans.
The result is a new web-based interactive experience for “Adrenaline.” When you first arrive at the experience and put on headphones, it feels like watching a run-of-the-mill lyric video. But when it’s suddenly interrupted by a playfully mysterious voice that seems to be coming from inside your head, you realize this is something entirely different. The voice says it has something to tell you about the song and starts asking you questions. You respond naturally, and the voice banters back an interaction powered by Microsoft Azure’s Speech-to-Text service. This plays out in key moments scattered across the song, where the voice cuts in to share behind-the-scenes anecdotes and commentary on the song’s key themes and references, with the precise path through the conversation determined by fans’ responses. There’s no single path through “Adrenaline,” and the voice-inside-your-head is full of surprises.