A Data Scientist Cloned His Best Friends' Group Chat Using AI -

As data scientist Izzy Miller puts it, the group chat is “a sacred thing” in today’s society. Whether on iMessage, WhatsApp, or Discord, it’s the place where you and your best friends hang out, shoot the shit, and share updates about life, both trivial and memorable. In a world where we are more and more only bowlingcan we at least complain to the group chat about how much bowling sucks these days.

“My group chat is a lifeline and a comfort and a point of connection,” Miller tells me The edge. “And I just thought it would be hilarious and a little sinister to replace it.”

Using the same technology that powers chatbots like Microsoft’s Bing and OpenAI’s ChatGPT, Miller created a clone of his best friends’ group chat — a conversation that has unfolded every day for the past seven years, since he and five friends first got together at University. . It was surprisingly easy to do, he says: a project that took a few weekends of work and a hundred dollars to complete. But the end results are eerie.

“I was really surprised by how much the model inherently learned things about who we were, not just the way we talk,” says Miller. “It knows things about who we date, where we went to school, the name of our house we lived in, and so on.”

And in a world where chatbots are becoming increasingly ubiquitous and increasingly persuasive, the AI group chat experience might be one we’ll all be sharing soon.

The robo boys fight over who drank whose beer. No conclusions were drawn.

Image: Izzy Miller

Contents

1 A group chat built using a leaked AI powerhouse
2 Say hello to the robo boys
3 A chatbot in every app

A group chat built using a leaked AI powerhouse

The project has been made possible by recent advances in AI, but is still not something anyone could achieve. Miller is a data scientist who’s been toying with this kind of technology for a while — “I’ve got some head on my shoulders,” he says — and currently works at a startup called hex tech which happens to provide tooling that helps exactly this type of project. Miller described all the technical steps required to replicate the work in one blog postwhere he introduced the AI group chat and dubbed it the ‘robo boys’.

However, the creation of robo boys follows a familiar path. It starts with a large language model, or LLM – a system trained on massive amounts of text scraped from the Internet and other sources and possesses broad but raw language skills. The model was then “refined,” meaning it fed in a more focused data set to replicate a specific task, such as answering medical questions or writing short stories in a specific author’s voice.

Miller used 500,000 messages from his group chat to train a leaked AI model

In this case, Miller refined the AI system to 500,000 messages downloaded from his group iMessage. Sorting posts by author, he prompted the model to mimic each member’s personality: Harvey, Henry, Wyatt, Kiebs, Luke, and Miller himself.

Interestingly, the language model Miller used to create the fake chat was created by Facebook owner Meta. This system, LLaMA, is about as powerful as OpenAI’s GPT-3 model and was the subject of controversy this year when it was leaked online a week after its announcement. Some experts warned that the vulnerability would allow malicious actors to misuse the software for spam and other purposes, but no one suspected it would be used for this purpose.

As Miller says, he’s sure Meta would have given him access to LLaMA had he requested it through official channels, but using the leak was easier. “I saw [a script to download LLaMA] and thought, ‘You know, I think this is going to be removed from GitHub,’ so I just copied and pasted it and saved it to a text file on my desktop,” he says. And lo and behold, five days later when I thought, ‘Wow, I have this great idea,’ the model was DMCA-pending on GitHub – but I still kept it.

The project shows how easy it has become to build these kinds of AI systems, he says. “The tools to do this kind of thing are in such a different place than they were two, three years ago.”

In the past, creating a convincing clone of a group chat with six different personalities would take a college team months to complete. Now an individual, with a little expertise and a small budget, can build one for fun.

Miller was able to sort his training data by author and prompt the system to reproduce six different (more or less) personalities.

Image: Izzy Miller

Say hello to the robo boys

After training the model on the group chat’s messages, Miller connected it to a clone of Apple’s iMessage user interface and allowed his friends access. The six men and their AI clones were then able to chat with each other, with the AIs identified by the lack of a last name.

Miller was impressed with the system’s ability to copy his and his friends’ mannerisms. He says some of the conversations felt so real — like an argument over who drank Henry’s beer — that he had to search the group chat history to make sure the model wasn’t just reproducing text from his training data. (This is known as “overfitting” in the AI world and is the mechanism that can cause chatbots to plagiarize their resources.)

“There’s something wonderful about capturing your friends’ voices perfectly,” Miller wrote in his blog post. “It’s not really nostalgia since the talks never happened, but it’s a similar sense of glee… This has really delivered more hours of deep fun for me and my friends than I could have imagined.”

“It’s not really nostalgia since the conversations never happened, but it’s a similar sense of glee.”

However, the system still has problems. Miller notes that the distinction between the six different personalities in the group chat can become blurred, and that a major limitation is that the AI model lacks a sense of chronology — it cannot reliably distinguish between past and present events (a problem that affects all chatbots to some degree). For example, past girlfriends can be mentioned as if they were current partners; ditto former jobs and homes.

Miller says the system’s sense of what is factual isn’t based on a holistic understanding of the chat — on analyzing news and updates — but on the volume of messages. In other words, the more something is talked about, the more likely it is to be referenced by the bots. An unexpected outcome of this is that the AI clones behave as if they were still in college, because the group chat was most active at that time.

“The model thinks it’s 2017, and when I ask how old we are, it says we’re 21 and 22,” says Miller. “It’s going to go on tangents and say, ‘Where are you?’, ‘Oh, I’m in the cafeteria, come over.’ That doesn’t mean it doesn’t know who I’m currently dating or where I live, but if left to its own devices, it thinks we’re our college-age selves.” He pauses and laughs, “Which really adds to the humor of it all. It’s a window into the past.”

A chatbot in every app

The project illustrates the increasing power of AI chatbots and in particular their ability to reproduce the mannerisms and knowledge of specific individuals.

Although this technology is still in its infancy, we are already seeing the power these systems can exert. When Microsoft’s Bing chatbot launched in February, it delighted and shocked users as much as its “unhinged” personality. Experienced journalists recorded conversations with the bot as if they had made first contact. That same month, users of the chatbot app Replika expressed outrage after the app’s creators removed the app ability to engage in erotic role play. Moderators of a user forum for the app posted links to suicide helplines to comfort them.

It is clear that AI chatbots have the power to influence us as real people can and do play an increasingly prominent role in our lives, be it for entertainment, education or something else entirely.

The bots try out a roast.

Image: Izzy Miller

When Miller’s project was shared on Hacker News, commenters on the site speculated on how such systems could be used for more ominous purposes. One suggested that tech giants holding vast amounts of personal data, such as Google, could use it to create digital copies of users. These could then be questioned in their place, perhaps by potential employers or even the police. Others suggested that the proliferation of AI bots could exacerbate social isolation: offering more reliable and less challenging forms of companionship in a world where friendships often happen online anyway.

Miller says this speculation is certainly interesting, but his experience with the group chat was more hopeful. As he explained, the project worked alone because it was an imitation of the real thing. It was the original group chat that made it all fun.

“What I noticed when we were playing around with the AI bots was that whenever something really funny happened, we would take a screenshot of it and send that to the real group chat,” he says. “Even though the funniest moments were the most realistic, there was this feeling of ‘oh my god, this is so funny I can’t wait to share it with real people.’ Much of the joy came from having the fake conversation with the bot, then grounding that into reality.

In other words, the AI clones can replicate real humans, he says, but not replace them.

He adds that he and his friends – Harvey, Henry, Wyatt, Kiebs and Luke – are currently planning to meet in Arizona next month. The friends currently live across the US, and it’s the first time they’ve reunited in a long time. The plan, he says, is to put the mock group chat on a big screen so the friends can watch their AI replicas tease and harass each other while doing the exact same thing.

“I can’t wait to all sit around and have some beers and play with this together.”

Janice Allen

Janice has been with businesskinda for 5 years, writing copy for client websites, blog posts, EDMs and other mediums to engage readers and encourage action. By collaborating with clients, our SEO manager and the wider businesskinda team, Janice seeks to understand an audience before creating memorable, persuasive copy.