Imagine someone were to suddenly allocate you 100 new team members. They’re all bright and enthusiastic, and have good working knowledge of lots of topics. But they can also lack common sense, and often require very specific instructions, even for seemingly straightforward tasks.
On the plus side, they’re so quick to complete tasks that time isn’t really a limiting factor. On the downside, they sometimes make basic mistakes that go unnoticed or go in frustrating circles in their enthusiasm to finish each task they’re given. They’ve also memorised vast amounts of information from books and newspapers, but can be forgetful in the short term, and their attention may wander if given long or complex tasks.
What could you achieve with such a team?
This kind of situation captures the challenge of working out how to get the most out of language models like GPT-4. I’ve recently come to realise that lots of people I know haven’t used these tools much yet – if at all. Outside of more technical circles, I think a lot of the world still sees ChatGPT as something of a trivia bot: they might get it to write limericks, or generate fictional imagery, or recite random facts (which may or may not be true). Or, if they’ve used its ‘co-pilot’ form, people see GPT as a glorified autocomplete.
But I think a more useful way to think about emerging AI is to frame it as a teamwork problem. If you had 100 new team members with the above characteristics, it could be a massive boost to your work if you managed them well – or a total disaster if you didn’t.
Faced with this situation, there are a few approaches you could take:
1. Focus on simple, predictable tasks with relatively clear rules. Ask the AI team members to proofread sections of reports for obvious errors. Get them to summarise emerging news articles. Delegate the extraction and formatting of data in clunky documents. This minimises the chance that they’ll go too far off track, but it also limits the ambition of what you can do. So instead, you might try another approach…
2. Give them additional domain-specific information and context. Rather than just give them a task, you could provide some detailed additional background or previous examples to help them understand the nuances of the tasks at hand, and what success looks like. This could improve their ability to make appropriate decisions and reduce errors. But, of course, doing this for every single task and team member puts a lot of burden on you. It would be better if they could be successful at similar tasks in future…
3. Train them in specific roles. As in any team, diversification of skills and roles can lead to more efficient task distribution. Rather than giving AI team members (commonly known as ‘agents’) new instructions and tasks each time, you could instead teach each one to be strong in certain areas, then give them specialized tasks based on their tailored knowledge and skills, from coding to communication. But that still means you have to assign tasks, then check the results. Perhaps there’s a better way…
4. Get them to work together. Rather than having each AI agent report back individually, you could create a sequence where one AI’s output is the input for another. If you’re working with data, one AI could work on formatting, then pass to another for checking, then another for analysing, then another for summarisation, then another for final review. Designed well, a team of agents could review and improve each other’s work, or even manage each other.
So why is it so common for people to get stuck on the first or second step above when exploring use of AI?
In part, the problem comes down to an information bottleneck. There are certain tasks we know take lots of time and effort, and could be made more efficient and effective. But how do we give the right information to AI agents to perform the task properly, then make sure they are giving the correct information to each other – and giving accurate information back to us? In the past year, I’ve tried dozens of AI tools that promise to make common tasks easier (many of them are basically ChatGPT plus some additional instructions). But I generally find these plug-and-play tools struggle do exactly what I want. Somehow information and context are still getting lost along the way. Often, it feels easier and faster to just build my own version of what I need using common language models.
But it’s not just about information. There is also an imagination bottleneck when it comes to AI. There are some tasks we don’t yet realise could be made easier, or perhaps tasks we haven’t even thought to attempt yet. Often, what we’re good at – and why were good at it – comes down to domain-specific knowledge and how we can apply this knowledge creatively. So addressing this bottleneck will require reflection: what makes a task difficult? What makes a human good (or bad) at it?
If we can solve these challenges in science and research, it could have huge benefits. There are many, many scientific projects that I’ve never completed – or attempted – because it’s simply not feasible to do the tasks required with the person-time available. But now time is no longer necessarily the limitation, with the potential for vast numbers of AI agents to support projects. Instead, the more immediate barrier is a collaborative one; it’s about information and imagination. If we want to make better use of AI, we therefore need to think more about teamwork.
I agree with this article, having struggled to use ChatGPT (and other LLM tools) effectively in my own work. I would be really interested in seeing some examples of when you've found them particularly useful, and workflows/prompts/etc that have been fruitful.