Cartoon: Testing times
“Testing times” by Iantoons
This new cartoon illustrates that despite the intellectual complexity of AI models, they are still making mistakes as humans do, just different types of ‘dumb’ mistakes.
The IQ test was invented in 1905 to assess a range of cognitive skills, including reasoning, problem-solving, and understanding complex ideas. With the rollout of AI models, researchers are trying to find where this new technology fits on the ranking of human average IQ.
One such study by Maxim Lott found that we are now getting language models to achieve over 100 IQ points, which is considered average human intellectual ability.
He found that some models performed better than others. For example, Anthropic’s Claude 3 did better than most other models, and Lott predicts that in another four-to-ten years (with the launch of Claude 6), the model should get all the IQ questions right and “be smarter than just about everyone”.
The team at OpenAI have started to think about all of this and recently announced a series of levels that could help translate this intelligence into real-world applications. Level 1, where we are today with ChatGPT 4, is the world of chatbots where there is conversational intelligence.
Moving up the levels, we go to models that are ‘Reasoners’, then to ‘Agents’, then ‘Innovators’. And finally Level 5 is the bots self-managing as standalone ‘Organisations’. Some companies are already working on Level 3+ models that can perform multi-step tasks on behalf of humans, like booking an entire vacation or working through a complicated coding problem on its own.
However, as prominent AI researcher and co-founder of Google Brain Andrew Ng states: “AI is incredibly powerful, but it still lacks a deep understanding of context that humans possess naturally.”
You can find more Iantoons cartoons here.