If you can read a traditional analog clock then congratulations, you’re smarter than artificial intelligence. AI is proving to be good at a lot of things, but reading an old-fashioned clock is not one of them. A team of researchers recently put a multimodal large language model (MLLM), an AI model that can analyze different forms of media, to the test to see what it is about analog clocks that stumps our future technological overlords.
The team first trained four MLLMs on a data set of images of analog clocks and asked the models to tell the time in a subset of images. All four failed to accurately report the time initially. Their performance did improve with additional training, even when some images they had not seen before were added, but when they were tested on a completely new set of images, their performance dropped again. The team then presented the models with altered images of clocks, either by distorting them or changing the appearance of the hands, say by adding arrows at the end. The models found these clocks to be even more difficult to read.
The models seriously struggled to do what humans find easy: “identify the clock hands, determine their orientations, and combine these observations to infer the correct time,” per the researchers. It seems AI has a hard time dialing in the spatial orientation of clock hands, even more so when the hands have unique features that the models aren’t used to. And the team found that when the AI has a harder time identifying the hands, that in turn leads to more spatial errors when attempting to determine their positions.