Elon Musk’s xAI has introduced the Grok 1.5 Vision AI model, an enhanced version of the Grok 1.5 model with added computer vision capabilities. This allows the model to process images and answer questions about them. The announcement was made via xAI’s official account, sharing benchmark scores and details about the new model. The Grok 1.5 Vision was tested on various benchmarks, outperforming OpenAI’s GPT-4 with Vision in RealWorldQA but scoring lower in MMMU and ChartQA. Computer vision equips AI models to identify and understand objects in the real world using images and videos, similar to human visual processing. This technology has wide-ranging applications, from calorie tracking and nutrition feedback to potential use in disease diagnosis and self-driving cars. The rise of multimodal AI models has led to increased focus on vision-focused models by various firms, such as Google’s Gemini 1.5 Pro and OpenAI’s GPT-4 with Vision.
