Microsoft has unveiled two new additions to its Phi-4 family of small language models: Phi-4-multimodal, which integrates speech, vision, and text, and Phi-4-mini. In December 2024, Microsoft ...
Microsoft Corp. today expanded its Phi line of open-source language models with two new algorithms optimized for multimodal processing and hardware efficiency. The first addition is the text-only ...
AnyGPT is a new multimodal LLM that can be trained stably without changing the architecture or training paradigm of existing large-scale language models (LLMs). AnyGPT relies solely on data-level ...
Microsoft has introduced a new AI model that, it says, can process speech, vision, and text locally on-device using less compute capacity than previous models. Innovation in generative artificial ...
A little more than two months ago, OpenAI released GPT-4o, its newest and most powerful AI model that was the first to be trained by the company natively to handle multimodal inputs and outputs (text, ...