Mistral AI unveiled Voxtral, its first open-source speech understanding model, on Tuesday. Voxtral is an AI model capable of both generating speech from text and understanding text to produce speech responses natively. Available in 24-billion and 3-billion parameter sizes, Voxtral allows users to convert text into audio and, uniquely, comprehend textual input to generate appropriate spoken outputs. Mistral emphasizes the model’s accessibility, offering it as a free download. Furthermore, access is provided through an affordable application programming interface (API), making Voxtral a readily available and cost-effective solution for various speech-related applications.
3








