AI learnings

Back in May (2023), Meta released their Massively Multilingual Speech Model [github]. This model is significant because it includes 1,107 languages, including a vast number of low resource languages. And supposedly it has lower error rates for transcription than OpenAI’s Whisper model. For text-to-speech capabilities, the model actually reuses another library in Meta’s Fairseq toolkit, VITS, which was released back in June of 2021. VITS came with some pretty impressive TTS demos, and so I wanted to see how well they worked in my own hands and compare the new MMS and the older VITS models....

Trying Out Meta Massively Multilingual Speech Model

LangChain demo: Chat with Tesla Earnings Deck