Back in May (2023), Meta released their Massively Multilingual Speech Model [github]. This model is significant because it includes 1,107 languages, including a vast number of low resource languages. And supposedly it has lower error rates for transcription than OpenAI’s Whisper model. For text-to-speech capabilities, the model actually reuses another library in Meta’s Fairseq toolkit, VITS, which was released back in June of 2021. VITS came with some pretty impressive TTS demos, and so I wanted to see how well they worked in my own hands and compare the new MMS and the older VITS models....
LangChain demo: Chat with Tesla Earnings Deck
Since LangChain seems like a fairly powerful way to recursively call OpenAI LLMs, I wanted to understand how this dark magic worked. I came accross this gist by @virattt where he creates a simple chatbot to chat with a Facebook earnings PDF. This seemed like a good place to start. I created my own adaptation that reproduces the simple chatbot, but this time talking with the Tesla Q1 2023 earnings deck....