One improvement that I haven’t seen mentioned by many is the improvements ChatGPT is making on a wider and wider array of languages in terms of spoken language. Just a month ago, I tested it on Thai and Cantonese languages, and it’s spoken language was terrible. But now it is sounding pretty good. It’s really amazing the progress that has been made in such a short time.
It sounds like they were able to do this training using thousands of hours of audio clips from a wide range of sources. They don’t get into the details about what they used, but if it is indeed, only thousands of hours that means it will be possible to train ChatGPT on more diverse languages.
For example, there are thousands of clips available online other Chinese dialects like Sichuanese and Shanghainese, or perhaps the Isaan language of northern Thailand, so it may be just a matter of time before the number of languages and dialects available for ChatGPT expand, and that will be awesome!