I am trying to use mBART for multilingual translation(about 30 languages) but I am facing an issue with using it as I am currently using langid to identify the languages then load mBART and translate all the words based on the language code that has been identified. But mBART uses this odd format for language codes for example:
en_XX -> English
hi_IN -> Hindi
ro_RO -> Romanian
Whereas Langid outputs them in this format:
af, am, an, ar, as, az, be, bg, bn, br
I cannot seem to find any documentation on how to interpret the mBART language code as even the research paper does not include it.