Supported query languages (116)
Languages multilingual-e5-small (and microsoft/Multilingual-MiniLM-L12-H384 as its base model) was trained on. Source here. The larger the amount of training data, the better the language should work.
af Afrikaans (305M) am Amharic (133M) ar Arabic (5.4G) as Assamese (7.6M) az Azerbaijani (1.3G) be Belarusian (692M) bg Bulgarian (9.3G) bn Bengali (860M) bn_rom Bengali Romanized (164M) br Breton (21M) bs Bosnian (18M) ca Catalan (2.4G) cs Czech (4.4G) cy Welsh (179M) da Danish (12G) de German (18G) el Greek (7.4G) en English (82G) eo Esperanto (250M) es Spanish (14G) et Estonian (1.7G) eu Basque (488M) fa Persian (20G) ff Fulah (3.1M) fi Finnish (15G) fr French (14G) fy Frisian (38M) ga Irish (108M) gd Scottish Gaelic (22M) gl Galician (708M) gn Guarani (1.5M) gu Gujarati (242M) ha Hausa (61M) he Hebrew (6.1G) hi Hindi (2.5G) hi_rom Hindi Romanized (129M) hr Croatian (5.7G) ht Haitian (9.1M) hu Hungarian (15G) hy Armenian (776M) id Indonesian (36G) ig Igbo (6.6M) is Icelandic (779M) it Italian (7.8G) ja Japanese (15G) jv Javanese (37M) ka Georgian (1.1G) kk Kazakh (889M) km Khmer (153M) kn Kannada (360M) ko Korean (14G) ku Kurdish (90M) ky Kyrgyz (173M) la Latin (609M) lg Ganda (7.3M) li Limburgish (2.2M) ln Lingala (2.3M) lo Lao (63M) lt Lithuanian (3.4G) lv Latvian (2.1G) mg Malagasy (29M) mk Macedonian (706M) ml Malayalam (831M) mn Mongolian (397M) mr Marathi (334M) ms Malay (2.1G) my Burmese (46M) my_zaw Burmese (Zawgyi) (178M) ne Nepali (393M) nl Dutch (7.9G) no Norwegian (13G) ns Northern Sotho (1.8M) om Oromo (11M) or Oriya (56M) pa Punjabi (90M) pl Polish (12G) ps Pashto (107M) pt Portuguese (13G) qu Quechua (1.5M) rm Romansh (4.8M) ro Romanian (16G) ru Russian (46G) sa Sanskrit (44M) si Sinhala (452M) sc Sardinian (143K) sd Sindhi (67M) sk Slovak (6.1G) sl Slovenian (2.8G) so Somali (78M) sq Albanian (1.3G) sr Serbian (1.5G) ss Swati (86K) su Sundanese (15M) sv Swedish (21G) sw Swahili (332M) ta Tamil (1.3G) ta_rom Tamil Romanized (68M) te Telugu (536M) te_rom Telugu Romanized (79M) th Thai (8.7G) tl Tagalog (701M) tn Tswana (8.0M) tr Turkish (5.4G) ug Uyghur (46M) uk Ukrainian (14G) ur Urdu (884M) ur_rom Urdu Romanized (141M) uz Uzbek (155M) vi Vietnamese (28G) wo Wolof (3.6M) xh Xhosa (25M) yi Yiddish (51M) yo Yoruba (1.1M) zh-Hans Chinese (Simplified) (14G) zh-Hant Chinese (Traditional) (5.3G) zu Zulu (4.3M)