Why is the research on artificial intelligence at this stage all researching on a separate ability? For example, train the visual ability of the computer alone, train the speech recognition ability alone, and train the natural language understanding ability alone. Why not train these abilities together at the same time? Does anyone think this will lead to better AI?
Asked
Active
Viewed 1,154 times
5
-
2The short answer is that it isn't. – Andy Feb 24 '23 at 14:13
-
1As the current answer mentioned, Multimodal Learning seems to fit for this, but maybe Ensemble Learning could fit, depending on how it's set up. – Nordine Lotfi Feb 24 '23 at 14:46
-
1@NordineLotfi Ensemble learning is where (to first approximation) you train multiple systems on the same data and they "vote" on the output. Multimodal learning is the correct term for what huang is asking about. – Ray Feb 24 '23 at 21:20
-
I agree, which is why I said "maybe" since I wasn't sure if there was a better term than this one :) @Ray – Nordine Lotfi Feb 24 '23 at 21:39
1 Answers
14
There is a large field of AI that indeed does this. It is called multi modal learning. It is a very active research area, especially in the last few years.
For more information see: https://en.m.wikipedia.org/wiki/Multimodal_learning

chessprogrammer
- 2,890
- 2
- 15
- 26
-
4Text-to-image generators are one good recent example of multimodal ML.
Also may be worth explaining why this is a more tricky thing to train - usually limited by availability of training data correctly setup for multimodal learning, whilst auto-regressive and semi-supervised ML on a single input modality can often construct training data from many publicly available sources.
– Neil Slater Feb 24 '23 at 16:10 -
Thanks for your answer, the "multimodal learning" you mentioned is quite close to my question (subjectively). But I think the question is pretty open, so I'm not going to accept this answer immediately and see if there are more different perspectives. – huang Feb 24 '23 at 22:49
-