Reading Stephen Wolfram's explanation of ChatGPT, it sounds as if first you train a very powerful "autocomplete" function that doesn't know anything specifically about chatbots, and then you create a corpus of chatbot-dialogs on top of that model to show it how to be a chatbot specifically. I'm curious if anyone can explain in a bit more detail how this process works technically, ie taking an already-trained model and "specializing" it with a second training corpus. How are the two neural networks related to one another?
Asked
Active
Viewed 89 times
1
-
It's just more training. There's no "end of training process" that stops you from doing any more training after that; you can just keep doing even more training. There's probably some kind of secret sauce in ChatGPT, but this isn't it – user253751 May 09 '23 at 12:56
-
ChatGPT is a decoder-only transformer model. For a full on (hopefully not too technical explanation), please read this extensive post. It also goes into how the transformer is trained (in two stages). – Robin van Hoorn May 09 '23 at 20:09
1 Answers
1
ChatGPT derives also from InstructGPT that is subject to a reinforcement learning training with human feedback (called RLHF) in order to enble it for instruction following.
Basically, to get a chatGPT you take a GPT model, pre-train it to predict the next token in a sentence (so that it learns the structure of the language), and then RLHF is applied to enable it to follow instructions specified in a prompt. Otherwise it will give you just random (but plausible) complexions of sentences.
The power of chatGPT is the combination of a very large model (billions of parameters), a huge dataset, and a large collection of human annotated prompts and results for the instruction follwing thing.

Luca Anzalone
- 2,888
- 3
- 14