Questions tagged [language-model]

Language models are used extensively in Natural Language Processing (NLP) and are probability distributions over a sequence of words or terms.

Language models are used extensively in Natural Language Processing (NLP) and are probability distributions over a sequence of words or terms. Commonly, language models are constructed to determine the probability of any given word given the set of n previous words. A popular language model is an n-gram one which has two variations: unigram and bigram.

The unigram model (Bag of Words, n=1):

$P_{unigram}(w_1,w_2,w_3,w_4) = P(w_1)P(w_2)P(w_3)P(w_4)$

The bigram model (n=2):

$P_{bigram}(w_1,w_2,w_3,w_4) = P(w_1)P(w_2|w_1)P(w_3|w_2)P(w_4|w_3)$

Other more sophisticated methods for constructing language models also exist using Exponential and Neural Networks.

170 questions
1
vote
1 answer

How can models like Mosaic's MPT-7b or Bloombergs BLOOMGPT take in so many tokens?

I've read the paper on ALiBi, and I understand that these models are biasing the values made in the query/key multiplication. But from my understanding, when I build the actual model I give it N input nodes. When I train a model I give it vectors of…
Travasaurus
  • 113
  • 4
0
votes
1 answer

What languages llama2 supports?

Which languages llama2 supports? I looked at the docs and huggingface but I couldn't find a list. Just it says usage in other languages than English as out-of-scope.
heyula
  • 37
  • 3
0
votes
0 answers

What is the best LLM that can be used on a single GPU?

I am interested in the best/state-of-the-art Large Language model that can be used in a single GPU. I read that Falcon 7B is state-of-the-art. Is there anything better? Any data that show the pros/cons of LLMs would be helpful. Thanks
Dion
  • 101