Questions tagged [natural-language-processing]

For questions related to natural language processing (NLP), which is concerned with the interactions between computers and human (or natural) languages, in particular how to create programs that process and analyze large amounts of natural language data.

See: Natural language processing (NLP) at Wikipedia.

760 questions
10
votes
1 answer

Has anyone attempted to train an AI to learn all languages?

It seems that most projects attempt to teach the AI to learn individual, specific languages. It occurs to me that there are relations in written and spoken words and phrases across languages - most of use have a much easier time learning more…
mindplay.dk
  • 209
  • 1
  • 4
7
votes
1 answer

Is Sanskrit still relevant for NLP/AI?

I came across a news article from 2018 where the president of India was saying that Sanskrit is the best language for ML/AI. I have no idea regarding his qualification on either AI or Sanskrit to say this but this idea has been floated earlier in…
Borun Chowdhury
  • 190
  • 1
  • 6
5
votes
1 answer

What algorithms does stackoverflow use for classifying duplicate questions?

Can I get details about the algorithms used for classifying questions in stackoverflow ("Questions that may already have your answer"). Most of the suggestions I get are nowhere related to the question I have intended to ask.
5
votes
3 answers

How can I determine if an input sentence is consistent with a certain subject?

How can I determine if an input sentence is consistent with a certain subject? For example, suppose I am given the following dataset. | Subject | User input | Output | |---------------|----------------------|--------| | Dog ownership…
bleand
  • 161
  • 2
4
votes
1 answer

How could you generate sentences from lists of facts

Let's pretend we had a list of facts (similar to prolog tuples) that define some knowledge about some entities. e.g. doing(clean, data) done(collect, data) todo(train, model) todo(write, paper) What methods could I use to generate sentences…
4
votes
1 answer

Can I categorise the user input which I get as free text?

I am working on a project, wherein I take input from the user as free text and try to relate the text to what the user might mean. I have tried Stanford NLP which tokenizes the text into tokens, but I am not able to categorize the input. For…
Karan Khanna
  • 141
  • 4
4
votes
2 answers

Can a sentence have different parse trees?

I just read about the concept of a parse tree. In my understanding, a valid parse tree of a sentence needs to be validated by a linguistic expert. So, I concluded, a sentence only has one parse tree. But, is that correct? Is it possible a sentence…
malioboro
  • 2,819
  • 3
  • 21
  • 47
3
votes
1 answer

Are there strictly deterministic LLMs?

LLMs are understood to generate non-deterministic outputs. My question is wether there are LLMs out there that are capable to producing deterministic outputs for any given input given fixed parameters (like e.g temperature). I heard that llama.cpp -…
user599464
  • 131
  • 1
3
votes
2 answers

How can I generate natural language sentences given logical structures that contain the subject, verb and target?

I have a group of structures in a program that are very specific on their meaning, eg. this is a piece of code randomItem = objects.concept.random("buyable") idea.example(objects.concept.random("family", "friend")).does({ action: "go", …
Onza
  • 139
  • 2
2
votes
1 answer

How can I build an AI with NLP that reads and understands documents?

I have to read a lot of papers, and I thought that I can use an A.I. to read them and summarize them. Maybe find one that can understand what the papers are talking about it seems a lot to ask. I think I can use natural language processing. Is it…
VansFannel
  • 493
  • 2
  • 15
2
votes
1 answer

Phonetic similarity metric for NLP (English)

I am looking for similarity metrics of phonemes (expressed in IPA) in English. In other words, given two phonemes A and B (both written), I want to know how similar they are based on some metric, M. For example, M(ɒ, oʊ) would yield a higher score…
2
votes
1 answer

Tagging parts of speech when proper noun is a composite

By means of parts of speech tagging, words of a given sentence can be assumed to be noun/verb etc, but if the sentence is for instance: "My favourite book is harry potter and the prizoner of azkaban" note that the inputs I receive would be from a…
2
votes
1 answer

How can I identify bigrams and trigrams that represent concepts?

I have many text documents and I want to identify concepts in these documents in an unsupervised manner. One of my problems is that the concepts can be bigrams, trigrams, or even longer. So, for example, out of all the bigrams, how can I identify…
Haffi112
  • 121
  • 2
2
votes
0 answers

Is NLP likely to be sufficiently solved in the next few years?

The reason I am asking this question is because I am about to start a PhD in NLP. So I am wondering if there would be as much job opportunities in research in industry as oppose to in academia in the future (~ 5 to 10 years) or would it be mostly a…
Ash
  • 21
  • 2
2
votes
1 answer

Why do you need to retrain GPT-2?

I'm following this tutorial, and I wonder why is there a train-step - why is it necessary? I thought the whole idea of GPT-2 is that you do not need to train it on specific text domain, as it's already pre-trained on a large amount of data.
Maverick Meerkat
  • 412
  • 3
  • 11
1
2 3