Questions tagged [unsupervised-learning]

Finding hidden (statistical) structure in unlabelled data, including clustering and feature extraction for dimensionality reduction.

Finding hidden (statistical) structure in unlabelled data, including clustering and feature extraction for dimensionality reduction

Because the items are unlabelled, there's nothing that points toward the "correct" labels, as there is with supervised learning. Unsupervised learning uses methods like clustering and principal components analysis to discover structure.

Reference:
Wikipedia - Unsupervised learning

455 questions
4
votes
1 answer

semi supervised learning doubt only classify points with confidence above threshold

I currently have a dataset with approximately 5% labelled points and 95% unlabelled. I would like to label some of the unlabelled points only if I am very confident and leave the rest NaN. Personally I would like to use a random forest but I am not…
Tank
  • 287
  • 1
  • 2
  • 9
1
vote
0 answers

Benchmarking unsupervised learning by second stage classifiers?

Suppose we have a dataset $\{x^{(i)}, y^{(i)}\}_{i = 1}^N$ where $x^{(i)} \in \mathbb{R}^n$ and $y^{(i)} \in \{0, 1\}$ for simplicity. Our main goal is to apply some unsupervised learning algorithm on $x^{(i)}$ and interpret the results, which we…
RJTK
  • 111
  • 2
1
vote
2 answers

Unsupervised Clustering for n-length word arrays

I have a series of arrays [Apple,Banana,Cherry,Date] [Apple,Fig,Grape] [Banana,Cherry,Date,Elderberry] [Fig,Grape] and I would like to build some clusters that associate the arrays into groups based on overlap Group1: Array1 and Array3 as they have…
Jamie Dixon
  • 135
  • 4
1
vote
1 answer

Use 1 or 2 norm for Voronoi vector quantization?

I have a script from a lecture. Basically it says that based on the Voronoi partitioning we identify the corresponding (nearest) class $w_k$ to a vector $x$ where $\left| {{w_k} - x} \right| = \mathop {\min }\limits_i \left( {\left| {{w_i} - x}…
1
vote
0 answers

Unsupervized latent truth discovery on text data

I have text infomration from several different sources. I need to identify the most reliable source in an unsupervized manner (no labels about true/false or ground truth to train on available). I think this field of work is sometimes referred to as…
Peter
  • 7,446
  • 5
  • 19
  • 49
0
votes
1 answer

Rigth way to find Lorentzian distance between 2 point

Following this paper and this paper, I'm trying to implement the formula for the Lorentzian distance between 2 points (aka the distance between 2 points in Lorentzian space). I'll use this a the distance metric for a KNN classifier. According to the…
0
votes
1 answer

Is EDA mandatory for unsupervised problems

I have seen many problems solving videos where in the users not performing Exploratory Data Analysis for unsupervised problems. Is this true? But I feel EDA is necessary as we are dealing with unsupervised problems. Please help me in understanding…
0
votes
1 answer

Restricted Boltzmann Machine (RBM) implementation in Tensorflow (TF) 2.x

I‘m looking for a Python implementation of a Restricted Boltzmann Machine (RBM), e.g. applied to MNIST data as mentioned in „Elements of Statistical Learning“ Ch. 17, in Tensorflow 2.x. I‘m aware of code as linked here. However, the model(s) are…
Peter
  • 7,446
  • 5
  • 19
  • 49