Most Popular

1500 questions
8
votes
2 answers

Image clustering by similarity measurement (CW-SSIM)

I'm trying to use scikit-learn and pyssim for clustering a set of images - less than 100. The end goal is to place the images into several buckets (clusters) according to the calculated similarity measures - CW-SSIM. The task seems to be trivial,…
Oleg Puzanov
  • 111
  • 1
  • 4
8
votes
4 answers

How to give name to topics created using LDA?

I have categorized 800,000 documents into 500 categories using the Mahout topic modelling. Instead of representing the topic using the top 5/10 words for each topics, I want to infer a generic name for the group using any existing algorithm. For the…
adihere
  • 81
  • 1
  • 1
  • 2
8
votes
2 answers

How to teach neural network a policy for a board game using reinforcement learning?

I need to use reinforcement learning to teach a neural net a policy for a board game. I chose Q-learining as the specific alghoritm. I'd like a neural net to have the following structure: layer - rows * cols + 1 neurons - input - values of…
Luke
  • 189
  • 1
  • 11
8
votes
1 answer

Why a restricted Boltzman machine (RBM) tends to learn very similar weights?

These are 4 different weight matrices that I got after training a restricted Boltzman machine (RBM) with ~4k visible units and only 96 hidden units/weight vectors. As you can see, weights are extremely similar - even black pixels on the face are…
ffriend
  • 2,801
  • 16
  • 18
8
votes
4 answers

How to select particular column in Spark(pyspark)?

testPassengerId = test.select('PassengerId').map(lambda x: x.PassengerId) I want to select PassengerId column and make RDD of it. But .select is not working. It says 'RDD' object has no attribute 'select'
dsl1990
  • 181
  • 1
  • 1
  • 2
8
votes
1 answer

Coreference Resolution for German Texts

Does anyone know a libarary for performing coreference resolution on German texts? As far as I know, OpenNLP and Stanford NLP are not able to perform coreference resolution for German Texts. The only tool that I know is CorZu which is a python…
Pasmod Turing
  • 463
  • 2
  • 6
8
votes
1 answer

Where exactly does $\geq 1$ come from in SVMs optimization problem constraint?

I've understood that SVMs are binary, linear classifiers (without the kernel trick). They have training data $(x_i, y_i)$ where $x_i$ is a vector and $y_i \in \{-1, 1\}$ is the class. As they are binary, linear classifiers the task is to find a…
Martin Thoma
  • 18,880
  • 35
  • 95
  • 169
8
votes
2 answers

Machine Learning: Single input to variable number of outputs

Is there a machine learning algorithm that maps a single input to an output list of variable length? If so, are there any implementations of the algorithm for public use? If not, what do you recommend as a workaround? In my case, the input is a…
ricksmt
  • 183
  • 1
  • 5
8
votes
1 answer

Recognition human in images through HOG descriptor and SVM classifier performs poorly

I'm using a HOG descriptor, coupled with a SVM classifier, to recognise humans in pictures. I'm using the Python wrappers for OpenCV. I've used the excellent tutorial at pymagesearch, which explains what the algorithm does and furnishes hints on how…
martina.physics
  • 255
  • 2
  • 8
8
votes
1 answer

Dimensions of Transformer - dmodel and depth

Trying to understand the dimensions of the Multihead Attention component in Transformer referring the following tutorial https://www.tensorflow.org/tutorials/text/transformer#setup There are 2 unknown dimensions - depth and d_model which I dont…
data_person
  • 255
  • 3
  • 11
8
votes
2 answers

Pylearn2 vs TensorFlow

I am about to dive into a long NN research project and wanted a push in the direction of Pylearn2 or TensorFlow? As of Dec 2015 has the community started to lean one direction or another? This link has given me concern about getting tied to…
user3155053
  • 183
  • 3
8
votes
1 answer

When do I have to use aucPR instead of auROC? (and vice versa)

I'm wondering if sometimes, to validate a model, it's not better to use aucPR instead of aucROC? Do these cases only depend on the "domain & business understanding" ? Especially, I'm thinking about the "unbalanced class problem" where, it seems…
jmvllt
  • 619
  • 1
  • 8
  • 15
8
votes
5 answers

Best way to search for a similar document given the ngram

I have a database of about 200 documents who's ngrams I have extracted. I want to find the document in my database that is most similar to a query document. In otherwords, I want to find the document in the database that shares the most number of…
okebz
  • 113
  • 4
8
votes
1 answer

What is the difference between Trax and Tensorflow?

What is the main difference between Trax and Tensorflow? Both of them are deep learning libraries and implemented by Google. https://github.com/google/trax https://github.com/tensorflow/tensorflow
Bala venkatesh
  • 391
  • 1
  • 3
  • 10
8
votes
3 answers

What is the use of user data collection besides serving ads?

Well this looks like the most suited place for this question. Every website collect data of the user, some just for usability and personalization, but the majority like social networks track every move on the web, some free apps on your phone…