Questions tagged [similarity]

288 questions
39
votes
4 answers

Applications and differences for Jaccard similarity and Cosine Similarity

Jaccard similarity and cosine similarity are two very common measurements while comparing item similarities. However, I am not very clear in what situation which one should be preferable than another. Can somebody help clarify the differences of…
shihpeng
  • 553
  • 1
  • 4
  • 8
9
votes
1 answer

Similarity measure based on multiple classes from a hierarchical taxonomy?

Could anyone recommend a good similarity measure for objects which have multiple classes, where each class is part of a hierarchy? For example, let's say the classes look like: 1 Produce 1.1 Eggs 1.1.1 Duck eggs 1.1.2 Chicken eggs 1.2…
Dave Challis
  • 395
  • 2
  • 10
6
votes
2 answers

Calculating similarity where order matters

How can I calculate a similarity (coefficient) where the order of the items matters and something like the Jaccard index would not be useful. Specifically, I'm interested in comparing ingredients. Take a simplified apple pie ingredient list, for…
James S
  • 163
  • 3
6
votes
3 answers

Similarity measure for ordered binary vectors

I would like to ask your opinion on how to choose a similarity measure. I have a set of vectors of length N, each element of which can contain either 0 or 1. The vectors are actually ordered sequences, so the position of each element is important.…
GdA
  • 121
  • 3
6
votes
1 answer

How to compute the Jaccard Similarity in this example? (Jaccard vs. Cosine)

I am trying to understand the difference between Jaccard and Cosine. However, there seem to be a disagreement in the answers provided in Applications and differences for Jaccard similarity and Cosine Similarity. I am seeking if anyone could step me…
jkyh
  • 462
  • 1
  • 4
  • 13
4
votes
1 answer

What is the difference between Latent and Explicit Semantic Analysis

I'm not quite sure what "latent" refers to in this context. In Computing Semantic Relatedness using Wikipedia-based Explicit Semantic Analysis they say ''Our semantic analysis is explicit in the sense that we manipulate manifest concepts…
1
vote
2 answers

similarity measure with two features

I have some question concern similarity measure Suppose that we have a matrix M where M(i,j) is the similarity measure between user i and user j . Each user is characterised by : id-user | country | id-artist | id-track For this I choose to use…
user17241
  • 151
  • 1
  • 7
1
vote
2 answers

users' percentile similarity measure

Having n vectors of percentile ranks for a list of common users between group #1 and groups #2:n e.g. vec1 = {0.25, 0.1, 0.8, 0.75, 0.5, 0.6} vec2 = {0.35, 0.2, 0.6, 0.45, 0.2, 0.9} The percentile ranks represent activity frequency within the…
1
vote
0 answers

Looking for an algorithm that compute similarity between a phrase and possible combination of tokens

I want to find similarity between a phrase and possible combination of tokens that may form the phrase. For example, phrase = 'sea surface water' Possible token = ['sea','surface land', 'surface water'] My approach: First, i generate combinations, c…
1
vote
2 answers

Group similarity by high dimension vector comparison

I have a dataset of 256 rows with 61 columns/variables. Each row should be considered a vector of dimension 61. If I randomly split it, by rows, in 2 groups, how could I prove that the 2 groups are similar? The origin of the data is biomedical and…
RgrNormand
  • 141
  • 1
0
votes
1 answer

I'm building a movie recommendation system based on genre, where the user will enter his choice and will receive recommendations based on his choice:

Column Name: "Genres" //This will contain the genres of movies User genre: "Action Adventure" I want to perform cosine similarity on this data, compare user genre with genres of different movies, sort and display the names of movies having similar…
Pratik
  • 1