1

I have sufficient and properly formatted data in millions without labels. I have to find out the anomalies.

Heard Isolation forest, Mahalanobis distance about identifying anomalies in unsupervised learning. Are these ok to try?

Are their any other techniques we can try?

Thanks

GSKR
  • 13
  • 4

1 Answers1

1

You can try these techniques and many more.- All anomaly detection techniques

As discussed in article, these are outlier detection techniques.Are you looking for outliers? better to get some known abnormalities and build a classification.

If supervised not possible, try to fit one of these approaches-

ABOD for identifying abnormalities in high dimensional data

Should clustering be based on distance or density to find outliers(abnormalities)

Connectivity based outlier detection technique

There are other techniques like PCA based, regression based, auto-encoder, knn, weighted Knn and even self organizing map (SOM) . let me know if you need some more information.

Imp- Know your abnormalities better before jumping to machine learning, I have experienced that even qq plot or just data points 3sd away might give better anomaly detection.

Arpit Sisodia
  • 425
  • 2
  • 10
  • 1
    Thanks for your response.

    I will checkout the links.

    I am not looking for outliers. These data points are related to behaviour of certain loan process. Almost 70 variables are there.

    How to use autoencoder for UN-supervised anomaly detection. I came across this when dealing with dimension reduction.

    I will try to know about abnormalities much better. Thanks for this suggestion.

    – GSKR Oct 27 '18 at 09:07
  • 1
    I like to add one more info- This problem is almost like credit card fraud transaction. There is nothing like outlier but abnormal way of transaction. – GSKR Oct 27 '18 at 09:23
  • Auto encoders and PCA are used to transform data into different dimensions( say m to n). One regenerates the original values( from n to m now) from transformed values. Higher the error = high abnormality – Arpit Sisodia Oct 28 '18 at 06:09