how to make decision based on users reports

Question

I have users reports about an accident, i want to know how to make sure that the number of reports is enough to take that accident as a true accident not a spam.

My idea is to consider a minimum number of reports in a specific time interval, for example 4 reports in 20 minutes are good enough to believe the existence of that accident.

My question is how can I choose the number of minimum reports and that time interval? Is there another logic to take that decision? I will appreciate your answers .

Do you have data with true accident/spam labels? You would need some basis to decide how good a specific threshold is. Do you have other data points relating to the accident - descriptions, etc.? — raghu, Apr 30 '18 at 17:51

score 1 · Answer 1 · answered Oct 31 '17 at 13:24

You don't need a prediction model for this. Maybe if you have had users' data. But without anything else, then you just need labeled data. Historical data that you know if it was a real accident or not.

When you have your labeled data, then you can follow a process like this, which is still heavily dependent on the kind of your data.

Start iterating on your labeled dataset and calculate the accuracy of a real accident's report for different combinations (5, 10, 15, 20, 25, 30 ... mins) and (1, 2, 3, 4, 5, 6, 7, etc users).

You will have a 2D matrix. I guess, acting fast on an accident is important in your case. Set an acceptable accuracy and choose the combination with the smallest interval, above that accuracy.

How is that not a predictive model? You're constructing a classifier subject to two features and picking a decision threshold relative to a cost function (risk aversion). — David Marx, Mar 30 '18 at 18:14

how to make decision based on users reports

1 Answers1