I've got survey data that resembles:
|-------------| Q1a | Q1b | Q1c | Q2a | Q2b | Q2c | Classification
| Respondent | 1 | 0 | 0 | 1 | 0 | 0 | Red
| Respondent | 0 | 0 | 1 | 1 | 0 | 0 | Green
| Respondent | 0 | 1 | 0 | 0 | 0 | 1 | Yellow
I am trying to predict the classification for new respondents. Currently I'm using a Naive Bayes, and getting pretty bad accuracy (~20%). I don't have much training data, and the training data is hand scraped from non-standard sources (internal company procedures are a mess here).
I'm looking for other ways to predict the classification.
I'm thinking about assigning weights to each question, and magically predicting the result based on those, somehow. Although I don't really know where to start learning about how to do that, and whether it's appropriate for this data. I have very little background in this :(
Any ideas or tips on predicting the classification column with no training data?