I've read http://papers.nips.cc/paper/7062-a-unified-approach-to-interpreting-model-predictions.pdf and https://medium.com/@gabrieltseng/interpreting-complex-models-with-shap-values-1c187db6ec83 which is like a summary of the first link.
In general I didn't understand anything about how does SHAP values helps and how it helps us determine importance of features from the first paper. At the second article he has a very simple decision tree and calculates shapely value for a feature for a specific training example. It doesn't say that at the end which value determines its importance(i.e mean of a feature for every training example. I don't know). Or why this works.
And there is a confusion between them. First the first article uses SHAP values which defined as "Shapley values of a conditional expectation function of the original model" at the second article it just uses shapley values.
I read several academic papers and website articles but I couldn't address to my question. Most of websites deal with its framework application anyways. If you can explain or give a useful resource I would be appreciated.