I m working with Tree-based classifiers in scikit-learn - Decision Trees and Random Forest, for a data classification use case, and the feature set is a mix of both categorical (majority) and numerical features. The scikit-learn Decision Trees / Random Forest can handle only numerical values, so I have used both LabelEncoder and OneHotEncoder that comes with the framework to transform the categorical features into numerical ones. Comparing the performance metrics using each, which came up to be similar, LabelEncoded data performed slightly better in terms of processing time, resource consumption, and final accuracy stats.
So my question is there anything fundamentally wrong with using LabelEncoder here?. A number of posts online suggest it's not recommended to transform categorical features using LabelEncoder, as it enforces an order to non-ordinal features. Even if it does, will it affect these tree-based models in any way?