Suppose we have a dataset $\{x^{(i)}, y^{(i)}\}_{i = 1}^N$ where $x^{(i)} \in \mathbb{R}^n$ and $y^{(i)} \in \{0, 1\}$ for simplicity.
Our main goal is to apply some unsupervised learning algorithm on $x^{(i)}$ and interpret the results, which we can call $u^{(i)}$. I have in mind applications where the unsupervised algorithm is designed to infer something intangible and unobservable, but nonetheless having a meaningful physical interpretation, ICA as a means of recovering independent sources for example.
Benchmarking the unsupervised algorithm algorithm in a meaningful way is difficult since we have no way to compare $u^{(i)}$ to any ground truth. My idea to add another perspective to this problem is to train a supervised classifier on the dataset $\{u^{(i)}, y^{(i)}\}_{i = 1}^N$, i.e. just use the results of the unsupervised algorithm as features. If a classifier on this derived dataset performs well, it would provide some evidence that the results $u^{(i)}$ of the unsupervised algorithm is actually finding some meaningful structure in each $x^{(i)}$. If $u^{(i)}$ was basically just spurious, it wouldn't be possible to train this second classifier.
Does this sound like a reasonable means of comparison? Is anyone aware of existing work that benchmarks unsupervised algorithms in this way?
Putting the question more generally, by what means can we try to evaluate whether or not $u^{(i)}$ is providing useful or interpretable information? As opposed to just being some artifically constructed statistic, or something spurious.