Normally, nearest neighbours (or $k$-nearest neighbours) is, as you note, a supervised learning algorithm (i.e. for regression/classification), not a clustering (unsupervised) algorithm.
That being said, there is an obvious way to "cluster" (loosely speaking) via nearest neighbours. (so-called unsupervised nearest neighbours).
This "clustering" simply refers to getting the nearest neighbours of a given point $p$ (not necessarily in the data set), either by taking all the neighbours in some ball around $p$ with cutoff radius $r$ or by taking the $k$ nearest neighbours and returning them as the cluster.
In such cases, the use of fast spatial partitioning data structures are the key: simply use either a $k$D-tree or a ball tree (metric tree). Either algorithm lets you do fast nearest neighbour search (average time is around $O(n\log n)$, under some reasonable distribution assumptions, I believe). See here for a comparison.
Since you mention big data though, I would say that both of those standard tree algorithms will fail miserably when there are millions of points in hundreds of dimensions (or less even). In such cases, you will have to use approximate nearest neighbours algorithms (e.g. locality-sensitive hashing). A good read for this might be An Investigation of Practical Approximate
Nearest Neighbor Algorithms by Liu et al.
Separately, a different approach that you may be thinking of is using Nearest-neighbor chain algorithm, which is a form of hierarchical clustering. In this case, one is interested in relating clusters, as well as the clustering itself.
Lastly, maybe look into clustering methods based on nearest neighbours (i.e. extending them). E.g: