Let's say we have following SQL table structure
- Entities (5-15k)
- Keywords (15-20k)
- EntityKeywords
- ExcludedKeywords (keywords which should be excluded from common matching)
We need to find related entities, which are entities which have most common keywords ordered descending.
Now obviously, querying this on each load would be too slow because each query requires ordering by count. One of ideas is to aggregate keywords to a single column for each entity, and use full text search over it. Don't know is this a good approach?
Is this too much for SQL server and does it require another technological stack, or there are better ways to deal with this problem?