In my experience, these types of issues typically boil down to some type of resource exhaustion.
It's easy to speculate to the n-th degree about what it "could" be, but without data, these remain speculations.
Counters
To gather data that can solve the puzzle, on windows you need to collect perfmon data. Some counters you should grab for all processes (if applicable) are:
**Processor** /All Counters/All instances
**Logical Disk**/All Counters/All instances
**Memory**/All Counters/All instances
**Network Interface**/All Counters/All instances
**Paging File**/All Counters/All instances
**Process**/All Counters/All instances
**Processor**/All Counters/All instances
**Server**/All Counters/All instances
**Server Work Queues**/All Counters/All instances
**System**/All Counters/All instances
In my opinion, this is an exhaustive list of all possible counters that you might find relevant data in. There is a penalty for capturing all of this data, it is a lot of data to log, so you may want to try a subset of the counters that you feel are most relevant for your situation.
Logging
When you run perfmon you want to select to create a new manually defined Data Collector Set for Performance Counters. There will be a screen that asks for the sample interval. You need to make sure that the sample interval is small enough to capture the problem but not so small that you overwhelm the system with data logging.
I would recommend setting the capture to manually start/stop. So that you can start the capture, repro the problem, then stop and analyze the logs.
Analyzing Data
The perfmon utility allows you to look at every counter individually. If you know what you're looking for, this works. If you're not familiar with this process or which counters to look at, you might benefit from using an automated analysis tool such as PAL. PAL is free and awesome. Essentially it has a set of thresholds defined for each counter, it parses through your log collection and spits out an HTML report that shows you:
Warnings - Any counter that is close to a threshold
Critical - Any counter that has exceeded a threshold
This can be a simple way to start your analysis and narrow in on any items marked Critical.
Best Guess/Speculation on Problem Statement
To add to the speculation about what it might be. It sounds like you may be under memory pressure. This means that physical free memory has been exhausted and the os needs to read or write memory contents to or from disk.
The perfmon data that would validate the above scenario would show a steady rise in memory utilization followed by a sharp fall. Simultaneously there would be a sharp rise in pagefile usage as well as local disk I/o. Again, just speculating without any hard data (which you need).