As an academic exercise, I have to write a parallel algorithm that given a sorted array of $n$ integers computes the mode (i.e. the item with the highest frequency) efficiently using $p$ processors, where $p \le n$ is a constant.
The model we use to describe these algorithms allows lock/unlock primitives to synchronize concurrent access to shared variables.
We cannot use hash tables. Could anyone share any hint on the optimal algorithm to solve this problem?
Edit: rephrased question according to comments.
Edit 2: adding my solution for feedback. While trying to solve the exercise I thought of an algorithm in the lines of
- Declare two global variables, mode and modeFrequency and initialize them appropriately;
- For i where $1 \le i \le p$, invoke a concurrent process on a portion of the array.
- In each concurrent process: 3.1.. Find the local mode of the partition; 3.2. Store the local mode and its frequency in two local variables; 3.3. Compare the local mode with the global one: 3.3.1. if the mode is the same, add the local frequency to the global one; 3.3.2. if the local mode is different than the global one and the local frequency is higher than the global one, set the local mode/frequency to the global ones
- Return the global mode.
but I am not convinced of the correctness of the algorithm. Please note that I omitted locks/unlocks for brevity. Also, can the way I partition the array make any difference on the correctness of the algorithm?