Firstly, your question seems confused about the relationship between a GIL and cooperative multithreading. Co-operative multithreading is when the the current thread continues to execute until it gives up execution. The greenlet library for python is based on this model. It simplifies coding in many cases because you don't have to worry about context switches except at specific points.
A global interpreter lock is a lock which prevents more than one thread from executing code inside the virtual machine at once. Execution will still jump from thread to thread without the thread requesting it. The times that switches will happen is limited (exactly how depends on the language implementation). But you don't really the simplification that the co-operative multitasking gives you.
Your question actually appears to be asking about whether you can get away with co-operative multitasking with multi-process options. Of course strictly speaking you can still do whatever you want without pre-emptive multithreading, the question is whether it'll be easier/more efficient with it.
I use co-operative multithreading and process parallelization whenever I can. Most of the time I find that works beautifully and is a simpler approach then would be required if I were to try and use threads. But I think there are some cases where this falls down.
Let's consider some examples:
1) Worker Thread and UI Thread
Its not uncommon to have a long running task executed in an application while a progress bar is displayed. In order to keep things working we need to execute UI events as well as continue running the task. Normally, we'd have the task executing in a separate thread. But if the threads are co-operative, this won't work because the task won't normally have any reason to delay itself.
So what can we do?
- In some cases, there were be natural pause points in the long running task. There could be File I/O, database calls, sockets reads. All of these naturally block, and if your language automatically thread waits at these points, many long running tasks may yield naturally.
- The task could be moved into another process. But for some tasks, this will take a lot of effort. I may have to ship a lot of data to the subprocess and then let it process and ship the data back.
- You could introduce explicit thread yield calls. The disadvantage here is that you are doing something manually that other languages do automatically.
2) Serving many requests
A system such as a database or a web server may need to serve requests coming from many different external systems. In doing so it will have to navigate in-memory data structures and multiple requests may require the same data structures. Typically, we might implement this using multiple threads and using locks on the data structure to make sure nobody changes it while it is being read.
As long as we only have on core operating, co-operative multithreading actually works great. But when you've got multiple cores, you can't take advantage of it that way. You could introduce multiple processes. But the trouble there is that you can't really share in-memory data structures all that well. I'm sure you can work around it with IPC techniques, but I think it'll always be awkward compared to the locks model.