In the past, I've worked in a variety of environments. Desktop apps, games, embedded stuff, web services, command line jobs, web sites, database reporting, and so on. All of these environments shared the same trait: no matter their complexity, no matter their size, I could always have a subset or slice of the application on my machine or in a dev environment to test.
Today I do not. Today I find myself in an environment whose primary focus is on scalability. Reproducing the environment is prohibitively costly. Taking a slice of the environment, while plausible (some of the pieces would need to be simulated, or used in a single-instance mode that they're not made to do), kinda defeats the purpose since it obscures the concurrency and loading that the real system encounters. Even a small "test" system has its flaws. Things will behave differently when you've got 2 nodes and when you have 64 nodes.
My usual approach to optimization (measure, try something, verify correctness, measure differences, repeat) doesn't really work here since I can't effectively do steps 2 and 3 for the parts of the problem that matter (concurrency robustness and performance under load). This scenario doesn't seem unique though. What is the common approach to doing this sort of task in this sort of environment?
There are some related questions:
Reproducing the environment is prohibitively costly.
- How much does a show-stopping production bug cost? What about 2 bugs? At unpredictable times (most likely when you have the majority of your users putting load on the system at the same time). Weigh that against the cost of setting up a minimal reproduction environment - you might find it's not that prohibitively expensive after all. – Jess Telford Jul 10 '14 at 20:17prohibitively costly
. – InformedA Jul 15 '14 at 04:52prohibitively high
though. In that case, I would ask for help from senior members. Cheers. – InformedA Jul 15 '14 at 11:46