Your dilemma hints at a larger problem that's well-known to the graphics programming community, commonly referred to as "combinatorial shader explosion." As the name implies, it's usually considered in the context of very large numbers of shader permutations, but the basic principle is the same. Solutions geared toward solving the overarching problem are industrial-strength, and as far as I'm aware, can be categorized into two main approaches:
- So-called "Ubershaders"
- Code generators
Ubershaders
The term ubershader refers to a shader which has conditional code to handle an extremely large number of permutations with regard to feature support. Likely candidates include variations on light counts/types, animation (number of bones? Dual quaternion skinning?), vertex formats, shadow map sampling and more. Because dynamic branching in shaders both pushes shader limits, and is not particularly performant, these variations are usually controlled by preprocessor directives, varying the values of preprocessor definitions (e.g. #define USE_PCF_SHADOW_FILTERING 1, #define NUM_PCF_SAMPLES 16, etc).
Code generators
Code generators, on the other hand, actually assemble the body of the shader from the ground up. An example of such an approach would be Shawn Hargreaves' shader fragment system.
As Shawn mentions in his article:
If you have five, ten, or even fifty shaders, this system is probably not for you. If you have thousands, however, automation is your friend.
I would say this advice applies to ubershaders as well.
As I mentioned above, these solutions are industrial-strength. Both have their pros and cons, and both are very complex to manage. Importantly, you need to know ahead of time which shader combinations your engine is going to potentially require, or you'll end up compiling them on-the-fly; this is obviously undesirable, because if you don't have the shader handy, compilation takes a considerable amount of time.
Bonus: Deferred rendering
You should also be aware that, in the case of light and shadow complexity specifically, deferred rendering approaches also help cut down on shader permutations.
Imagine the following simple approach (expressed in pseudo-code):
for ( each mesh )
{
draw mesh with ambient lighting only;
for ( each light )
{
additively blend single light's diffuse + specular contributions \
for this mesh into frame buffer;
}
}
Now your shader only needs to handle one light, and you invoke it as many times as necessary. Unfortunately, this naive approach does not scale very well in terms of performance — with deferred shading, you draw each mesh only once, and then draw each light only once.
An in-depth explanation of deferred shading is beyond the scope of this answer, but you can start looking here for more info.
It's similarly possible to calculate shadows in a "deferred" manner, which is much simpler, and plays nicely with traditional ("forward") rendering — you don't need deferred shading to implement deferred shadows.