1

We use forward rendering so our main shader is pretty big (342 instructions). Recently I tried substituting every single custom function I call in the shader with the actual code from that function and the number of instructions generated dropped down to 311.

Am I doing something wrong with my custom functions? I tried passing variables into them with various predicates (in/inout/out/[nothing]) but it did not change the number of instructions even though Microsoft claims every instruction in HLSL is inlined.

UPDATE: Example code:

void DirectionalLight(half4 Normal,
                half3 DirectionToCamera,
                half4 Combined,
                inout half2x3 light)
{
    half3 halfVector = normalize(DirectionToCamera + SpecLightDirection);
    half3 lightParams = saturate(half3(dot(Normal.rgb, DirectLight1.Direction), //NDL
        dot(Normal.rgb, SpecLightDirection),                                    //NDL_Spec
        dot(Normal.rgb, halfVector)));                                          //NDH

    half4 litV = lit(lightParams.g, lightParams.b, Combined.b); 
    light[0] += DirectLight1.Color*lightParams.r;
    light[1] += DirectLight1.Color*litV.b;
}


float4 MainPS(VSOutput input) : COLOR0
{
    ...         
    DirectionalLight(Normal, input.DirectionToCamera, Combined, DirLightAccum);
    ...
}

Or this code:

half2x3 DirectionalLight(half4 Normal,
                half3 DirectionToCamera,
                half4 Combined)
{
    half2x3 light = (half2x3)0;
    half3 halfVector = normalize(DirectionToCamera + SpecLightDirection);
    half3 lightParams = saturate(half3(dot(Normal.rgb, DirectLight1.Direction), //NDL
        dot(Normal.rgb, SpecLightDirection),                                    //NDL_Spec
        dot(Normal.rgb, halfVector)));                                          //NDH

    half4 litV = lit(lightParams.g, lightParams.b, Combined.b); 
    light[0] += DirectLight1.Color*lightParams.r;  
    light[1] += DirectLight1.Color*litV.b;
    return light;
}


float4 MainPS(VSOutput input) : COLOR0
{
    ...         
    DirLightAccum += DirectionalLight(Normal, input.DirectionToCamera, Combined, );
    ...
}

Both generates 342 instructions. But the following code:

float4 MainPS(VSOutput input) : COLOR0
{
    ...         
    half3 halfVector = normalize(input.DirectionToCamera + SpecLightDirection);
    half3 lightParams = saturate(half3(dot(Normal.rgb, DirectLight1.Direction), //NDL
        dot(Normal.rgb, SpecLightDirection),                                    //NDL_Spec
        dot(Normal.rgb, halfVector)));                                          //NDH

    half4 litV = lit(lightParams.g, lightParams.b, Combined.b); 
    DirLightAccum [0] += DirectLight1.Color*lightParams.r;
    DirLightAccum [1] += DirectLight1.Color*litV.b;
    ...
}

Only generates 339 instructions. Heavier functions (with more variables passed) generate more instructions. As I mentioned, when substituting ALL the functions with their actuall bodies I was able to reduce instruction count by 10%.

I tried making CastShadow a float function so that I can get rid of the "inout half shadow" argument but does not help.

I am just trying to figure out if it's a norm or I am doing something wrong?

cubrman
  • 1,551
  • 1
  • 18
  • 31
  • It may be helpful here to give a SSCCE. – kevintodisco May 19 '14 at 20:13
  • 3
    Every "instruction" is inlined in HLSL, if for no other reason than there exists no call stack on GPUs. You do not call and return from routines using a stack. That said, variables with different scope visibility (e.g. in vs. inout) may require extra register allocation, etc. which could increase the number of instructions. – Andon M. Coleman May 19 '14 at 20:14
  • @AndonM.Coleman I know about all functions being inlined, but I told you that no matter what scope visibility I specify, calling my custom functions generates extra instructions. It ain't "may" for me, it is "certain". – cubrman May 20 '14 at 06:47
  • @ktodisco I added SSCCE. – cubrman May 20 '14 at 11:35

0 Answers0