0

Problem

I'm currently working on multi UV support for glTF-models, and after getting a first version up and running (checking against glTF's multi-uv-test) I checked if the other models are still running fine.

I noticed some very visible artifacts (left image) for the Sponza model which has only one set of texture coordinates.

Sponza Bad Sponza Good

Sponza Bad 2 Sponza Good 2

I could at least find the code that seems to cause this issue in a fragment shader (the condition should always be true for Sponza):

vec2 getBaseColorTexCoords() {
    return u_material.baseColorTexCoordSet == 0 ? o_texCoords0 : o_texCoords1;
//    return o_texCoords0;
}

When I return the second line instead, I get the desired output (right image).

Some additional stuff:

  • If I change the condition to true it works fine. I guess the branching is just optimized away in that case
  • If I change the condition to false I get a black screen (texCoords1 attribute disabled for Sponza)
  • If I pass o_texCoords0 on both sides of the colon : I still get these artifacts, but not if I just return o_texCoords0 in analogous functions for other texture maps (eg. normal or metallic-roughness)

The only place where I use this function:

vec4 getBaseColor() {
    vec4 baseColor = u_material.baseColor;
    if(u_hasBaseColorTexture) {
        vec2 texCoords = getBaseColorTexCoords();
        baseColor *= srgbToLinear4fv(texture(u_baseColorTexture, texCoords));
    }
    if(u_hasVertexColors) {
        baseColor *= o_color;
    }
    return baseColor;
}

Analogous functions:

vec2 getMetallicRoughnessTexCoords() {
    return u_material.metallicRoughnessTexCoordSet == 0 ? o_texCoords0 : o_texCoords1;
//    return o_texCoords0;
}

vec2 getNormalTexCoords() { return u_material.normalTexCoordSet == 0 ? o_texCoords0 : o_texCoords1; // return o_texCoords0; }

Additional Code Snippets

First I suspected that perhaps u_material.baseColorTexCoordSet somehow somewhere gets assigned a value that is not equal to 0 on the CPU side. But that is not the case and to make doubly sure I'm passing in the constant 0:

val baseColorTexCoordSetLoc = glGetUniformLocation(shaderProgram, "u_material.baseColorTexCoordSet")
// glUniform1i(baseColorTexCoordSetLoc, material.baseColorTexCoordSet)
glUniform1i(baseColorTexCoordSetLoc, 0)

I don't think it is the cause, but here are some code snippets related to o_texCoords1, which gets its value from a disabled vertex attribute:

GPU:

// Vertex shader
layout (location = 1) in vec2 texCoords0;
layout (location = 2) in vec2 texCoords1;

out vec2 o_texCoords0; out vec2 o_texCoords1;

// Inside main()... o_texCoords0 = texCoords0; o_texCoords1 = texCoords1;

// Fragment shader in vec2 o_texCoords0; in vec2 o_texCoords1;

CPU:

if(primitives.texCoords0.isNotEmpty()) {
    glEnableVertexAttribArray(1)
    glVertexAttribPointer(1, 2, GL_FLOAT, false, 0, offset)
    offset += primitives.texCoords0.size * Float.SIZE_BYTES
}

// This is empty for the Sponza model if(primitives.texCoords1.isNotEmpty()) { glEnableVertexAttribArray(2) glVertexAttribPointer(2, 2, GL_FLOAT, false, 0, offset) offset += primitives.texCoords1.size * Float.SIZE_BYTES }

"Question"

Currently I'm suspecting that this is some GPU branching issue.

I guess this might be difficult to answer, but any clues or hints on how to best tackle this problem are welcomed.

Edit

Changing the precision to highp float helps resolve the issue (not sure why), though that's not desirable on mobile.

Edit 2

I've decided to replace my uniform branches with preprocessor directives:

vec2 getBaseColorTexCoords() {
    vec2 texCoords;

#ifdef BASE_COLOR_TEX_COORD_SET_0 texCoords = o_texCoords0; #else texCoords = o_texCoords1; #endif

return texCoords;

}

Cons: more cumbersome to maintain (perhaps also program switches, but I haven't benchmarked yet and it seems to run just fine)

Pros: no artifacts, don't have to use highp precision, recommended by Mali:

Use #defines at compile time in OpenGL ES, and specialization constants in Vulkan for all control flow. Doing so allows the compilation to completely remove unused code blocks and statically unroll loops.

So I'll roll with this for now.

Beko
  • 123
  • 4
  • It looks like this has to do with texture LOD calculation (which works by comparing texture coordinates in neighbouring fragments) being botched by this branching for some reason, though this shouldn't happen if u_material is a uniform. Did you test this on some other device? – lisyarus May 24 '22 at 22:17
  • Hey, yes I've tested it on another device and it exhibits a similar behavior (I'd say artifacts look even worse). I'll add screenshots in a bit. u_material is a uniform. And thanks for the pointer, I'll see if I can read up something. – Beko May 24 '22 at 22:46
  • Do you use mipmaps? What happens if you disable them, e.g. use just GL_NEAREST filtering? – lisyarus May 25 '22 at 06:56
  • @lisyarus Yes, it uses mimaps with GL_LINEAR_MIPMAP_LINEAR as the min filter. Changing this and disabling mipmapping has no effect on the issue, however. – Beko May 25 '22 at 15:52
  • Some Mobile GPUs have optimised 32bit paths for texture coordinates (a special case of texture2D with a varying variable as the coordinate) even if the rest of the GPUs ALU is less than 32bit. By performing calculation or logic on the uv coordinate varying the shader has to use the lower precision ALU instead of this higher precision path. Do you get better precision of you just directly pass 1 varying into Texture2D? – PaulHK May 26 '22 at 09:30
  • @PaulHK I'm not sure if I understood your suggestion, but I think I'm doing that already (?), except that I use two sets of coordinates. But they're not involved in any calculations. – Beko May 26 '22 at 10:14
  • The ternary operator a?b:c won't work, you need to use the varying coordinate directly – PaulHK May 26 '22 at 14:55
  • @PaulHK So that's why return t0; worked but return cond ? t0 : t0 didn't, because in the latter case the varying is involved in some control-flow logic and thereby some precision is lost. I guess in the highp case there was enough precision even after some loss? Anyway, considering that it also works with preprocessor directives I'd say that I indeed get better precision when using them directly. Is there an actual way to measure/debug this? – Beko May 26 '22 at 19:42
  • It's a strange optimisation in the hardware, preprocessor doesn't count as ALU. Anything that doesn't directly move the coordinate into Texture2D prevents this working. I don't know how to debug this, the difference between texcoord precision you can usually see, as in your example screen shots. – PaulHK May 27 '22 at 03:48
  • I'd also recommend 2 separate shaders, games for example will use thousands. Doing stuff like if(someUniformVariable) in a shader is wasting gpu time as the if statement is always true. Some gpu drivers can optimise this case by recompiling the shader based on a uniform value to short circuit/optimise the if statement out, usually pc drivers do that. Judging by your comment about t0?t0:t0 not working I don't think your gpu driver is great at optimising (that statement should have been reduced to just loading t0) – PaulHK May 27 '22 at 03:53
  • I misread, cond? t0:t0 should be short circuited to just loading t0. Old Mali gpu drivers aren't too smart unfortunately. I had this same problem myself recently – PaulHK May 27 '22 at 04:01
  • @PaulHK But ultimately preprocessor directives (PPDs) allow me to get rid of branches before the compiler does its work, so I can directly use the texture coordinates, no? The output is looking good, too. Also, with the current PPDs approach I end up compiling separate shaders for each mesh. Eg. the Sponza scene has about 100 meshes, each with its own program. – Beko May 27 '22 at 18:16
  • @PaulHK Anyway, I believe that you were right about this issue being related to the precision of texture coordinates, so if you want you can provide an answer. Mali also suggests to use highp for textures greater than 512x512 (Sponza uses 1024x1024 textures). Maybe two more questions before I let go: 1. do you have a link where I could read about that hardware optimization for varying tex coords and 2. do method calls also influence this? So is it ok to access them via getTexCoords() if the method just does a return texCoords? Thanks a lot! – Beko May 27 '22 at 18:23

0 Answers0