12

I am trying to get this shader to run on a really old iDevice, as well as eventually Androids.

Even when I reduce the code down to 2 sine functions per fragment the shader runs at about 20fps.

I have considered taking a leaf out of the book of old shading techniques and creating an array that holds a bunch of predefined trig values and somehow using those to approximate the shader.

In the shader I linked above I am already simulating that by rounding the values sent to the trig function the further left the mouse (while down) goes the less quality of the shader. It's actually pretty cool because really close to the left side it looks like a completely different and pretty cool shader.

Anyway I have two dilemmas:

  1. I don't know what the most efficient way to have an array of like 360 values in it in a GLSL shader, constant or uniform?
  2. I can't figure out how to put a number in a range like usually if I wanted an angle between 0 and 360 (yes I know GPUs use radians) I would do it like so.

    func range(float angle)
    {
       float temp = angle
       while (temp > 360) {temp -= 360;}
       while (temp < 0)   {temp += 360;}
       return temp;
    }
    

    However GLSL does not allow while loops or recursive functions.

J.Doe
  • 1,445
  • 12
  • 23
  • I don't know if it would be practical to implement this, but would it help to have the precomputed sine values spaced unevenly, with more closely clustered values where the slope of the sine curve is steepest, and fewer values where it levels out and doesn't change as much? Would this allow higher accuracy where it is needed, without needing to store a large number of values? – trichoplax is on Codidact now Jul 02 '16 at 19:26
  • Can you just render to a texture that is 1/2 or 1/4 the size of the final target? –  Jul 02 '16 at 21:59
  • 5
    Regarding #2, the built-in mod function is what you want. You would write mod(angle, 360.0). – Nathan Reed Jul 03 '16 at 01:10
  • 1
    @trichoplax brilliant idea but I don't know how you would be able to look up values on the table then. Let's say we put them in an array with some of them more concentrared. How could we find the right index? – J.Doe Jul 03 '16 at 01:35
  • @racarate actually will probably do that. Feel free to post that as the answer. – J.Doe Jul 03 '16 at 01:35
  • 6
    How about putting your values into a 3-channel 1D texture? That way you can get sin, cos and tan out for the price of a single texture lookup. If you map 0 - 2pi angle to 0 - 1 UV and use repeat texture mode you don't even need the mod call, it will 'wrap' automatically, and you can also get linear-filtered approximations in between your stored values rather than snapping to the nearest one. – russ Jul 04 '16 at 05:31
  • 3
    A lot of the time you can eliminate trig functions when used for geometry by not using the angle but start and end with the sin/cos pair and use trig identities for half angles and such. – ratchet freak Jul 04 '16 at 08:27
  • Russ has, I think, the best solution here. If you only need sine, then a single channel half precision will work best, as it'll be important for performance that the 1d texture to fit in cache (less precision than half float will likely not reproduce the sine function very well). – GroverManheim Jul 06 '16 at 19:15
  • That is brilliant! Though filling up a texture with values would be a new technique for me. I can write to a texture but I can't ensure that I'm writing exact pixels rather then say .5 pixels – J.Doe Jul 06 '16 at 20:11
  • And I don't quite understand how I would look up values on the texture for a given radian value. – J.Doe Jul 06 '16 at 20:11
  • Convert radians to revolutions by multiplying by 1/(2pi); have your texture represent one complete revolution of sin/cos/whatever and lookup with GL_REPEAT. Please post what kind of speedups you get - different HW has very different performance characteristics - AMD has generally supported single cycle sin/cos instructions through HW lookup tables and texture lookup tables were, at least at one time, a slowdown. Keeping the table small so it lives in cache as GroverManheim suggested, is important. –  Jan 03 '17 at 01:04
  • When you say "really old iDevice" how old do you mean? For example, page 8 of the series 6 gives the "fred" instruction which is meant for trig range reduction (admittedly for Radians), which can then be followed by fsinc. It's been a while since I worked on these (i.e. developing the HW algorithms) but can't imagine there's a much faster way of calculating sine/cosine. – Simon F Jan 03 '17 at 11:28
  • If sin instruction and texture lookup are expensive, you might also try approximating sin with a polynomial – JarkkoL Jan 03 '17 at 22:22

1 Answers1

9

It has been suggested in comments repeatedly, but noone felt the need to give a proper answer, so for the sake of completeness, a straight-forward and common solution to this problem might be to use a texture as lookup table, specifically a 1D texture that contains all the values of your function for the possible input range (i.e. $[0,360)$ / $[0,2\pi)$). This has various advantages:

  • It uses normalized coordinates, i.e. you access the texture by mapping your angles from $[0,360]$ to $[0,1]$. This means your shader doesn't actually have to care about the specific amount of values. You can adjust its size according to whatever memory/speed vs. quality tradeoff you want (and especially on older/embedded hardware you might want a power of two as texture size anyway).
  • You get the additional benefit of not having to do your loop-like interval adjustment (although, you wouldn't need loops anyway and could just use a modulus operation). Just use GL_REPEAT as wrapping mode for the texture and it will automatically start at the beginning again when accessing with arguments > 1 (and similarly for negative arguments).
  • And you also get the benefit of linearly interpolating between two values in the array basically for free (or let's say almost free) by using GL_LINEAR as texture filter, this way getting values you didn't even store. Of course linear interpolation isn't 100% accurate for trigonometric functions, but it's certainly better than no interpolation.
  • You can store more than one value in the texture by using an RGBA texture (or however many components you need). This way you can get e.g. sin and cos with a single texture lookup.
  • For sin and cos you only need to store values in $[-1,1]$ anyway, which you can naturally upscale from the normalized $[0,1]$ range of a common 8-bit fixed-point format. However, that might not be enough precision for your needs. Some people suggested using 16-bit floating point values, as they're more precise than the usual 8-bit normalized fixed point values but less memory intensive than real 32-bit floats. But then again, I also don't know if your implementation supports floating point textures to begin with. If not, then maybe you can use 2 8-bit fixed point components and combine them into a single value with something like float sin = 2.0 * (texValue.r + texValue.g / 256.0) - 1.0; (or even more components for finer grain). This lets you profit from multi-component textures yet again.

Of course it still has to be evaluated if this is a better solution, since texture access isn't entirely free either, as well as what the best combination of texture size and format would be.

As to filling the texture with data and adressing one of your comments, you have to consider that texture filtering returns the exact value at the texel center, i.e. a texture coordinate off by half the texel size. So yes, you should generate values at .5 texels, i.e. something like this in application code:

float texels[256];
for(unsigned int i = 0; i < 256; ++i)
    texels[i] = sin((i + .5f) / 256.f) * TWO_PI);
glTexImage1D(GL_TEXTURE_1D, 0, ..., 256, 0, GL_RED, GL_FLOAT, texels);

You might, however, still want to compare this approach's performance against an approach using a small uniform array (i.e. uniform float sinTable[361], or maybe less in practice, keep an eye on your implementation's limit on uniform array size) which you just load with the respective values using glUniform1fv and access by adjustig your angle to $[0,360)$ using the mod function and rounding it to the nearest value:

angle = mod(angle, 360.0);
float value = sinTable[int(((angle < 0.0) ? (angle + 360.0) : angle) + 0.5)];
Christian Rau
  • 1,601
  • 12
  • 34
  • 1
    Here's an interesting extension to storing look up tables in textures. It (ab)uses N-linear texture interpolation to get higher order interpolation (aka better than linear) of data points, surfaces, volumes, and hypervolumes. https://blog.demofox.org/2016/02/22/gpu-texture-sampler-bezier-curve-evaluation/ – Alan Wolfe Oct 03 '17 at 17:14
  • Would a read only float buffer be faster than a texture? – wduk Jun 30 '20 at 02:59