- Home /
What is the difference between a cutout shader and using clip?
I'm trying to use the clip() shader function to have the specific pixels of a mesh not rendered. These pixels are determined by an alpha texture. Originally I was using a cutout shader, which visually achieved a similar effect except the pixels were only made transparent/cutout, as oppose to what the clip function does. These are the two lines of the shader that achieve this respectively:
o.Alpha = c.a * tex2D(_MainTex, IN.uv_SkinTex).a;
//clip( c.a * tex2D(_MainTex, IN.uv_SkinTex).a < 0.5f ? -1:1 );
I would like to know what exactly is the difference between using a cutout shader or using clip() to not render parts of a mesh. For example is there a general performance difference or intended use for each? Ultimately I would like to decide whether to use one or the other or both.
Unity's cutout shaders uses clip(). To see that, download Unity's shaders and copy AlphaTest-Diffuse.shader in your project. In the inspector, click on "show generated code". You will see lines like that : clip (o.Alpha - _Cutoff);
in the generated pixel shaders source.
That's the "alphatest:_Cutoff" in the #pragma line that will generate that clip() after it calls your surface shader code (I assume you use surface shaders)
Now, clip() should be somewhat faster than a blending operation that won't change the color (= totally transparent), especially if there are blocs of transparent pixels (as shaders renders in blocs that won't "return" until the whole bloc is written or clipped), and especially if the clip() is done before lots of other instructions so it skips lots of code.
That's some great info. So you are saying that it is better for performance to do the clip() earlier (eg. before the diffuse, normal maps etc) since there will than be less pixels to render. Have I got the right idea?
AFAI$$anonymous$$ blending comes free (there's a separate fixed-function module for that), so transparency itself is not expensive. The expenses come with the fact that transparent objects (usually) don't write to depth and they have to be distance-sorted.
Well, like Tanoshimi wrote, there is not one best solution for all cases so you should try different options.
The idea behind an early clip() is that it just stops the shader so if placed before other instructions it will just skip them + skip the other stages. The optimization comes from the avoidance of your shader instructions + no z write + no read/write to the frame buffer for blending (As Panga$$anonymous$$i said, the blending computation themselves are free but you still save the bandwidth of reading/writing to the framebuffer, well, that is usually very well cached on modern desktop GPUs too).
The other complexity is that shaders works in group = the GPU writes a small grid of pixels at the same time and has to wait that they all finish before being able to work on another grid. Then, if only 1 pixel from that grid is slower to compute (e.g. not transparent), it will make rendering of the whole grid slower (it is the same kind of consequences with some "if/else" instructions). But, always take this with a grain of salt as it changes with different generations of GPU and architectures so again, do your own researches and tests on the hardware you target.
To add another option: maybe having a mesh with lots of "holes" in it will be faster in your case. The mesh would have more vertices & triangles but it might be faster to draw, just like its sometimes faster to have a sprite bounded by lots of triangles but less fully transparent pixels than a simple quad-bound sprite.
Answer by tanoshimi · Dec 17, 2016 at 05:31 PM
The clip() function discards a pixel if any of its component values is less than 0. It is logically identical to:
void clip(float4 x) {
if (any(x < 0))
discard;
}
Setting the alpha of a pixel to 0 doesn't discard the pixel at all - it still continues down the pipeline. Depending on the blend mode and factor, fragments may still have an effect on the rendered image on screen even if their alpha=0.
If you want to compare performance of discarding pixels with clip rather than setting them to be fully transparent (e.g. using blend srcalpha dstalpha), it depends on the platform. - Generally clip() gives you a small advantage when using it to remove totally transparent pixels on most platforms. - However, on PowerVR GPUs found in iOS and some Android devices, alpha testing is more resource-intensive than overdraw, and you'd get better performance from simply drawing the fully-transparent pixels.
As always, the only true answer is to profile on your target device.