- Home /
Is it possible to profile compute shaders?
As the title says, I'm wondering if it is possible to profile compute shaders. I want to know the time each compute kernel took in one frame. The Unity Profiler does not show this as it only tracks CPU calls. Compute kernels however run in parallel so the Dispatch calls take 0 ms. Any ideas?
Thanks but that only applies for VR SD$$anonymous$$s as far as I understand. I ended up disabling VSync so that it would just run as fast as possible. However, this is not very accurate when it runs very fast and the delta time is not very helpful when you want the time of different compute shader dispatches. When I needed accurate values (for a scientific thesis) I dispatched a kernel and copied a tiny buffer back from GPU memory to main memory immediately after the dispatch call. This forced the GPU to compute the results first. So I could take the time on the CPU and keep some time in $$anonymous$$d, what it costs to copy the data back.
Ah that's really cool! Yea I don't think that that function exists outside VR SD$$anonymous$$s, and it also incorporates the time on the GPU that the VR functions and rendering functions and other things spend. Yours is a great way to profile a compute shader though
Answer by Arycama · Jun 16, 2019 at 07:57 AM
Are you talking about profiling in Editor or on device?
If GPU profiling works in Editor on your machine, you can wrap the Dispatch calls in Profier.BeingSample("Some meaningful string here"), and Profiler.EndSample().
This will show the GPU time under the GPU profiler for each dispatch. I'd assume the same works for builds too, if the device supports GPU profiling.
Keep in mind, if you're reading data back from the GPU, this won't necessarily help, as doing ComputeBuffer.GetData will stall the CPU until the GPU is finished. You'll just have to yield return new WaitForSeconds() with an approximate guess of how long the compute shader will take, which will vary across devices.
AsyncGPUReadbacks introduced in 2019 help with this, however the GPU will still stutter if you're attempting to do large Compute Shaders at once, as it still interrupts the regular rendering draw calls. Async Compute helps with this but it requires using command buffers, and is apparently only supported on PS4 and XboxOne. I haven't tried it out yet in editor however.
Anyway, just some general ComputeShader optimisation tips from me. There's also a lot of good info online about aligning your data structures to be aligned to 4 bytes, etc.
Good luck, compute shaders are very fun.
Answer by Waffle1434 · Nov 24, 2019 at 10:16 PM
I would suggest compiling and build and profiling it with a graphics debugger like RenderDoc, Nsight, or the latest PIX, which should collect statistics on Compute kernels in a frame. I believe Nsight and PIX show GPU occupancy.
Your answer
Follow this Question
Related Questions
What is this shader? 1 Answer
Graphics and GPU Profiling 0 Answers
[Compute Shader] Porting an Image Effect Shader - Kuwahara Filter 2 Answers
Custom colliders on the GPU 0 Answers
GFX.WaitForPresent for no reason 0 Answers