- Home /
Compute shaders strange error
I have a compute shader and the C# script which goes with it used to modify an array of vertices on the y axis simple enough to be clear.
But despite the fact that it runs fine the shader seems to forget the first vertex of my shape (except when that shape is a closed volume?)
Here is the C# class :
Mesh m;
//public bool stopProcess = false; //Useless in this version of exemple
MeshCollider coll;
public ComputeShader csFile; //the compute shader file added the Unity way
Vector3[] arrayToProcess; //An array of vectors i'll use to store data
ComputeBuffer cbf; //the buffer CPU->GPU (An early version with exactly
//the same result had only this one)
ComputeBuffer cbfOut; //the Buffer GPU->CPU
int vertexLength;
void Awake() { //Assigning my stuff
coll = gameObject.GetComponent<MeshCollider>();
m = GetComponent<MeshFilter>().sharedMesh;
vertexLength = m.vertices.Length;
arrayToProcess = m.vertices; //setting the first version of the vertex array (copy of mesh)
}
void Start () {
cbf = new ComputeBuffer(vertexLength,32); //Buffer in
cbfOut = new ComputeBuffer(vertexLength,32); //Buffer out
csFile.SetBuffer(0,"Board",cbf);
csFile.SetBuffer(0,"BoardOut",cbfOut);
}
void Update () {
csFile.SetFloat("time",Time.time);
cbf.SetData(m.vertices);
csFile.Dispatch(0,vertexLength,vertexLength,1); //Dispatching (i think there is my mistake)
cbfOut.GetData(arrayToProcess); //getting back my processed vertices
m.vertices = arrayToProcess; //assigning them to the mesh
//coll.sharedMesh = m; //collider stuff useless in this demo
}
And my compute shader script :
#pragma kernel CSMain
RWStructuredBuffer<float3> Board : register(s[0]);
RWStructuredBuffer<float3> BoardOut : register(s[1]);
float time;
[numthreads(1,1,1)]
void CSMain (uint3 id : SV_DispatchThreadID)
{
float valx = (sin((time*4)+Board[id.x].x));
float valz = (cos((time*2)+Board[id.x].z));
Board[id.x].y = (valx + valz)/5;
BoardOut[id.x] = Board[id.x];
}
At the beginning I was reading and writing from the same buffer, but as I had my issue I tried having separate buffers, but with no success. I still have the same problem.
Maybe I misunderstood the way compute shaders are supposed to be used (and I know I could use a vertex shader but I just want to try compute shaders for further improvements.)
To complete what I said, I suppose it is related with the way vertices are indexed in the Mesh.vertices Array.
I tried a LOT of different Blocks/Threads configuration but nothing seems to solve the issue combinations tried :
Block Thread
60,60,1 1,1,1
1,1,1 60,60,3
10,10,3 3,1,1
and some others I do not remember. I think the best configuration should be something with a good balance like :
Block : VertexCount,1,1 Thread : 3,1,1
About the closed volume: I'm not sure about that because with a Cube {8 Vertices} everything seems to move accordingly, but with a shape with an odd number of vertices, the first (or last did not checked that yet) seems to not be processed
I tried it with many different shapes but subdivided planes are the most obvious, one corner is always not moving.
EDIT :
After further study i found out that it is simply the compute shader which does not compute the last (not the first i checked) vertices of the mesh, it seems related to the buffer type, i still dont get why RWStructuredBuffer should be an issue or how badly i use it, is it reserved to streams? i cant understand the MSDN doc on this one.
This is a duplicate of a question i asked on SO : http://stackoverflow.com/q/20630796/1906111
Answer by Gogeric · Dec 20, 2013 at 02:22 PM
Question answered properly here : StackOverflow
I'm familiar with Compute Shaders but have never touched Unity, but having looked over the documentation for Compute Shaders in Unity a couple of things stand out.
The cbf and cbfOut ComputeBuffers are created with a stride of 32 (bytes?). Both your StructuredBuffers contain float3s which have a stride of 12 bytes, not 32. Where has 32 come from?
When you dispatch your compute shader you're requesting a two-dimensional dispatch (vertexLength,vertexLength, 1) but you're operating on a 1D array of float3s. You will end up with a race condition where many different threads think they're responsible for updating each element of the array. Although awful for performance, if you want a thread group size of [numthreads(1,1,1)] then you should dispatch (vertexLength, 1, 1) numbers of waves/wavefronts when calling Dispatch (ie, Dispatch (60,1,1) with numThreads(1,1,1)).
For best/better performance the number of threads in your thread group / wave should at least be a multiple of 64 for best efficiency on AMD hardware. You then need only dispatch ceil(numVertices/64) wavefronts and then simply insert some logic into the shader to ensure id.x is not out of bounds for any given thread.
EDIT:
The documentation for the ComputeBuffer constructor is here: Unity ComputeBuffer Documentation While it doesn't explicitly say "stride" is in bytes, it's the only reasonable assumption.
Original answer from Adam Miles on StackOverflow
I can't believe that I missed the stride error :|
I have seen the 32 but didn't think about it could be worth :)
Since you accepted your answer it would be great if you could add the actual answer and not just link to an external site. External links tend to break / become invalid. Also for others it a pain to follow link chains without any information on the way.
Answer by Bunny83 · Dec 19, 2013 at 01:45 PM
I don't have a pro version here and i don't even have a DX11 card :D. So i can't say much about the compute shaders, however you might try a simple shader just to prove it processes each element.
For example passing in an array with all elements (0,0,0) and let the shader just do
BoardOut[id.x].x = 1.0
I did that actually at the beginning and the result was the same, for any array (except a cube...) the last element is not processed
But that makes no sense. Have you tested different array length? $$anonymous$$aybe it has to be even / odd / multiple of (4 / 6 / 8)? A Vector3-array is a Vector3-array, no matter where it come from or what it's used for. It seems you interpret your results in a strange way ;) If you have a sample array that works and one that doesn't you should work out the real difference.
Debugging is about analysing test data which includes the data itself (your arrays) and your code. Sorry but as i said i can't run any tests on this issue.
If you have some time, i suggest you watch this lecture of the $$anonymous$$IT about debugging ;)
Your answer
Follow this Question
Related Questions
Texture2D to Texture3D 2 Answers
GLSL shader - array of uniform const 0 Answers
Stencil buffer appears to be displaced when running on android 0 Answers
How to pass an array to a shader? 2 Answers
Access Unity Projector Buffer 0 Answers