- Home /
Help with Optimizing Voxel Code?
Hi guys, I've been working on optimizing some code for generating a voxel chunk (cubed, so x cubes by x cubes wide, each calculated, etc) and have no idea how to optimize it further. I know where the lag is (posted below) but can't figure out how do I optimize it, or at the very least hide of slow it is :p Here's the relevant code :
public static byte calculateByte(Vector3 pos, Vector3 offset1, Vector3 offset2, Vector3 offset3)
{
Profiler.BeginSample("Gene");
float clusterValue = calculateNoise(pos, offset2, 0.0001f);
float biomeFloat = clusterValue * World.cWorld.biomes.Length;
int biomeIndex = Mathf.Abs(Mathf.RoundToInt(biomeFloat));
if (biomeIndex > 1)
{
biomeIndex = 1;
}
bf = biomeFloat.ToString();
Biome biome = World.cWorld.biomes[biomeIndex];
if (biomeDetailsSet == false)
{
//Debug.Log(biomeFloat);
BiomeName = biome.name;
grassEnabled = biome.grassEnabled;
biomeDetailsSet = true;
}
float heightBase = biome.minHeight;
float maxHeight = biome.maxHeight;
float heightSwing = maxHeight - heightBase;
float blobValue = calculateNoise(pos, offset2, 0.01f);
float mountainValue = calculateNoise(pos, offset1, 0.01f);
mountainValue += biome.mountainPowerBonus;
mountainValue = Mathf.Pow(mountainValue, biome.mountainPower);
//if (mountainValue < 0) mountainValue = 0;
mountainValue = Mathf.Sqrt(mountainValue);
byte block = biome.getBlock(Mathf.FloorToInt(pos.y), mountainValue, blobValue);
mountainValue *= heightSwing;
mountainValue += heightBase;
mountainValue += (blobValue * 10) - 5f;
if (mountainValue >= pos.y)
return block;
Profiler.EndSample();
return 0;
}
public virtual IEnumerator calculateWorldMap()
{
map = new byte[(byte)chunkWidth, (byte)chunkHeight, (byte)chunkWidth];
Random.seed = World.cWorld.seed;
Vector3 grain0Offset = new Vector3(Random.value * 10000, Random.value * 10000, Random.value * 10000);
Vector3 grain1Offset = new Vector3(Random.value * 10000, Random.value * 10000, Random.value * 10000);
Vector3 grain2Offset = new Vector3(Random.value * 10000, Random.value * 10000, Random.value * 10000);
if (loaded == false)
{
for (int x = 0; x < chunkWidth; x++)
{
for (int y = 0; y < chunkHeight; y++)
{
for (int z = 0; z < chunkWidth; z++)
{
if (y == 0)
{
map[x, y, z] = 6;
}
else
{
map[x, y, z] = calculateByte(new Vector3(x, y, z) + transform.position, grain0Offset, grain1Offset, grain2Offset);
if (grassEnabled)
{
if (map[x, y, z] == 0 && map[x, y - 1, z] != 1 && map[x, y - 1, z] != 0)
{
map[x, y, z] = 1;
}
}
}
}
}
yield return null;
}
yield return null;
}
f = bf;
StartCoroutine(createVisualMesh());
initialized = true;
yield return null;
chunksWaiting.Remove(this);
if (chunksWaiting.Count > 0)
{
StartCoroutine(chunksWaiting[0].calculateWorldMap());
}
yield return 0;
}
Any Ideas? It has been bottle-necking me for quite a while as this is the main factor that is slowing me down. Many Thanks :)
Edit: As requested I've now added the noise function and the noise generator. Noise Generator:
// SimplexNoise for C#
// Author: Heikki Törmälä
//This is free and unencumbered software released into the public domain.
//Anyone is free to copy, modify, publish, use, compile, sell, or
//distribute this software, either in source code form or as a compiled
//binary, for any purpose, commercial or non-commercial, and by any
//means.
//In jurisdictions that recognize copyright laws, the author or authors
//of this software dedicate any and all copyright interest in the
//software to the public domain. We make this dedication for the benefit
//of the public at large and to the detriment of our heirs and
//successors. We intend this dedication to be an overt act of
//relinquishment in perpetuity of all present and future rights to this
//software under copyright law.
//THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
//EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
//MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
//IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR
//OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
//ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
//OTHER DEALINGS IN THE SOFTWARE.
//For more information, please refer to <http://unlicense.org/>
namespace SimplexNoise
{
/// <summary>
/// Implementation of the Perlin simplex noise, an improved Perlin noise algorithm.
/// Based loosely on SimplexNoise1234 by Stefan Gustavson <http://staffwww.itn.liu.se/~stegu/aqsis/aqsis-newnoise/>
///
/// </summary>
public class Noise
{
/// <summary>
/// 1D simplex noise
/// </summary>
/// <param name="x"></param>
/// <returns></returns>
public static float Generate(float x)
{
int i0 = FastFloor(x);
int i1 = i0 + 1;
float x0 = x - i0;
float x1 = x0 - 1.0f;
float n0, n1;
float t0 = 1.0f - x0*x0;
t0 *= t0;
n0 = t0 * t0 * grad(perm[i0 & 0xff], x0);
float t1 = 1.0f - x1*x1;
t1 *= t1;
n1 = t1 * t1 * grad(perm[i1 & 0xff], x1);
// The maximum value of this noise is 8*(3/4)^4 = 2.53125
// A factor of 0.395 scales to fit exactly within [-1,1]
return 0.395f * (n0 + n1);
}
/// <summary>
/// 2D simplex noise
/// </summary>
/// <param name="x"></param>
/// <param name="y"></param>
/// <returns></returns>
public static float Generate(float x, float y)
{
const float F2 = 0.366025403f; // F2 = 0.5*(sqrt(3.0)-1.0)
const float G2 = 0.211324865f; // G2 = (3.0-Math.sqrt(3.0))/6.0
float n0, n1, n2; // Noise contributions from the three corners
// Skew the input space to determine which simplex cell we're in
float s = (x+y)*F2; // Hairy factor for 2D
float xs = x + s;
float ys = y + s;
int i = FastFloor(xs);
int j = FastFloor(ys);
float t = (float)(i+j)*G2;
float X0 = i-t; // Unskew the cell origin back to (x,y) space
float Y0 = j-t;
float x0 = x-X0; // The x,y distances from the cell origin
float y0 = y-Y0;
// For the 2D case, the simplex shape is an equilateral triangle.
// Determine which simplex we are in.
int i1, j1; // Offsets for second (middle) corner of simplex in (i,j) coords
if(x0>y0) {i1=1; j1=0;} // lower triangle, XY order: (0,0)->(1,0)->(1,1)
else {i1=0; j1=1;} // upper triangle, YX order: (0,0)->(0,1)->(1,1)
// A step of (1,0) in (i,j) means a step of (1-c,-c) in (x,y), and
// a step of (0,1) in (i,j) means a step of (-c,1-c) in (x,y), where
// c = (3-sqrt(3))/6
float x1 = x0 - i1 + G2; // Offsets for middle corner in (x,y) unskewed coords
float y1 = y0 - j1 + G2;
float x2 = x0 - 1.0f + 2.0f * G2; // Offsets for last corner in (x,y) unskewed coords
float y2 = y0 - 1.0f + 2.0f * G2;
// Wrap the integer indices at 256, to avoid indexing perm[] out of bounds
int ii = i % 256;
int jj = j % 256;
// Calculate the contribution from the three corners
float t0 = 0.5f - x0*x0-y0*y0;
if(t0 < 0.0f) n0 = 0.0f;
else {
t0 *= t0;
n0 = t0 * t0 * grad(perm[ii+perm[jj]], x0, y0);
}
float t1 = 0.5f - x1*x1-y1*y1;
if(t1 < 0.0f) n1 = 0.0f;
else {
t1 *= t1;
n1 = t1 * t1 * grad(perm[ii+i1+perm[jj+j1]], x1, y1);
}
float t2 = 0.5f - x2*x2-y2*y2;
if(t2 < 0.0f) n2 = 0.0f;
else {
t2 *= t2;
n2 = t2 * t2 * grad(perm[ii+1+perm[jj+1]], x2, y2);
}
// Add contributions from each corner to get the final noise value.
// The result is scaled to return values in the interval [-1,1].
return 40.0f * (n0 + n1 + n2); // TODO: The scale factor is preliminary!
}
public static float Generate(float x, float y, float z)
{
// Simple skewing factors for the 3D case
const float F3 = 0.333333333f;
const float G3 = 0.166666667f;
float n0, n1, n2, n3; // Noise contributions from the four corners
// Skew the input space to determine which simplex cell we're in
float s = (x+y+z)*F3; // Very nice and simple skew factor for 3D
float xs = x+s;
float ys = y+s;
float zs = z+s;
int i = FastFloor(xs);
int j = FastFloor(ys);
int k = FastFloor(zs);
float t = (float)(i+j+k)*G3;
float X0 = i-t; // Unskew the cell origin back to (x,y,z) space
float Y0 = j-t;
float Z0 = k-t;
float x0 = x-X0; // The x,y,z distances from the cell origin
float y0 = y-Y0;
float z0 = z-Z0;
// For the 3D case, the simplex shape is a slightly irregular tetrahedron.
// Determine which simplex we are in.
int i1, j1, k1; // Offsets for second corner of simplex in (i,j,k) coords
int i2, j2, k2; // Offsets for third corner of simplex in (i,j,k) coords
/* This code would benefit from a backport from the GLSL version! */
if(x0>=y0) {
if(y0>=z0)
{ i1=1; j1=0; k1=0; i2=1; j2=1; k2=0; } // X Y Z order
else if(x0>=z0) { i1=1; j1=0; k1=0; i2=1; j2=0; k2=1; } // X Z Y order
else { i1=0; j1=0; k1=1; i2=1; j2=0; k2=1; } // Z X Y order
}
else { // x0<y0
if(y0<z0) { i1=0; j1=0; k1=1; i2=0; j2=1; k2=1; } // Z Y X order
else if(x0<z0) { i1=0; j1=1; k1=0; i2=0; j2=1; k2=1; } // Y Z X order
else { i1=0; j1=1; k1=0; i2=1; j2=1; k2=0; } // Y X Z order
}
// A step of (1,0,0) in (i,j,k) means a step of (1-c,-c,-c) in (x,y,z),
// a step of (0,1,0) in (i,j,k) means a step of (-c,1-c,-c) in (x,y,z), and
// a step of (0,0,1) in (i,j,k) means a step of (-c,-c,1-c) in (x,y,z), where
// c = 1/6.
float x1 = x0 - i1 + G3; // Offsets for second corner in (x,y,z) coords
float y1 = y0 - j1 + G3;
float z1 = z0 - k1 + G3;
float x2 = x0 - i2 + 2.0f*G3; // Offsets for third corner in (x,y,z) coords
float y2 = y0 - j2 + 2.0f*G3;
float z2 = z0 - k2 + 2.0f*G3;
float x3 = x0 - 1.0f + 3.0f*G3; // Offsets for last corner in (x,y,z) coords
float y3 = y0 - 1.0f + 3.0f*G3;
float z3 = z0 - 1.0f + 3.0f*G3;
// Wrap the integer indices at 256, to avoid indexing perm[] out of bounds
int ii = Mod(i, 256);
int jj = Mod(j, 256);
int kk = Mod(k, 256);
// Calculate the contribution from the four corners
float t0 = 0.6f - x0*x0 - y0*y0 - z0*z0;
if(t0 < 0.0f) n0 = 0.0f;
else {
t0 *= t0;
n0 = t0 * t0 * grad(perm[ii+perm[jj+perm[kk]]], x0, y0, z0);
}
float t1 = 0.6f - x1*x1 - y1*y1 - z1*z1;
if(t1 < 0.0f) n1 = 0.0f;
else {
t1 *= t1;
n1 = t1 * t1 * grad(perm[ii+i1+perm[jj+j1+perm[kk+k1]]], x1, y1, z1);
}
float t2 = 0.6f - x2*x2 - y2*y2 - z2*z2;
if(t2 < 0.0f) n2 = 0.0f;
else {
t2 *= t2;
n2 = t2 * t2 * grad(perm[ii+i2+perm[jj+j2+perm[kk+k2]]], x2, y2, z2);
}
float t3 = 0.6f - x3*x3 - y3*y3 - z3*z3;
if(t3<0.0f) n3 = 0.0f;
else {
t3 *= t3;
n3 = t3 * t3 * grad(perm[ii+1+perm[jj+1+perm[kk+1]]], x3, y3, z3);
}
// Add contributions from each corner to get the final noise value.
// The result is scaled to stay just inside [-1,1]
return 32.0f * (n0 + n1 + n2 + n3); // TODO: The scale factor is preliminary!
}
public static byte[] perm = new byte[512] { 151,160,137,91,90,15,
131,13,201,95,96,53,194,233,7,225,140,36,103,30,69,142,8,99,37,240,21,10,23,
190, 6,148,247,120,234,75,0,26,197,62,94,252,219,203,117,35,11,32,57,177,33,
88,237,149,56,87,174,20,125,136,171,168, 68,175,74,165,71,134,139,48,27,166,
77,146,158,231,83,111,229,122,60,211,133,230,220,105,92,41,55,46,245,40,244,
102,143,54, 65,25,63,161, 1,216,80,73,209,76,132,187,208, 89,18,169,200,196,
135,130,116,188,159,86,164,100,109,198,173,186, 3,64,52,217,226,250,124,123,
5,202,38,147,118,126,255,82,85,212,207,206,59,227,47,16,58,17,182,189,28,42,
223,183,170,213,119,248,152, 2,44,154,163, 70,221,153,101,155,167, 43,172,9,
129,22,39,253, 19,98,108,110,79,113,224,232,178,185, 112,104,218,246,97,228,
251,34,242,193,238,210,144,12,191,179,162,241, 81,51,145,235,249,14,239,107,
49,192,214, 31,181,199,106,157,184, 84,204,176,115,121,50,45,127, 4,150,254,
138,236,205,93,222,114,67,29,24,72,243,141,128,195,78,66,215,61,156,180,
151,160,137,91,90,15,
131,13,201,95,96,53,194,233,7,225,140,36,103,30,69,142,8,99,37,240,21,10,23,
190, 6,148,247,120,234,75,0,26,197,62,94,252,219,203,117,35,11,32,57,177,33,
88,237,149,56,87,174,20,125,136,171,168, 68,175,74,165,71,134,139,48,27,166,
77,146,158,231,83,111,229,122,60,211,133,230,220,105,92,41,55,46,245,40,244,
102,143,54, 65,25,63,161, 1,216,80,73,209,76,132,187,208, 89,18,169,200,196,
135,130,116,188,159,86,164,100,109,198,173,186, 3,64,52,217,226,250,124,123,
5,202,38,147,118,126,255,82,85,212,207,206,59,227,47,16,58,17,182,189,28,42,
223,183,170,213,119,248,152, 2,44,154,163, 70,221,153,101,155,167, 43,172,9,
129,22,39,253, 19,98,108,110,79,113,224,232,178,185, 112,104,218,246,97,228,
251,34,242,193,238,210,144,12,191,179,162,241, 81,51,145,235,249,14,239,107,
49,192,214, 31,181,199,106,157,184, 84,204,176,115,121,50,45,127, 4,150,254,
138,236,205,93,222,114,67,29,24,72,243,141,128,195,78,66,215,61,156,180
};
private static int FastFloor(float x)
{
return (x > 0) ? ((int)x) : (((int)x) - 1);
}
private static int Mod(int x, int m)
{
int a = x % m;
return a < 0 ? a + m : a;
}
private static float grad( int hash, float x )
{
int h = hash & 15;
float grad = 1.0f + (h & 7); // Gradient value 1.0, 2.0, ..., 8.0
if ((h & 8) != 0) grad = -grad; // Set a random sign for the gradient
return ( grad * x ); // Multiply the gradient with the distance
}
private static float grad( int hash, float x, float y )
{
int h = hash & 7; // Convert low 3 bits of hash code
float u = h<4 ? x : y; // into 8 simple gradient directions,
float v = h<4 ? y : x; // and compute the dot product with (x,y).
return ((h&1) != 0 ? -u : u) + ((h&2) != 0 ? -2.0f*v : 2.0f*v);
}
private static float grad( int hash, float x, float y , float z ) {
int h = hash & 15; // Convert low 4 bits of hash code into 12 simple
float u = h<8 ? x : y; // gradient directions, and compute dot product.
float v = h<4 ? y : h==12||h==14 ? x : z; // Fix repeats at h = 12 to 15
return ((h&1) != 0 ? -u : u) + ((h&2) != 0 ? -v : v);
}
private static float grad( int hash, float x, float y, float z, float t ) {
int h = hash & 31; // Convert low 5 bits of hash code into 32 simple
float u = h<24 ? x : y; // gradient directions, and compute dot product.
float v = h<16 ? y : z;
float w = h<8 ? z : t;
return ((h&1) != 0 ? -u : u) + ((h&2) != 0 ? -v : v) + ((h&4) != 0 ? -w : w);
}
}
}
Noise Function :
public static float calculateNoise(Vector3 pos, Vector3 offset, float scale)
{
float noiseX = Mathf.Abs((pos.x + offset.x) * scale);
float noiseY = Mathf.Abs((pos.y + offset.y) * scale);
float noiseZ = Mathf.Abs((pos.z + offset.z) * scale);
return Noise.Generate(noiseX, noiseY, noiseZ);
}
Edit 2: I've now added a thread pool which works to some degree. However I'm still getting about half the frame rate I should (about 15fps when generating chunks). I'm completely new to multi-threading in unity so ideas in what could be causing the lag between threads would help a lot :)
I see these functions are being called, every time in the calculateByte code: Both of these are fairly expensive operations, not sure if you have any other options though.
mountainValue = $$anonymous$$athf.Pow(mountainValue, biome.mountainPower);
mountainValue = $$anonymous$$athf.Sqrt(mountainValue);
Also, not sure where this ever gets used:
bf = biomeFloat.ToString();
other than
f=bf
Do you need to do this every time?
Not sure if this would actually help....
I see you are using Random.value a lot... which is potentially expensive (I've not tested it): You could create an array of random numbers, and store them, during load-up. Then use this list of numbers (memory lookup is fast), rather than calling a psudorandom number generator, in your enumerator.
No definition given for: calculateNoise
oh yeah:
mountainValue = $$anonymous$$athf.Pow(mountainValue, 0.5f*biome.mountainPower);
should be equivalent to, and faster than:
mountainValue = $$anonymous$$athf.Pow(mountainValue, biome.mountainPower);
mountainValue = $$anonymous$$athf.Sqrt(mountainValue);
Where is your performance issue most profound? Could you post a profiler log?
First things first, the $$anonymous$$athf calls are not as quick as you might think when doing these kinds of repetitive tasks. Your worst case scenario is to have a series of 255x255x255 (16581375) innermost calls, each of which does:
-2x calculateNoise -5x $$anonymous$$athf calls -a profiler log begin / end (that isn't always guaranteed to end by your setup, see "return block")
$$anonymous$$oreover, this coroutine begins one just like it until it presumably runs out of blocks to calculate. Coroutines don't typically have a ton of overhead, but I don't think you need to do this little handoff at all.
Thanks for the help Glurth and TreyH, The calculate noise function is optimized as far as it can go that I know of. The bf = biomeFloat.ToString(); doesn't need to be called every time so thanks for the spot howevery f = bf should be, means I can clearly see on each chunk whAT BIO$$anonymous$$E IT IS. The area which has the most lag is calculating each block. TreyH what do you mean about the handoff? Also what would you suggest ins$$anonymous$$d of $$anonymous$$athf?
If we see the calculateNoise function, that might help. Also, the concept of "chunking" your map might help, so having certain smaller segments that are loaded at runtime in a radius of your character. This might also help optimisation with less draw calls, vertices etc. This guy does an interesting voxel-based world tutorial I played around with about a year ago - http://studentgamedev.blogspot.co.uk/2013/11/unity-voxel-tutorial-part-7-loading.html $$anonymous$$aybe you could adapt some of his chunk loading to your case?
Also, maybe using multithreading (System.Threading)
As others have said, run it through the profiler, making sure to enable Deep Profile. Sort by time descending. What is taking the most time? Noise functions are often expensive, especially if you're adding noise octaves together. We don't know what your noise function looks like though.
But yeah, start with profiling and go from there.
Answer by EuanHollidge · Nov 02, 2016 at 03:51 PM
I figured it out with the help of the comments. The key was to thread the generation of the world, and in a couple more yield return null to the mesh generation. Simple as pie :)
Would you like me to convert any of those comments into an answer you can select? I believe Stormy102 suggested multithreading...
@Glurth I will paste my code, I'm just away from my pc at the moment but I should be back in an hour or two :) Thanks for your help btw, it got me into the correct shape of $$anonymous$$d to tackle the issue :)
Your answer
Follow this Question
Related Questions
Lowered general performance with Threads 1 Answer
Is calling GetComponent multiple times bad? 2 Answers
Job System without using the main thread 1 Answer
Alternative to getComponent? 3 Answers
GPU Instancing performance variation 3 Answers