- Home /
Much slower while using DOD for A* algorithm than OOP
Hi Guys,
I am trying to write the A* code using DOD, follwoing this tutorial https://www.youtube.com/watch?v=1bO1FdEThnU&ab_channel=CodeMonkey
But i find out that this is about 7 times slower than OOP when the input data is big , the speed depending on how big is the data, it is faster than if data is small , no like the youtube video.
This is my first time trying to write code using DOD, can anyone help me take a look at my script and tell me what did i do wrong, or did i misunderstand something ?
Please Help!
The attachment is my scripts, which contain DOD method (GetDODPath) and OOP method (GetPathlink text).
Answer by andrew-lukasik · Jan 03 at 11:49 AM
This is opposite of how these types are meant to be used. NativeArray
and NativeList
are fast in a Burst-compiled jobs only and slow everywhere else.
They are accessible from everywhere only to:
Allow allocation/deallocation calls
Fill them with relevant data
Schedule computations on them with job structs
and nothing else.
If one is using them as a replacement to List
/Array
and expect great performance - those hopes are misplaced but also correctable. Solution for you here: refactor your GetDODPath
into a GetDODPathJob
struct and call Schedule().Complete()
on that.
[Unity.Burst.BurstCompile]
public struct GetDODPathJob : IJob
{
public NativeArray<DODPathNode> AllDODNode;
public float2 BeginPos, EndPos, BottomLeftPos;
public float MapMaxNodeLength;
public NativeList<float2> Result;// allocated before job is scheduled
void IJob.Execute ()
{
int startIndex = GetDODIndex( BeginPos.x, BeginPos.y );
DODPathNode beginNode = AllDODNode[startIndex];
beginNode.gCost = 0;
beginNode.hCost = CalculateDODDistanceCost(BeginPos, EndPos);
beginNode.CalculateFCost();
AllDODNode[startIndex] = beginNode;
// ... //
if( endNode.previousIndex>=0 )
{
Result.Add( endNode.position );
DODPathNode currentNode = endNode;
while (currentNode.previousIndex >= 0)
{
currentNode = allDODNode[currentNode.previousIndex];
Result.Add(currentNode.position);
}
}
else
{
Result.Clear();
}
}
int GetDODIndex ( float x , float y ) => (int)((x - BottomLeftPos.x) * MapMaxNodeLength + (y - BottomLeftPos.y));
float CalculateDODDistanceCost(float2 beginPos, float2 endPos)
{
float xDistance = math.abs(beginPos.x - endPos.x);
float yDistance = math.abs(beginPos.y - endPos.y);
float remaining = math.abs(xDistance - yDistance);
return DIAGONAL_COST * math.min(xDistance, yDistance) + remaining * STRAIGHT_COST;
}
}
EDIT: How to write a decent pathfinding code for Burst & Unity.Collections
Diagonal path for grid 128/128 solved in about 2.5 ms ( i3-4170
cpu) when HMultiplier
is 1.0
Diagonal path for grid 128/128 solved in about 0.5 ms (
i3-4170
cpu) when HMultiplier
is 1.1
As you can see there from the profiler stats, this
HMultiplier
is a useful trick/heuristic to reduce number of search steps at zero additional cpu cost.
LetsTestIt.cs
using UnityEngine;
using Unity.Jobs;
using Unity.Collections;
using Unity.Mathematics;
using Unity.Profiling;
using Random = Unity.Mathematics.Random;
using BurstCompile = Unity.Burst.BurstCompileAttribute;
using Stopwatch = System.Diagnostics.Stopwatch;
namespace Area51
{
public class LetsTestIt : MonoBehaviour
{
[SerializeField][Range(3,128)] int _numColumns = 10;
[SerializeField][Range(3,128)] int _numRows = 10;
[SerializeField] float2 _gridWorldCellSize = new float2{ x=3 , y=3 };
[SerializeField] Cell[] _canWalk;
[SerializeField][Range(1,2)] float _HMultiplier = 1.1f;
[SerializeField][Min(0)] int _searchStepBudget = 0;
[SerializeField][Min(1)] uint _seed = 1;
Random _random = new Random{ state=uint.MaxValue };
public int2 _searchStart => int2.zero;
public int2 _searchDestination => new int2{ x=_numColumns-1 , y=_numRows-1 };
void OnDrawGizmos ()
{
// run pathfinding job:
int numIndices = _numColumns * _numRows;
if( _canWalk==null || _canWalk.Length!=numIndices || _random.state!=_seed ) GenerateRandomGridFromSeed();
var watch = Stopwatch.StartNew();
var job = new PathfindingJob{
Cells = new NativeArray<Cell>( _canWalk , Allocator.TempJob ) ,
StartIndex2D = _searchStart ,
EndIndex2D = _searchDestination ,
Path = new NativeList<int>( Allocator.TempJob ) ,
GridArrayNumColumns = _numColumns ,
GridArrayNumRows = _numRows ,
GridWorldCellSize = _gridWorldCellSize ,
GridWorldOrigin = new float2{ x=transform.position.x , y=transform.position.y } ,
SearchStepBudget = _searchStepBudget ,
HMultiplier = _HMultiplier ,
FG = new NativeArray<Fg>( numIndices , Allocator.TempJob ) ,
Stats = new NativeArray<int>( 1 , Allocator.TempJob ) ,
Frontier = new NativeList<int>( numIndices , Allocator.TempJob ) ,
Visited = new NativeHashSet<int>( numIndices , Allocator.TempJob ) ,
Profile_Initiliazation = new ProfilerMarker("initialization") ,
Profile_Search = new ProfilerMarker("search") ,
Profiler_Trace = new ProfilerMarker("trace path") ,
Profiler_FindLowestF = new ProfilerMarker("find lowest f") ,
};
job.Run();
Debug.Log($"job completed in {((double)watch.ElapsedTicks/(double)Stopwatch.Frequency*1000.0):0.00} [ms], {job.Stats[0]} search steps");
// display job results;
float2 origin = new float2{ x=transform.position.x , y=transform.position.y };
for( int i=0 ; i<numIndices ; i++ )
{
float3 pos = job.ToWorldPosition3D( i:i , numColumns:_numColumns , origin:origin , cellSize:_gridWorldCellSize );
Vector3 size = new Vector3{ x=_gridWorldCellSize.x , y=_gridWorldCellSize.y };
if( _canWalk[i]==Cell.Walkable ) Gizmos.color = job.Visited.Contains(i) ? Color.cyan : Color.white;
else Gizmos.color = Color.black;
Gizmos.DrawCube( pos , size );
}
for( int i=0 ; i<job.Path.Length-1 ; i++ )
{
int i0 = job.Path[i];
int i1 = job.Path[i+1];
Gizmos.color = Color.Lerp( Color.red , Color.blue , (float)i/(float)job.Path.Length );
float3 p0 = job.ToWorldPosition3D( i:i0 , numColumns:_numColumns , origin:origin , cellSize:_gridWorldCellSize );
float3 p1 = job.ToWorldPosition3D( i:i1 , numColumns:_numColumns , origin:origin , cellSize:_gridWorldCellSize );
Gizmos.DrawLine( p0 , p1 );
}
Gizmos.color = Color.blue;
Gizmos.DrawSphere( job.ToWorldPosition3D( i2:_searchStart , origin:origin , cellSize:_gridWorldCellSize ) , math.min(_gridWorldCellSize.x,_gridWorldCellSize.y)*0.5f );
Gizmos.color = Color.red;
Gizmos.DrawSphere( job.ToWorldPosition3D( i2:_searchDestination , origin:origin , cellSize:_gridWorldCellSize ) , math.min(_gridWorldCellSize.x,_gridWorldCellSize.y)*0.5f );
// clean up:
job.Cells.Dispose();
job.Path.Dispose();
job.Dispose();
}
[ContextMenu("Generate random grid from seed")]
void GenerateRandomGridFromSeed ()
{
int numIndices = _numColumns * _numRows;
_canWalk = new Cell[ numIndices ];
if( _random.state!=_seed ) _random = new Random{ state=_seed };
for( int i=0 ; i<numIndices ; i++ )
_canWalk[i] = _random.NextFloat()>0.7f ? Cell.Obstacle : Cell.Walkable;
_canWalk[0] = Cell.Walkable;
_canWalk[numIndices-1] = Cell.Walkable;
}
[BurstCompile]
public struct PathfindingJob : IJob, System.IDisposable
{
const float k_cost_straight = 10;
const float k_cost_diagonal = 14;// k_cost_straight * math.SQRT2;
[ReadOnly] public NativeArray<Cell> Cells;
public int2 StartIndex2D, EndIndex2D;
[WriteOnly] public NativeList<int> Path;
public int GridArrayNumColumns, GridArrayNumRows, SearchStepBudget;
public float2 GridWorldCellSize;
public float2 GridWorldOrigin;
public float HMultiplier;
public NativeArray<int> Stats;
public NativeArray<Fg> FG;
public NativeList<int> Frontier;
public NativeHashSet<int> Visited;
public ProfilerMarker Profile_Initiliazation, Profile_Search, Profiler_Trace, Profiler_FindLowestF;
void IJob.Execute ()
{
// Debug.Log($"{nameof(PathfindingJob)} job execution started");
Profile_Initiliazation.Begin();
int StartIndex = ToIndex1D( StartIndex2D );
int EndIndex = ToIndex1D( EndIndex2D );
if( SearchStepBudget==0 ) SearchStepBudget = int.MaxValue;
Path.Clear();// allocation can be reused, make sure it's cleared
int numIndices = this.Cells.Length;
for( int i=0 ; i<numIndices ; i++ ) FG[i] = new Fg{ g=half.MaxValueAsHalf };
var previousIndex = new NativeArray<int>( numIndices , Allocator.Temp , NativeArrayOptions.UninitializedMemory );
for( int i=0 ; i<numIndices ; i++ ) previousIndex[i] = -1;
var nightborOffsetList = new NativeArray<int2>( 8 , Allocator.Temp );
{
nightborOffsetList[0] = new int2(-1,-1);
nightborOffsetList[1] = new int2(0,-1);
nightborOffsetList[2] = new int2(1,-1);
nightborOffsetList[3] = new int2(-1,0);
nightborOffsetList[4] = new int2(1,0);
nightborOffsetList[5] = new int2(-1,1);
nightborOffsetList[6] = new int2(0,1);
nightborOffsetList[7] = new int2(1,1);
}
FG[StartIndex] = new Fg{ f=(half)H(StartIndex2D,EndIndex2D) , g=(half)0 };
Frontier.Add( StartIndex );
Profile_Initiliazation.End();
Profile_Search.Begin();
int numSearchSteps = 0;
while(
!Frontier.IsEmpty
&& PopLowestFCostIndex(Frontier,FG) is int currentIndex && currentIndex!=EndIndex
&& numSearchSteps<SearchStepBudget
)
{
numSearchSteps++;
Visited.Add( currentIndex );
half currentG = FG[currentIndex].g;
int2 currentIndex2D = ToIndex2D(currentIndex);
// Debug.Log($"\t<b>currentIndex: {currentIndex}</b>,\t F: {(float)FG[currentIndex].f},\t G: {(float)FG[currentIndex].g},\t H: {H(currentIndex2D,EndIndex2D)}\t\t frontier.Length:{Frontier.Length} {FrontierToString()}");
if( numSearchSteps>numIndices )
{
Debug.LogError($"EMERGENCY EXIT #1 ( numSearchSteps:{numSearchSteps}, frontier.Length:{Frontier.Length}), likely infinite loop prevented. This means a bug in pathfinding code, heuristic calc method or invalid input data.");
break;
}
foreach( int2 offset in nightborOffsetList )
{
int2 neighborIndex2D = ToIndex2D(currentIndex) + offset;
int neighborIndex = ToIndex1D(neighborIndex2D);
// Debug.Log($"\t\toffset:({offset.x},{offset.y}), neighborIndex:{neighborIndex}, neighborIndex2D:({neighborIndex2D.x},{neighborIndex2D.y})");
if( neighborIndex2D.x<0 || neighborIndex2D.y<0 || neighborIndex2D.x>(GridArrayNumColumns-1) || neighborIndex2D.y>(GridArrayNumRows-1) )
{
// Debug.Log($"\t\t\tskipping (index [{neighborIndex2D.x},{neighborIndex2D.y}] is out of bounds)");
continue;
}
if( Visited.Contains(neighborIndex) )
{
// Debug.Log($"\t\t\tskipping (visited already)");
continue;
}
if( Cells[neighborIndex]==Cell.Obstacle )
{
// Debug.Log($"\t\t\tskipping (obstacle)");
continue;
}
half g = (half)( currentG + H(currentIndex2D,neighborIndex2D) );
if( g<FG[neighborIndex].g )
{
float h = H(neighborIndex2D,EndIndex2D) * HMultiplier;
FG[neighborIndex] = new Fg{
f = (half)( g + h ) ,
g = g
};
previousIndex[neighborIndex] = currentIndex;
// Debug.Log($"\t\t\ti:{neighborIndex},\t new F: {(float)FG[neighborIndex].f},\t new G: {(float)FG[neighborIndex].g},\t H: {h}\t\t Frontier index: {Frontier.Length}");
}
if( !Visited.Contains(neighborIndex) && !Frontier.Contains(neighborIndex) )
{
Frontier.Add(neighborIndex);
}
}
// Debug.Log($"\t\t frontier on step end: {FrontierToString()}");
}
Stats[0] = numSearchSteps;
Profile_Search.End();
Profiler_Trace.Begin();
if( previousIndex[EndIndex]>=0 )
{
Path.Add( EndIndex );
int currentIndex = EndIndex;
int numPathTraceSteps = 0;
while( currentIndex!=StartIndex && currentIndex>=0 )
{
currentIndex = previousIndex[currentIndex];
Path.Add( currentIndex );
if( numPathTraceSteps++>numIndices )
{
Debug.LogError($"EMERGENCY EXIT #2 (numPathTraceSteps:{numPathTraceSteps}), likely infinite loop prevented (this means a bug in pathfinding code or invalid input data)");
Path.Clear();
return;
}
}
}
else Debug.Log("no solution found");
Profiler_Trace.End();
}
public void Dispose ()
{
if( FG.IsCreated ) FG.Dispose();
if( Stats.IsCreated ) Stats.Dispose();
if( Frontier.IsCreated ) Frontier.Dispose();
if( Visited.IsCreated ) Visited.Dispose();
}
public float2 ToWorldPosition2D ( int i ) => ToWorldPosition2D( i2:ToIndex2D(i) );
public float2 ToWorldPosition2D ( int i , int numColumns , float2 origin , float2 cellSize ) => ToWorldPosition2D( i2:ToIndex2D(i:i,numColumns:numColumns) , origin:origin , cellSize:cellSize );
public float2 ToWorldPosition2D ( int2 i2 ) => ToWorldPosition2D( i2:i2 , origin:GridWorldOrigin , cellSize:GridWorldCellSize );
public float2 ToWorldPosition2D ( int2 i2 , float2 origin , float2 cellSize ) => origin + i2*cellSize + cellSize*0.5f;
public float3 ToWorldPosition3D ( int i , int numColumns , float2 origin , float2 cellSize )
{
float2 pos2d = ToWorldPosition2D( i2:ToIndex2D(i:i,numColumns:numColumns) , origin:origin , cellSize:cellSize );
return new float3{ x=pos2d.x , y=pos2d.y , z=0 };
}
public float3 ToWorldPosition3D ( int2 i2 , float2 origin , float2 cellSize )
{
float2 pos2d = ToWorldPosition2D( i2:i2 , origin:origin , cellSize:cellSize );
return new float3{ x=pos2d.x , y=pos2d.y , z=0 };
}
public int2 ToIndex2D ( int i ) => ToIndex2D( i:i , numColumns:GridArrayNumColumns );
public int2 ToIndex2D ( int i , int numColumns )
{
int col = i % numColumns;
int row = i / numColumns;
return new int2{ x=col , y=row };
}
public int ToIndex1D ( int2 i2 ) => ToIndex1D( i2:i2 , numColumns:GridArrayNumColumns );
public int ToIndex1D ( int2 i2 , int numColumns ) => (int)( i2.y*numColumns + i2.x );
public float H ( int2 a , int2 b )
{
float distX = math.abs( a.x - b.x );
float distY = math.abs( a.y - b.y );
float remaining = math.abs( distX - distY );
return k_cost_diagonal * math.min(distX,distY) + remaining * k_cost_straight;
}
// TODO: replace with Min Heap
int PopLowestFCostIndex ( NativeList<int> indices , in NativeArray<Fg> fg )
{
Profiler_FindLowestF.Begin();
half lowestF = half.MaxValueAsHalf;
int lowestFIndex = -1;
int lowestFIndexLocation = -1;
for( int i=0 ; i<indices.Length ; i++ )
{
int nextFIndex = indices[i];
half nextF = fg[nextFIndex].f;
if( nextF<lowestF )
{
lowestF = nextF;
lowestFIndex = nextFIndex;
lowestFIndexLocation = i;
}
}
// Debug.Log($"\tlowestF: {(float)lowestF},\t lowestFIndex: {lowestFIndex},\t lowestFIndexLocation: {lowestFIndexLocation}");
indices.RemoveAtSwapBack( lowestFIndexLocation );
Profiler_FindLowestF.End();
return lowestFIndex;
}
// string FrontierToString ()
// {
// var sb = new System.Text.StringBuilder("{ ");
// if( Frontier.Length!=0 )
// {
// sb.Append(Frontier[0]);
// for( int i=1 ; i<Frontier.Length ; i++ ) { sb.Append(" , "); sb.Append(Frontier[i]); }
// sb.Append(" }");
// }
// else sb.Append(" }");
// return sb.ToString();
// }
}
public enum Cell : byte
{
Walkable = 0 ,
Obstacle = 1 ,
}
public struct Fg { public half f, g; }
}
}
Hi, Mr. Andrew-Lukasik, thank you for explanation,
I will refactor the code using job.
But i have another question, why it is working inside the youtube video ? the speed did become faster after "Code Monkey" finish the code without using the job and burst system. (at time 19:48) , or is it because the input data is extremely small inside the video ?
Again , thank you for the answer and patience, I will refactor it as soon as possible.
By Howard
At 19:35 he says that OOP code solves 20/20 grid in about 700ms, which is a catastrophically long for sucha a small grid. At around 20:00 he claims victory by enabling "DOD" and cutting that time to 1-2ms.
In my guess-stimation his "OOP" method probably contains a bug that makes it very slow and his "DOD" refactor is free of that bug - thus producing such a sharp difference in execution time (even with those higher mem. access times).
So, as far as I can tell: It's a Code Monkey's mistake, preserved in a video format that will continue to confuse newcomers for years to come.
Hi, thank you for the explanation.
After refactor the code and using burst, the speed did become faster.
It will become so much faster than OOP if the path is short , and about 34% faster in my use case ( about 176 x 176 ) , it make me think another problem, will the DOD eventually slower than OOP if input data is extremely big ?
Hi, long time long see, Mr. Andrew-Lukasik.
First of all, thank you for the example code, I learned lots of thing inside the code.
I was working on other issue and now come back to work on path finding.
The example code is about 10 times faster than $$anonymous$$e.
But I have some questions about the code. And want to ask you again.
So here are my questions, sorry if the questions are literally basic .
(1) What is the use case of the "NativeArrayOptions.UninitializedMemory" in line 125, is it used ever since the empty array is create and we want to fill up the element to array immediately, but why not used in line 127? (like Unity Manual says) . If so , should I use it ever since I new an empty array or empty list
which will be filled after the new ?
(2) Why add an "in" inside the function "PopLowestFCostIndex" 's second parameter, why not both ? Is it because NativeArray is value type and it will copy all the array and NativeList is actually reference type?
(3) "HMultiplier" can save lots of performance, but i can not understand the reason why it works, Will it cost wrong result sometimes ? Or where can i understand it ?
Last, thank you again ,I appreciate it so much.
When memory is allocated from the system memory pool, it may contain random data. In normal C# the C# type system ensures that the memory of an array is filled with "0". So any references or values would get their default values and no bad things can happen. An uninitialized native memory block is just the allocated memory which is not "cleaned" before its use. When you're going to fill the whole array anyways, there's no reason to first fill it with 0. So using an uninitialized array is faster, but has the potential for catastrophic errors if not handled with care. If you leave certain parts of the array uninitialized and read / use those values you get random data which could lead to unpredictable results. So when you can guarantee that you're filling the whole array, using an uninitialized array is faster.
The in keyword is essentially a contraint and hint for the compiler and an expression of intent so the compiler can create optimised code. "in" parameters are only "passed in" as reference. They can not be modifed inside this method, only read. Unlike the "indices" array which is written two from inside the method. See the in modifier. Knowing that an array can not be modified by a method makes it easier to reason about potential concurrent access issued. Multiple threads can safely read the same memory as long as they don't modify it.
The heuristics multiplier is just a way to artificially overestimate the heuristics (guess). As you probably know (since you implement A*, so you should know that the heuristics function should never overestimate the real cost) the heuristic should always either match the actual code or can be lower. If the heuristic is too high the resulting path may not be the shortest path. Since the heuristic is just an approximation, it generally depends on the underlying graph what heuristic function makes sense. Most heuristics on a grid in fact do underestimate the real cost. In this case the used H function splits the distance to the target into the lateral and diagonal components. Depending on the allowed movements on the grid this would usually be the best case scenario (no obsticals). When you artificially increase the heuristic, it may expand a different route first. This could help to get around obsticals quicker, though as I mentioned, the result may not be the optimal path. See the gif on the right in the example on wikipedia and here's how it would look with an overestimated H.
I have another question, what is the "Stats" array used for ?
Well for statistics purposes. Just search for "Stats" in the code. You will see that the native array only has a single element which is used to pass "out" the number if search steps:
Stats[0] = numSearchSteps;
in order to display them in the debug log statement:
Debug.Log($"job completed in {((double)watch.ElapsedTicks/(double)Stopwatch.Frequency*1000.0):0.00} [ms], {job.Stats[0]} search steps");
There's no other purpose.
Answer by Han-Wen · Feb 08 at 09:19 AM
Hi, thank you for teaching.
After I refactor my code to the code similar to yours , the performance become better about more than 10 times, which make me feel happy but also confused, can you tell me the reason why the performance boost so much, or where can I learn more detail about the DOD.
Other , although the performance boost, it is still need about 8 ms with H multiplier = 1 and 2.5 ms with H multiplier = 1.1 for 128/128 in my use case. Can you help me take a look my code and tell me where did i do wrong ?
The first rule of engineering here is to measure. If you don't measure - you cannot see what is wrong. You can see these ProfilerMarker
s carefully placed in my code - I strongly suggest doing the same, because this will allow you to rely on yourself in identifying the choke points (code segments that take the most time % overall).
(valid) Measurement always trump intuitions.
Your answer
![](https://koobas.hobune.stream/wayback/20220613051223im_/https://answers.unity.com/themes/thub/images/avi.jpg)
Follow this Question
Related Questions
How do i get the IsPathPossible() function to ignore some nodes using Astar Pathfinding Project 0 Answers
Aron Ganberg Astar - add nodes at runtime with pointgraph 0 Answers
Pathfinding unity 3 Answers
How to add multiple values to single variable(or something else)? 0 Answers
How to rotate object to face toward the path using Astar pathfinding? 0 Answers