Best Practice for Maze Runner Collectible Scenario - mlagents
I would like to understand the best approach to reward my agent when it collects a health item in the maze and to give it max reward when it collects all of them (none left)
At the moment I have the following code. I have around 40 dangerous obstacles, hazards, 6 walls, giving a negative value to force agent not to stick to walls, and 20 collectibles.
private void OnCollisionEnter(Collision collision)
{
ContactPoint[] contactPoints = collision.contacts;
for (int contactIndex = 0; contactIndex < contactPoints.Length; contactIndex++)
{
ContactPoint cp = contactPoints[contactIndex];
Collider otherObjectCollider = cp.otherCollider;
//Debug.Log(collision.gameObject.name);
if (otherObjectCollider.gameObject.CompareTag("Hazard"))
{
AddReward(-0.2f);
//AddReward(-0.025f); // divide 1/noOfSpikes
Done();
GameController.Instance.ReceiveDamage(gameObject, healthOnHazard);
return; // stop handling collisions
}
if (otherObjectCollider.gameObject.CompareTag("Collectible"))
{
Debug.LogError("Consumed Health!");
GameController.Instance.ReceiveHealth(gameObject, healthOnPickup);
//areaOfDetection.radius -= healthOnPickup; disabled for training purposes
if (collectibleList.Contains(otherObjectCollider.gameObject))
{
//Debug.Log("I have eaten " + collision.gameObject.name);
collectibleList.Remove(otherObjectCollider.gameObject);
//Debug.Log("And there are now " + collectibleList.Count + " in list.");
otherObjectCollider.gameObject.GetComponent<BoxCollider>().enabled = false;
otherObjectCollider.gameObject.SetActive(false);
//Destroy(collision.gameObject);
}
// remove from list then destroy, otherwise null pointer exception
//SetReward(1f);
//AddReward(0.045f);
AddReward(0.6f);
Done();
return;
}
if (otherObjectCollider.gameObject.CompareTag("Wall"))
{
AddReward(-0.4f);
// Nothing was here
Done();
return;
}
}
I have also noticed that adding the max reward under AgentAction made a huge impact. Should all reward be given under this overridden method?
public override void AgentAction(float[] vectorAction, string textAction)
{
// Actions, size = 2
Vector3 controlSignal = Vector3.zero;
controlSignal.x = vectorAction[0];
controlSignal.z = vectorAction[1];
rBody.AddForce(controlSignal * speed);
if (!(GameObject.FindGameObjectWithTag("Collectible")))
{
AddReward(1f); // was set
Done();
}
}
Currently the agent does improve but since its movement is sporadic around the place it's hitting more negative than positive rewards.
Should I divide the rewards in collision by the number of objectives and call Done() once there are no gameobjectsoftype left in AgentAction?