Question by firepro20 · Oct 30, 2019 at 06:19 AM · collision detection experience collision.contacts

Best Practice for Maze Runner Collectible Scenario - mlagents

I would like to understand the best approach to reward my agent when it collects a health item in the maze and to give it max reward when it collects all of them (none left)

At the moment I have the following code. I have around 40 dangerous obstacles, hazards, 6 walls, giving a negative value to force agent not to stick to walls, and 20 collectibles.

 private void OnCollisionEnter(Collision collision)
     {
         ContactPoint[] contactPoints = collision.contacts;
         for (int contactIndex = 0; contactIndex < contactPoints.Length; contactIndex++)
         {
             ContactPoint cp = contactPoints[contactIndex];
             Collider otherObjectCollider = cp.otherCollider;
         //Debug.Log(collision.gameObject.name);
         if (otherObjectCollider.gameObject.CompareTag("Hazard"))
         {
             AddReward(-0.2f);
             //AddReward(-0.025f); // divide 1/noOfSpikes
             Done();
             GameController.Instance.ReceiveDamage(gameObject, healthOnHazard);
             return; // stop handling collisions
         }
         if (otherObjectCollider.gameObject.CompareTag("Collectible"))
         {
             Debug.LogError("Consumed Health!");
             GameController.Instance.ReceiveHealth(gameObject, healthOnPickup);
             //areaOfDetection.radius -= healthOnPickup; disabled for training purposes

             if (collectibleList.Contains(otherObjectCollider.gameObject))
             {
                 //Debug.Log("I have eaten " + collision.gameObject.name);
                 collectibleList.Remove(otherObjectCollider.gameObject);
                 //Debug.Log("And there are now " + collectibleList.Count + " in list.");
                 otherObjectCollider.gameObject.GetComponent<BoxCollider>().enabled = false;
                 otherObjectCollider.gameObject.SetActive(false);
                 //Destroy(collision.gameObject);
             }
             // remove from list then destroy, otherwise null pointer exception
             //SetReward(1f);
             //AddReward(0.045f);
             AddReward(0.6f);
             Done();
             return;
         }

         if (otherObjectCollider.gameObject.CompareTag("Wall"))
         {
             AddReward(-0.4f);
             // Nothing was here
             Done();
             return;
         }

     }

I have also noticed that adding the max reward under AgentAction made a huge impact. Should all reward be given under this overridden method?

 public override void AgentAction(float[] vectorAction, string textAction)
     {
         // Actions, size = 2
         Vector3 controlSignal = Vector3.zero;
         controlSignal.x = vectorAction[0];
         controlSignal.z = vectorAction[1];
         rBody.AddForce(controlSignal * speed);
 
  if (!(GameObject.FindGameObjectWithTag("Collectible")))
         {
             AddReward(1f); // was set
             Done();
         }
 
 }

Currently the agent does improve but since its movement is sporadic around the place it's hitting more negative than positive rewards.

Should I divide the rewards in collision by the number of objectives and call Done() once there are no gameobjectsoftype left in AgentAction?

Add comment