Wayback Machinekoobas.hobune.stream
May JUN Jul
Previous capture 12 Next capture
2021 2022 2023
1 capture
12 Jun 22 - 12 Jun 22
sparklines
Close Help
  • Products
  • Solutions
  • Made with Unity
  • Learning
  • Support & Services
  • Community
  • Asset Store
  • Get Unity

UNITY ACCOUNT

You need a Unity Account to shop in the Online and Asset Stores, participate in the Unity Community and manage your license portfolio. Login Create account
  • Blog
  • Forums
  • Answers
  • Evangelists
  • User Groups
  • Beta Program
  • Advisory Panel

Navigation

  • Home
  • Products
  • Solutions
  • Made with Unity
  • Learning
  • Support & Services
  • Community
    • Blog
    • Forums
    • Answers
    • Evangelists
    • User Groups
    • Beta Program
    • Advisory Panel

Unity account

You need a Unity Account to shop in the Online and Asset Stores, participate in the Unity Community and manage your license portfolio. Login Create account

Language

  • Chinese
  • Spanish
  • Japanese
  • Korean
  • Portuguese
  • Ask a question
  • Spaces
    • Default
    • Help Room
    • META
    • Moderators
    • Topics
    • Questions
    • Users
    • Badges
  • Home /
avatar image
0
Question by CerushDope · Jan 26, 2018 at 03:49 PM · aiartificial intelligence

Unity ML Agent doesn't learn well on default examples

I am trying to understand Unity Machine Learning Beta.

I'm trying to train agents on all given example scenes, 3DBall worked fine, but Area and even Basic examples are not learning properly. Problem might be in hyper parameters but I'm not sure. My hyper parameters Are:

 ### General parameters
 max_steps = 5e5 # Set maximum number of steps to run environment.
 run_path = "ppo" # The sub-directory name for model and summary statistics
 load_model = False # Whether to load a saved model.
 train_model = True # Whether to train the model.
 summary_freq = 10000 # Frequency at which to save training statistics.
 save_freq = 50000 # Frequency at which to save model.
 env_name = "basic" # Name of the training environment file.
 curriculum_file = None
 
 ### Algorithm-specific parameters for tuning
 gamma = 0.99 # Reward discount rate.
 lambd = 0.95 # Lambda parameter for GAE.
 time_horizon = 2048 # How many steps to collect per agent before adding to buffer.
 beta = 1e-3 # Strength of entropy regularization
 num_epoch = 5 # Number of gradient descent steps per batch of experiences.
 num_layers = 2 # Number of hidden layers between state/observation encoding and value/policy layers.
 epsilon = 0.2 # Acceptable threshold around ratio of old and new policy probabilities.
 buffer_size = 2048 # How large the experience buffer should be before gradient descent.
 learning_rate = 3e-4 # Model learning rate.
 hidden_units = 64 # Number of units in hidden layer.
 batch_size = 64 # How many experiences per gradient descent update step.
 normalize = False
 
 ### Logging dictionary for hyperparameters
 hyperparameter_dict = {'max_steps':max_steps, 'run_path':run_path, 'env_name':env_name,
     'curriculum_file':curriculum_file, 'gamma':gamma, 'lambd':lambd, 'time_horizon':time_horizon,
     'beta':beta, 'num_epoch':num_epoch, 'epsilon':epsilon, 'buffe_size':buffer_size,
     'leaning_rate':learning_rate, 'hidden_units':hidden_units, 'batch_size':batch_size}

It's mean reward doesn't increase more than 0.2 (even though it has potential of 0.9+) If anybody have done training on these examples can you please tell me what hyper parameters did you use?

Thanks in advance

Comment
Add comment
10 |3000 characters needed characters left characters exceeded
▼
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
  • Advanced visibility
Viewable by all users

1 Reply

· Add your reply
  • Sort: 
avatar image
0

Answer by ArgMagnus · Feb 15, 2018 at 11:21 AM

You can try changing the input state type to continuous in the brain, it is not really a good solution but it gave me better results for the basic example. I first tried with both one-hot encoding and a number of hyperparameters but switching to continuous without changing anything else in the example code got it up to ~0.85 mean reward after 8 million steps. Would still be interesting to hear how someone trained this example more efficiently while keeping the discrete state type though.

Comment
Add comment · Share
10 |3000 characters needed characters left characters exceeded
▼
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
  • Advanced visibility
Viewable by all users

Your answer

Hint: You can notify a user about this post by typing @username

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

Follow this Question

Answers Answers and Comments

126 People are following this question.

avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image

Related Questions

If Terrain is larger(width10000Xlength10000) then how to increase the size of the GridGraph of Astar Path Finder 0 Answers

How can I make the I see the player better? 1 Answer

Enemy Follow Player 1 Answer

NavMeshAgent occasionally makes stale paths 0 Answers

AI Avoiding Obstacles! 0 Answers


Enterprise
Social Q&A

Social
Subscribe on YouTube social-youtube Follow on LinkedIn social-linkedin Follow on Twitter social-twitter Follow on Facebook social-facebook Follow on Instagram social-instagram

Footer

  • Purchase
    • Products
    • Subscription
    • Asset Store
    • Unity Gear
    • Resellers
  • Education
    • Students
    • Educators
    • Certification
    • Learn
    • Center of Excellence
  • Download
    • Unity
    • Beta Program
  • Unity Labs
    • Labs
    • Publications
  • Resources
    • Learn platform
    • Community
    • Documentation
    • Unity QA
    • FAQ
    • Services Status
    • Connect
  • About Unity
    • About Us
    • Blog
    • Events
    • Careers
    • Contact
    • Press
    • Partners
    • Affiliates
    • Security
Copyright © 2020 Unity Technologies
  • Legal
  • Privacy Policy
  • Cookies
  • Do Not Sell My Personal Information
  • Cookies Settings
"Unity", Unity logos, and other Unity trademarks are trademarks or registered trademarks of Unity Technologies or its affiliates in the U.S. and elsewhere (more info here). Other names or brands are trademarks of their respective owners.
  • Anonymous
  • Sign in
  • Create
  • Ask a question
  • Spaces
  • Default
  • Help Room
  • META
  • Moderators
  • Explore
  • Topics
  • Questions
  • Users
  • Badges