- Home /
Best way to store large amounts of data for random generation
I'm creating methods to randomly generate a lot of data in my game. For instance, my "Person" generator takes a large amount of data and randomizes things such as name, address, phone number, social security number, place of work (company name), etc.
My current solution is loading each type of data from a separate .txt file in the resources folder into lists at start, then generating a random index from those lists:
public class Person
{
public string firstName;
}
public class GenericPerson
{
public List<string> firstNames;
}
// Class instance containing data collections
public GenericPerson genericPerson;
void PopulateFirstNames()
{
if (!PlayerController.loadedGame)
{
// Load first names from text file
TextAsset names = Resources.Load("Names/FirstNames", typeof(TextAsset)) as TextAsset;
// Split text file into rows and populate list
genericPerson.firstNames = names.text.Split("\n"[0]).ToList();
// Parse the empty char from each entry
for (int i = 0; i < genericServer.names.Count; i++)
{
genericPerson.firstNames[i] = genericPerson.firstNames[i].Replace("\r", string.Empty);
}
}
}
string GenerateFirstName()
{
int index = Random.Range(0, genericPerson.firstNames.Count);
string str = genericPerson.firstNames[index];
return str;
}
public Person GeneratePerson()
{
Person person = new Person();
person.firstName = GenerateFirstName();
return person;
}
My concern is performance - I want to load as much as possible at the start of a new game to be able to simply get a set of values when generating a new person, but might end up storing tens of thousands of strings in 10-15 different lists.
Should I be using list collections for this? I'm not very well acquainted with dictionaries or hashtables, but how much better are they at handling large amounts of data?
$$anonymous$$y friend has just been making a procedurally generated game within Unity and he uses a combination of lists of individual variable (e.g. ints etc) and lists of classes which obviously have more data in. He can run a 16x16 grid size generation in 0.1s with no hit on performance. He has enemies, walls, objects etc that all get generated and spawned after.
I would suggest running the generation at the start and seeing how many combinations you can generate before you feel a performance hit, I'd assume you could get quite alot generated though if youre smart with youre code :)
Answer by Free_Radical · Apr 20, 2016 at 05:10 PM
It depends on what you're doing. If all you're doing is generating a random value and using that to access an element in your list, that's about as efficient as you're going to get outside of just a standard array. If however, you're going to ever manipulate these values or attempt to return a specific object, it would be far more efficient to utilize a dictionary.
May I suggest some light reading:
https://msdn.microsoft.com/en-us/library/6tc79sx1(v=vs.110).aspx
In the case of this Person class example, all I need is a whole lot of different combinations of personal information to create a read-only rap sheet for a random person. The only reason I want to generate it randomly is so that I can create, say a thousand people in one playthrough that are somewhat unique and not repeated in another playthrough.
So in that case, would you say the example I posted is a viable solution?
Thanks for the links, I'll dig into those.
Absolutely. If you don't plan on ordering the list, searching for specific names, or anything along those lines, you really can't beat direct access by index.