- Home /
Fastest way to store huge amounts of data?
Hello
I'm working on a script, that needs to manage a huge amount of data, with a lot of datashifting.
So I need to know, what kind of storage is most effective in this case.
As far as I know, the List is not bad, but not as fast as built in arrays. But because I need to add and remove data, I would need to use the Array.Resize method. But as far as I read in this thread, that doesn't work on ios... Also, the speed of this resize method can be slower...
So when I need to manage multiple tentousands of objects, would the List be the best choise?
Greetings
Chillersanim
List are not slower than array since they are themselves array. So just use a list. The only difference is that they take more space than regular array so that they can add and remove without constantly resizing.
Refering to this answer, arrays are much faster than lists. But that is not the question, I would rather want to know, if the resize method from unity undoes the speed advantage from arrays or not.
But if you didn't refer to the .Net List, could you please tell me which list meant?
Greetings
Chillersanim
I did refer to .NET list so Unity must have specific implementation on the Built-in array (which may be the reason for the built-in na$$anonymous$$g) http://stackoverflow.com/questions/454916/performance-of-arrays-vs-lists
But did you read through the comments as well? It is said that the resizing of the array is way too slow to be considered against list. One other mentions that the performance difference is depending on the type of data, it is saidthat using GameObject will not affect the result much while int gets array to be faster.
You could also develop your own array system which could be a linked list. You would only gain on the resizing though as getting info would become slower.
I would suggest you to do your own benchmark and pick the most appropriate. Now if you are to add and remove, do not even bother, use List.
Ok, I will do some benchmark, althought the access speed for the lists I use gets slower the more items I have.
For more informations, read the comment I've written to $$anonymous$$TheDev.
Greetings
Chillersanim
Answer by whydoidoit · Feb 21, 2014 at 08:26 AM
You could use this version of List which gives you direct access to the underlying array (with .array) when you are not inserting or removing elements. Then it is as fast as an array when accessing the values:
namespace System.Collections.Generic {
using System;
using System.Diagnostics;
using System.Collections.ObjectModel;
using System.Security.Permissions;
// Implements a variable-size List that uses an array of objects to store the
// elements. A List has a capacity, which is the allocated length
// of the internal array. As elements are added to a List, the capacity
// of the List is automatically increased as required by reallocating the
// internal array.
//
[Serializable()]
public class MyList<T> : IList<T>, System.Collections.IList
{
private const int _defaultCapacity = 4;
public T[] _items;
public int _size;
public int _version;
[NonSerialized]
private Object _syncRoot;
static T[] _emptyArray = new T[0];
// Constructs a List. The list is initially empty and has a capacity
// of zero. Upon adding the first element to the list the capacity is
// increased to 16, and then increased in multiples of two as required.
public MyList() {
_items = _emptyArray;
}
// Constructs a List with a given initial capacity. The list is
// initially empty, but will have room for the given number of elements
// before any reallocations are required.
//
public MyList(int capacity) {
_items = new T[capacity];
}
// Constructs a List, copying the contents of the given collection. The
// size and capacity of the new list will both be equal to the size of the
// given collection.
//
public MyList(IEnumerable<T> collection) {
ICollection<T> c = collection as ICollection<T>;
if( c != null) {
int count = c.Count;
_items = new T[count];
c.CopyTo(_items, 0);
_size = count;
}
else {
_size = 0;
_items = new T[_defaultCapacity];
using(IEnumerator<T> en = collection.GetEnumerator()) {
while(en.MoveNext()) {
Add(en.Current);
}
}
}
}
// Gets and sets the capacity of this list. The capacity is the size of
// the internal array used to hold items. When set, the internal
// array of the list is reallocated to the given capacity.
//
public int Capacity {
get { return _items.Length; }
set {
if (value != _items.Length) {
if (value > 0) {
T[] newItems = new T[value];
if (_size > 0) {
Array.Copy(_items, 0, newItems, 0, _size);
}
_items = newItems;
}
else {
_items = _emptyArray;
}
}
}
}
// Read-only property describing how many elements are in the List.
public int Count {
get { return _size; }
}
bool System.Collections.IList.IsFixedSize {
get { return false; }
}
// Is this List read-only?
bool ICollection<T>.IsReadOnly {
get { return false; }
}
bool System.Collections.IList.IsReadOnly {
get { return false; }
}
// Is this List synchronized (thread-safe)?
bool System.Collections.ICollection.IsSynchronized {
get { return false; }
}
// Synchronization root for this object.
Object System.Collections.ICollection.SyncRoot {
get {
if( _syncRoot == null) {
System.Threading.Interlocked.CompareExchange(ref _syncRoot, new Object(), null);
}
return _syncRoot;
}
}
// Sets or Gets the element at the given index.
//
public T this[int index] {
get {
return _items[index];
}
set {
_items[index] = value;
}
}
public T[] array {
get {
return _items;
}
}
private static bool IsCompatibleObject(object value) {
if( (value is T) || ( value == null && !typeof(T).IsValueType) ) {
return true;
}
return false;
}
private static void VerifyValueType(object value) {
}
Object System.Collections.IList.this[int index] {
get {
return this[index];
}
set {
this[index] = (T) value;
}
}
// Adds the given object to the end of this list. The size of the list is
// increased by one. If required, the capacity of the list is doubled
// before adding the new element.
//
public void Add(T item) {
if (_size == _items.Length) EnsureCapacity(_size + 1);
_items[_size++] = item;
_version++;
}
int System.Collections.IList.Add(Object item)
{
VerifyValueType(item);
Add((T) item);
return Count - 1;
}
// Adds the elements of the given collection to the end of this list. If
// required, the capacity of the list is increased to twice the previous
// capacity or the new size, whichever is larger.
//
public void AddRange(IEnumerable<T> collection) {
InsertRange(_size, collection);
}
public ReadOnlyCollection<T> AsReadOnly() {
return new ReadOnlyCollection<T>(this);
}
// Searches a section of the list for a given element using a binary search
// algorithm. Elements of the list are compared to the search value using
// the given IComparer interface. If comparer is null, elements of
// the list are compared to the search value using the IComparable
// interface, which in that case must be implemented by all elements of the
// list and the given search value. This method assumes that the given
// section of the list is already sorted; if this is not the case, the
// result will be incorrect.
//
// The method returns the index of the given value in the list. If the
// list does not contain the given value, the method returns a negative
// integer. The bitwise complement operator (~) can be applied to a
// negative result to produce the index of the first element (if any) that
// is larger than the given search value. This is also the index at which
// the search value should be inserted into the list in order for the list
// to remain sorted.
//
// The method uses the Array.BinarySearch method to perform the
// search.
//
public int BinarySearch(int index, int count, T item, IComparer<T> comparer) {
return Array.BinarySearch<T>(_items, index, count, item, comparer);
}
public int BinarySearch(T item)
{
return BinarySearch(0,Count,item, null);
}
public int BinarySearch(T item, IComparer<T> comparer)
{
return BinarySearch(0,Count,item,comparer);
}
// Clears the contents of List.
public void Clear() {
if (_size > 0)
{
Array.Clear(_items, 0, _size); // Don't need to doc this but we clear the elements so that the gc can reclaim the references.
_size = 0;
}
_version++;
}
// Contains returns true if the specified element is in the List.
// It does a linear, O(n) search. Equality is determined by calling
// item.Equals().
//
public bool Contains(T item) {
if ((Object) item == null) {
for(int i=0; i<_size; i++)
if ((Object) _items[i] == null)
return true;
return false;
}
else {
EqualityComparer<T> c = EqualityComparer<T>.Default;
for(int i=0; i<_size; i++) {
if (c.Equals(_items[i], item)) return true;
}
return false;
}
}
bool System.Collections.IList.Contains(Object item)
{
if(IsCompatibleObject(item)) {
return Contains((T) item);
}
return false;
}
public MyList<TOutput> ConvertAll<TOutput>(Converter<T,TOutput> converter) {
MyList<TOutput> list = new MyList<TOutput>(_size);
for( int i = 0; i< _size; i++) {
list._items[i] = converter(_items[i]);
}
list._size = _size;
return list;
}
// Copies this List into array, which must be of a
// compatible array type.
//
public void CopyTo(T[] array) {
CopyTo(array, 0);
}
// Copies this List into array, which must be of a
// compatible array type.
//
void System.Collections.ICollection.CopyTo(Array array, int arrayIndex) {
try {
// Array.Copy will check for NULL.
Array.Copy(_items, 0, array, arrayIndex, _size);
}
catch(ArrayTypeMismatchException e){
throw e;
}
}
// Copies a section of this list to the given array at the given index.
//
// The method uses the Array.Copy method to copy the elements.
//
public void CopyTo(int index, T[] array, int arrayIndex, int count) {
// Delegate rest of error checking to Array.Copy.
Array.Copy(_items, index, array, arrayIndex, count);
}
public void CopyTo(T[] array, int arrayIndex) {
// Delegate rest of error checking to Array.Copy.
Array.Copy(_items, 0, array, arrayIndex, _size);
}
// Ensures that the capacity of this list is at least the given minimum
// value. If the currect capacity of the list is less than min, the
// capacity is increased to twice the current capacity or to min,
// whichever is larger.
private void EnsureCapacity(int min) {
if (_items.Length < min) {
int newCapacity = _items.Length == 0? _defaultCapacity : _items.Length * 2;
if (newCapacity < min) newCapacity = min;
Capacity = newCapacity;
}
}
public bool Exists(Predicate<T> match) {
return FindIndex(match) != -1;
}
public T Find(Predicate<T> match) {
for(int i = 0 ; i < _size; i++) {
if(match(_items[i])) {
return _items[i];
}
}
return default(T);
}
public List<T> FindAll(Predicate<T> match) {
List<T> list = new List<T>();
for(int i = 0 ; i < _size; i++) {
if(match(_items[i])) {
list.Add(_items[i]);
}
}
return list;
}
public int FindIndex(Predicate<T> match) {
return FindIndex(0, _size, match);
}
public int FindIndex(int startIndex, Predicate<T> match) {
return FindIndex( startIndex, _size - startIndex, match);
}
public int FindIndex(int startIndex, int count, Predicate<T> match) {
int endIndex = startIndex + count;
for( int i = startIndex; i < endIndex; i++) {
if( match(_items[i])) return i;
}
return -1;
}
public T FindLast(Predicate<T> match) {
for(int i = _size - 1 ; i >= 0; i--) {
if(match(_items[i])) {
return _items[i];
}
}
return default(T);
}
public int FindLastIndex(Predicate<T> match) {
return FindLastIndex( _size - 1, _size, match);
}
public int FindLastIndex(int startIndex, Predicate<T> match) {
return FindLastIndex( startIndex, startIndex + 1, match);
}
public int FindLastIndex(int startIndex, int count, Predicate<T> match) {
int endIndex = startIndex - count;
for( int i = startIndex; i > endIndex; i--) {
if( match(_items[i])) {
return i;
}
}
return -1;
}
public void ForEach(Action<T> action) {
for(int i = 0 ; i < _size; i++) {
action(_items[i]);
}
}
// Returns an enumerator for this list with the given
// permission for removal of elements. If modifications made to the list
// while an enumeration is in progress, the MoveNext and
// GetObject methods of the enumerator will throw an exception.
//
public Enumerator GetEnumerator() {
return new Enumerator(this);
}
/// <internalonly/>
IEnumerator<T> IEnumerable<T>.GetEnumerator() {
return new Enumerator(this);
}
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator() {
return new Enumerator(this);
}
public MyList<T> GetRange(int index, int count) {
MyList<T> list = new MyList<T>(count);
Array.Copy(_items, index, list._items, 0, count);
list._size = count;
return list;
}
// Returns the index of the first occurrence of a given value in a range of
// this list. The list is searched forwards from beginning to end.
// The elements of the list are compared to the given value using the
// Object.Equals method.
//
// This method uses the Array.IndexOf method to perform the
// search.
//
public int IndexOf(T item) {
return Array.IndexOf(_items, item, 0, _size);
}
int System.Collections.IList.IndexOf(Object item)
{
if(IsCompatibleObject(item)) {
return IndexOf((T)item);
}
return -1;
}
// Returns the index of the first occurrence of a given value in a range of
// this list. The list is searched forwards, starting at index
// index and ending at count number of elements. The
// elements of the list are compared to the given value using the
// Object.Equals method.
//
// This method uses the Array.IndexOf method to perform the
// search.
//
public int IndexOf(T item, int index) {
return Array.IndexOf(_items, item, index, _size - index);
}
// Returns the index of the first occurrence of a given value in a range of
// this list. The list is searched forwards, starting at index
// index and upto count number of elements. The
// elements of the list are compared to the given value using the
// Object.Equals method.
//
// This method uses the Array.IndexOf method to perform the
// search.
//
public int IndexOf(T item, int index, int count) {
return Array.IndexOf(_items, item, index, count);
}
// Inserts an element into this list at a given index. The size of the list
// is increased by one. If required, the capacity of the list is doubled
// before inserting the new element.
//
public void Insert(int index, T item) {
// Note that insertions at the end are legal.
if (_size == _items.Length) EnsureCapacity(_size + 1);
if (index < _size) {
Array.Copy(_items, index, _items, index + 1, _size - index);
}
_items[index] = item;
_size++;
_version++;
}
void System.Collections.IList.Insert(int index, Object item)
{
VerifyValueType(item);
Insert(index, (T) item);
}
// Inserts the elements of the given collection at a given index. If
// required, the capacity of the list is increased to twice the previous
// capacity or the new size, whichever is larger. Ranges may be added
// to the end of the list by setting index to the List's size.
//
public void InsertRange(int index, IEnumerable<T> collection) {
ICollection<T> c = collection as ICollection<T>;
if( c != null ) { // if collection is ICollection<T>
int count = c.Count;
if (count > 0) {
EnsureCapacity(_size + count);
if (index < _size) {
Array.Copy(_items, index, _items, index + count, _size - index);
}
// If we're inserting a List into itself, we want to be able to deal with that.
if (this == c) {
// Copy first part of _items to insert location
Array.Copy(_items, 0, _items, index, index);
// Copy last part of _items back to inserted location
Array.Copy(_items, index+count, _items, index*2, _size-index);
}
else {
T[] itemsToInsert = new T[count];
c.CopyTo(itemsToInsert, 0);
itemsToInsert.CopyTo(_items, index);
}
_size += count;
}
}
else {
using(IEnumerator<T> en = collection.GetEnumerator()) {
while(en.MoveNext()) {
Insert(index++, en.Current);
}
}
}
_version++;
}
// Returns the index of the last occurrence of a given value in a range of
// this list. The list is searched backwards, starting at the end
// and ending at the first element in the list. The elements of the list
// are compared to the given value using the Object.Equals method.
//
// This method uses the Array.LastIndexOf method to perform the
// search.
//
public int LastIndexOf(T item)
{
return LastIndexOf(item, _size - 1, _size);
}
// Returns the index of the last occurrence of a given value in a range of
// this list. The list is searched backwards, starting at index
// index and ending at the first element in the list. The
// elements of the list are compared to the given value using the
// Object.Equals method.
//
// This method uses the Array.LastIndexOf method to perform the
// search.
//
public int LastIndexOf(T item, int index)
{
return LastIndexOf(item, index, index + 1);
}
// Returns the index of the last occurrence of a given value in a range of
// this list. The list is searched backwards, starting at index
// index and upto count elements. The elements of
// the list are compared to the given value using the Object.Equals
// method.
//
// This method uses the Array.LastIndexOf method to perform the
// search.
//
public int LastIndexOf(T item, int index, int count) {
if (_size == 0) {
return -1;
}
return Array.LastIndexOf(_items, item, index, count);
}
// Removes the element at the given index. The size of the list is
// decreased by one.
//
public bool Remove(T item) {
int index = IndexOf(item);
if (index >= 0) {
RemoveAt(index);
return true;
}
return false;
}
void System.Collections.IList.Remove(Object item)
{
if(IsCompatibleObject(item)) {
Remove((T) item);
}
}
// This method removes all items which matches the predicate.
// The complexity is O(n).
public int RemoveAll(Predicate<T> match) {
int freeIndex = 0; // the first free slot in items array
// Find the first item which needs to be removed.
while( freeIndex < _size && !match(_items[freeIndex])) freeIndex++;
if( freeIndex >= _size) return 0;
int current = freeIndex + 1;
while( current < _size) {
// Find the first item which needs to be kept.
while( current < _size && match(_items[current])) current++;
if( current < _size) {
// copy item to the free slot.
_items[freeIndex++] = _items[current++];
}
}
Array.Clear(_items, freeIndex, _size - freeIndex);
int result = _size - freeIndex;
_size = freeIndex;
_version++;
return result;
}
// Removes the element at the given index. The size of the list is
// decreased by one.
//
public void RemoveAt(int index) {
_size--;
if (index < _size) {
Array.Copy(_items, index + 1, _items, index, _size - index);
}
_items[_size] = default(T);
_version++;
}
// Removes a range of elements from this list.
//
public void RemoveRange(int index, int count) {
if (count > 0) {
int i = _size;
_size -= count;
if (index < _size) {
Array.Copy(_items, index + count, _items, index, _size - index);
}
Array.Clear(_items, _size, count);
_version++;
}
}
// Reverses the elements in this list.
public void Reverse() {
Reverse(0, Count);
}
// Reverses the elements in a range of this list. Following a call to this
// method, an element in the range given by index and count
// which was previously located at index i will now be located at
// index index + (index + count - i - 1).
//
// This method uses the Array.Reverse method to reverse the
// elements.
//
public void Reverse(int index, int count) {
Array.Reverse(_items, index, count);
_version++;
}
// Sorts the elements in this list. Uses the default comparer and
// Array.Sort.
public void Sort()
{
Sort(0, Count, null);
}
// Sorts the elements in this list. Uses Array.Sort with the
// provided comparer.
public void Sort(IComparer<T> comparer)
{
Sort(0, Count, comparer);
}
// Sorts the elements in a section of this list. The sort compares the
// elements to each other using the given IComparer interface. If
// comparer is null, the elements are compared to each other using
// the IComparable interface, which in that case must be implemented by all
// elements of the list.
//
// This method uses the Array.Sort method to sort the elements.
//
public void Sort(int index, int count, IComparer<T> comparer) {
Array.Sort<T>(_items, index, count, comparer);
_version++;
}
public class ComparisonComparer<T> : IComparer<T>
{
private readonly Comparison<T> _comparison;
public ComparisonComparer(Comparison<T> comparison)
{
_comparison = comparison;
}
public int Compare(T x, T y)
{
return _comparison(x, y);
}
}
public void Sort(Comparison<T> comparison) {
if( _size > 0) {
IComparer<T> comparer = new ComparisonComparer<T>(comparison);
Array.Sort(_items, 0, _size, comparer);
}
}
// ToArray returns a new Object array containing the contents of the List.
// This requires copying the List, which is an O(n) operation.
public T[] ToArray() {
T[] array = new T[_size];
Array.Copy(_items, 0, array, 0, _size);
return array;
}
// Sets the capacity of this list to the size of the list. This method can
// be used to minimize a list's memory overhead once it is known that no
// new elements will be added to the list. To completely clear a list and
// release all memory referenced by the list, execute the following
// statements:
//
// list.Clear();
// list.TrimExcess();
//
public void TrimExcess() {
int threshold = (int)(((double)_items.Length) * 0.9);
if( _size < threshold ) {
Capacity = _size;
}
}
public bool TrueForAll(Predicate<T> match) {
for(int i = 0 ; i < _size; i++) {
if( !match(_items[i])) {
return false;
}
}
return true;
}
[Serializable()]
public struct Enumerator : IEnumerator<T>, System.Collections.IEnumerator
{
private MyList<T> list;
private int index;
private int version;
private T current;
internal Enumerator(MyList<T> list) {
this.list = list;
index = 0;
version = list._version;
current = default(T);
}
public void Dispose() {
}
public bool MoveNext() {
MyList<T> localList = list;
if (version == localList._version && ((uint)index < (uint)localList._size))
{
current = localList._items[index];
index++;
return true;
}
return MoveNextRare();
}
private bool MoveNextRare()
{
index = list._size + 1;
current = default(T);
return false;
}
public T Current {
get {
return current;
}
}
Object System.Collections.IEnumerator.Current {
get {
return Current;
}
}
void System.Collections.IEnumerator.Reset() {
index = 0;
current = default(T);
}
}
}
}
Hmmm, that looks indeed good. I will surely give it a try.
I made some testing, in the editor, the script slows down, when there is a loot of datashifting. I already cache the list length so that it is only asked from the lists once in each frame.
In the method Remove(), do you actualy reduce the size if possible? I've only seen that you double the size if necessary, but I've seen nothing that reduces the size. Also the comment says, that the size is reduced by one, but if you doubles the size in the Add() method, shouldn't it halve the size (when possible) in the Remove() method?
When I want to loop through an amount of objects, should I use the built in ForEach method? Or is a simple for(...) method even fast?
Thanks and greetings Chillersanim
Well I commented above that perhaps a Dictionary/Hashset combination would be faster. I posted this answer as a way of getting good access performance, but after reading your comment to the other answer I then felt that it is probably the addition and removal which is hurting you.
This code above works for me, but i only add and remove things sporadically - so it's access time that matters to me.
Well, it's booth. I have a certain amount of objects that I update each frame, so this amount is the maximum of adds and removes I'm doing. But when there are more objects, the access time is also lower. In the update() method, I only access objects by index, so the solution from whydoidoit is probably the best, because I have direct access to the array.
Greetings
Chillersanim
Answer by RudyTheDev · Feb 20, 2014 at 01:42 PM
If you are doing a lot of inserts/removals, something like LinkedList
is a good option (a bit slower than List
and random access is bad, but doesn't have to copy up to an entire list every time you insert/remove).
There are marginal benefits to using arrays versus lists (which are just arrays with a fancy wrappers as fafase says) and most of the time you will end up reinventing the wheel. Arrays are slightly faster under certain conditions, but it's unlikely you are at a point where this matters yet (read: micro-optimization). 10k objects isn't that many and there are usually better ways to optimize.
This depends on what the "data" is, but--after basic considerations--optimization is almost always about algorithms first. For example (if your data is not some directly real-time related thing), don't iterate all 10k objects every frame. Do only 1/10th of them per frame or have the rest of 9/10th objects do a lite version of their function. Suddenly you are doing things up to 10x faster. And it is unlikely the user could ever notice something updates only 6 times versus 60 times per second (especially, when you have 10k of those things).
If you really need to squeeze performance, then we'd need to know what kind of data you are storing and how you are using and accessing it. For "data shifting" something that links elements together is best, in this case probably a LinkedList
.
Actually, I'm using the lists for a lod system, that should support up to one million game objects with all their lod levels. I already have a optimized algorythm that only updates a certain amount of objects, but I noticed, that the access speed for the lists decrases they more items are stored in them.
Because I use multiple lists, for different distances (near objects get updated more often than far objects), I also have to move this objects regularly.
At least, the idea of arrays with resize() is now down, I will try if the LinkedList have a benefit thought.
Greetings
Chillersanim
Give a try to Sort for sorting list. Since you only deal with the closest ones. Now if it sorts the whole list constantly, it won't help but you can design a sort method that only deals with some of the objects.
@chillersanim: I'm not sure I would do LOD with multiple lists that way. I would keep all of them in the same list, but store a value of how detailed they are. When the list is iterated, deal with the objects the way their detail level requires. And once in a while re-check if their detail level has/needs to be changed.
I've two levels, one is the lod level, this one is stored in the object and has nothing to do with the lists.
The other one is the distance level, this is used to prefer near objects, so that they get updated more often than far objects. This level is stored in the lists. I need this lists, if I wouldn't use them, I had to iterate through the whole list to check which object is prefered and which not.
Also I don't sort, the only thing I do is, when an object changes the distance level, I move it to another list.
Greetings Chillersanim
So I think your problem is the addition and removal of items, which is very slow for a large list as the data needs to be copied around. So in fact lots of lists would be more useful than a two as they would take less time to add and remove items.
It sounds like you should use a dictionary of dictionaries perhaps as these have the fastest performance for "contains" style lookup.
Dictionary<int, HashSet<object>> lookup;
Perhaps you would use the int outer dictionary to specify ranges etc.
Your answer
Follow this Question
Related Questions
Distribute terrain in zones 3 Answers
The name 'Joystick' does not denote a valid type ('not found') 2 Answers
Multiple Cars not working 1 Answer
Waiting twice inside coroutine (C#) 2 Answers