- Home /
Optimize the reading of a text file?
I've followed a lot of suggestions for optimizing graphics performance, and my projects currently sails along quickly and lightly in terms of rendering.
I do, however, have some overhead on loading that I'd like to minimize if possible. It comes down to reading a text file of objects and coordinates (this approach is questionable, but let's stick with it as a given).
I'm reading in a file that contains a subset of lines I'm interested in ("MARK" being the intial characters). Then, only a subset of these lines, I want to store coordinate information for - I check for the existence of the line type using a single character from the line string in a hash I populate at the start of the code.
I'm using builtin arrays for everything below.
I have the following:
if (textAsset == null) return; reader = new StringReader(textAsset.text); line = reader.ReadLine();
while (line != null) {
if(line.Substring(0,4) == "MARK")
{
var identifier = Regex.Replace(line.Substring(13,4),"( )+","");
if (lineType[identifier])
{
i++;
totalCounts++;
vec[i] = Vector3(float.Parse(line.Substring(30,8)),float.Parse(line.Substring(38,8)),float.Parse(line.Substring(46,8)));
vecSum += Vector3(float.Parse(line.Substring(30,8)),float.Parse(line.Substring(38,8)),float.Parse(line.Substring(46,8)));
print(totalCounts);
print(DateTime.Now);
}
}
}
I'm clocking about 200 lines read per second. I'm wondering if anyone sees anything in the above code that could be streamlined for a time gain?
Cheers, and thanks for reading this far!
Answer by Mike 3 · Jul 05, 2010 at 04:16 PM
First thing i see - you're parsing the vectors twice each - store the intermediary vector, then use it for vec[i] and adding to vecSum
Second, you really should be using the overload of float.Parse that takes a CultureInfo object - otherwise on european computers, it'll have problems reading US/UK floats (and vice versa)
Third, you're printing two items each time - that'll slow you down ridiculously (it has to print the current stacktrace to a file, as well as displaying it in the console), so remove those if you're testing performance
Thanks for the pointer on #2. #3 is just temporary while I figure this out.
Hmmm... fixed the #1 and deleted #3. Neither reduces the time outside the variance of total time from run to run.
I think parsing the input text file down will get me to where I need to go. I was hoping the readin would be sufficiently fast that I could ignore large chunks of non-interesting data in my input files. Thanks for looking, mike.
I'm assu$$anonymous$$g that you've skipped the bit of code which stops the above being an infinite loop though - line never gets modified, so it'll just hang. Also - I'm not sure if reading line by line will be faster than reading once then splitting, if you don't need to break out early