- Home /
Parse String and Serialise
Hello,
I have a body text that comes into my game, which I need to parse and serialise certain sections of it, due to it containing fields and values for a calculator. The text comes externally from a source, and I have no control over it. Let me show you the text body first;
---------------
Hello, this is some text.
Below you'll see a calculator
[calculator
field1="value1" field2="value2" field3="value3" field4="value4"
/]
Above is a calculator.
---------------
So firstly, I need to remove the [calculator ... /] (which is a static tag) section from the text. I then need to take that removed section and parse it. The format is always;
field="value"
With only a space, or linebreak separating each entry.
I've been using string.Split, but it is very messy, and I'm pretty sure there are easier methods than this?
Thanks.
If the format is known and consistent then consider regular expressions to parse the data.
Answer by sarahnorthway · Jul 11, 2019 at 06:41 PM
Sounds like a job for the mighty Regular Expression! Regex patterns can get gnarly but regex101.com is a fantastic resource to help build and debug them.
I would pull out the field1="value1" field2="value2"
string in a first step, then pull the pairs from that.
Like...
using System.Text.RegularExpressions;
using UnityEngine;
[ExecuteInEditMode]
public class test : MonoBehaviour {
private static string text = " ---------------\r\n Hello, this is some text.\r\n Below you'll see a calculator\r\n \r\n [calculator\r\n field1=\"value1\" field2=\"value2\" field3=\"value3\" field4=\"value4\"\r\n /]\r\n \r\n Above is a calculator.\r\n ---------------";
void OnEnable() {
// look for single match of whatever's between [calculator and /]
string pattern = @"\[calculator([^\]]*)\/\]";
Match match = Regex.Match(text, pattern);
if (string.IsNullOrEmpty(match.Value)) {
Debug.LogWarning("Invalid calculator string.");
return;
}
// field1="value1" field2="value2" field3="value3" field4="value4"
// Groups[0] is the entire match, Groups[1] is the 1st capturing group denoted by ()
string pairs = match.Groups[1].Value.Trim();
Debug.Log("Pairs: " + pairs);
// look for multiple matches of XXXX="YYYY"
// to escape " in an @ string replace with ""
string pairsPattern = @"([^=]*)=""([^""]*)""";
MatchCollection matches = Regex.Matches(pairs, pairsPattern);
foreach (Match pairsMatch in matches) {
string field = pairsMatch.Groups[1].Value.Trim();
string value = pairsMatch.Groups[2].Value.Trim();
Debug.Log(field + "=" + value);
}
}
}
Hello @sarahnorthway , thank you very much for this. I have heard of Regex, and it does seem very useful indeed. Thank you for the Regex tester, I'm having a play now. Could you comment/explain a bit further? On line 19, why are you using Trim? Also I inserted the field=value into the Regex tester using your expression, and I get 'Your regular expression does not match the subject string.' -- Update -- I replaced your expression with this;
([^=]*)="([^""]*)
And it works on the tester
Yeah using "" to escape " always throws me. In the sim it should be ([^=]*)="([^"]*)"
which means (anything NOT an equals sign)="(anything NOT a double quote)"
I always Trim to remove whitespace on unknown values just to be safe... but in this case the 2nd+ field name will end up with a space at the start if you don't. To account for it you could do string pairsPattern = @"\w*([^=]*)=""([^""]*)""";
which matches but ignores 0 or more whitespace characters (\w) from the start of each pair.
Thanks for this, one thing I forgot to mention is that there could be multiple calculators on the same text, loads in fact. Can the $$anonymous$$atch take this into account?