- Home /
Dictionary only matching last word in list
Hi,
I'm trying to write a simple dictionary program that checks whether a word entered by the user is present. I've created a GUI button that does the checking and outputs the result.
The program was working fine (matching ALL the words entered that were in the dictionary/text file) and then suddenly couldn't match any of the words apart from the last one in the list.
I can't for the life of me see what the problem is as all my debugging confirms that the words are being parsed from the dictionary.txt file into the dictionary correctly.
I'm using wordArray = MytextAsset.text.Split('\n'); // Have also tried ("\n"[0])
to separate the words in the text file and myDictionary = wordArray.ToDictionary(s => s, s => true);
to convert the array into a Dictionary. The text file looks like this;
one
two
three
So here, only 'three' can me matched/found from the dictionary and 'one' and 'two' can't be.
My check for the word is as follows:
return myDictionary.ContainsKey(word);
and only returns true for 'three' - the last word in the list. I have even set the word
string explicitly directly before the check, as follows; word = "one";
to eliminate any errors in passing it to the variable.
I thought that it could be a problem with the newline character in the text file for the first 2 words. However, I've tried creating the file several times, with the same result. The only difference I can see is that the word 'three' wouldn't have a newline character next to it (as I don't have to hit Enter to type another word on the next line), but 'one' and 'two' would.
Any help would be great.
Complete code below:
using UnityEngine;
using System.Collections;
using System.Collections.Generic;
using System.Linq;
public class WordListDictionary : MonoBehaviour
{
private string wordToCheck = " "; // Word to check against words in dictionary
private Dictionary<string, bool> myDictionary = new Dictionary<string, bool>(); // Dictionary used to store words
private string[] wordArray = new string[4]; // Array used to temporarily store the words in the text file
public GameObject[] CharBoxArray = new GameObject[3]; // Array holding the boxes with child with TextMesh
public GameObject[] SlotsArray = new GameObject[3]; // Array holding the slots with child with TextMesh
private TextAsset MytextAsset; // Used to store the data in the dictionary.txt file
private string resultString = " "; // Used to display string showing whether made word is found in dictionary or not
void Awake()
{
// Read in the dictionary.txt file which is in the Resources folder in the project
// and store it in a TextAsset variable/object
MytextAsset = Resources.Load("dictionary", typeof(TextAsset)) as TextAsset;
// Add the words to a string array and use the \n (newline) character to separate the word
// So when a newline is reached, it means its a new word
wordArray = MytextAsset.text.Split('\n'); // Have also tried ("\n"[0])
// For debugging
print("wordArray[0] = " + wordArray[0]);
print("wordArray[1] = " + wordArray[1]);
print("wordArray[2] = " + wordArray[2]);
// Convert the array containing the words to a Dictionary, passing in the correct KeyValuePair types (string, bool)
myDictionary = wordArray.ToDictionary(s => s, s => true);
// Simultaneously convert the Dictionary to a List and then run a ForEach loop to print out the values of the Keys
// Note this is not the usual foreach loop which uses an enumerator. ForEach is part of the List<T> class and uses a for-loop
myDictionary.ToList().ForEach(x => Debug.Log(x.Key + ", " + x.Value));
}
bool checkWord(string word)
{
print("myDictionary.ContainsKey(word) = " + myDictionary.ContainsKey(word));
return myDictionary.ContainsKey(word);
}
void Update()
{
wordToCheck = "one";
checkWord(wordToCheck);
print("wordToCheck = " + wordToCheck);
}
Have you double checked that wordArray
has the right content?
Yes, placing;
print("wordArray[0] = " + wordArray[0]);
print("wordArray[1] = " + wordArray[1]);
print("wordArray[2] = " + wordArray[2]);
beneath the ..split('\n') line outputs the following;
wordArray[0] = one
wordArray[1] = two
wordArray[2] = three
and also to check the dictionary conversion, placing the line:
myDictionary.ToList().ForEach(x => Debug.Log(x.$$anonymous$$ey + ", " + x.Value));
beneath the line;
`myDictionary = wordArray.ToDictionary(s => s, s => true);`
outputs
one,True
two, True
three, True
Answer by KellyThomas · Feb 02, 2014 at 03:09 PM
I've not used ToDictionary()
personally but it looks like your code:
myDictionary = wordArray.ToDictionary(s => s, s => true);
might be resolving to this overload:
ToDictionary<TSource, TKey>(Func<TSource, TKey>, IEqualityComparer<TKey>)
If this is occurring then passing true as the second parameter would means that all values would be treated as equal. If that is happening then each new string that may be overwriting the previous, possibly explaining the behaviour you observe.
Are you planning on using the values stored in the Dictionary
?
If you are only interested in the absence/presence of a key you might find success with this overload:
Dictonary<string, string> myDictionary = wordArray.ToDictionary<string, string>(s => s);
Or better still you could use a `HashSet`:
HashSet<string> dictHash = new HashSet<string>(wordArray);
Thanks for your suggestions, $$anonymous$$elly - they are much appreciated.
Currently I only need to know if the word is present in the dictionary. I tried replacing the line with yours:
`Dictonary<string, string> myDictionary = wordArray.ToDictionary<string, string>(s => s)`;
but it unfortunately gives the same result of 'false' for the first 2 words and 'true' for the 3rd. I used the same method to test for a word:
myDictionary.Contains$$anonymous$$ey(word);
I've heard a lot of about HashSets and their speed so will look into them if I can't get around this.
I've included the complete script at the bottom of my answer if it will help figure things out.
Thanks
O$$anonymous$$ I ran your code and think I have identified the problem.
You are splitting on "`\n`" the line forward character but it is likely that the line endings in your textfile are CR+LF or "`\r\n`".
This would result in all but the last word having the (invisible) "`\r`" character included at the end of it's string.
Try using this code to split:
using System;
//...
wordArray = $$anonymous$$ytextAsset.text.Split(new char[]{'\r', '\n'}, StringSplitOptions.RemoveEmptyEntries);
You nailed it - that was the problem!
Working great now - $$anonymous$$any Thanks :)
While we're on the subject, I notice that while writing the dictionary.txt file in Notepad+ (with 'End of line' set to visible) that on some occasions a CRLF is added and at other times just an LF. Would you know why this might be the case?
Thanks in advance
Good stuff I'm glad it's working.
I don't know why Notepad++ would be creating mixed line endings, but you can convert / configure with Edit - EOL Conversion
and select from Windows / $$anonymous$$ac / Unix.
The code above was intended to literally demonstrate the problem, I think this would be a good approach to use (it also cleans blank lines and trailing spaces):
$$anonymous$$ytextAsset.text.Split(new char[] { '\r', '\n' })
.Select(s => s.Trim())
.Where(s => s.Length > 0)
.ToArray();
//.ToDictionary(s => s, s => true);
That great - I'll have a fiddle with Notepad+ and see what can be done there.
I also implemented this using a Hashset. However (and correct me if I'm wrong), most sources say they're only really useful if their advanced functionality is needed (Overlaps, SymmetricExceptWith, etc.). As is most cases Dictionaries are supposed to perform better/faster.
Thanks for the thorough answer.