PDA

View Full Version : Dictionary List and Data Structures



theRemix
July 10th, 2009, 06:13 PM
In your experience or opinion, what is the fastest way to store a list of every word in the dictionary, and to check if a word exists in it?

specifically, i'm using haXe
which means i have Object, Hash, IntHash, Array, Typed Array, List

and am making a text twist type game where user has a limited amount of letters to use to create a word that matches one in the dictionary

the possibility to pre-fetch words based on the available letters is there. letters are random, quantity will be 5 - 20

bluemagica
July 10th, 2009, 06:59 PM
I would suggest using a Array, as flash is best optimised for it, also it will be best since for a word game you will be dealing with thousands of words!

First store the words in a array based on the length...so if player is working with 5 letters, get all 5 letter words! and then sort the list based on the word's beginning character! For this, its better to use merge sort!

theRemix
July 10th, 2009, 07:10 PM
thanks for your input, although i don't agree a single dimensional Array is the best tool in this case.

storing the words by length is a good idea, since they have up to 10 letters for example, i could search based on the current number of letters typed in, so "STICK" is typed, but not entered, i'd already know to search within the words_array[3]

assuming i words_array[0] contains 2 letter words (yes i'll allow "to, me, at" this time)
words_array[1] contains 3 letter words
words_array[2] contains 4 letter words
words_array[3] contains 5 letter words
...

then just iterate over an Array or linked List in words_array[3] to find "STICK" ?

theRemix
July 10th, 2009, 07:24 PM
i'm looking for a way of storing, then searching the words faster than using a Trie for example.

http://en.wikipedia.org/wiki/Trie

bluemagica's idea is worth an attempt to compare against

any other ideas?

TOdorus
July 10th, 2009, 08:03 PM
Ah so it's called a trie. I was thinking about suggesting something like that

I don't you can find anything faster. Especially if you're using a large list of words. It may be slow in the beginning, but it can eliminate 25 branches per iteration. When the list of words is small, there probably won't be that much nodes with 26 branches, so another way can be faster. I think the letters per word idea gets too slow pretty quickly compared to a trie.

I would try to make it easy on myself if I were you and code a algorithm that can make the trie for you. You just feed it a array of words and it creates the data structure. This rules out a lot of mistakes and some nasty repetitive work. Exactly what computers are here for.

flyingmonkey456
July 10th, 2009, 08:44 PM
there's probably some kind of online database of every word somewhere, you could probably use that if you can find one.

bluemagica
July 10th, 2009, 09:44 PM
sadly there isn't such a thing, but there are many "worlists" available for download with thousands of words in them!

The reason I suggested the array is because, its built right into flash, and a 1d array is very fast to itterate through! As for your 2d array plan, it sounds ok, the main reason I overlooked it cause, I normally make my word games using mysql/php, where I use php to fetch me a custom wordlist as per the level requirements!

theRemix
July 10th, 2009, 10:53 PM
sadly there isn't such a thing, but there are many "worlists" available for download with thousands of words in them!

The reason I suggested the array is because, its built right into flash, and a 1d array is very fast to itterate through! As for your 2d array plan, it sounds ok, the main reason I overlooked it cause, I normally make my word games using mysql/php, where I use php to fetch me a custom wordlist as per the level requirements!

actually, i mentioned that i use haXe because it also provides a datatype that you can iterate through that is faster than an array.
Linked List
http://haxe.org/api/haxe/fastlist

which compiles to flash nicely