[Click image to enlarge]
My natural language processor, WidapMind, is finally approaching the point where it will be able to parse sentences of arbitrary complexity.
My trick to do this is to construct a data structure out of the sentence made of Nodes and Ideas. and Idea has a String with one or more words in it and may contain a Thing (the class I store all permanent data in). a Node is what holds all the ideas together. The structure is similar to a linked list, except, as you can see, it can branch and recombine.
The process of parsing has 3 phases:
- Make a data structure with no branches where each word in the input sentence goes into an Idea
- split each Idea into as many duplicates as necessary such that every possible meaning of the word is covered (this is done based on all the parts of speech that show up when I do a search in my dictionary)
- merge Ideas to create more complex ones, hopefully ending up with a single Idea that spans all the way from start to end
Writing enough functions to completely process most sentences will be a big job, and I'm sure there will be a few hiccups and restructuring along the way. There is no single good way to process English, but using the framework I've written, I think I can cover most sentence structures with a feasible amount of code.