Fractal Patterns, L-Systems and Semantics
"Certainly the rules which govern fractal patterns, L-Systems and syntax and those which govern semantics and relevance patterns are related."
Dr. E. Garcia
Mi Islita.com
Email | Last Update: 06/13/05
Article 6 of the series The Fractal Nature of Semantics
Copy Guidelines for students, universities, and public interested in reproducing Mi Islita.com material.
Topics
Distributions and Patterns
Lindenmayer Systems
Fractal Patterns and L-Systems
Sentence Production Rules
Conceptual Semantics Theory
Local Context Analysis (LCA)
References
Distributions and Patterns
In The Keyword Density of Non-Sense I explained why users and search engines may not agree in the way they see, read or interpret documents (1). Essentially,
- users assess document relevancy by visually interpreting -not necessarily in a linear fashion- information displayed.
- search engines assess document relevancy by interpreting -tag by tag and in a linear fashion- information coded.
This leads to a perception mismatch that makes more than one metric irrelevant. One of such metrics is the keyword density (KD) concept used by some search engine marketers and optimizers (SEMs/SEOs). The gist of that article was intentionally given between lines and as follows:
- Documents are probability distributions over topics.
- Topics are probability distributions over words.
- Words are probability distributions over characters.
- KD is disconnected from the nature of these distributions.
- Stay away from marketers that promote keyword density tools.
These tools fall short because the distribution of words, topics and documents are not necessarily linear, static or by chance. These distributions can be non linear, dynamic and conforming to self-similar fractal patterns (2).
However,
- where do these patterns come from?
- which word groups convey more precise information?
- how could these word groups and patterns affect relevance scores and retrieval?
- why we are inclined to certain type of sentence constructions and not others?
- why some word combinations sound, feel and flow more natural, flexible and appealing than others?
- what is the connection between fractals, semantics, L-Systems and IR?
I will try to address these and similar questions in this article. The material is organized as follows. First an introduction on L-Systems is presented. This is followed by a discussion on sentence production rules and fractals. Next I discuss one of the most fascinating topics that may shape the next generation of smart search engines: Conceptual Semantics. This and the last section discusses recent advances in the area of fractals applied to IR and introduces the reader to Local Context Analysis (LCA).
Lindenmayer Systems
The hungarian professor Aristid Lindenmayer (University of Utrecht; 1925-1989) developed an algorithmic treatment for describing the beauty of plants (3). Known as Lindenmayer Systems or L-Systems, this treatment is not strange to psycho-linguistics and to those conducting research in mathematical semantics and cognition. Before proceeding any further, a discussion on basic L-Systems nomenclature is in order.
In L-Systems Theory (4), Grammar is a set G = {V, S, w, P} where
- V is a set (alphabet) of symbols containing elements (variables) to be replaced
- S is a set of symbols containing fixed elements (constants)
- w is a string (start, axiom or initiator) of symbols from V defining the system initial state
- P is a set of re-write production rules defining the way variables are replaced with combinations of S and V
Context-free grammar is an L-System in which the left hand side of a production rule is formed by a single non-terminal symbol. The following are context-free grammar rules:
- (A → B); which means replace A with B
- (A → B); (B → AB); which means replace A with B, then replace B with AB
Context-sensitive grammar is an L-System in which the production rules depend on the neighbours. The following are context-sensititive rules:
- (A <B> A → A); which means that if B is surrounded by two A's replace B with A
- (A <B> B → B); which means that if B is preceded by A and followed by B leave B alone
Deterministic L-Systems are systems that exhibit one production rule for each symbol. If the system is deterministic and context-free is called a D0L-system.
Stochastic L-Systems are systems that exhibit several production rules for a symbol, and each is selected based on probability values during each iteration. For instance, the rules
- (A → B) with p = 0.6 or (A → AB) with p = 0.4
means that each time A is obtained during the iterative process, a random choice must be made as whether to replace A with B or with AB.
That's pretty much the basics. At this point you may ask. "Fine, but what is the connection between L-Systems and fractals?"
Fractal Patterns and L-Systems
It turns out that associating a meaning to each production rule often leads to the generation of fractal patterns.
For instance, if we start with the F axiom then after two iterations (i = 2) the (F → F + F - - F + F) rule gives
F
F + F - - F + F
F + F - - F + F + F + F - - F + F - - F + F - - F + F + F + F - - F + F
However, if F means "Draw a line forward", + means "Make a 60 degree left turn, and - means "Make a 60 degree right turn" then the famous von Koch Curve is generated by applying recursively the (F → F + F - - F + F) re-write rule to a segment (F) each time F is found. This is illustrated in Figure 1.

Figure 1. Koch Curve L-System.
Note that during each iteration an F segment is replaced by a reduced copy of the whole thing. This is the signature of Mandelbrot's Fractal Geometry (2). Figure 2 lists more complex re-write rules. These rules are known to generate branching patterns that mimic the morphology of plants and living organisms. Two examples are provided.

Figure 2. L-System Production Rules.
As shown in the figure, bracket operators are used to scale back and repeat locally the re-write rules. "[" is a "push" command that stores current angle and position on a stack. "]" is a "pop" command that returns to location of last push. The end result of this contextual recursion is a self-similar branching structure: a ramified fractal pattern. Computer programs that explain the design of L-Systems have been published elsewhere (5 - 8).
But how does L-Systems relate to word patterns and sentence production?
Sentence Production Rules
According to Ben Goertzel (see Fractals and Sentence Production) the beginning of early childhood grammar is the two-word sentence in which the iterative pattern involving nouns (N) and verbs( V) is driven by the (V → V N) re-write rule (9, 10). Figure 3 shows the tree-like patterns obtained when this production rule is applied to the NV and VN axioms. I must point out that here we are not associating a meaning to each re-write rule. Still the overall process itself resembles a decision tree.

Figure 3. Tree-like sentence production patterns for the NV and VN axioms.
Several similarities and differences are observed after four iterations (i = 4):
- Both trees
- are related by an element of self-similarity.
- have identical statistics at each iteration level.
- The VN axiom produces
- noun groups but not the NVN or NV combinations
- a tree-like pattern that already exists in the NV tree and as a branch;
- The NV axiom produces three distinct combinations:
- noun groups; i.e., NN, NNN, NNN, NNNN
- the NVN combination
- the NV combination
This makes NV a more flexible and natural combination for constructing sentences. Not surprisingly, the NV sequence is present in many basic constructions -i.e., pet commands and common expressions- as in
NV Pet Commands Mugsy, sit. Mugsy, give. Mugsy, go-nay-nay. |
NVN Common Expressions Mommy loves baby. God bless America. Daddy is home! |
Goertzel is completely right when asserting that NV is "a more natural combination" for sentence construction (1, 9). The role that (V → V N) and similar rules play during the beginning of early childhood grammar cannot be minimized. Indeed, these production rules might lead to a better understanding about how the human mind processes word meanings and how machines should score semantics. As Professor Steven Pinker (Harvard University) states in Lexical and Conceptual Semantics (11),
"How are words represented in the mind and woven into sentences? How do children learn how to use words? Currently there is a tremendous resurgence of interest in lexical semantics."
"In computational linguistics, new techniques are being applied to analyze words in texts, and machine-readable dictionaries are being used to build lexicons for natural language systems. These technologies provide large amounts of data and powerful data-analysis techniques to theoretical linguists, who can repay the favor to computer science by describing how one efficient lexical system, the human mind, represents word meanings...."
"Infants are not born knowing a language, but they do have some understanding of the conceptual world that their parents describe in their speech. Since concepts are intimately tied to word meanings, knowledge of semantics might help children break into the rest of the language system."
Pinker's perception is shared by others. For instance, Professors Dalit Levy and Tami Lapidot (Israel Institute of Technology) have studied extensively fractals and recursion in connection with shared terminology, private syntax and learning activities of 11th grade students. In Shared Terminology, Private Syntax: The case of recursive descriptions (12) they found that fractal recursion plays an important role during the learning and processing of concepts.
So, yes, there is a connection between children learning activities, fractals and sentence construction. However, how about syntax and semantics? Is there any connection between the two? Conceptual Semantics Theory answers this question.
Conceptual Semantics Theory
Conceptual semantics is a theory developed by Professor Ray Jackendoff of Brandeis University (13). Jackendoff is a worldwide recognized linguistic and the recipient of the Jean-Nicod Prize Winner for 2003 (14). According to An analysis of Jackendoff's Lexical Conceptual Semantics (13), syntax can be mapped onto semantics and vice versa. If the structures which govern sentences and those which govern the construction of concepts are related then the patterns which govern syntax and semantics must be related. Considering that
- semantics deals with sentence meaning based on lexical constituents and principles of logical inference
- pragmatics accounts for the context of the sentences and background information accumulated
it is then clear why syntax and semantics are not mutually exclusive. As Roberto de Almeida suggests in his lecture notes on Verb Conceptual Representation (15): sentences must be viewed as contextual events that can be analyzed into refined conceptual constituents and within other sentences. To illustrate, consider the following figure:

Figure 4. Sentence-Event Representation.
As an event, this sentence consists of a thing ("Bruce") and a path. The path consists of an action ("walked"), redundancy ("toward" and "the") and a thing ("school").
Note the natural use of NV and N. A search system driven only by regexp matching should count all terms. However, a semantic-based search engine filtering out redundancy and stopwords should be able to recognize not only NV and N, but a hidden or latent sequence: NVN. The picture gets more complex if sentences and terms are misplaced "all over the place" in a Web document (by means of sloppy design or by means of overlooking document linearization). This gives rise to the phenomenon I have described as "burning the trees" (1).
How does contextual semantics relates to fractals or to the process of retrieval? In my view, relevance scores assigned to documents -as distributions over topics- or to topics -as distributions over words- must account for units of meaning or utterance meanings.
A sentence or group of sentences can be viewed as units or segments expressing specific thoughts and events. These specific thoughts and events, when co-occuring with similar thoughts or associated with other segments give rise to branching paths which allow words to evolve into topics and topics to evolve into documents. The connection is evident. It is a reminder of something found in Lindenmayer Theory: the recursive phenomena that at local level give rise to branching structures (i.e., the "push" and "pop" commands).
The brilliant work of Simon Levy (Washington & Lee University) puts things in the right perspective. In Neuro-Fractal Composition of Meaning: Toward a Collage Theorem for Language (16), Levy discusses the well known Collage Theorem and Iterated Function Systems. He explains the connection between fractals, language, semantics and term vector theory as follows:
"Self-similarity in language appears in the guise of stories within stories, or sentences within sentences ("I know what I know"), and has been represented in the form of recursive grammar rules by Chomsky and his followers. Having observed this common property of language and images, we present a formal mathematical model for putting together words and phrases, based on the iterated function system (IFS) method used in fractal image compression. Building (literally) on vector-space representations of word meaning from contemporary cognitive science research, we show how the meaning of phrases and sentences can likewise be represented as points in a vector space of arbitrary dimension."
Not only that, in Dynamical Parsing to Fractal Representations (17), Levy presents a connectionist parsing model that uses re-write rules and is able to parse text as well as arithmetic expressions. Simon's research gives us a glimse at things to come. It seems to me that linguistics, semantics, fractals and information retrieval are coming together to establish the bases for the next generation of semantic search engines.
If this is not enough to convince yourself that fractals are relevant to the nature of semantics, sentence production rules and relevance scores, then it may be time to "pay a visit" to Local Context Analysis.
Local Context Analysis (LCA)
Local Context Analysis (LCA) (18) was developed by Jinxi Xu and Professor Bruce Croft (University of Massachusetts). The researchers found that LCA is superior to both Local Feedback (a local technique) and PhraseFinder (a global technique). I discuss these techniques in the On-Topic Analysis paper (19).
In LCA, expansion concepts are selected from the top retrieved documents for a query. Baeza-Yates and Ribeiro-Neto call these document concepts (20). Concepts are ranked by their co-occurrence with the query terms in the top ranked documents/passages. Research suggests that these are more informative than other types of data structures and are more flexible for expanding queries and for improving term discovery and retrieval (21).
The top ranked documents are used because these tend to form clusters about a given topic. This is not that far from the cluster hypothesis described in van Rijsbergen's Information Retrieval (22). Thus, the assertion that documents are probability distributions over topics and that topics are probability distributions over words makes sense.
What exactly are document concepts? Now here is the interesting part: document concepts are single nouns and noun groups; i.e., pairs (NN), triplets (NNN), tuples (NNNN...N), etc. Aren't these the very same combinations shown in Figure 3 and obtained when the (V → V N) production rule was applied to the NV and VN axioms? If we visualize query expansion as a sentence production process then everything starts to make sense.
So, on one side we have an L-System that gives rise to natural combinations associated to the beginning of early childhood grammar, and on the other the very same combinations are found to improve relevance scores and retrieval. If you think that this is a mere coincidence, think twice. Not all re-write rules produce noun groups, as can be seen from Figure 5.

Figure 5. Tree-like sentence production patterns for the NV axiom.
Note that after four iterations (i = 4), two distinct tree-like production patterns emerge when the (V → VN), (N → V) and (N → NV), (V → N) re-write rules are applied to the NV axiom. When comparing both production patterns, several similarities and differences can be observed:
1. Similarities
- At a given iteration level both production patterns are of same length.
- At a given iteration level both production patterns have same number of nouns and verbs.
- The length, noun and verb counts seem to converge to a famous sequence, the Fibonacci series: 1, 1, 2, 3, 5, 8, 13, 21, 34 ... where each successive number is the sum of the two preceding numbers.
2. Differences
- The number of nouns and verbs between trees are inverted.
- The (V → VN), (N → V) production rule does not produce noun groups, but the (N → NV), (V → N) does.
Certainly the rules which govern fractal patterns, L-Systems and syntax and those which govern semantics and relevance patterns are related. It is like Dr Kevin Jones (Kingston University) expressed in the winning essay in The THES/OUP Science Writing Prize for 1999 (Self-similar syncopations: Fibonacci, L-systems, limericks and ragtime) (23):
"The fact that such a remarkable pattern underpins the structure of the limerick suggests that this helps to account for its "naturally" appealing qualities - it is almost as if the brain is hard-wired to match or harmonise with these patterns in nature, and that the mind's aesthetic responses will naturally be pulled towards these attractive patterns."
"Perceiving the underlying metre of the limerick is not just a simple linear experience. The generational grammar and symmetrical relationships which inhabit the structure effectively map out a sort of hierarchy of simultaneity. As the first few words are revealed - by listening to the poem being read out loud, or else by "virtually" hearing the patterns when silently reading the text - the self-similar nature of the pattern provides the mind with enough clues to intuitively predict and anticipate what is coming, and feel a surge of satisfaction as the structure inevitably unfolds - reinforced by the rhyming patterns at the end of the lines."
Next: Fractal Clusters - Fractal Networks
Prev: The Keyword Density of Non-Sense
References
- The Keyword Density of Non-Sense, E. Garcia.
- The Fractal Geometry of Nature; Benoit B. Mandelbrot, Chapter 38, W. H. Freeman (1983).
- The Algorithmic Beauty of Plants; P. Prusinkiewicz and A. Lindenmayer, Springer-Verlag, New York (1990).
- L-System: Information from Answer.com.
- A Garden of Fractals; Henry F. Smith in Fractals in the Fundamental and Applied Sciences; H.-O.Peitgen, J.M. Henriques, L. F. Penedo, Editors; North-Holland (1991).
- Dynamical Systems and Fractals; Karl-Heinz Becker and Michael Dorfler, Chapter 8, Cambridge (1988).
- A Unified Approach to Fractal Curves and Plants; Dietmar Saupe in The Science of Fractal Images; Heinz-Otto Peitgen and Dietmar Saupe, Editors, Appendix C, Springer-Verlag (1988).
- Fractals for the Classroom; Heinz-Otto Peitgen, Hartmut Jurgens and Dietmar Saupe, Editors, Vol 2, Chapter 1, Springer-Verlag (1992).
- From Complexity to Creativity: Computational Models of Evolutionary, Autopoietic and Cognitive Dynamics; Ben Goertzel, Plenum Press (1997).
- Fractals and Sentence Production; Ben Goertzel, Ref 9, Chapter 9, Plenum Press (1997).
- Lexical and Conceptual Semantics; Steven Pinker.
- Shared Terminology, Private Syntax: The case of recursive descriptions; Dalit Levy and Tami Lapidot.
- An analysis of Jackendoff's Lexical Conceptual Semantics, Insecurities.org
- Institut Nicod; Jean-Nicod Prize & Lectures 2003.
- Verb Conceptual Representation; Roberto de Almeida, PSYC353 Lecture Notes.
- Neuro-Fractal Composition of Meaning: Toward a Collage Theorem for Language, Simon Levy.
- Dynamical Parsing to Fractal Representations, Simon D. Levy.
- Improving the Effectiveness of Information Retrieval with Local Context Analysis, Jinxi Xu, W. Bruce Croft.
- On-Topic Analysis: Online Discovery of On-Topic Terms; E. Garcia.
- Modern Information Theory; Chapter 5; Ricardo Baeza-Yates, Berthier Ribeiro-Neto; ACM, Addison Wesley (1999).
- An Association Thesaurus for Information Retrieval, Y. Jing, W. Bruce Croft; Proceedings of RIAO 94, pages 140-160 (1994).
- Information Retrieval, C. J. van Rijsbergen; Butterworths, 2nd Edition (1979).
- Self-similar syncopations: Fibonacci, L-systems, limericks and ragtime; Kevin Jones.

