Home - Contacts - Terms -

Mi Islita

Grammar, Semantics, Knowledge and Fractals

"Like semantics, indeed not to be distinguished from semantics, grammatical organization as a fractal phenomenon is a shifting yet stable structuring device that generates word constructions but is also the essence of the meaning. That the neurological arrangement of the brain is also fractal in nature is no coincidence. In this regard it is impossible to separate the dancer from the dance: to see grammar as an abstraction removed from meaning." - Charles Henry, Rice University.

Dr. E. Garcia
Mi Islita.com
Email | Last Update: 05/01/05

Article 4 of the series The Fractal Nature of Semantics

Topics

Human Writing, Zipf's Law, Relevancy and Fractals

Grammar, Language and Fractals

Text Summarization, Retrieval and Fractals

Grammar, Semantics and Fractals

Fractals and Knowledge Management Technology

References

Human Writing, Zipf's Law, Relevancy and Fractals

In 1994, Armin Bunde and Shlomo Havlin wrote in Fractals in Science (1) about the relationship between human writing and fractals.

"Long-range correlations have been found recently in human writings. A novel, a piece of music or a computer program can be regarded as a one-dimensional string of symbols. These strings can be mapped to a one-dimensional random walk model similar to the DNA walk..."

An interesting fractal feature of languages was found in 1949 by Zipf. He observed that the frequency of words as a function of the word order decays as a power-law (with power close to -1) for more than four orders of magnitude. A theory for this empirical finding, based on assumptions of coding words in the brain, was given by Mandelbrot."

So, Mandelbrot already hinted that a connection between grammar, language and semantics does exist. The same year two important research articles were published. One relates to a PhD thesis. The other to a paper published in the Journal of the American Society for Information Science (JASIS). Let us revisit these.

Grammar, Language and Fractals

In March of 1994, Henning Fernau from Universitat Karlsruhe (TH), Germany wrote a brilliant paper for his doctoral work: Valuations Of Languages, With Applications To Fractal Geometry. Fernau proposes a mathematical model based on Michael Barnsley's Iterated Function Systems (IFS) (2 - 4). Fernau writes,

"This paper shows that valuations are useful not only within the theory of codes, but also when dealing with ambiguity, especially in contextfree grammars, or for defining outer measures on the space of w-words which are of some importance to the theory of fractals. These connections yield new formulae to determine the Hausdorff dimension of fractal sets (especially in Euclidean spaces) defined via formal languages."

Two months later, in the May 1994 issue of JASIS, Jim Ottaviani from The University of Michigan published what is considered by many a theoretical bombshell: The Fractal Nature of Relevance: a Hypothesis (5). Unfortunately, not many IR folks paid the necessary attention to his work. In the paper, Ottaviani proposes a new model for relevancy, based on fractal geometry. According to a CiteSeer abstract,

"Ottaviani writes that the hierarchical clustering methods (Chapter IV) often employed in IR fail to capture important relevance relationships among documents, but that the evolution of users interests during iterative searches requires a fractal model of relevance and cluster shapes."

Few months ago, I contacted Ottaviani about a copy of his original work. He mentioned that a copy of this work is only available through university library requests.

Text Summarization, Retrieval and Fractals

Fernau and Ottaviani underscore the obvious, that most tf*IDF text summarization models (Term Vector Theory) used by search engines and IR systems fail to reproduce the inherent nature of human abstractors. This nature is the result of the synergy between semantics and the way humans read, write, think, speak and search.

Almost ten years later, in July 2003 to be exact, Christopher Yang and Fu Lee Wang (Chinese University of Hong Kong) pointed this out in the paper Fractal Summarization: Summarization Based on Fractal Theory (6). I have opened a thread [Fractal Summarization Algorithm (7)] at the Forums.SearchEngineWatch.com site to discuss this research work. In this paper, the authors introduce a text summarization model based on the hierarchical structure of documents. Information is extracted from both document structures and on-page factors. The end result is a reduced copy of the original document. The authors found that fractal summarization outperforms traditional summarization algorithms found in the IR literature.

These studies on fractal relevancy and summarization tell us the obvious; that humans tend to reduce information into topics using a top-to-bottom selection rules until enough information is collected. In the process documents are viewed as fractal clusters consisting of sections, subsections, paragraphs, sentences and terms. Information and meanings become more specific and relevant as the abstractor moves from top levels to lower levels. The meaning of words and phrases is slaved to the length scales of the passages. In the absence of a better characterization I call this process fractal contextualization.

Grammar, Semantics and Fractals

In Universal Grammar (8), Professor Charles Henry (Rice University, Houston) proposes that a universal grammar is not to be found by the traditional method of studying words, abstracting categories of words, and overlaying a structure based on those categories (6). This work can be found in Communication and Cognition - Artificial Intelligence, Vol. 12, Nos. 1-2, pp. 45-61, Special Issue of Self-Reference in Biological and Cognitive Systems and edited by Luis Rocha (9).

Henry contents that one can do better by adopting a technique that emphasizes on language as a biological byproduct of the brain, one where the organizing principles of language are viewed as sharing biological rules of constraint and are subject to evolutionary principles. The following quote from his article and found under Fractal Grammar and Symbols is enlightening

"Research on the geometry of language is highly suggestive in this regard (Van Fraasen, 1980 Van Fraasen and Hooker, 1976 ). If indeed the brain has learned to utilize itself, and this is what we pass along generation to generation, and why knowledge cannot be inherited. Our children inherit only the structure and with it the potentialities of re-structure."

"A phenomenon that occurs in nature and can be modelled mathematically that might shed enormous light on the function of language and, specifically, the organizational property of grammar is fractal in nature. If understood as a fractal, grammar would more easily be explained as an ever changing morphological set of basic instructions. Like semantics, indeed not to be distinguished from semantics, grammatical organization as a fractal phenomenon is a shifting yet stable structuring device that generates word constructions but is also the essence of the meaning. That the neurological arrangement of the brain is also fractal in nature is no coincidence. In this regard it is impossible to separate the dancer from the dance: to see grammar as an abstraction removed from meaning."

"Language activates these associations as well as produces them. In one regard the brain, through language, is constantly in search of new meaning. Meaning is thus highly determined by initial conditions (another feature of biological phenomena). This raises the issue of symbols and symbolism: what is the symbolic nature of language; does language, as a system of symbols, manipulate symbols and create new ones? Earlier descriptions of language would have words as symbolic representations of objects external to the body, or symbolic abstractions of qualities such as faith and honor that can become evident through behavior."

[Let me make a quick pause now to mention that Self-Reference in Biological and Cognitive Systems is edited by Luis Rocha from Los Alamos National Lab and Indiana University. I meet Rocha via email when back in 2003 I was struggling in the Caribbean with the organization of search engine conferences and with the write-up of proposals. He graciously accepted a keynote speaker invitation but due to our local government bureaucracy all plans were canceled. I should point out that at Los Alamos, Indiana University and university centers across the Nation a lot of research on semantics, queries and fractals is taking place. Well, end of the pause.]

Fractals and Knowledge Management Technology

That fractals are not irrelevant to semantics is evident, not only from research work conducted by university and government scientist but also by private research firms. In Taking QuickPlace to the Next Level of Collaborative Knowledge Sharing (10), Bian McKay, Executive Vice President and Chief Scientist of CIRI Lab Inc. put it in this way

"Unlike first generation Knowledge Management technology, second generation doesn't use complex mathematics to calculate patterns from information because it needs to be dynamically adaptive with real-time responses. 2G-KM uses adaptive processing methods like those found in genetic algorithms with a focus on pattern-based Darwinian survival techniques in high-dimensional fractal semantic space to recognize emerging patterns of significance, facilitating early reactions to opportunities and threats to improve business health. Through a high-dimensional spatial continuum index, it turns the digital computing metaphor into an analog spatial semantic pattern metaphor, delivering a host of powerful automatic assimilation derivatives. These are based on concepts like semantic proximity, semantic convergence, recursive containment, and containment propagations that emulate machine learning, facilitated by abductive, inductive, and deductive inferencing with causal reasoning to aid the user in knowledge assimilation through concept usage cause-effect-impact analysis."

"Rather than indexing bulk content (i.e., all the words in documents) as does fulltext indexing, 2G-KM indexes multi-dimensional hyperspatial surfaces of emerging semantic patterns as determined through the usage of common vocabulary, tracing concept boundaries in hypercube fractal fashion. This facilitates the implementation of a high-dimensional, scalable, active-reflexive, associative semantic environment that can deliver on the need for real-time adaptive knowledge management services to responsively drive collaborative knowledge sharing environments."

"Such an index can adaptively map onto and extend the solution provided by first generation Knowledge Management techniques such as fulltext indexing, neural network, Baysean-belief Trees, Taxonomies and Classifications to bring them to the next level of productivity enhancement for corporations."

If by now, you are not convinced that fractals are relevant to the nature of semantics, please keep reading. Like a painter refining his masterpiece, McKay beautifully explains

"In this model, documents contain fractal patterns as signatures that emulate semantic hypercubes, traced by their common vocabulary, dynamically joining documents in collections to reflect Knowledge ViewPoints called kThreads. KThreads provide a similar interpretative value-added to dynamically adaptive knowledge management modeling, as do Eigen Vectors used in physical Engineering models. In 2G-KM adaptive models, kThreads intersect to emulate inferencing patterns, inferencing patterns intersect to emulate meta-inferencing patterns, and so forth into higher-orders of automated intelligence, introspecting fractal semantic patterns that trace through document collections as the knowledge concept use and reusage vehicles of business and governance."

"The effect is that, by fractally distilling or factoring fractal semantic patterns as common vocabulary surfaces out of unstructured documents, much like first order predicate calculus, you get highly operative semantic fractal patterns as first order Knowledge Operators that can be reused to generate higher order semantic assemblies as component-based concepts and next-level semantic operators based on common context."

"By automatically deriving fractal semantic patterns as simplicity from the apparent complexity of unstructured documents, you get the ability to "mix and match" combinations of knowledge concepts contained in and across documents, as they apply to corporate problems and corporate opportunities. This results in the ability to explore and analyze the use of corporate knowledge across corporations to business problems."

Fractals are not Irrelevant to The Nature of Semantics as some agents of misinformation have claimed (11). I advice both members of the IR and SEO community to start learning about fractal semantic analysis and its relationship with search engine behaviors. Once we know how to extract patterns from documents and collections we can grasp their hidden semantics.

Next: The Keyword Density of Non-Sense

Prev: Overlapping Patterns: EF-Ratios, Separators, Patterns and Pitfalls

References
  1. Bunde, A., Havlin, S., Fractals in Science Chapter 3, page 80; Springer-Verlag, Berlin (1994).
  2. Valuations Of Languages, With Applications To Fractal Geometry. Henning Fernau (1994)
  3. Valuations Of Languages, With Applications To Fractal Geometry . A version with Ludwig Staiger (1994).
  4. Fractal Motifs and Iterated Function Systems (IFS); E. Garcia
  5. The Fractal Nature of Relevance: a Hypothesis. Ottaviani, J.S., Journal of the American Society for Information Science 45, 4 (1994), 263-272.
  6. Fractal Summarization: Summarization Based on Fractal Theory. Christopher Yang and Fu Lee Wang, Chinese University of Hong Kong (2003).
  7. Fractal Summarization Algorithm; E. Garcia
  8. Universal Grammar. Henry, Charles, In Communication and Cognition - Artificial Intelligence, Vol. 12, Nos. 1-2, pp. 45-61.
  9. Self-Reference in Biological and Cognitive Systems. Edited by Luis Rocha.
  10. Taking QuickPlace to the Next Level of Collaborative Knowledge Sharing. Bian McKay, Executive Vice President and Chief Scientist of CIRI Lab Inc.
  11. Fractals are not Irrelevant to The Nature of Semantics; E. Garcia

Thank you for using this site.
Status of the Current Document 
W3C CSS Validation  W3C XHTML Validation
Copyright © 2006 Mi Islita.com -