Questo e' un estratto di
credits: Parsing Timeline, di jeffreykegler Version 3.1 Revision 2, 23 October 2019
wp/ACE Automatic_Computing_Engine of Alan Turing
In India, Pannini creates an exact and complete description of the Sanskrit language, including pronunciation. Sanskrit could be recreated using nothing but Pannini's grammar. Pannini's grammar is probably the first formal system of any kind, predating Euclid. In the future, nothing like it will exist for any other natural language of significant size or corpus. By 2018, Pannini will be the object of serious study. But in the 1940's and 1950's Pannini will be almost unknown in the West. This means that Pannini will have no direct effect on the other events in this timeline.
wp/Pāṇini Panini Pannini
In her translator's note on an article on Babbage's computer, Ada Lovelace becomes the first person to clearly see that programming a computer is a distinct discipline from building the computer itself.
Ada is also the first to think of software in linguistic terms -- she originates the idea that, in writing software, we communicate with the computer, and that the means that we use to do that is a "language".
Andrey Markov introduces his chains -- a set of states with transitions between them.
One offshoot of Markov's work will be what will come to be known as regular expressions.
Markov uses his chains, not for parsing, but to deal with a problem in probability -- does the law of large numbers require that events be independent? Indirectly, Markov is addressing the question of the existence of free will.
In 1913, Markov revisits his chains, applying them to the sequence of vowels and consonants in Pushkin's Eugene Onegin. Again, Markov's interest is not in parsing. Nonetheless, this is an application to language of what later will be regarded as a parsing technique, and apparently for the first time in the West.
Leonard Bloomfield, as part of his effort to create a linguistics that would be taken seriously as a science, publishes his "Postulates". Known as structural linguistics, Bloomfield's approach will dominate American lingustics for over two decades.
the totality of utterances that can be made in a speech community
Note: no reference to semantic, the usual view: utterances of a language "mean" something.
This omission is not accidental. Bloomfield excludes meaning from his definition of language because he wants linguistics to be taken seriously as science. Behaviorist thought is very influential at this time and behavorists believe that, while human behaviors can be observed and verified and therefore made the subject of science, mental states cannot be verified. Claiming to know what someone means by a word is claiming to read his mind. And "mind-reading" is not science.
Emil Post defines and studies a formal rewriting system using productions. With this, the process of rediscovering Pannini in the West begins.
In the form of arithmetic expressions,
operator expressions are the target of
the first efforts at automatic parsing.
We will not see this issue go away. Very informally, we can say that
The archetypal examples of operator expressions are arithmetic expressions. >>>
Rutishauser's language is structured line-by-line, as will be all languages
until LISP. No language until ALGOL will be truly block-structured.
The line-by-line languages are parsed using string manipulations. A parsing
theory is not helpful for describing these ad hoc string manipulations, so they
don't give rise to one.
The only logic in these early compilers that really deserves to be called a parsing method is that which tackles arithmetic expressions.
At IBM, a team under John Backus begins working on the language which will be called FORTRAN.
June 1954, ACM committee including John Backus and Grace Hopper
(ACM Association for Computing Machinery
www.acm.org)
The "pseudo-code" of the Committee is what in 2018 will be called a "programming langauge".
The definition goes on to detail the various tasks a "compiler" might perform. These include what in 2018 will be called "linking" and "loading", so that the ACM committee's redefinition of compiler can be seen as a broadening of Hopper's use of the term. It is not clear if the ACM committee's new definition captures a current usage of the term "compiler", or if it is a prescriptive redefinition, intended to change usage.
As the computer field develops, the textbook definition will expand beyond the ACM Committee's implementation-focused one, so that a
And linkers and loaders will often not be consider a part of the compiler in the more proper sense. But in its essentials, the most common usage of the term "compiler" in 2018 will be as it is defined by the ACM committee.
3 Models for the Description of Language. IRE Transactions on Information Theory. September, 1956 - Noam Chomsky (chomsky).
Chomsky publishes his "Three models" paper, which is usually considered the foundation of Western formal language theory. Chomsky demolishes the idea that natural language grammar can be modeled using only Markov chains. Instead, the paper advocates a natural language approach that uses 3 layers:
In his "Three models" paper, Chomsky says that
By a language then, we shall mean a set (finite or infinite) of sentences, each of finite length, all constructed from a finite alphabet of symbols.
This is exactly Bloomfield's definition, restated using set theory.
Nonetheless signs of departure from the behaviorist orthodoxy are apparent in "Three Models" -- Chomsky is quite willing to talk about what sentences mean, when it serves his purposes. For a utterance with multiple meanings, Chomsky's new model produces multiple syntactic derivations. Each of these syntactic derivations "looks" like the natural representation of one of the meanings. Chomsky points out that the insight into semantics that his new model provides is a very desirable property to have.
Chomsky is a turning point, so much so that he establishes or settles the
meaning of many of the terms we are using. A
In contrast to a parser, a recognizer is a something or someone that takes a string and answers "yes" or "no"
Note that, if we intend to do semantics, a recognizer alone is not sufficient.
In the strict sense, a recognizer cannot drive a compiler -- a compiler needs a parser.
But recognizers can be far easier to write, and are often hacked up and pressed into service as parsers. For example, if your semantics is simple and non-recursive, you might be able to drive your semantics phase with the output of a regex engine, using captures.
Noam Chomsky publishes Syntactic Structures, one of the most
important books of all time. The orthodoxy in 1957 is structural linguistics
which argues, with Sherlock Holmes, that it is a capital mistake to theorize
in advance of the facts
. Structuralists start with the utterances in a
language, and build upward.
But Chomsky claims that without a theory there are no facts: there is only noise. The Chomskyan approach is to start with a grammar, and use the corpus of the language to check its accuracy. Chomsky's approach will soon come to dominate linguistics.
Syntactic Structures, London: Mouton, 1957. Limited preview available in English
Oettinger's paper is full of evidence that stacks are still very new.
stacks called "pushdown stores".
DPDA Deterministic PushDown Sutomata, will come to be called Oettinger's mathematical model of stacks.
DPDA's will become the subject of a substantial literature.
LR(k) grammar "corresponds with the intuitive notion of
Knuth 1965 uses the Bloomfieldian definition of language, following the rest of the parsing literature at this time. In other words, Knuth defines a language a "set of strings", without regard to their meaning. The Bloomfieldian definition, while severely restrictive, has yet to cause any recognizable harm.
eterministic stack-based parsing is "exactly" LR-parsing.
It is also the case that non-deterministic stack-based parsing is "exactly" context-free parsing. This symmetry is very elegant, and suggests that the theoreticians have uncovered the basic laws behind parsing.
While nobody had been more authoritative about the limits of extensions as a definition of language than Chomsky (he had literally written the book), Chomsky himself had used extensions to produce some of his most impressive results.
When Knuth discovered the LR grammars, he announced them to the world with a full-blown mathematical description. The top-down grammars, which arose historically, have lacked such a description. In 1968, Lewis and Stearns fill that gap by defining the LL(k) grammars.
When LL is added to the vocabulary of parsing, the meaning of LR shifts slightly.
in response, the meaning of LR shifts to become
LL(k) means "scan from the left, using left reductions with k character of lookahead".
LL(1) is important.
I am not one of those who believes that mathematics takes place in vacuum -- it follows trends and even what are more accurately called fashions.
...
This post is about the influence of minimalism on computer languages and parsing theory. Larry Wall is perhaps the only language design who has managed to escape the influence of minimalism and that is the reason Perl is so different from other languages.
...
If you require evidence that parsing theory can be as much about trends, ideology and fashion as it is about mathematics and techology, here it is. The dominant parsing algorithm for production compilers today is hand-written recursive descent, exactly the one that was displaced in 1975, when the technology was almost unimaginably less powerful.
credits: jeffreykegler.github.io/blog Posts about Marpa: an annotated list
RHS Right Hand Side
LHS Left Hand Side
There is no "data" without theory, just
white noise. Every "fact" gathered relies on many prejudgements about what is
relevant and what is not. And you certainly cannot characterize anything as
"impossible", unless you have, in advance, a theory about what is possible.
http://jeffreykegler.github.io/Ocean-of-Awareness-blog/ Sun, 31 Mar
2019
Not Supported in HTML5.
The <tt> tag was used in HTML 4 to define teletype text.
What to Use Instead?
Consider the <kbd> element (for keyboard input), the <var> element (for variables), the <code> element (for computer code), the <samp> element (for computer output), or use CSS instead.
Ho preferito mettere qui i 2 articoletti
1956: The Chomsky hierarchy. L'articolo fondante i linguaggi formali
1957: Chomsky publishes Syntactic Structures
invece che dove vengono richiamati (in Linguaggi formali) poiche' l'articolo mi e' piaciuto molto, e mi ha spinto a mantenerne l'integrita'.