Chomsky's hierarchy is a classification of formal grammars into four types based on their generative power (the complexity of the languages they can generate). These grammar types are structured from least powerful to most powerful. The hierarchy was introduced by Noam Chomsky in 1956 and has become a central concept in formal language theory, computational theory, and linguistics.
The four levels in Chomsky’s hierarchy are:
Each type of grammar in the hierarchy can generate a specific class of languages, and each level includes the grammars of the levels below it (i.e., regular grammars are a subset of context-free grammars, context-free grammars are a subset of context-sensitive grammars, and so on). The hierarchy helps in understanding how the complexity of languages relates to the computational power required to process them.
Definition: Regular grammars are the simplest class of grammars. They can generate regular languages, which are the languages that can be recognized by a finite automaton.
Production Rules: The production rules in a regular grammar are of the form:
Where:
Characteristics:
Examples: Languages like:
Definition: Context-free grammars (CFGs) generate context-free languages (CFLs), which can be recognized by pushdown automata (PDA), machines that have a stack for memory.
Production Rules: The production rules in a context-free grammar are of the form:
A key restriction here is that the left-hand side of every production consists of a single non-terminal symbol.
Characteristics:
Examples:
Definition: Context-sensitive grammars generate context-sensitive languages, which are more powerful than context-free languages. They can be recognized by linear bounded automata (LBA), a Turing machine with limited space.
Production Rules: The production rules in a context-sensitive grammar are of the form:
The left-hand side of the rule may involve more than one non-terminal symbol, and the key feature is that no production rule can reduce the size of the string, meaning the length of the right-hand side must be at least as long as the left-hand side.
Characteristics:
Examples:
Definition: Unrestricted grammars are the most general class of grammars. They can generate recursively enumerable languages, which are the class of languages that can be recognized by a Turing machine.
Production Rules: The production rules in an unrestricted grammar can be of the form:
Characteristics:
Examples:
| Grammar Type | Language Class | Recognition Machine | Example Language |
|---|---|---|---|
| Type 3: Regular | Regular Languages | Finite Automaton (FA) | |
| Type 2: Context-Free | Context-Free Languages | Pushdown Automaton (PDA) | Arithmetic expressions |
| Type 1: Context-Sensitive | Context-Sensitive Languages | Linear Bounded Automaton (LBA) | |
| Type 0: Unrestricted | Recursively Enumerable | Turing Machine (TM) | Universal computing problems |
Chomsky's hierarchy provides a clear structure for understanding the complexity of formal languages and their recognition by computational models. From regular languages (which can be recognized by finite automata) to recursively enumerable languages (which can be recognized by Turing machines), the hierarchy helps define the computational limits of different classes of languages. Each level in the hierarchy offers more expressive power but also increases the complexity of language recognition, with Type 0 grammars being the most general and powerful.
Open this section to load past papers