ScholarQuill logoScholarQuillUniversity Notes
  • Notes
  • Past Papers
  • Blogs
  • Todo
Login
ScholarQuill logoScholarQuillUniversity Notes
Login
NotesPast PapersBlogsTodo
More
SubjectsDiscussionCGPA CalculatorGPA CalculatorStudent PortalCourse Outline
About
About usPrivacy PolicyReportContact
Notes
Past Papers
Blogs
Todo
Analytics
    Current Subject
    🧩
    Theory of Automata
    COMP3148
    Progress0 / 25 topics
    Topics
    1. Finite State Models2. Language definitions preliminaries3. Regular expressions/Regular languages4. Finite automata (FAs)5. Transition graphs (TGs)6. NFAs7. Kleene's theorem8. Transducers (automata with output)9. Pumping lemma and non-regular language10. Grammars and PDA11. CFGs12. Derivations, derivation trees and ambiguity13. Simplifying CFLs14. Normal form grammars and parsing15. Decidability16. Context sensitive languages17. Grammars and linear bounded automata (LBA)18. Chomsky's hierarchy of grammars19. Turing Machines Theory20. Turing machines21. Post machine22. Variations on TM23. TM encoding24. Universal Turing Machine25. Defining Computers by TMs
    COMP3148›Chomsky's hierarchy of grammars
    Theory of AutomataTopic 18 of 25

    Chomsky's hierarchy of grammars

    7 minread
    1,265words
    Intermediatelevel

    Chomsky's Hierarchy of Grammars

    Chomsky's hierarchy is a classification of formal grammars into four types based on their generative power (the complexity of the languages they can generate). These grammar types are structured from least powerful to most powerful. The hierarchy was introduced by Noam Chomsky in 1956 and has become a central concept in formal language theory, computational theory, and linguistics.

    The four levels in Chomsky’s hierarchy are:

    1. Type 3: Regular Grammars
    2. Type 2: Context-Free Grammars (CFGs)
    3. Type 1: Context-Sensitive Grammars (CSGs)
    4. Type 0: Unrestricted Grammars

    Each type of grammar in the hierarchy can generate a specific class of languages, and each level includes the grammars of the levels below it (i.e., regular grammars are a subset of context-free grammars, context-free grammars are a subset of context-sensitive grammars, and so on). The hierarchy helps in understanding how the complexity of languages relates to the computational power required to process them.

    Type 3: Regular Grammars

    • Definition: Regular grammars are the simplest class of grammars. They can generate regular languages, which are the languages that can be recognized by a finite automaton.

    • Production Rules: The production rules in a regular grammar are of the form:

      • A→aBA \to aBA→aB
      • A→aA \to aA→a
      • A→ϵA \to \epsilonA→ϵ

      Where:

      • AAA and BBB are non-terminal symbols,
      • aaa is a terminal symbol,
      • ϵ\epsilonϵ represents the empty string.
    • Characteristics:

      • Regular grammars can be represented by finite automata (deterministic or non-deterministic).
      • Regular languages can be described using regular expressions.
      • Regular languages are closed under operations such as union, intersection, and Kleene star.
    • Examples: Languages like:

      • L={an∣n≥1}L = \{ a^n \mid n \geq 1 \}L={an∣n≥1}, which is the set of all strings of "a's".
      • L={anbn∣n≥0}L = \{ a^n b^n \mid n \geq 0 \}L={anbn∣n≥0}, which is the set of strings consisting of some number of "a's" followed by an equal number of "b's".

    Type 2: Context-Free Grammars (CFGs)

    • Definition: Context-free grammars (CFGs) generate context-free languages (CFLs), which can be recognized by pushdown automata (PDA), machines that have a stack for memory.

    • Production Rules: The production rules in a context-free grammar are of the form:

      • A→αA \to \alphaA→α, where AAA is a non-terminal symbol and α\alphaα is a string of terminals and non-terminals.

      A key restriction here is that the left-hand side of every production consists of a single non-terminal symbol.

    • Characteristics:

      • Context-free languages can be parsed efficiently using recursive descent parsers and LL or LR parsers.
      • Context-free languages are closed under operations such as union, concatenation, and Kleene star.
      • CFGs can describe many programming languages and mathematical expressions.
    • Examples:

      • L={anbn∣n≥0}L = \{ a^n b^n \mid n \geq 0 \}L={anbn∣n≥0}, which is the set of strings consisting of an equal number of "a's" followed by "b's".
      • Arithmetic expressions, which can be described using context-free grammars.

    Type 1: Context-Sensitive Grammars (CSGs)

    • Definition: Context-sensitive grammars generate context-sensitive languages, which are more powerful than context-free languages. They can be recognized by linear bounded automata (LBA), a Turing machine with limited space.

    • Production Rules: The production rules in a context-sensitive grammar are of the form:

      • αAβ→αγβ\alpha A \beta \to \alpha \gamma \betaαAβ→αγβ, where α\alphaα, β\betaβ, and γ\gammaγ are strings of terminals and non-terminals, and the length of γ\gammaγ is greater than or equal to the length of AAA.

      The left-hand side of the rule may involve more than one non-terminal symbol, and the key feature is that no production rule can reduce the size of the string, meaning the length of the right-hand side must be at least as long as the left-hand side.

    • Characteristics:

      • Context-sensitive languages are decidable, and the problem of membership in a context-sensitive language is solvable by a linear bounded automaton.
      • Context-sensitive grammars can describe more complex syntactic structures that cannot be handled by context-free grammars.
      • Context-sensitive languages are closed under intersection, but not under union.
    • Examples:

      • L={anbncn∣n≥1}L = \{ a^n b^n c^n \mid n \geq 1 \}L={anbncn∣n≥1}, which is the set of strings where the number of "a's", "b's", and "c's" are all equal.
      • This class includes languages where the context in which a symbol appears affects how it is rewritten.

    Type 0: Unrestricted Grammars

    • Definition: Unrestricted grammars are the most general class of grammars. They can generate recursively enumerable languages, which are the class of languages that can be recognized by a Turing machine.

    • Production Rules: The production rules in an unrestricted grammar can be of the form:

      • α→β\alpha \to \betaα→β, where α\alphaα and β\betaβ are strings of non-terminals and terminals, and there are no restrictions on the form of α\alphaα and β\betaβ. The left-hand side can be any string of symbols, and the right-hand side can be any string as long as it’s not empty.
    • Characteristics:

      • Unrestricted grammars are the most powerful class in the Chomsky hierarchy and can generate any language that a Turing machine can recognize.
      • They are not necessarily decidable, as some languages generated by unrestricted grammars may lead to undecidable problems, like the halting problem.
      • Unrestricted grammars are closed under intersection and union.
    • Examples:

      • Recursively enumerable languages, which include languages that may not have a finite decision process (i.e., a Turing machine may not halt for all inputs).
      • Any language that a Turing machine can compute can be described by an unrestricted grammar.

    Chomsky Hierarchy Summary

    Grammar Type Language Class Recognition Machine Example Language
    Type 3: Regular Regular Languages Finite Automaton (FA) L={an∣n≥1}L = \{ a^n \mid n \geq 1 \}L={an∣n≥1}
    Type 2: Context-Free Context-Free Languages Pushdown Automaton (PDA) Arithmetic expressions
    Type 1: Context-Sensitive Context-Sensitive Languages Linear Bounded Automaton (LBA) L={anbncn∣n≥1}L = \{ a^n b^n c^n \mid n \geq 1 \}L={anbncn∣n≥1}
    Type 0: Unrestricted Recursively Enumerable Turing Machine (TM) Universal computing problems

    Conclusion

    Chomsky's hierarchy provides a clear structure for understanding the complexity of formal languages and their recognition by computational models. From regular languages (which can be recognized by finite automata) to recursively enumerable languages (which can be recognized by Turing machines), the hierarchy helps define the computational limits of different classes of languages. Each level in the hierarchy offers more expressive power but also increases the complexity of language recognition, with Type 0 grammars being the most general and powerful.

    Previous topic 17
    Grammars and linear bounded automata (LBA)
    Next topic 19
    Turing Machines Theory

    Past Papers

    Open this section to load past papers

    Click on Show Past Papers to see past papers.
    On This Page
      Reading Stats
      Est. reading time7 min
      Word count1,265
      Code examples0
      DifficultyIntermediate