A token is the smallest meaningful unit of a program that the compiler understands.
int x = 10;
👉 Tokens:
int | x | = | 10 | ;
| Term | Meaning |
|---|---|
| Token | Category (e.g., identifier, keyword) |
| Lexeme | Actual text (e.g., x, 10) |
| Pattern | Rule used to define token |
👉 Token = <Token Name, Attribute Value>
Example:
(id, pointer to symbol table)
(num, 10)
Specification of tokens means defining tokens using rules or patterns so that the lexical analyzer can identify them.
Tokens are specified using:
👉 A regular expression is a pattern used to describe a set of strings.
👉 Rule:
letter (letter | digit)*
👉 Examples:
x, count, total1
👉 Rule:
digit+
👉 Examples:
10, 25, 100
👉 Predefined words:
int, if, while, return
+, -, *, /, =
;, (, ), {, }
| Token Type | Regular Expression | Example | |
|---|---|---|---|
| Identifier | letter (letter | digit)* | var1 |
| Number | digit+ | 123 | |
| Operator | + | - | * |
| Whitespace | space/tab/newline | — |
Recognition of tokens is the process of identifying tokens in the input string using the specified patterns.
Using:
Input String → Pattern Matching → Token Identified
Input:
sum = a + 5;
Recognition:
sum → identifier
= → operator
a → identifier
+ → operator
5 → number
; → delimiter
A machine used to recognize patterns in input strings.
Start → letter → (letter/digit)* → Accept
(q0) --letter--> (q1)
(q1) --letter/digit--> (q1)
👉 “Regular Expressions → Finite Automata → Token Recognition”
@1abcint total = 25;
int → keywordtotal → identifier25 → number| Lexeme | Token |
|---|---|
| int | keyword |
| total | identifier |
| = | operator |
| 25 | number |
| ; | delimiter |
👉 Frequently asked:
Regular Expression → DFA → Token Recognition
| Aspect | Specification of Tokens | Recognition of Tokens |
|---|---|---|
| Definition | Defining token patterns | Identifying tokens in input |
| Tool Used | Regular Expressions | Finite Automata (DFA/NFA) |
| Purpose | Describe tokens | Detect tokens |
| Stage | Before scanning | During scanning |
| Example | digit+ | Recognizing "123" as number |
| Output | Token rules | Actual tokens |
| Importance | High | Very High |
Open this section to load past papers