Module rustc_expand::mbe::macro_parser  
source · Expand description
This is an NFA-based parser, which calls out to the main Rust parser for named non-terminals (which it commits to fully when it hits one in a grammar). There’s a set of current NFA threads and a set of next ones. Instead of NTs, we have a special case for Kleene star. The big-O, in pathological cases, is worse than traditional use of NFA or Earley parsing, but it’s an easier fit for Macro-by-Example-style rules.
(In order to prevent the pathological case, we’d need to lazily construct the resulting
NamedMatches at the very end. It’d be a pain, and require more memory to keep around old
matcher positions, but it would also save overhead)
We don’t say this parser uses the Earley algorithm, because it’s unnecessarily inaccurate. The macro parser restricts itself to the features of finite state automata. Earley parsers can be described as an extension of NFAs with completion rules, prediction rules, and recursion.
Quick intro to how the parser works:
A “matcher position” (a.k.a. “position” or “mp”) is a dot in the middle of a matcher, usually
written as a ·. For example · a $( a )* a b is one, as is a $( · a )* a b.
The parser walks through the input a token at a time, maintaining a list
of threads consistent with the current position in the input string: cur_mps.
As it processes them, it fills up eof_mps with threads that would be valid if
the macro invocation is now over, bb_mps with threads that are waiting on
a Rust non-terminal like $e:expr, and next_mps with threads that are waiting
on a particular token. Most of the logic concerns moving the · through the
repetitions indicated by Kleene stars. The rules for moving the · without
consuming any input are called epsilon transitions. It only advances or calls
out to the real Rust parser when no cur_mps threads remain.
Example:
Start parsing a a a a b against [· a $( a )* a b].
Remaining input: a a a a b
next: [· a $( a )* a b]
- - - Advance over an a. - - -
Remaining input: a a a b
cur: [a · $( a )* a b]
Descend/Skip (first position).
next: [a $( · a )* a b]  [a $( a )* · a b].
- - - Advance over an a. - - -
Remaining input: a a b
cur: [a $( a · )* a b]  [a $( a )* a · b]
Follow epsilon transition: Finish/Repeat (first position)
next: [a $( a )* · a b]  [a $( · a )* a b]  [a $( a )* a · b]
- - - Advance over an a. - - - (this looks exactly like the last step)
Remaining input: a b
cur: [a $( a · )* a b]  [a $( a )* a · b]
Follow epsilon transition: Finish/Repeat (first position)
next: [a $( a )* · a b]  [a $( · a )* a b]  [a $( a )* a · b]
- - - Advance over an a. - - - (this looks exactly like the last step)
Remaining input: b
cur: [a $( a · )* a b]  [a $( a )* a · b]
Follow epsilon transition: Finish/Repeat (first position)
next: [a $( a )* · a b]  [a $( · a )* a b]  [a $( a )* a · b]
- - - Advance over a b. - - -
Remaining input: ''
eof: [a $( a )* a b ·]
Structs§
- A single matcher position, representing the state of matching.
Enums§
- A unit within a matcher that aMatcherPoscan refer to. Similar to (and derived from)mbe::TokenTree, but designed specifically for fast and easy traversal during matching. Notable differences tombe::TokenTree:
- NamedMatchis a pattern-match result for a single metavar. All- MatchedNonterminals in the- NamedMatchhave the same non-terminal type (expr, item, etc).
- Represents the possible results of an attempted parse.
Functions§
- Count how many metavars declarations are inmatcher.
- Performs a token equality check, ignoring syntax context (that is, an unhygienic comparison)
Type Aliases§
- Contains a mapping ofMacroRulesNormalizedIdents toNamedMatches. This represents the mapping of metavars to the token trees they bind to.
- AParseResultwhere theSuccessvariant contains a mapping ofMacroRulesNormalizedIdents toNamedMatches. This represents the mapping of metavars to the token trees they bind to.