[Q&A] Phase 19.1 — SemanticParser: Intent Patterns, Slot Extraction & Edge Cases #473
Unanswered
web3guru888
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
❓ Phase 19.1 — SemanticParser: Intent Patterns, Slot Extraction & Edge Cases
Common questions about the SemanticParser component introduced in Phase 19.1.
Q1: Why regex-based pattern matching instead of ML-based intent classification?
A: The regex-first approach was a deliberate design choice for several reasons:
SemanticParserProtocol is backend-agnostic. A futureMLSemanticParsercan implement the sameparse()interface and be swapped viamake_semantic_parser(backend="ml").The key insight: regex handles structured command languages well. For free-form natural language, the ML backend path is already designed into the Protocol — it just isn't the default.
Q2: How does confidence scoring work?
A: Confidence is computed as:
matched_chars= length of the regex match span.total_chars= length of the normalised input.literal_char_fraction= count of non-metacharacters in the pattern / total pattern length.This means a pattern like
^list agents$(all literals) gets a higher bonus than^list (.+)$(contains wildcard). The bonus rewards specific patterns that matched, even if the match span is the same.The resulting float maps to
IntentConfidence:HIGH(≥ 0.85): near-exact matchMEDIUM(0.50–0.84): partial match, possible clarification neededLOW(< 0.50): weak match,fallback_intentmay overrideQ3: How are slots extracted and typed?
A: Slots are extracted from named regex groups in the pattern. The group name encodes the slot's semantic type via a suffix convention:
The parser inspects each group name's suffix:
SlotType_entityENTITY_numNUMBERint()/float()coercion_dateDATE_boolBOOLEANyes/no/true/false/1/0STRINGCoercion happens at extraction time. If coercion fails (e.g.,
_numgroup captured "abc"), the slot is still included but typed asSTRINGwith acoercion_failed=Trueflag — theDialogueManagercan then request clarification.Q4: What happens when multiple patterns match the same input?
A: Patterns are evaluated in priority-sorted order (descending). The sorting key is:
The first pattern whose match meets
confidence_thresholdwins. This means:priority=5beats anypriority=0pattern regardless of specificity.(.+)groups reduce the score.IntentRegistry.register()).Example:
Pattern B wins: more literal chars (
deploy) + fewer wildcards +_entitysuffix.Q5: How does
fallback_intentwork?A: When no registered pattern matches the input — or all matches fall below
confidence_threshold— the parser emits aSemanticFramewith:The
fallback_intentname is configurable viaParserConfig.fallback_intent(default:"fallback").Downstream handling:
semantic_parser_fallback_totalcounter increments → Grafana alert if fallback rate > 20%.The parser never returns
None— it always produces a frame, ensuring the pipeline never stalls.Q6: How does SemanticParser integrate with DialogueManager?
A: The integration follows a clean producer-consumer pattern:
Key design points:
SemanticParseris stateless — it doesn't track dialogue history.DialogueManagerowns the conversation state and decides the response strategy based on confidence level.SemanticFrameis the only data contract between the two — no shared mutable state.DialogueManagermay pass context hints to a future context-aware parser viaparse(text, context=...).Q7: How to test with custom patterns?
A: Three approaches, from simplest to most thorough:
1.
register_pattern()at runtime:2.
ParserConfig.custom_patterns:3. Pytest fixtures for isolation:
The
NullSemanticParseralways returns a fixedSemanticFrame— useful for testing downstream components without pattern logic.Phase 19: Natural-Language Interface — sub-phase tracker
Beta Was this translation helpful? Give feedback.
All reactions