-
Notifications
You must be signed in to change notification settings - Fork 12
Description
Hi there, I was very pleased to find a solution to the inability to generate a regex for (.*?) in capture groups via the parse library, only (.+?). I feel it's a shame that the libraries could not be merged, but such is open source.
I've studied your docs and comments on the other repo and written out test cases for the behaviour I'm after.
I've only managed to make "optional strings" (nullable strings, Union[str,None]) whereas what I really want is "any width strings" (length 0+, str).
Here's the code I wrote to achieve it:
from parse import with_pattern
from parse_type.cfparse import Parser
def check(parser: Parser, schema: str, expected: list[str], /) -> None:
"""Validate the parsed field values against their expected values."""
result = parser.parse(schema)
try:
assert result is not None, f"Parse failed for {schema!r} ({expected=})"
values = [result[f] or "" for f in parser.named_fields]
assert values == expected, f"Parsed {schema!r} as {values} ({expected=})"
except AssertionError as exc:
print(f" F {exc}")
else:
print(f" P {schema!r} ---> {result}")
@with_pattern(r".+")
def parse_str(text: str) -> str:
return text
extra_types = {"Stringlike": parse_str}
parser = Parser("-{content:Stringlike?}", extra_types=extra_types)
print(f"EXPR {parser._expression}")
check(parser, "-hello world", ["hello world"])
check(parser, "-", [""])
print()
parser = Parser("-{a:Stringlike?} {b:Stringlike?}", extra_types=extra_types)
print(f"EXPR {parser._expression}")
check(parser, "-A B", ["A", "B"])
check(parser, "-A ", ["A", ""]) # ["A", ""]
check(parser, "- B", ["", "B"]) # ["", "B"]
check(parser, "- ", ["", ""]) # ["", ""]Which results in
EXPR -(?P<content>(.+)?)
P '-hello world' ---> <Result () {'content': 'hello world'}>
P '-' ---> <Result () {'content': None}>
EXPR -(?P<a>(.+)?) (?P<b>(.+)?)
P '-A B' ---> <Result () {'a': 'A', 'b': 'B'}>
P '-A ' ---> <Result () {'a': 'A', 'b': None}>
P '- B' ---> <Result () {'a': None, 'b': 'B'}>
P '- ' ---> <Result () {'a': None, 'b': None}>
Note that in my code I extract the field value or "" so I can test against lists of strings including the empty string rather than None.
What I would really like here is to eliminate that or statement, I really just want strings.
I suspect that the place to do so would be to hook into the TypeBuilder but I'm falling very far down the rabbit hole at this point! If you could guide me I would appreciate it greatly :-)