Last week, we just finished our Assignment 2. We were modeling Regular Expressions as trees. It is really funny and quite hard. Every regex can be represented as a tree. Each leaf node contains exactly one of ‘0’, ‘1’, or ‘e’. Each internal node contains exactly one of ‘.’(Dot), ‘|’(Bar), or ‘*’(Star).
Basic idea as follow:
Firstly, we needed to transform ‘0’,’1’,’e’ to a RegexTreeNode. For example, RegexTreeNode("0",[]),RegexTreeNode("1",[]),RegexTreeNode("e", [])
Then,we needed to think about ‘.’(Dot), ‘|’(Bar),‘*’(Star).
If Regular expression is '((1.(0|1)*).0)', we should get node like DotNode(DotNode(RegexTreeNode('1', []), StarNode(BarNode(RegexTreeNode('0', []),
RegexTreeNode('1', [])))), RegexTreeNode('0', []))
Secondly, we needed to matches RegexTreeNodes and strings.
For example, match(RegexTreeNode('1', []), '10'),should return False.
Because in RegexTreeNode only has ‘1’, not ‘0’.
We just did it recursively with some conditions.
Here is our code for regex_match function:
def regex_match(r, s):
'''(RegexTree, str) -> bool
Return True iff r matches s.
'''
# r with length 1
if r.symbol == '0':
return s == '0'
if r.symbol == '1':
return s == '1'
if r.symbol == 'e':
return s == ''
# StarNode
if isinstance(r, StarNode):
if s == '':
return True
for i in range(len(s)):
if s[i] == r.children[0].symbol and s[i] * len(s) == s:
return regex_match(r.children[0], s[i])
return False
# BarNode
if isinstance(r, BarNode):
return any((regex_match(r.children[0], s),
regex_match(r.children[1], s)))
# DotNode
if isinstance(r, DotNode):
if s == '':
return (regex_match(r.children[0], s) and
regex_match(r.children[1], s))
else:
for i in range(len(s)):
s1 = s[:i]
s2 = s[i:]
return (regex_match(r.children[0], s1) and
regex_match(r.children[1], s2))
In the end, good luck for our term test 2!
No comments:
Post a Comment