Whereas Perl-compatible regular expression matchers typically exhibit some variation of leftmost-greedy semantics, those conforming to the posix standard are prescribed leftmost-longest semantics. However, the posix standard leaves some room for interpretation, and Fowler and Kuklewicz have done experimental work to confirm differences between various posix matchers. The Boost library has an interesting take on the posix standard, where it maximises the leftmost match not with respect to subexpressions of the regular expression pattern, but rather, with respect to capturing groups. In our work, we provide the first formalisation of Boost semantics, and we analyse the complexity of regular expression matching when using Boost semantics.
Reference:
Berglund M., Bester W., van der Merwe B. (2018) Formalising Boost POSIX Regular Expression Matching. In: Fischer B., Uustalu T. (eds) Theoretical Aspects of Computing – ICTAC 2018. ICTAC 2018. Lecture Notes in Computer Science, vol 11187. Springer, Cham
Berglund, M., Bester, W., & van der Merwe, B. (2018). Formalising boost POSIX regular expression matching., Workflow;22167 Springer Cham. http://hdl.handle.net/10204/10936
Berglund, Martin, W Bester, and B van der Merwe. "Formalising boost POSIX regular expression matching" In WORKFLOW;22167, n.p.: Springer Cham. 2018. http://hdl.handle.net/10204/10936.
Berglund M, Bester W, van der Merwe B. Formalising boost POSIX regular expression matching.. Workflow;22167. [place unknown]: Springer Cham; 2018. [cited yyyy month dd]. http://hdl.handle.net/10204/10936.
Copyright: 2018 Springer, Cham. Due to copyright restrictions, the attached PDF file only contains the abstract version of the full-text item. For access to the full-text item, please consult the publisher's website. The definitive version of the work is published in Theoretical Aspects of Computing – ICTAC 2018