Python Re.sub Multiline On String
Solution 1:
You need to replace re.MULTILINE with re.DOTALL/re.S and move out period outside the character class as inside it, the dot matches a literal ..
Note that re.MULTILINE only redefines the behavior of ^ and $ that are forced to match at the start/end of a line rather than the whole string. The re.DOTALL flag redefines the behavior of . inside the pattern outside the character class only. It starts matching a newline symbol, too.
So, the regex you could use for the current example: /\*.*?\*/. It matches a literal /* with /\*, then .*? matches as few any symbols as possible up to and including */ (matched with \*/).
See the code demo:
txt = """\n\
<?php\n\
/* Multi-line\n\
comment */\n\
$var = 1;\n"""
new_txt = re.sub(r'/\*.*?\*/', '', txt, flags=re.S)
print("\n=========== TXT ============")
print(txt)
print("\n=========== NEW TXT ============")
print(new_txt)
See IDEONE demo
However, it is not the best solution, as in most cases multiline comments are very long. The best is an unrolling-the-loop technique. The regex above can be "unrolled" like this:
/\*[^*]*(?:\*(?!/)[^*]*)*\*/
See the regex demo
Post a Comment for "Python Re.sub Multiline On String"