Skip to content Skip to sidebar Skip to footer

How To Combine Multiple Regular Expressions Into One Line?

My script works fine doing this: images = re.findall('src.\'(\S*?media.tumblr\S*?tumblr_\S*?jpg)', doc) videos = re.findall('\S*?(http\S*?video_file\S*?tumblr_[a-zA-Z0-9]*)', doc)

Solution 1:

As mentioned in the comments, a pipe (|) should do the trick.

The regular expression

(src.\"(\S*?media.tumblr\S*?tumblr_\S*?jpg))|(\S*?(http\S*?video_file\S*?tumblr_[a-zA-Z0-9]*))

catches either of the two patterns.

Demo on Regex Tester


Solution 2:

If you really want efficient...

For starters, I would cut out the \S*? in the second regex. It serves no purpose apart from an opportunity for lots of backtracking.

src.\"(\S*?media.tumblr\S*?tumblr_\S*?jpg)|(http\S*?video_file\S*?tumblr_[a-zA-Z0-9]*)

Other ideas

You can get rid of the capture groups by using a small lookbehind in the first one, allowing you to get rid of all parentheses and directly matching what you want. Not faster, but tidier:

(?<=src.\")\S*?media.tumblr\S*?tumblr_\S*?jpg|http\S*?video_file\S*?tumblr_[a-zA-Z0-9]*

Do you intend for the periods after src and media to mean "any character", or to mean "a literal period"? If the latter, escape them: \.

You can use the re.IGNORECASE option and get rid of some letters:

(?<=src.\")\S*?media.tumblr\S*?tumblr_\S*?jpg|http\S*?video_file\S*?tumblr_[a-z0-9]*

Post a Comment for "How To Combine Multiple Regular Expressions Into One Line?"