Regex End Of Line And Specific Chracters
Solution 1:
Your problem is threefold:
1) your string contains extra \r (Carriage Return character) before \n (New Line character); this is common in Windows and in network communication protocols; it is probably best to remove any trailing whitespace from your string:
regexString = regexString.rstrip()
2) as mentioned by Wiktor Stribiżew, your regexp is unnecessarily surrounded with / characters - some languages, like Perl, define regexp as a string delimited by / characters, but Python is not one of them;
3) your instruction using re.sub is actually replacing the matching part of regexString with an empty string - I believe this is the exact opposite of what you want (you want to keep the match and remove everything else, right?); that's why fixing the regexp makes things "even worse".
To summarize, I think you should use this instead of your current code:
m = re.match('T12F8B0A22[A-Z0-9]{2}F8', regexString)
regexString = m.group(0)
Solution 2:
There are several ways to get rid of the "\r", but first a little analysis of your code : 1. the special charakter for the end is just '$' not '$\' in python. 2. re.sub will substitute the matched pattern with a string ( '' in your case) wich would substitute the string you want to get with an empty string and you are left with the //r
possible solutions:
- use simple replace: - regexString.replace('\\r','')
- if you want to stick to regex the approach is the same - pattern = '\\\\r' match = re.sub(pattern, '',regexString)
2.2 if you want the acces the different groubs use re.search
    match = re.search('(^T12F8B0A22[A-Z0-9]{2}F8)(.*)',regexString)
    match.group(1) # will give you the T12...
    match.groupe(2) # gives you the \\r
Solution 3:
Just match what you want to find. Couple of examples:
import re
data = '''lots of
otherT12F8B0A2212F8garbage
T12F8B0A2234F8around
T12F8B0A22ABF8the
stringsT12F8B0A22CDF8
'''
print(re.findall('T12F8B0A22..F8',data))
['T12F8B0A2212F8', 'T12F8B0A2234F8', 'T12F8B0A22ABF8', 'T12F8B0A22CDF8']
m = re.search('T12F8B0A22..F8',data)
if m:
    print(m.group(0))
T12F8B0A2212F8
Post a Comment for "Regex End Of Line And Specific Chracters"