Function To Extract Integer With Regex Returns Nonetype
I wrote a function to extract integer from strings. The strings example is below and it is a column in my dataframe. The output I got is in square bracket, with a lot of numbers in
Solution 1:
Your function returns None
because you forgot the return
statement. Because every function in Python has a return value, a missing return
statement is like returning None
.
Solution 2:
You want to use either str.findall
or str.extractall
:
In [11]: REGEX = '(?<!No\s)(?<!new)(?!2016)(\d{2,4})+€?'
In [12]: s = df2017['Items']
In [13]: s.str.findall(REGEX)
Out[13]:
0 [20]
1 [430]
2 [2015, 30]
3 [016, 80, 20, 00]
4 [30, 13]
5 [016, 100]
6 [016, 016, 70]
dtype: object
In [14]: s.str.extractall(REGEX)
Out[14]:
0
match
002010430202015130300161802203004030113500161100600161016270
Generally extractall
is preferred since it keeps you in numpy rather than using a Series of python lists.
Solution 3:
If your problem is getting the sum of the integers, then you can simply:
sum(int(x) for x in ...)
However, if your problem is with the regex, then you should consider improving your filter mechanism (what should go in). You may also consider filtering manually (though not ideal) word by word (determining which word is irrelevant).
Post a Comment for "Function To Extract Integer With Regex Returns Nonetype"