How To Query A Numerical Column Name In Pandas?
Lets suppose I create a dataframe with columns and query i.e pd.DataFrame([[1,2],[3,4],[5,6]],columns=['a','b']).query('a>1') This will give me a b 1 3 4 2 5 6 But w
Solution 1:
Since the query is under development one possible solution is creating a monkey patch for pd.DataFrame
to evaluate self i.e :
def query_cols(self,expr):
if'self'in expr:
returnself[eval(expr)]
else:
returnself.query(expr)
pd.DataFrame.query_cols = query_cols
pd.DataFrame([[1,2],[3,4],[5,6]]).query_cols('self[1] > 3')
01134256
pd.DataFrame([[1,2],[3,4],[5,6]]).query_cols('self[1] == 4')
01134
pd.DataFrame([[1,2],[3,4],[5,6]],columns=['a','b']).query_cols('a > 3')
a b
256
This is a simple trick and doesn't suit all the cases, answer will be updated when the issue with query is resolved.
Solution 2:
Solution
An option without any monkey patching is to use @
to define a variable and do this as follows.
# If you are fond of one-liners
df = pd.DataFrame([[1,2],[3,4],[5,6]]); df.query('@df[0] > 1')
# Otherwise this is the same as
df = pd.DataFrame([[1,2],[3,4],[5,6]])
df.query('@df[0] > 1') # @df refers to the variable df
Output:
0 1
1 3 4
2 5 6
References
You can find more ways of dealing with this here.
Post a Comment for "How To Query A Numerical Column Name In Pandas?"