Skip to content Skip to sidebar Skip to footer

How To Query A Numerical Column Name In Pandas?

Lets suppose I create a dataframe with columns and query i.e pd.DataFrame([[1,2],[3,4],[5,6]],columns=['a','b']).query('a>1') This will give me a b 1 3 4 2 5 6 But w

Solution 1:

Since the query is under development one possible solution is creating a monkey patch for pd.DataFrame to evaluate self i.e :

def query_cols(self,expr):
    if'self'in expr:
        returnself[eval(expr)]
    else:
        returnself.query(expr)

pd.DataFrame.query_cols = query_cols

pd.DataFrame([[1,2],[3,4],[5,6]]).query_cols('self[1] > 3')

   01134256

pd.DataFrame([[1,2],[3,4],[5,6]]).query_cols('self[1] == 4')

   01134

pd.DataFrame([[1,2],[3,4],[5,6]],columns=['a','b']).query_cols('a > 3')

   a  b
256

This is a simple trick and doesn't suit all the cases, answer will be updated when the issue with query is resolved.

Solution 2:

Solution

An option without any monkey patching is to use @ to define a variable and do this as follows.

# If you are fond of one-liners
df = pd.DataFrame([[1,2],[3,4],[5,6]]); df.query('@df[0] > 1')

# Otherwise this is the same as
df = pd.DataFrame([[1,2],[3,4],[5,6]])
df.query('@df[0] > 1') # @df refers to the variable df

Output:

   0  1
1  3  4
2  5  6

References

You can find more ways of dealing with this here.

Post a Comment for "How To Query A Numerical Column Name In Pandas?"