pandas.DataFrame.query¶
- DataFrame.query(expr, **kwargs)¶
Query the columns of a frame with a boolean expression.
Parameters: expr : string
The query string to evaluate. The result of the evaluation of this expression is first passed to loc and if that fails because of a multidimensional key (e.g., a DataFrame) then the result will be passed to __getitem__().
kwargs : dict
Returns: q : DataFrame or Series
See also
Notes
This method uses the top-level eval() function to evaluate the passed query.
The query() method uses a slightly modified Python syntax by default. For example, the & and | (bitwise) operators have the precedence of their boolean cousins, and and or. This is syntactically valid Python, however the semantics are different.
You can change the semantics of the expression by passing the keyword argument parser='python'. This enforces the same semantics as evaluation in Python space. Likewise, you can pass engine='python' to evaluate an expression using Python itself as a backend. This is not recommended as it is inefficient compared to using numexpr as the engine.
The index and columns attributes of the DataFrame instance is placed in the namespace by default, which allows you to treat both the index and columns of the frame as a column in the frame. The identifier index is used for this variable, and you can also use the name of the index to identify it in a query.
For further details and examples see the query documentation in indexing.
Examples
>>> from numpy.random import randn >>> from pandas import DataFrame >>> df = DataFrame(randn(10, 2), columns=list('ab')) >>> df.query('a > b') >>> df[df.a > df.b] # same result as the previous expression