October 2017
Intermediate to advanced
532 pages
16h 10m
English
For many operations, pandas has multiple ways to do the same thing. In the preceding recipe, the criteria for salary uses two separate boolean expressions. Similarly to SQL, Series have a between method, with the salary criteria equivalently written as follows:
>>> criteria_sal = employee.BASE_SALARY.between(80000, 120000)
Another useful application of isin is to provide a sequence of values automatically generated by some other pandas statements. This would avoid any manual investigating to find the exact string names to store in a list. Conversely, let's try to exclude the rows from the top five most frequently occurring departments:
>>> top_5_depts = employee.DEPARTMENT.value_counts().index[:5]>>> criteria = ~employee.DEPARTMENT.isin(top_5_depts) ...
Read now
Unlock full access