I have a problem whereby I want to filter the following dataframe such that I only return the rows where we have both a pie and a non-pie item:
ID | set |
---|---|
1 | apple pie, banana loaf |
2 | banana pie, apple pie |
3 | banana loaf, apple tart |
Thus, the expected output would be:
ID | set |
---|---|
1 | apple pie, banana loaf |
Note that every set in the set column contains exactly two items.
What I have tried so far:
df[(any("pie" in s for s in df['set'])) & (any("pie" not in s for s in df['set']))]
I expect I am doing something that is breaking Pandas dataframe filtering convention but not sure what exactly.
Any help appreciated!
You could use apply
on your dataframe
:
df[df.set.apply(lambda x: len([s for s in x if "pie" in s]) == 1)]
Results:
ID set
0 1 [apple pie, banana loaf]
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments