pandas - apply replace function with condition row-wise

Fabio Lamanna Published at Dev

Fabio Lamanna

Starting with this dataframe df:

     0     1     2
02  en    it  None
03  en  None  None
01  nl    en   fil

There are some missing values. I'm trying to apply a replace function row-wise, e.g. in pseudocode:

def replace(x):
    if 'fil' and 'nl' in row:
        x = ''

I know that I can do someting like:

df.apply(f, axis=1)

with a function f defined like:

def f(x):
    if x[0] == 'nl' and x[2] == 'fil':
        x[0] = ''
    return x

obtaining:

     0     1     2
02  en    it  None
03  en  None  None
01        en   fil

but a priori I don't know the actual positions of the strings through the columns, so I have to search with something like the isin method, but row-wise.

EDIT: every string can appear anywhere throughout the columns.

tmthydvnprt

Boolean Indexing and Text Comparison in Pandas

You could create boolean indexing based on string comparisons like this

df['0'].str.contains('nl') & df['2'].str.contains('fil')

or since you updated that the columns could change:

df.isin(['fil']).any(axis=1) & df.isin(['nl']).any(axis=1)

Here is the test case:

import pandas as pd
from cStringIO import StringIO

text_file = '''
     0     1     2
02  en    it  None
03  en  None  None
01  nl    en   fil
'''

# Read in tabular data
df = pd.read_table(StringIO(text_file), sep='\s+')
print 'Original Data:'
print df
print

# Create boolean index based on text comparison
boolIndx = df.isin(['nl']).any(axis=1) & df.isin(['fil']).any(axis=1)
print 'Example Boolean index:'
print boolIndx
print

# Replace string based on boolean assignment   
df.loc[boolIndx] = df.loc[boolIndx].replace('nl', '')
print 'Filtered Data:'
print df
print

Original Data:
    0     1     2
2  en    it  None
3  en  None  None
1  nl    en   fil

Example Boolean index:
2    False
3    False
1     True
dtype: bool

Filtered Data:
    0     1     2
2  en    it  None
3  en  None  None
1        en   fil

Collected from the Internet

Please contact [email protected] to delete if infringement.