python - Pandas str.extract: AttributeError: 'str' object has no attribute 'str' -

- August 15, 2011

i'm trying repurpose function using split using str.extract (regex) instead.

def bull_lev(x):     spl = x.rsplit(none, 2)[-2].strip("xx")     if spl.str.isdigit():         return "+" + spl + "00"     return "+100"  def bear_lev(x):     spl = x.rsplit(none, 2)[-2].strip("xx")     if spl.str.isdigit():          return "-" + spl + "00"     return "-100"  df["leverage"] = df["name"].map(lambda x: bull_lev(x)     if "bull" in x else bear_lev(x) if "bear" in x else "+100"

i using pandas dataframe handling:

import pandas pd df = pd.dataframe(["bull axp un x3 von", "bear estox 12x s"], columns=["name"])

desired output:

name                    leverage "bull axp un x3 von"    "+300" "bear estox 12x s"      "-1200"

faulty regex attempt "bull":

def bull_lev(x):     #spl = x.rsplit(none, 2)[-2].strip("xx")     spl = x.str.extract(r"(x\d+|\d+x)\s", flags=re.ignorecase).strip("x")     if spl.str.isdigit():         return "+" + spl + "00"     return "+100"  df["leverage"] = df["name"].map(lambda x: bull_lev(x)     if "bull" in x else bear_lev(x) if "bear" in x else "+100")

produces error:

traceback (most recent call last):   file "toolkit.py", line 128, in <module>     df["leverage"] = df["name"].map(lambda x: bull_lev(x)   file "/python/virtual/py2710/lib/python2.7/site-packages/pandas/core/series.py", line 2016, in map     mapped = map_f(values, arg)   file "pandas/src/inference.pyx", line 1061, in pandas.lib.map_infer (pandas/lib.c:58435)   file "toolkit.py", line 129, in <lambda>     if "bull" in x else bear_lev(x) if "bear" in x else "+100")   file "toolkit.py", line 123, in bear_lev     spl = x.str.extract(r"(x\d+|\d+x)\s", flags=re.ignorecase).strip("x")  attributeerror: 'str' object has no attribute 'str'

i assuming due str.extract capturing list while split works directly string?

you can handle positive case using following:

in [150]: import re df['fundleverage'] = '+' + df['name'].str.extract(r"(x\d+|\d+x)\s", flags=re.ignorecase).str.strip('x') + '00' df  out[150]:                  name fundleverage 0  bull axp un x3 von         +300 1    bull estox x12 s        +1200

you can use np.where handle both cases in 1 liner:

in [151]: df['fundleverage'] = np.where(df['name'].str.extract(r"(x\d+|\d+x)\s", flags=re.ignorecase).str.strip('x').str.isdigit(),  '+' + df['name'].str.extract(r"(x\d+|\d+x)\s", flags=re.ignorecase).str.strip('x') + '00', '+100') df  out[151]:                  name fundleverage 0  bull axp un x3 von         +300 1    bull estox x12 s        +1200

so above uses vectorised str methods strip, extract , isdigit achieve want.

update

after changed requirements (which should not future reference) can mask df bull , bear cases:

in [189]: import re df = pd.dataframe(["bull axp un x3 von", "bear estox 12x s"], columns=["name"]) bull_mask_name = df.loc[df['name'].str.contains('bull', case=false), 'name'] bear_mask_name = df.loc[df['name'].str.contains('bear', case=false), 'name'] df.loc[df['name'].str.contains('bull', case=false), 'fundleverage'] = np.where(bull_mask_name.str.extract(r"(x\d+|\d+x)\s", flags=re.ignorecase).str.strip('x').str.isdigit(),  '+' + bull_mask_name.str.extract(r"(x\d+|\d+x)\s", flags=re.ignorecase).str.strip('x') + '00', '+100') df.loc[df['name'].str.contains('bear', case=false), 'fundleverage'] = np.where(bear_mask_name.str.extract(r"(x\d+|\d+x)\s", flags=re.ignorecase).str.strip('x').str.isdigit(),  '-' + bear_mask_name.str.extract(r"(x\d+|\d+x)\s", flags=re.ignorecase).str.strip('x') + '00', '-100') df  out[189]:                  name fundleverage 0  bull axp un x3 von         +300 1    bear estox 12x s        -1200

Search This Blog

Alconcel

python - Pandas str.extract: AttributeError: 'str' object has no attribute 'str' -

Comments

Post a Comment

Popular posts from this blog

How has firefox/gecko HTML+CSS rendering changed in version 38? -

javascript - Complex json ng-repeat -

jquery - Cloning of rows and columns from the old table into the new with colSpan and rowSpan -