python - Pandas: Conditionally generate descriptions from column content -
i trying iron out issues function uses pandas regex
via str.extract
each row in column "name"
generate column "description"
. using regex
, not split
since code must able manage variety of formattings.
the function must modified acknowledge various conditions.
dataframe:
import pandas pd import re df = pd.dataframe(["long axp un x3 von", "short bidu un 5x von", "short goog von", "long goog von"], columns=["name"])
input:
name "long axp un x3 von" "short bidu un 5x von" "short goog von" "long goog von"
current code:
description_map = {"axp":"american express", "bidu":"baidu"} sign_map = {"long": "", "short": "-"} def f(strseries): stock = strseries.str.extract(r"\s(\s+)\s").map(description_map) leverage = strseries.str.extract(r"(x\d+|\d+x)\s", flags=re.ignorecase) sign = strseries.str.extract(r"(\s+)\s").map(sign_map) return "tracks " + stock + " " + sign + leverage + " leverage" df["description"] = f(df["name"])
current output:
name description "long axp un x3 von" "tracks american express x3 leverage" "short bidu un 5x von" "tracks baidu -5x leverage" "short goog von" "" "long goog von" ""
desired output:
name description "long axp un x3 von" "tracks american express 3x leverage" "short bidu un 5x von" "tracks baidu inversely -5x leverage" "short goog von" "tracks inversely" "long goog von" "tracks"
implications:
- if
sign
"-"
, how can make adddirection = "inversely"
string? - if no
stock
matched inname
dictionarydescription_map
: setstock = ""
, return string. - if no
leverage
found inname
: ignore part"with" + sign + leverage + " leverage"
. - split , reorder
sign + leverage
displays in order-5x"
regardless of if inputted"short x5"
.
i spent time writing function:
description_map = {"axp":"american express", "bidu":"baidu"} sign_map = {"long": "", "short": "-"} stock_match = re.compile(r"\s(\s+)\s") leverage_match = re.compile("[0-9]x|x[0-9]|x[0-9]|[0-9]x") def f(value): f1 = lambda x: description_map[stock_match.findall(x)[0]] if stock_match.findall(x)[0] in description_map else '' f2 = lambda x: leverage_match.findall(x)[0] if len(leverage_match.findall(x)) > 0 else '' f3 = lambda x: '-' if 'short' in x else '' stock = f1(value) leverage = f2(value) sign = f3(value) statement = "tracks " + stock if stock == "": if sign == '-': return statement + "{}".format('inversely') else: return "tracks" if leverage[0].replace('x','x') == 'x': leverage = leverage[1]+leverage[0].replace('x','x') if leverage != '' , sign == '-': statement += " {} {}{} leverage".format('inversely', sign, leverage) elif leverage != '' , sign == '': statement += " {} leverage".format(leverage) else: if sign == '-': statement += " {} ".format('inversely') return statement df["description"] = df["name"].map(lambda x:f(x))
output:
in [97]: %paste import pandas pd import re df = pd.dataframe(["long axp un x3 von", "short bidu un 5x von", "short goog von", "long goog von"], columns=["name"]) ## -- end pasted text -- in [98]: df out[98]: name 0 long axp un x3 von 1 short bidu un 5x von 2 short goog von 3 long goog von in [99]: df["description"] = df["name"].map(lambda x:f(x)) in [100]: df out[100]: name description 0 long axp un x3 von tracks american express 3x leverage 1 short bidu un 5x von tracks baidu inversely -5x leverage 2 short goog von tracks inversely 3 long goog von tracks
Comments
Post a Comment