python - Pandas: Conditionally generate descriptions from column content -
i trying iron out issues function uses pandas regex via str.extract each row in column "name" generate column "description". using regex , not split since code must able manage variety of formattings.
the function must modified acknowledge various conditions.
dataframe:
import pandas pd import re df = pd.dataframe(["long axp un x3 von", "short bidu un 5x von", "short goog von", "long goog von"], columns=["name"]) input:
name "long axp un x3 von" "short bidu un 5x von" "short goog von" "long goog von" current code:
description_map = {"axp":"american express", "bidu":"baidu"} sign_map = {"long": "", "short": "-"} def f(strseries): stock = strseries.str.extract(r"\s(\s+)\s").map(description_map) leverage = strseries.str.extract(r"(x\d+|\d+x)\s", flags=re.ignorecase) sign = strseries.str.extract(r"(\s+)\s").map(sign_map) return "tracks " + stock + " " + sign + leverage + " leverage" df["description"] = f(df["name"]) current output:
name description "long axp un x3 von" "tracks american express x3 leverage" "short bidu un 5x von" "tracks baidu -5x leverage" "short goog von" "" "long goog von" "" desired output:
name description "long axp un x3 von" "tracks american express 3x leverage" "short bidu un 5x von" "tracks baidu inversely -5x leverage" "short goog von" "tracks inversely" "long goog von" "tracks" implications:
- if
sign"-", how can make adddirection = "inversely"string? - if no
stockmatched innamedictionarydescription_map: setstock = "", return string. - if no
leveragefound inname: ignore part"with" + sign + leverage + " leverage". - split , reorder
sign + leveragedisplays in order-5x"regardless of if inputted"short x5".
i spent time writing function:
description_map = {"axp":"american express", "bidu":"baidu"} sign_map = {"long": "", "short": "-"} stock_match = re.compile(r"\s(\s+)\s") leverage_match = re.compile("[0-9]x|x[0-9]|x[0-9]|[0-9]x") def f(value): f1 = lambda x: description_map[stock_match.findall(x)[0]] if stock_match.findall(x)[0] in description_map else '' f2 = lambda x: leverage_match.findall(x)[0] if len(leverage_match.findall(x)) > 0 else '' f3 = lambda x: '-' if 'short' in x else '' stock = f1(value) leverage = f2(value) sign = f3(value) statement = "tracks " + stock if stock == "": if sign == '-': return statement + "{}".format('inversely') else: return "tracks" if leverage[0].replace('x','x') == 'x': leverage = leverage[1]+leverage[0].replace('x','x') if leverage != '' , sign == '-': statement += " {} {}{} leverage".format('inversely', sign, leverage) elif leverage != '' , sign == '': statement += " {} leverage".format(leverage) else: if sign == '-': statement += " {} ".format('inversely') return statement df["description"] = df["name"].map(lambda x:f(x)) output:
in [97]: %paste import pandas pd import re df = pd.dataframe(["long axp un x3 von", "short bidu un 5x von", "short goog von", "long goog von"], columns=["name"]) ## -- end pasted text -- in [98]: df out[98]: name 0 long axp un x3 von 1 short bidu un 5x von 2 short goog von 3 long goog von in [99]: df["description"] = df["name"].map(lambda x:f(x)) in [100]: df out[100]: name description 0 long axp un x3 von tracks american express 3x leverage 1 short bidu un 5x von tracks baidu inversely -5x leverage 2 short goog von tracks inversely 3 long goog von tracks
Comments
Post a Comment