r data.table ( <= 1.9.4) join behaviour -
i using r , data.table after time , still have issue join. asked this question resulting in satisfactory explanation still not logic. let's consider few examples:
library("data.table") x <- data.table(chiave=c("a", "a", "a", "b", "b"),valore1=1:5) y <- data.table(chiave=c("a", "b", "c", "d"),valore2=1:4) x chiave valore1 1: 1 2: 2 3: 3 4: b 4 5: b 5 y chiave valore2 1: 1 2: b 2 3: c 3 4: d 4
when join error:
setkey(x,chiave) x[y] # error in vecseq(f__, len__, if (allow.cartesian || notjoin) null else as.integer(max(nrow(x), : join results in 7 rows; more 5 = max(nrow(x),nrow(i)). check duplicate key values in i, each of join same group in x on , on again. if that's ok, try including `j` , dropping `by` (by-without-by) j runs each group avoid large allocation. if sure wish proceed, rerun allow.cartesian=true. otherwise, please search error message in faq, wiki, stack overflow , datatable-help advice.
so:
x[y,allow.cartesian=t] chiave valore1 valore2 1: 1 1 2: 2 1 3: 3 1 4: b 4 2 5: b 5 2 6: c na 3 7: d na 4
please note x
has duplicate keys , i
doesn't. if change y
to:
y <- data.table(chiave=c("b", "c", "d"),valore2=1:3) y chiave valore2 1: b 1 2: c 2 3: d 3
the join done no error message , no need allow.cartesian, logically situation same: x
has multiple keys , i
doesn't.
x[y] chiave valore1 valore2 1: b 4 1 2: b 5 1 3: c na 2 4: d na 3
on other hand:
x <- data.table(chiave=c("a", "a", "a", "a", "a", "a", "b", "b"),valore1=1:8) y <- data.table(chiave=c("b", "b", "d"),valore2=1:3) x chiave valore1 1: 1 2: 2 3: 3 4: 4 5: 5 6: 6 7: b 7 8: b 8 y chiave valore2 1: b 1 2: b 2 3: d 3
i have multiple keys in both x
, i
join (and cartesian product) done, no error message , no need allow.cartesian
setkey(x,chiave) x[y] chiave valore1 valore2 1: b 7 1 2: b 8 1 3: b 7 2 4: b 8 2 5: d na 3
from point of view, need warned if , if have multiple keys in both x , (not if resulting table has more rows max(nrow(x),nrow(i)
)) , in case see need of allow.cartesian
(so not in first 2 examples).
just keep answered, behaviour allow.cartesian
has been fixed in current development version v1.9.5
, , available on cran v1.9.6
. odd versions devel, , stable. news:
Comments
Post a Comment