Distinct by multiple columns¶
R:
mtcars %>% distinct(mpg, am)
This drops all other columns. Pass .keep_all=T to distinct to keep them.
pandas:
mtcars[~mtcars[['mpg','am']].duplicated()]
plydata:
mtcars >> dp.distinct(['mpg', 'am'], 'first')
Must specify keep
dpylthon:
dp.DplyFrame(mtcars) >> dp.sift(~X[['mpg', 'am']].duplicated())
dfply:
mtcars >> dp.filter_by(~X[['mpg', 'am']].duplicated())
dppd:
dp(mtcars).filter_by(~X[['mpg', 'am']].duplicated()).pd