macpie.group_by_keep_one#
- macpie.group_by_keep_one(dset: Dataset, keep: str = 'all', drop_duplicates: bool = False) None #
Given a
Dataset
object, group on theDataset.id2_col_name
column and keep only the earliest or latest row in each group as determined by the date in theDataset.date_col_name
column.This is the
Dataset
analog ofmacpie.pandas.group_by_keep_one()
.- Parameters:
- dsetDataset
- keep: {‘all’, ‘earliest’, ‘latest’}, default ‘all’
Specify which row of each group to keep.
all: keep all rows
earliest: in each group, keep only the earliest (i.e. oldest) row
latest: in each group, keep only the latest (i.e. most recent) row
- drop_duplicatesbool, default: False
If
True
, then if more than one row is determined to be ‘earliest’ or ‘latest’ in each group, drop all duplicates except the first occurrence. Ifdset
has anid_col_name
, then that column will also be used for identifying duplicates