macpie.LavaDataset#

class macpie.LavaDataset(*args, **kwargs)#

A Dataset using LAVA defaults. (LAVA is the data management system used at the Memory and Aging Center.) Defaults used are the following:

id_col_name = “InstrID”
date_col_name = “DCDate”
id2_col_name = “PIDN”

__init__(*args, **kwargs)#

Methods

`__init__`(args, *kwargs)
`abs`()	Return a Series/DataFrame with absolute numeric value of each element.
`add`(other[, axis, level, fill_value])	Get Addition of dataframe and other, element-wise (binary operator add).
`add_prefix`(prefix)	Prefix labels with string prefix.
`add_suffix`(suffix)	Suffix labels with string suffix.
`add_tag`(tag)	Add a tag to the `Dataset`
`agg`([func, axis])	Aggregate using one or more operations over the specified axis.
`aggregate`([func, axis])	Aggregate using one or more operations over the specified axis.
`align`(other[, join, axis, level, copy, ...])	Align two objects on their axes with the specified join method.
`all`([axis, bool_only, skipna, level])	Return whether all elements are True, potentially over an axis.
`any`(*[, axis, bool_only, skipna, level])	Return whether any element is True, potentially over an axis.
`append`(other[, ignore_index, ...])	Append rows of other to the end of caller, returning a new object.
`apply`(func[, axis, raw, result_type, args])	Apply a function along an axis of the DataFrame.
`applymap`(func[, na_action])	Apply a function to a Dataframe elementwise.
`asfreq`(freq[, method, how, normalize, ...])	Convert time series to specified frequency.
`asof`(where[, subset])	Return the last row(s) without any NaNs before where.
`assign`(**kwargs)	Assign new columns to a DataFrame.
`astype`(dtype[, copy, errors])	Cast a pandas object to a specified dtype `dtype`.
`at_time`(time[, asof, axis])	Select values at particular time of day (e.g., 9:30AM).
`backfill`(*[, axis, inplace, limit, downcast])	Synonym for `DataFrame.fillna()` with `method='bfill'`.
`between_time`(start_time, end_time[, ...])	Select values between particular times of the day (e.g., 9:00-9:30 AM).
`bfill`(*[, axis, inplace, limit, downcast])	Synonym for `DataFrame.fillna()` with `method='bfill'`.
`bool`()	Return the bool of a single element Series or DataFrame.
`boxplot`([column, by, ax, fontsize, rot, ...])	Make a box plot from DataFrame columns.
`clear_tags`()	Clear all tags.
`clip`([lower, upper, axis, inplace])	Trim values at input threshold(s).
`combine`(other, func[, fill_value, overwrite])	Perform column-wise combine with another DataFrame.
`combine_first`(other)	Update null elements with value in the same location in other.
`compare`(other[, align_axis, keep_shape, ...])	Compare to another DataFrame and show the differences.
`convert_dtypes`([infer_objects, ...])	Convert columns to best possible dtypes using dtypes supporting `pd.NA`.
`copy`([deep])	Make a copy of this object's indices and data.
`corr`([method, min_periods, numeric_only])	Compute pairwise correlation of columns, excluding NA/null values.
`corrwith`(other[, axis, drop, method, ...])	Compute pairwise correlation.
`count`([axis, level, numeric_only])	Count non-NA cells for each column or row.
`cov`([min_periods, ddof, numeric_only])	Compute pairwise covariance of columns, excluding NA/null values.
`create_id_col`([col_name, start_index])	Create `id_col_name` with sequential numerical index.
`cross_section`(excel_dict)	Return the Dataset defined by `excel_dict` from this Dataset.
`cummax`([axis, skipna])	Return cumulative maximum over a DataFrame or Series axis.
`cummin`([axis, skipna])	Return cumulative minimum over a DataFrame or Series axis.
`cumprod`([axis, skipna])	Return cumulative product over a DataFrame or Series axis.
`cumsum`([axis, skipna])	Return cumulative sum over a DataFrame or Series axis.
`date_proximity`(**kwargs)
`default_display_name_generator`(dset)	Default function to use as `display_name_generator`.
`describe`([percentiles, include, exclude, ...])	Generate descriptive statistics.
`diff`([periods, axis])	First discrete difference of element.
`div`(other[, axis, level, fill_value])	Get Floating division of dataframe and other, element-wise (binary operator truediv).
`divide`(other[, axis, level, fill_value])	Get Floating division of dataframe and other, element-wise (binary operator truediv).
`dot`(other)	Compute the matrix multiplication between the DataFrame and other.
`drop`([labels, axis, index, columns, level, ...])	Drop specified labels from rows or columns.
`drop_duplicates`([subset, keep, inplace, ...])	Return DataFrame with duplicate rows removed.
`drop_sys_cols`([inplace])	Drop all `sys_cols` from `Dataset`.
`droplevel`(level[, axis])	Return Series/DataFrame with requested index / column level(s) removed.
`dropna`(*[, axis, how, thresh, subset, inplace])	Remove missing values.
`duplicated`([subset, keep])	Return boolean Series denoting duplicate rows.
`eq`(other[, axis, level])	Get Equal to of dataframe and other, element-wise (binary operator eq).
`equals`(other)	Test whether two Datasets are equal.
`eval`(expr, *[, inplace])	Evaluate a string describing operations on DataFrame columns.
`ewm`([com, span, halflife, alpha, ...])	Provide exponentially weighted (EW) calculations.
`excel_dict_has_tags`(excel_dict, tags)	Helper function to determine if an `excel_dict` has `tags`.
`expanding`([min_periods, center, axis, method])	Provide expanding window calculations.
`explode`(column[, ignore_index])	Transform each element of a list-like to a row, replicating index values.
`ffill`(*[, axis, inplace, limit, downcast])	Synonym for `DataFrame.fillna()` with `method='ffill'`.
`fillna`([value, method, axis, inplace, ...])	Fill NA/NaN values using the specified method.
`filter`([items, like, regex, axis])	Subset the dataframe rows or columns according to the specified index labels.
`first`(offset)	Select initial periods of time series data based on a date offset.
`first_valid_index`()	Return index for first non-NA value or None, if no non-NA value is found.
`floordiv`(other[, axis, level, fill_value])	Get Integer division of dataframe and other, element-wise (binary operator floordiv).
`from_dict`(data[, orient, dtype, columns])	Construct DataFrame from dict of array-like or dicts.
`from_excel_dict`(excel_dict, df)	Construct a Dataset from a dictionary representation.
`from_file`(filepath, **kwargs)	Construct `Dataset` from a file.
`from_records`(data[, index, exclude, ...])	Convert structured or record ndarray to DataFrame.
`ge`(other[, axis, level])	Get Greater than or equal to of dataframe and other, element-wise (binary operator ge).
`get`(key[, default])	Get item from object for given key (ex: DataFrame column).
`group_by_keep_one`(**kwargs)
`groupby`([by, axis, level, as_index, sort, ...])	Group DataFrame using a mapper or by a Series of columns.
`gt`(other[, axis, level])	Get Greater than of dataframe and other, element-wise (binary operator gt).
`has_tag`(tag)	Returns true if `Dataset` contains tag.
`head`([n])	Return the first n rows.
`hist`([column, by, grid, xlabelsize, xrot, ...])	Make a histogram of the DataFrame's columns.
`idxmax`([axis, skipna, numeric_only])	Return index of first occurrence of maximum over requested axis.
`idxmin`([axis, skipna, numeric_only])	Return index of first occurrence of minimum over requested axis.
`infer_objects`()	Attempt to infer better dtypes for object columns.
`info`([verbose, buf, max_cols, memory_usage, ...])	Print a concise summary of a DataFrame.
`insert`(loc, column, value[, allow_duplicates])	Insert column into DataFrame at specified location.
`interpolate`([method, axis, limit, inplace, ...])	Fill NaN values using an interpolation method.
`isetitem`(loc, value)	Set the given value in the column with position 'loc'.
`isin`(values)	Whether each element in the DataFrame is contained in values.
`isna`()	Detect missing values.
`isnull`()	DataFrame.isnull is an alias for DataFrame.isna.
`items`()	Iterate over (column name, Series) pairs.
`iteritems`()	Iterate over (column name, Series) pairs.
`iterrows`()	Iterate over DataFrame rows as (index, Series) pairs.
`itertuples`([index, name])	Iterate over DataFrame rows as namedtuples.
`join`(other[, on, how, lsuffix, rsuffix, ...])	Join columns of another DataFrame.
`keep_cols`(cols[, inplace])	Keep specified columns (thus dropping the rest).
`keep_fields`(selected_fields[, inplace])	Keep specified fields (and drop the rest).
`keys`()	Get the 'info axis' (see Indexing for more).
`kurt`([axis, skipna, level, numeric_only])	Return unbiased kurtosis over requested axis.
`kurtosis`([axis, skipna, level, numeric_only])	Return unbiased kurtosis over requested axis.
`last`(offset)	Select final periods of time series data based on a date offset.
`last_valid_index`()	Return index for last non-NA value or None, if no non-NA value is found.
`le`(other[, axis, level])	Get Less than or equal to of dataframe and other, element-wise (binary operator le).
`lookup`(row_labels, col_labels)	Label-based "fancy indexing" function for DataFrame.
`lt`(other[, axis, level])	Get Less than of dataframe and other, element-wise (binary operator lt).
`mad`([axis, skipna, level])	Return the mean absolute deviation of the values over the requested axis.
`mask`(cond[, other, inplace, axis, level, ...])	Replace values where the condition is True.
`max`([axis, skipna, level, numeric_only])	Return the maximum of the values over the requested axis.
`mean`([axis, skipna, level, numeric_only])	Return the mean of the values over the requested axis.
`median`([axis, skipna, level, numeric_only])	Return the median of the values over the requested axis.
`melt`([id_vars, value_vars, var_name, ...])	Unpivot a DataFrame from wide to long format, optionally leaving identifiers set.
`memory_usage`([index, deep])	Return the memory usage of each column in bytes.
`merge`(right[, how, on, left_on, right_on, ...])	Merge DataFrame or named Series objects with a database-style join.
`min`([axis, skipna, level, numeric_only])	Return the minimum of the values over the requested axis.
`mod`(other[, axis, level, fill_value])	Get Modulo of dataframe and other, element-wise (binary operator mod).
`mode`([axis, numeric_only, dropna])	Get the mode(s) of each element along the selected axis.
`mul`(other[, axis, level, fill_value])	Get Multiplication of dataframe and other, element-wise (binary operator mul).
`multiply`(other[, axis, level, fill_value])	Get Multiplication of dataframe and other, element-wise (binary operator mul).
`ne`(other[, axis, level])	Get Not equal to of dataframe and other, element-wise (binary operator ne).
`nlargest`(n, columns[, keep])	Return the first n rows ordered by columns in descending order.
`notna`()	Detect existing (non-missing) values.
`notnull`()	DataFrame.notnull is an alias for DataFrame.notna.
`nsmallest`(n, columns[, keep])	Return the first n rows ordered by columns in ascending order.
`nunique`([axis, dropna])	Count number of distinct elements in specified axis.
`pad`(*[, axis, inplace, limit, downcast])	Synonym for `DataFrame.fillna()` with `method='ffill'`.
`pct_change`([periods, fill_method, limit, freq])	Percentage change between the current and a prior element.
`pipe`(func, args, *kwargs)	Apply chainable functions that expect Series or DataFrames.
`pivot`(*[, index, columns, values])	Return reshaped DataFrame organized by given index / column values.
`pivot_table`([values, index, columns, ...])	Create a spreadsheet-style pivot table as a DataFrame.
`pop`(item)	Return item and drop from frame.
`pow`(other[, axis, level, fill_value])	Get Exponential power of dataframe and other, element-wise (binary operator pow).
`prepend_level`(level[, inplace])	Create a `MultiIndex` by adding level as the first level.
`prod`([axis, skipna, level, numeric_only, ...])	Return the product of the values over the requested axis.
`product`([axis, skipna, level, numeric_only, ...])	Return the product of the values over the requested axis.
`quantile`([q, axis, numeric_only, ...])	Return values at the given quantile over requested axis.
`query`(expr, *[, inplace])	Query the columns of a DataFrame with a boolean expression.
`radd`(other[, axis, level, fill_value])	Get Addition of dataframe and other, element-wise (binary operator radd).
`rank`([axis, method, numeric_only, ...])	Compute numerical data ranks (1 through n) along axis.
`rdiv`(other[, axis, level, fill_value])	Get Floating division of dataframe and other, element-wise (binary operator rtruediv).
`reindex`([labels, index, columns, axis, ...])	Conform Series/DataFrame to new index with optional filling logic.
`reindex_like`(other[, method, copy, limit, ...])	Return an object with matching indices as other object.
`rename`([mapper, index, columns, axis, copy, ...])	Alter axes labels.
`rename_axis`([mapper, inplace])	Set the name of the axis for the index or columns.
`rename_col`(old_col, new_col[, inplace])	Rename `old_col` to `new_col`.
`reorder_levels`(order[, axis])	Rearrange index levels using input order.
`replace`([to_replace, value, inplace, limit, ...])	Replace values given in to_replace with value.
`replace_tag`(old_tag, new_tag)	Replace `old_tag` with `new_tag`.
`resample`(rule[, axis, closed, label, ...])	Resample time-series data.
`reset_index`([level, drop, inplace, ...])	Reset the index, or a level of it.
`rfloordiv`(other[, axis, level, fill_value])	Get Integer division of dataframe and other, element-wise (binary operator rfloordiv).
`rmod`(other[, axis, level, fill_value])	Get Modulo of dataframe and other, element-wise (binary operator rmod).
`rmul`(other[, axis, level, fill_value])	Get Multiplication of dataframe and other, element-wise (binary operator rmul).
`rolling`(window[, min_periods, center, ...])	Provide rolling window calculations.
`round`([decimals])	Round a DataFrame to a variable number of decimal places.
`rpow`(other[, axis, level, fill_value])	Get Exponential power of dataframe and other, element-wise (binary operator rpow).
`rsub`(other[, axis, level, fill_value])	Get Subtraction of dataframe and other, element-wise (binary operator rsub).
`rtruediv`(other[, axis, level, fill_value])	Get Floating division of dataframe and other, element-wise (binary operator rtruediv).
`sample`([n, frac, replace, weights, ...])	Return a random sample of items from an axis of object.
`select_dtypes`([include, exclude])	Return a subset of the DataFrame's columns based on the column dtypes.
`sem`([axis, skipna, level, ddof, numeric_only])	Return unbiased standard error of the mean over requested axis.
`set_axis`(labels, *[, axis, inplace, copy])	Assign desired index to given axis.
`set_flags`(*[, copy, allows_duplicate_labels])	Return a new object with updated flags.
`set_index`(keys, *[, drop, append, inplace, ...])	Set the DataFrame index using existing columns.
`shift`([periods, freq, axis, fill_value])	Shift index by desired number of periods with an optional time freq.
`skew`([axis, skipna, level, numeric_only])	Return unbiased skew over requested axis.
`slice_shift`([periods, axis])	Equivalent to shift without copying data.
`sort_by_id2`()	Sort `df` by `id2_col_name`.
`sort_index`(*[, axis, level, ascending, ...])	Sort object by labels (along an axis).
`sort_values`(by, *[, axis, ascending, ...])	Sort by the values along either axis.
`squeeze`([axis])	Squeeze 1 dimensional axis objects into scalars.
`stack`([level, dropna])	Stack the prescribed level(s) from columns to index.
`std`([axis, skipna, level, ddof, numeric_only])	Return sample standard deviation over requested axis.
`sub`(other[, axis, level, fill_value])	Get Subtraction of dataframe and other, element-wise (binary operator sub).
`subtract`(other[, axis, level, fill_value])	Get Subtraction of dataframe and other, element-wise (binary operator sub).
`sum`([axis, skipna, level, numeric_only, ...])	Return the sum of the values over the requested axis.
`swapaxes`(axis1, axis2[, copy])	Interchange axes and swap values axes appropriately.
`swaplevel`([i, j, axis])	Swap levels i and j in a `MultiIndex`.
`tail`([n])	Return the last n rows.
`take`(indices[, axis, is_copy])	Return the elements in the given positional indices along an axis.
`to_clipboard`([excel, sep])	Copy object to the system clipboard.
`to_csv`([path_or_buf, sep, na_rep, ...])	Write object to a comma-separated values (csv) file.
`to_dict`([orient, into])	Convert the DataFrame to a dictionary.
`to_excel`(excel_writer[, sheet_name, na_rep, ...])	Write `Dataset` to an Excel sheet.
`to_excel_dict`()	Convert the `Dataset` to a dictionary representation needed for Excel reading/writing.
`to_feather`(path, **kwargs)	Write a DataFrame to the binary Feather format.
`to_gbq`(destination_table[, project_id, ...])	Write a DataFrame to a Google BigQuery table.
`to_hdf`(path_or_buf, key[, mode, complevel, ...])	Write the contained data to an HDF5 file using HDFStore.
`to_html`([buf, columns, col_space, header, ...])	Render a DataFrame as an HTML table.
`to_json`([path_or_buf, orient, date_format, ...])	Convert the object to a JSON string.
`to_latex`([buf, columns, col_space, header, ...])	Render object to a LaTeX tabular, longtable, or nested table.
`to_markdown`([buf, mode, index, storage_options])	Print DataFrame in Markdown-friendly format.
`to_numpy`([dtype, copy, na_value])	Convert the DataFrame to a NumPy array.
`to_orc`([path, engine, index, engine_kwargs])	Write a DataFrame to the ORC format.
`to_parquet`([path, engine, compression, ...])	Write a DataFrame to the binary parquet format.
`to_period`([freq, axis, copy])	Convert DataFrame from DatetimeIndex to PeriodIndex.
`to_pickle`(path[, compression, protocol, ...])	Pickle (serialize) object to file.
`to_records`([index, column_dtypes, index_dtypes])	Convert DataFrame to a NumPy record array.
`to_sql`(name, con[, schema, if_exists, ...])	Write records stored in a DataFrame to a SQL database.
`to_stata`(path, *[, convert_dates, ...])	Export DataFrame object to Stata dta format.
`to_string`([buf, columns, col_space, header, ...])	Render a DataFrame to a console-friendly tabular output.
`to_timestamp`([freq, how, axis, copy])	Cast to DatetimeIndex of timestamps, at beginning of period.
`to_xarray`()	Return an xarray object from the pandas object.
`to_xml`([path_or_buffer, index, root_name, ...])	Render a DataFrame to an XML document.
`transform`(func[, axis])	Call `func` on self producing a DataFrame with the same axis shape as self.
`transpose`(*args[, copy])	Transpose index and columns.
`truediv`(other[, axis, level, fill_value])	Get Floating division of dataframe and other, element-wise (binary operator truediv).
`truncate`([before, after, axis, copy])	Truncate a Series or DataFrame before and after some index value.
`tshift`([periods, freq, axis])	Shift the time index, using the index's frequency if available.
`tz_convert`(tz[, axis, level, copy])	Convert tz-aware axis to target time zone.
`tz_localize`(tz[, axis, level, copy, ...])	Localize tz-naive index of a Series or DataFrame to target time zone.
`unstack`([level, fill_value])	Pivot a level of the (necessarily hierarchical) index labels.
`update`(other[, join, overwrite, ...])	Modify in place using non-NA values from another DataFrame.
`value_counts`([subset, normalize, sort, ...])	Return a Series containing counts of unique rows in the DataFrame.
`var`([axis, skipna, level, ddof, numeric_only])	Return unbiased variance over requested axis.
`where`(cond[, other, inplace, axis, level, ...])	Replace values where the condition is False.
`xs`(key[, axis, level, drop_level])	Return cross-section from the Series/DataFrame.

Attributes

`FIELD_DATE_COL_VALUES_POSSIBLE`	Possible default values for `date_col_name` of Dataset.
`FIELD_DATE_COL_VALUE_DEFAULT`	Default value for `date_col_name` of Dataset.
`FIELD_ID2_COL_VALUES_POSSIBLE`	Possible default values for `id2_col_name` of Dataset.
`FIELD_ID2_COL_VALUE_DEFAULT`	Default value for `id2_col_name` of Dataset.
`FIELD_ID_COL_VALUES_POSSIBLE`	Possible default values for `id_col_name` of Dataset.
`FIELD_ID_COL_VALUE_DEFAULT`	Default value for `id_col_name` of Dataset.
`T`
`all_fields`	Returns list of all fields of this `Dataset`.
`at`	Access a single value for a row/column label pair.
`attrs`	Dictionary of global attributes of this dataset.
`axes`	Return a list representing the axes of the DataFrame.
`col_count`	Number of columns in `Dataset`.
`columns`	The column labels of the DataFrame.
`date_col_errors`	Errors flag to use when parsing `date_col_name`
`date_col_name`	Column to use as record collection date.
`display_name`	The name for this `Dataset` suitable for display as generated by the `display_name_generator` function.
`display_name_generator`	The function used to generate `display_name`.
`dtypes`	Return the dtypes in the DataFrame.
`empty`	Indicator whether Series/DataFrame is empty.
`excel_sheetname`	Generates a valid Excel sheet name by truncating `display_name` to 30 characters.
`flags`	Get the properties associated with this pandas object.
`history`	History information as generated by the `macpie.util.MethodHistory` decorator.
`iat`	Access a single value for a row/column pair by integer position.
`id2_col_name`	Column to use as secondary record IDs.
`id_col_name`	Column to use as record IDs.
`iloc`	Purely integer-location based indexing for selection by position.
`index`	The index (row labels) of the DataFrame.
`key_cols`	Returns list of non-null key column names of this `Dataset`, defined as `id_col_name`, `date_col_name`, and `id2_col_name`
`key_fields`	Returns list of all key fields of this `Dataset` (analog of `key_cols`).
`loc`	Access a group of rows and columns by label(s) or a boolean array.
`name`	Name of Dataset.
`ndim`	Return an int representing the number of axes / array dimensions.
`non_key_cols`	Returns list of non-key column names of this `Dataset`, defined as any columns that are not `key_cols` or `sys_cols`.
`non_key_fields`	Returns list of all non-key fields of this `Dataset` (analog of `non_key_cols`).
`row_count`	Number of rows in `Dataset`.
`shape`	Return a tuple representing the dimensionality of the DataFrame.
`size`	Return an int representing the number of elements in this object.
`style`	Returns a Styler object.
`sys_cols`	Returns list of system column names of this `Dataset`, defined as any columns starting with `column.system.prefix` option.
`sys_fields`	Returns list of all system fields of this `Dataset` (analog of `sys_cols`).
`tag_duplicates`	Tag that denotes this Dataset has duplicates
`tags`	Tag(s) of Dataset.
`values`	Return a Numpy representation of the DataFrame.