Skip to content

noslag

collect_files

collect_files

Description

Takes a parent variable and optional list of databases to include, checks for their existence, and collects files and directories based on the parent variable.

Usage

collect_files(parent_variable, include_databases = NULL)

Arguments

Argument Description
parent_variable Character. Variable to collect files on.
include_databases Optional listCharacter. List of Database IDs to collect files from.

Return Value

List of tuples. List of tuples containing the parent variable and the database ID for each file found in the specified directories.

Examples

## Example usage:
collect_files("variable_name", c("db1", "db2"))

normalise_units

normalise_units

Description

Takes a DataFrame with reported or reference data, along with dictionaries mapping variable units and flow IDs, and normalizes the units of the variables in the DataFrame based on the provided mappings.

Usage

normalise_units(df, level, var_units, var_flow_ids)

Arguments

Argument Description
df DataFrame. Dataframe to be normalized.
level Character. Specifies whether the data should be normalized on the reported or reference values. Possible values are 'reported' or 'reference'.
var_units List. Dictionary that maps a combination of parent variable and variable to its corresponding unit. The keys in the dictionary are in the format "{parent_variable}
var_flow_ids List. Dictionary that maps a combination of parent variable and variable to a specific flow ID. This flow ID is used for unit conversion in the normalize_units function.

Return Value

DataFrame. Normalized dataframe.

Examples

## Example usage:
normalize_dataframe(df, "reported", var_units, var_flow_ids)

normalise_values

normalise_values

Description

Takes a DataFrame as input, normalizes the 'value' and 'uncertainty' columns by the reference value, and updates the 'reference_value' column accordingly.

Usage

normalise_values(df)

Arguments

Argument Description
df DataFrame. Dataframe to be normalized.

Return Value

DataFrame. Returns a modified DataFrame where the 'value' column has been divided by the 'reference_value' column (or 1.0 if 'reference_value' is null), the 'uncertainty' column has been divided by the 'reference_value' column, and the 'reference_value' column has been replaced with 1.0 if it was not null.

Examples

## Example usage:
normalized_df <- normalize_values(df)

DataSet

Description

This class provides methods to store, normalize, select, and aggregate DataSets.

Examples

### ------------------------------------------------
### Method `DataSet$normalise`
### ------------------------------------------------

## Example usage:
dataset$normalize(override = list("variable1" = "value1"), inplace = FALSE)


### ------------------------------------------------
### Method `DataSet$select`
### ------------------------------------------------

## Example usage:
dataset$select(override = list("variable1" = "value1"), drop_singular_fields = TRUE, extrapolate_period = FALSE, field1 = "value1")


### ------------------------------------------------
### Method `DataSet$aggregate`
### ------------------------------------------------

## Example usage:
dataset$aggregate(override = list("variable1" = "value1"), drop_singular_fields = TRUE, extrapolate_period = FALSE, agg = "field", masks = list(mask1, mask2), masks_database = TRUE)

Methods

Public Methods

Method new()

Create new instance of the DataSet class

Usage

DataSet$new(
  parent_variable,
  include_databases = NULL,
  file_paths = NULL,
  check_inconsistencies = FALSE,
  data = NULL
)

Arguments:

  • parent_variable Character. Variable to collect Data on.
  • include_databases Optional listCharacter | tupleCharacter, optional. Databases to load from.
  • file_paths Optional listCharacter, optional. Paths to load data from.
  • check_inconsistencies Logical, optional. Whether to check for inconsistencies.
  • data Optional DataFrame, optional. Specific data to include in the dataset.

Method normalise()

Normalize data: default reference units, reference value equal to 1.0, default reported units

Usage

DataSet$normalise(override = NULL, inplace = FALSE)

Arguments:

  • override Optional listCharacter. Dictionary with key, value pairs of variables to override.
  • inplace Logical, optional. Whether to do the normalization in place.

Example:

## Example usage:
dataset$normalize(override = list("variable1" = "value1"), inplace = FALSE)

Returns:

DataFrame. If inplace is FALSE, returns normalized dataframe.

Method select()

Select desired data from the dataframe

Usage

DataSet$select(
  override = NULL,
  drop_singular_fields = TRUE,
  extrapolate_period = TRUE,
  ...
)

Arguments:

  • override Optional listCharacter. Dictionary with key, value pairs of variables to override.
  • drop_singular_fields Logical, optional. If TRUE, drop custom fields with only one value.
  • extrapolate_period Logical, optional. If TRUE, extrapolate values if no value for this period is given.
  • ... IDs of values to select.

Example:

## Example usage:
dataset$select(override = list("variable1" = "value1"), drop_singular_fields = TRUE, extrapolate_period = FALSE, field1 = "value1")

Returns:

DataFrame. DataFrame with selected values.

Method aggregate()

Aggregates data based on specified parameters, applies masks, and cleans up the resulting DataFrame.

Usage

DataSet$aggregate(
  override = NULL,
  drop_singular_fields = TRUE,
  extrapolate_period = TRUE,
  agg = NULL,
  masks = NULL,
  masks_database = TRUE,
  ...
)

Arguments:

  • override Optional listCharacter. Dictionary with key, value pairs of variables to override.
  • drop_singular_fields Logical, optional. If TRUE, drop custom fields with only one value.
  • extrapolate_period Logical, optional. If TRUE, extrapolate values if no value for this period is given.
  • agg Optional Character | listCharacter | tupleCharacter. Specifies which fields to aggregate over.
  • masks Optional listMask. Specifies a list of Mask objects that will be applied to the data during aggregation. These masks can be used to filter or weight the data based on certain conditions defined in the Mask objects.
  • masks_database Logical, optional. Determines whether to include masks from databases in the aggregation process. If TRUE, masks from databases will be included along with any masks provided as function arguments. If FALSE, only the masks provided as function arguments will be applied.
  • ... additional field vals

Example:

## Example usage:
dataset$aggregate(override = list("variable1" = "value1"), drop_singular_fields = TRUE, extrapolate_period = FALSE, agg = "field", masks = list(mask1, mask2), masks_database = TRUE)

Returns:

DataFrame. The aggregate method returns a pandas DataFrame that has been cleaned up and aggregated based on the specified parameters and input data. The method performs aggregation over component fields and case fields, applies weights based on masks, drops rows with NaN weights, aggregates with weights, inserts reference variables, sorts columns and rows, rounds values, and inserts units before returning the final cleaned and aggregated DataFrame.

Method clone()

The objects of this class are cloneable with this method.

Usage

DataSet$clone(deep = FALSE)

Arguments:

  • deep Whether to make a deep clone.