noslag
collect_files
collect_files
Description
Takes a parent variable and optional list of databases to include, checks for their existence, and collects files and directories based on the parent variable.
Usage
collect_files(parent_variable, include_databases = NULL)
Arguments
Argument | Description |
---|---|
parent_variable |
Character. Variable to collect files on. |
include_databases |
Optional listCharacter. List of Database IDs to collect files from. |
Return Value
List of tuples. List of tuples containing the parent variable and the database ID for each file found in the specified directories.
Examples
## Example usage:
collect_files("variable_name", c("db1", "db2"))
normalise_units
normalise_units
Description
Takes a DataFrame with reported or reference data, along with dictionaries mapping variable units and flow IDs, and normalizes the units of the variables in the DataFrame based on the provided mappings.
Usage
normalise_units(df, level, var_units, var_flow_ids)
Arguments
Argument | Description |
---|---|
df |
DataFrame. Dataframe to be normalized. |
level |
Character. Specifies whether the data should be normalized on the reported or reference values. Possible values are 'reported' or 'reference'. |
var_units |
List. Dictionary that maps a combination of parent variable and variable to its corresponding unit. The keys in the dictionary are in the format "{parent_variable} |
var_flow_ids |
List. Dictionary that maps a combination of parent variable and variable to a specific flow ID. This flow ID is used for unit conversion in the normalize_units function. |
Return Value
DataFrame. Normalized dataframe.
Examples
## Example usage:
normalize_dataframe(df, "reported", var_units, var_flow_ids)
normalise_values
normalise_values
Description
Takes a DataFrame as input, normalizes the 'value' and 'uncertainty' columns by the reference value, and updates the 'reference_value' column accordingly.
Usage
normalise_values(df)
Arguments
Argument | Description |
---|---|
df |
DataFrame. Dataframe to be normalized. |
Return Value
DataFrame. Returns a modified DataFrame where the 'value' column has been divided by the 'reference_value' column (or 1.0 if 'reference_value' is null), the 'uncertainty' column has been divided by the 'reference_value' column, and the 'reference_value' column has been replaced with 1.0 if it was not null.
Examples
## Example usage:
normalized_df <- normalize_values(df)
DataSet
Description
This class provides methods to store, normalize, select, and aggregate DataSets.
Examples
### ------------------------------------------------
### Method `DataSet$normalise`
### ------------------------------------------------
## Example usage:
dataset$normalize(override = list("variable1" = "value1"), inplace = FALSE)
### ------------------------------------------------
### Method `DataSet$select`
### ------------------------------------------------
## Example usage:
dataset$select(override = list("variable1" = "value1"), drop_singular_fields = TRUE, extrapolate_period = FALSE, field1 = "value1")
### ------------------------------------------------
### Method `DataSet$aggregate`
### ------------------------------------------------
## Example usage:
dataset$aggregate(override = list("variable1" = "value1"), drop_singular_fields = TRUE, extrapolate_period = FALSE, agg = "field", masks = list(mask1, mask2), masks_database = TRUE)
Methods
Public Methods
Method new()
Create new instance of the DataSet class
Usage
DataSet$new(
parent_variable,
include_databases = NULL,
file_paths = NULL,
check_inconsistencies = FALSE,
data = NULL
)
Arguments:
parent_variable
Character. Variable to collect Data on.include_databases
Optional listCharacter | tupleCharacter, optional. Databases to load from.file_paths
Optional listCharacter, optional. Paths to load data from.check_inconsistencies
Logical, optional. Whether to check for inconsistencies.data
Optional DataFrame, optional. Specific data to include in the dataset.
Method normalise()
Normalize data: default reference units, reference value equal to 1.0, default reported units
Usage
DataSet$normalise(override = NULL, inplace = FALSE)
Arguments:
override
Optional listCharacter. Dictionary with key, value pairs of variables to override.inplace
Logical, optional. Whether to do the normalization in place.
Example:
## Example usage:
dataset$normalize(override = list("variable1" = "value1"), inplace = FALSE)
Returns:
DataFrame. If inplace
is FALSE
, returns normalized dataframe.
Method select()
Select desired data from the dataframe
Usage
DataSet$select(
override = NULL,
drop_singular_fields = TRUE,
extrapolate_period = TRUE,
...
)
Arguments:
override
Optional listCharacter. Dictionary with key, value pairs of variables to override.drop_singular_fields
Logical, optional. IfTRUE
, drop custom fields with only one value.extrapolate_period
Logical, optional. IfTRUE
, extrapolate values if no value for this period is given....
IDs of values to select.
Example:
## Example usage:
dataset$select(override = list("variable1" = "value1"), drop_singular_fields = TRUE, extrapolate_period = FALSE, field1 = "value1")
Returns:
DataFrame. DataFrame with selected values.
Method aggregate()
Aggregates data based on specified parameters, applies masks, and cleans up the resulting DataFrame.
Usage
DataSet$aggregate(
override = NULL,
drop_singular_fields = TRUE,
extrapolate_period = TRUE,
agg = NULL,
masks = NULL,
masks_database = TRUE,
...
)
Arguments:
override
Optional listCharacter. Dictionary with key, value pairs of variables to override.drop_singular_fields
Logical, optional. IfTRUE
, drop custom fields with only one value.extrapolate_period
Logical, optional. IfTRUE
, extrapolate values if no value for this period is given.agg
Optional Character | listCharacter | tupleCharacter. Specifies which fields to aggregate over.masks
Optional listMask. Specifies a list of Mask objects that will be applied to the data during aggregation. These masks can be used to filter or weight the data based on certain conditions defined in the Mask objects.masks_database
Logical, optional. Determines whether to include masks from databases in the aggregation process. IfTRUE
, masks from databases will be included along with any masks provided as function arguments. IfFALSE
, only the masks provided as function arguments will be applied....
additional field vals
Example:
## Example usage:
dataset$aggregate(override = list("variable1" = "value1"), drop_singular_fields = TRUE, extrapolate_period = FALSE, agg = "field", masks = list(mask1, mask2), masks_database = TRUE)
Returns:
DataFrame. The aggregate
method returns a pandas DataFrame that has been cleaned up and aggregated based on the specified parameters and input data. The method performs aggregation over component fields and case fields, applies weights based on masks, drops rows with NaN weights, aggregates with weights, inserts reference variables, sorts columns and rows, rounds values, and inserts units before returning the final cleaned and aggregated DataFrame.
Method clone()
The objects of this class are cloneable with this method.
Usage
DataSet$clone(deep = FALSE)
Arguments:
deep
Whether to make a deep clone.