Skip to content

Python

scenario_vetting_criteria

Definitions of scenario validation criteria.

Use the load_criteria function to load definitions from raw definition files.

load_criteria

load_criteria(
    components=None,
    load_all=False,
    csv_engine="pandas",
    criteria_types=None,
    reference_subset=None,
    release=None,
)

Load and return the criteria definitions contained in the package.

Parameters:

Name Type Description Default
components str | list[str] | tuple[str]

A string or list/vector of strings. The return type changes depending on whether a list/vector or a single string is provided.

None
load_all bool

Alternatively to providing the names of individual components, the loading of all components can be instructed with the key-word argument load_all=True.

False
csv_engine str = 'pandas'

The method for loading CSV files if these are supposed to be loaded. Must be one of pandas or python. Defaults to pandas. The output changes accordingly.

'pandas'
criteria_types str | list[str] | tuple[str]

When loading the components thresholds and metadata, by default all criteria types are loaded. Alternatively, a single string or a list or tuple of strings can be provided as argument criteria_types to load only a subset of criteria of corresponding type(s).

None
reference_subset str | list[str] | tuple[str]

When loading the component reference-data, by default all sources are loaded. Alternatively, a single string or a list or tuple of strings can be provided as argument reference_subset to load only a subset of sources.

None
release str

Define the release of the criteria definition to load. If not provided, the latest release will be used.

None

Returns:

Type Description
DataFrame | dict[str, str] | dict[str, DataFrame | dict[str, str]]

Returns the loaded data. This data can be a dataframe or a nested list. If multiple data components are requested, then the components are returned inside a keyworded list.

Source code in python/scenario_vetting_criteria/__init__.py
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
def load_criteria(
    components: str | list[str] | tuple[str] | None = None,
    load_all: bool = False,
    csv_engine: Literal["pandas", "python"] = "pandas",
    criteria_types: str | list[str] | None = None,
    reference_subset: str | list[str] | tuple[str] | None = None,
    release: str | None = None,
):
    """Load and return the criteria definitions contained in the package.

    Parameters
    ----------
    components : str | list[str] | tuple[str], optional
        A string or list/vector of strings. The return type changes depending
        on whether a list/vector or a single string is provided.
    load_all : bool, optional
        Alternatively to providing the names of individual components, the
        loading of all components can be instructed with the key-word argument
        `load_all=True`.
    csv_engine : str = 'pandas', optional
        The method for loading CSV files if these are supposed to be loaded.
        Must be one of `pandas` or `python`. Defaults to `pandas`. The output
        changes accordingly.
    criteria_types : str | list[str] | tuple[str], optional
        When loading the components `thresholds` and `metadata`, by default
        all criteria types are loaded. Alternatively, a single string or a
        list or tuple of strings can be provided as argument `criteria_types`
        to load only a subset of criteria of corresponding type(s).
    reference_subset : str | list[str] | tuple[str], optional
        When loading the component `reference-data`, by default all sources
        are loaded. Alternatively, a single string or a list or tuple of
        strings can be provided as argument `reference_subset` to load only
        a subset of sources.
    release : str, optional
        Define the release of the criteria definition to load. If not
        provided, the latest release will be used.

    Returns
    -------
    pd.DataFrame | dict[str, str] | dict[str, pd.DataFrame | dict[str, str]]
        Returns the loaded data. This data can be a dataframe or a nested
        list. If multiple data components are requested, then the components
        are returned inside a keyworded list.

    """
    if components is None and not load_all:
        raise Exception(
            "At least one component must be provided as function argument."
        )
    if components is not None and load_all:
        raise Exception(
            "Component name(s) and `load_all` cannot be provided as arguments "
            "at the same time."
        )
    if load_all:
        components = COMPONENTS
    if release is None:
        release = sorted(list(releases))[-1]
    elif release not in releases:
        raise Exception(
            f"Release '{release}' not known. Choose from: "
            f"{', '.join(releases)}"
        )
    release_path = releases[release]
    if criteria_types is not None:
        if isinstance(criteria_types, str):
            criteria_types = [criteria_types]
        elif not isinstance(criteria_types, tuple):
            criteria_types = list(criteria_types)
    if reference_subset is not None:
        if isinstance(reference_subset, str):
            reference_subset = [reference_subset]
        elif isinstance(reference_subset, tuple):
            reference_subset = list(reference_subset)
    if isinstance(components, str):
        return _load_criteria_file(
            component=components,
            csv_engine=csv_engine,
            criteria_types=criteria_types,
            reference_subset=reference_subset,
            release_path=release_path,
        )
    elif (
        isinstance(components, list) and
        all(isinstance(c, str) for c in components)
    ):
        return {
            component: _load_criteria_file(
                component=component,
                csv_engine=csv_engine,
                criteria_types=criteria_types,
                reference_subset=reference_subset,
                release_path=release_path,
            )
            for component in components
        }
    else:
        raise Exception(
            "Argument `components` must be string or list of strings."
        )

scenario_vetting_criteria.formatting

Format bibliographic information on sources.

format_sources

format_sources(
    bib_data,
    style="alpha",
    target="plaintext",
    exclude_fields=None,
)

Convert sources to specific format.

Takes a citation style, a citation format, and (optionally) excluded fields, and returns a formatted list of sources based on the specified style and format. The sources are loaded from 'sources.bib' file.

Parameters:

Name Type Description Default
bib_data BibliographyData

Bibliography data loaded from BibTeX file.

required
style str

Specifies the formatting style for the bibliography entries.

'alpha'
target str

Specifies the format in which the citation should be rendered. It determines how the citation information will be displayed or structured in the final output. This can be 'plaintext' or 'html'.

'plaintext'
exclude_fields Optional[list]

Specifies a list of fields that should be excluded from the final output. These fields will be removed from the entries before formatting and returning the citation data.

None

Returns:

Type Description
list[dict]

A list of dictionaries containing the identifier, citation, and URL information for each entry in the bibliography data, formatted according to the specified style and form, with any excluded fields removed.

Source code in python/scenario_vetting_criteria/formatting.py
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
def format_sources(
    bib_data: BibliographyData,
    style: str = "alpha",
    target: str = "plaintext",
    exclude_fields: Optional[list] = None,
) -> dict[str, str]:
    """Convert sources to specific format.

    Takes a citation style, a citation format, and (optionally) excluded
    fields, and returns a formatted list of sources based on the specified
    style and format. The sources are loaded from 'sources.bib' file.

    Parameters
    ----------
    bib_data
        Bibliography data loaded from BibTeX file.
    style
        Specifies the formatting style for the bibliography entries.
    target
        Specifies the format in which the citation should be rendered.
        It determines how the citation information will be displayed or
        structured in the final output. This can be 'plaintext' or 'html'.
    exclude_fields
        Specifies a list of fields that should be excluded from the
        final output. These fields will be removed from the entries
        before formatting and returning the citation data.

    Returns
    -------
        list[dict]
            A list of dictionaries containing the identifier, citation,
            and URL information for each entry in the bibliography
            data, formatted according to the specified style and form,
            with any excluded fields removed.

    """
    # set exclude_fields to an empty list if provided as None
    exclude_fields = exclude_fields or []

    # load pybtext styles and formats based on arguments
    pyb_style = find_plugin("pybtex.style.formatting", style)()
    pyb_format = find_plugin("pybtex.backends", target)()

    # exclude undesired fields
    if exclude_fields:
        for entry in bib_data.entries.values():
            for ef in exclude_fields:
                if ef in entry.fields.__dict__["_dict"]:
                    del entry.fields.__dict__["_dict"][ef]

    # loop over entries and format accordingly
    ret = {}
    for identifier in bib_data.entries:
        try:
            entry = bib_data.entries[identifier]

            first_author = entry.persons.get("author", [])[0].last_names
            cite_auth = re.sub("[{}]", "", " ".join(first_author))
            cite_year = entry.fields.get("year", "n.d.")

            doi = entry.fields.get("doi", None)
            url = entry.fields.get("url", None)
            pdf = entry.fields.get("pdf", None)
            url_doi = f"https://doi.org/{doi}" if doi else None

            if doi:
                del entry.fields["doi"]
            if url:
                del entry.fields["url"]
            if pdf:
                del entry.fields["pdf"]

            bib = next(pyb_style.format_entries([entry])).text.render(
                pyb_format
            )

            ret[identifier] = {
                "cite_auth": cite_auth,
                "cite_year": cite_year,
                "cite": f"{cite_auth} ({cite_year})",
                "citep": f"({cite_auth}, {cite_year})",
                "bib": bib,
                "doi": doi,
                "url_doi": url_doi,
                "url": url or url_doi,
                "pdf": pdf,
            }
        except Exception as ex:
            raise Exception(
                f"Error occurred while parsing '{identifier}':\n{ex}"
            )

    return ret

insert_citations

insert_citations(text, citations, link='')

Insert citations into placeholders in a text.

Parameters:

Name Type Description Default
text str

Text that contains replacement patterns for citations.

required
citations dict[str, dict[str, str]]

Formatted citations for each identifier.

required
link str

Top-level page address for all citations.

''

Returns:

Type Description
str

The updated text, which has the patterns replaced with citations.

Source code in python/scenario_vetting_criteria/formatting.py
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
def insert_citations(
    text: str,
    citations: dict[str, dict[str, str]],
    link: str = "",
) -> str:
    """Insert citations into placeholders in a text.

    Parameters
    ----------
    text
        Text that contains replacement patterns for citations.
    citations
        Formatted citations for each identifier.
    link
        Top-level page address for all citations.

    Returns
    -------
        str
            The updated text, which has the patterns replaced with citations.

    """
    return re.sub(
        r"{{(cite|citep):([^}]+)}}",
        lambda m: (
            (f'<a href="{link}#{m.group(2)}">' if link else "")
            + citations.get(m.group(2), {}).get(m.group(1), m.group(0))
            + ("</a>" if link else "")
        ),
        text,
    )