pygcam.diff
¶
Functions for computing differences between CSV files and for generating CSV and XLSX from multiple CSV files.
API¶
See the https://opensource.org/licenses/MIT for license details.
-
pygcam.diff.
computeDifference
(df1, df2, resetIndex=True, dropna=True, asPercentChange=False, splitLand=False)¶ Compute the difference between two DataFrames.
Parameters: - df1 – a pandas DataFrame instance
- df2 – a pandas DataFrame instance
- resetIndex – (bool) if True (the default), the index in the DataFrame holding the computed difference is reset so that data in non-year columns appear in individual columns. Otherwise, the index in the returned DataFrame is based on all non-year columns.
- dropna – (bool) if True, drop rows with NaN values after computing difference
- asPercentChange – (bool) if True, compute percent change rather than difference.
- splitLand – (bool) whether to split ‘Landleaf’ column (if present) to create two new columns, ‘land_use’ and ‘basin’. Ignored if resetIndex is False.
Returns: a pandas DataFrame with the difference in all the year columns, computed as (df2 - df1) if asPercentChange is False, otherwise as (df2 - df1)/df1.
-
pygcam.diff.
diffCsvPathname
(query, baseline, policy, diffsDir=None, workingDir='.', createDir=False, asPercentChange=False)¶ Compute the path to the CSV file containing differences between policy and baseline scenarios for query.
Parameters: - query – (str) the base file name of the query result
- baseline – (str) the baseline scenario
- policy – (str) the policy scenario
- workingDir – (str) the directory immediately above the baseline and policy sandboxes.
- createDir – (bool) whether to create the diffs directory, if needed.
Returns: (str) the pathname of the CSV file
-
pygcam.diff.
queryCsvPathname
(query, scenario, workingDir='.')¶ Compute the path to the CSV file containing results for the given query and scenario.
Parameters: - query – (str) the base file name of the query result
- scenario – (str) the scenario name
- workingDir – (str) the directory immediately above the baseline and policy sandboxes.
Returns: (str) the pathname of the CSV file
-
pygcam.diff.
writeDiffsToCSV
(outFile, referenceFile, otherFiles, skiprows=1, interpolate=False, years=None, startYear=0, asPercentChange=False, splitLand=False)¶ Compute the differences between the data in a reference .CSV file and one or more other .CSV files as (other - reference), optionally interpolating annual values between timesteps, storing the results in a single .CSV file. See also
writeDiffsToXLSX()
andwriteDiffsToFile()
Parameters: - outFile – (str) the name of the .CSV file to create
- referenceFile – (str) the name of a .CSV file containing reference results
- otherFiles – (list of str) the names of other .CSV file for which to compute differences.
- skiprows – (int) should be 1 for GCAM files, to skip header info before column names
- interpolate – (bool) if True, linearly interpolate annual values between timesteps in all data files and compute the differences for all resulting years.
- years – (iterable of 2 values coercible to int) the range of years to include in results.
- startYear – (int) the year at which to begin interpolation, if interpolate is True. Defaults to the first year in years.
- asPercentChange – (bool) if True, compute percent change rather than difference.
Returns: none
-
pygcam.diff.
writeDiffsToFile
(outFile, referenceFile, otherFiles, ext='csv', skiprows=1, interpolate=False, years=None, startYear=0, asPercentChange=False, splitLand=False)¶ Compute the differences between the data in a reference .CSV file and one or more other .CSV files as (other - reference), optionally interpolating annual values between timesteps, storing the results in a single .CSV or .XLSX file. See
writeDiffsToCSV()
andwriteDiffsToXLSX()
for more details.Parameters: - outFile – (str) the name of the file to create
- referenceFile – (str) the name of a .CSV file containing reference results
- otherFiles – (list of str) the names of other .CSV file for which to compute differences.
- ext – (str) if ‘.csv’, results are written to a single .CSV file, otherwise, they are written to an .XLSX file.
- skiprows – (int) should be 1 for GCAM files, to skip header info before column names
- interpolate – (bool) if True, linearly interpolate annual values between timesteps in all data files and compute the differences for all resulting years.
- years – (iterable of 2 values coercible to int) the range of years to include in results.
- startYear – (int) the year at which to begin interpolation, if interpolate is True. Defaults to the first year in years.
- asPercentChange – (bool) whether to write diffs as percent change from baseline
- splitLand – (bool) whether to split ‘Landleaf’ column (if present) to create two new columns, ‘land_use’ and ‘basin’.
Returns: none
-
pygcam.diff.
writeDiffsToXLSX
(outFile, referenceFile, otherFiles, skiprows=1, interpolate=False, years=None, startYear=0, asPercentChange=False, splitLand=False)¶ Compute the differences between the data in a reference .CSV file and one or more other .CSV files as (other - reference), optionally interpolating annual values between timesteps, storing the results in a single .XLSX file with each difference matrix on a separate worksheet, and with an index worksheet with links to the other worksheets. See also
writeDiffsToCSV()
andwriteDiffsToFile()
.Parameters: - outFile – (str) the name of the .XLSX file to create
- referenceFile – (str) the name of a .CSV file containing reference results
- otherFiles – (list of str) the names of other .CSV file for which to compute differences.
- skiprows – (int) should be 1 for GCAM files, to skip header info before column names
- interpolate – (bool) if True, linearly interpolate annual values between timesteps in all data files and compute the differences for all resulting years.
- years – (iterable of 2 values coercible to int) the range of years to include in results.
- startYear – (int) the year at which to begin interpolation, if interpolate is True. Defaults to the first year in years.
- asPercentChange – (bool) if True, compute percent change rather than difference.
Returns: none