Defining parameters and distributions

The file parameters.xml defines the parameters modified in a Monte Carlo Simulation (MCS) with GCAM.

Note that there is no native concept of a “parameter” in GCAM. The XML files read by GCAM define thousands of numerical values describing emissions, conversion efficiencies, elasticities, share-weights, logit exponents, and more, with values that can differ by region, sector, subsector, year, and AEZ.

In pygcam.mcs, a parameter is defined as a query on an XML file (using the “XPath” query language) that produces a set of numerical values. This is general enough to allow the analyst to be as specific or broad as desired in deciding which values to perturb as a set.

For example, the following XML fragment defines the parameter n2o-emissions as all input-emissions elements below Non-CO2 elements named N2O_AGR (i.e., agricultural N2O emission) and AgProductionTechnology elements, for the year “2005” only. The query is applied to the XML file identified in the configuration file with “name” nonco2_aglu.

<!-- N2O emissions intensity -->
<InputFile name="nonco2_aglu">
  <Parameter name="n2o-emissions">
    <Query>//AgProductionTechnology/period[@year="2005"]/Non-CO2[@name="N2O_AGR"]/input-emissions</Query>
    <Distribution apply="multiply">
      <LogUniform factor="2"/>  <!-- i.e., from half to double -->
    </Distribution>
  </Parameter>
</InputFile>

The sub-command gensim ignores the query, and simply draws values from the designated distributions for all defined parameters, and saves these to a CSV file. Using a CSV file as an intermediate representation allows any plugin (or manual process) to generate the data used to generate the actual input XML files. (The default method is Latin Hypercube Sampling from the indicated distributions, but other sample methods are provided via the gensim’s -m / --method argument.)

XML elements

The elements that comprise the parameters.xml file are described below.

<ParameterList>

The top-most element, <ParameterList>, encloses one or more <InputFile> (or <comment>) elements. The <ParameterList> element accepts no attributes.

<InputFile>

The <InputFile> element describes a file on which XPath queries will be run for each <Parameter>, to produce sets of values that are perturbed for each trial. The <InputFile> element accepts the following attribute:

Attribute Required Default Values
name yes (none) text

The name is must identify an input file in the GCAM configuration XML file, given in a <Value> element within the <ScenarioComponents> element. For example, in this fragment of configuration_ref.xml:

<ScenarioComponents>
   ...
   <Value name="solver">../input/solution/cal_broyden_config.xml</Value>
   ...
</ScenarioComponents>

the input file would be indicated using the name "solver". Note that this requires each file in <ScenarioCompenents> to have a unique name, as is the case starting in GCAM v4.3.

An <InputFile> element must contain at least one <Parameter> element, and can contain any number of <comment> or <WriteFunc> elements.

<Parameter>

The <Parameter> element specifies the XPath query to run and the distribution to apply to the results of the query. It must contain exactly one <Distribution> element, zero or one <Query> elements, and any number of <Correlation> elements.

The <Parameter> element accepts the following attributes:

Attribute Required Default Values
name yes (none) text
mode no shared see below
active no “1” boolean

Accepted values for mode are “shared”, “independent”, and “ind”. The final two values are synonymous. The default, “shared”, indicates that a single random variable (RV) should be created and used for all the elements retrieved by the XPath query. The value “ind” or “independent” indicates that an RV should be created for each value returned. Note that it is common for queries to return hundreds or even tens of thousands of values, so specifying these as independent can create a lot of overhead. In most cases, “shared” is more appropriate.

The active attribute can be set to “0” or “false” to disable a <Parameter>, causing it to be ignored in the MCS.

<WriteFunc>

The <WriteFunc> element defines a Python function to be called before an XML file is written. This provides a hook to make arbitrary modifications to the XML that cannot be handled in a more straighforward manner. The element takes no attributes and must contain a period-delimited value that interpreted to be a sequence of Python package/module names and a final function name.

For example, to call the function my_func in the MyModule module of package MyPkg, you would write:

<WriteFunc>MyPkg.MyModule.my_func</WriteFunc>

<Distribution>

The <Distribution> element defines the shape of the distribution from which values should be randomly drawn. This element accepts an apply attribute that defines how the randomly drawn value will be applied to the values returned by the XPath query.

Attribute Required Default Values
apply no direct see below

The following values are recognized:

  • dir, direct, or replace : the random value replaces the values returned by the XPath query.
  • add : the randomly drawn value is added to the values returned by the XPath query.
  • mult or multiply : the randomly drawn value is multiplied by the values returned by the XPath query.
  • a period-delimited string indicating a package/module and function which is called to generate the result. # TBD: review this.

<Correlation>

This element allows the user to require that values of drawn from the current parameter’s distribution have a given rank correlation (with values in [-1, 1]) with values drawn for one or more other parameters. The rank correlation is produced by drawing all the random values and then reordering them so that the requested rank correlation obtains.

<With>

Names one parameter with which the enclosing parameter is correlated, and provides the correlation level. The “text” value of the <With> element must be a floating point number in the range [-1, 1].

Attribute Required Default Values
name yes (none) text

<Linked>

The <Linked> element allows the vector of random values drawn for one parameter to be shared with another parameter. This can be useful, for example, when you have two in two XML files that are conceptually a single “parameter”. This differs from the <Correlation> element, which ensures only a given level of rank correlation between parameters.

Attribute Required Default Values
parameter yes (none) text

<Constant>

This pseudo-distribution produces the designated value on every draw. It can be used to force a set of XML elements to a given value.

Attribute Required Default Values
value yes (none) float

<Sequence>

This pseudo-distribution produces each value from a discrete set of comma-separated values, in order. If the number of trials exceeds the number of values, the list is recycled as many times as needed. This distribution can be used to generate trials with a series of specific values.

Attribute Required Default Values
values yes (none) text*

The values attribute takes a text argument that must contain comma-delimited integer or floating point values. (Integers are converted to float, however.) Spaces around commas are removed, so they can be added for readability.

<Binary>

Produces a discrete distribution with a 50% chance of returning 0 or 1. This element accepts no attributes.

<Integers>

This produces a discrete distribution with all integer values in the range [min, max] having equal probability of being drawn.

Attribute Required Default Values
min no (none) float
max no (none) float

<Grid>

Produces a distribution of count values equally spaced across the range [min, max] (inclusive).

Attribute Required Default Values
min no (none) float
max no (none) float
count no (none) float

<Uniform>

Produces a uniform distribution of values from a given range. The range can be specified one of three ways:

  1. Explicit minimum and maximum values, e.g.
<Uniform min=0.25 max=0.5>
  1. A symmetrical spread around zero, equivalent to Uniform(-range, +`range`), which is used mainly when adding a random value to the original data:

    <Uniform range=0.25>
    

    which is equivalent to

    <Uniform min=-0.25 max=0.25>
    
  2. A symmetrical range around 1, defined as Uniform(1 - factor, 1 + factor), which is used mainly when multiplying a random value by the original data:

    <Uniform factor=0.25>
    

    which is equivalent to

    <Uniform min=0.75 max=1.25>
    

The valid attributes are any of the following sets:

Attribute Required Default Values
min yes (none) float
max yes (none) float

or

Attribute Required Default Values
factor yes (none) float

or

Attribute Required Default Values
range yes (none) float

<LogUniform>

The <LogUniform> element defines a uniform distribution from 1/factor to factor. For example, the following two distribution specifications are equivalent:

<LogUniform factor=3>

<Uniform min=0.333 max=3>
Attribute Required Default Values
factor yes (none) float

<Triangle>

Defines a triangular distribution. There are three alternatives for declaring the distribution:

  1. Explicit minimum, mode, and maximum values, e.g.
<Triangle min=0.25 mode=0.40 max=0.75>
  1. A symmetrical spread around zero, equivalent to Triangle(-range, 0, +`range`), which is used mainly when adding a random value to the original data:

    <Triangle range=0.25>
    

    which is equivalent to

    <Triangle min=-0.25 mode=0 max=0.25>
    
  2. A symmetrical range around 1, defined as Triangle(1 - factor, 1, 1 + factor), which is used mainly when multiplying a random value by the original data:

    <Triangle factor=0.25>
    

    which is equivalent to

    <Triangle min=0.75 mode=1.0 max=1.25>
    

The valid attributes are any of the following sets:

Attribute Required Default Values
min yes (none) float
max yes (none) float
mode yes (none) float

or

Attribute Required Default Values
range yes (none) float

or

Attribute Required Default Values
factor yes (none) float

<Normal>

Attribute Required Default Values
mean yes (none) float
stdev yes (none) float

<Lognormal>

This element defines a lognormal distribution one of two ways:

  1. By providing the mean and standard deviation of the lognormal distributio, e.g.
<Lognormal mean=0.5 stdev=0.2>
  1. By providing the 2.75% and 97.5% values (bounds of the central 95% of the distribution), e.g.,

    <Lognormal low95=0.1 high95=0.6>
    

The valid attributes are either of the following sets:

Attribute Required Default Values
mean yes (none) float
stdev yes (none) float

or

Attribute Required Default Values
low95 yes (none) float
high95 yes (none) float

<DataFile>

Describes a file containing data to be used in place of a distribution.

<PythonFunc>

The <PythonFunc> element defines a Python function to be called to produce an array of values used in place of a distribution. The element takes no attributes and must contain a period-delimited value that interpreted to be a sequence of Python package/module names and a final function name.

For example, to call the function my_func in the MyModule module of package MyPkg, you would write:

<WriteFunc>MyPkg.MyModule.my_func</WriteFunc>

Example

Following is an example of a parameters.xml file.

<?xml version="1.0" encoding="UTF-8"?>

<ParameterList>
  <!--
  The distribution on fraction of land protected is processed via the
  function 'protectTrial' in paper1/mcs/trialFuncs.py
   -->
  <InputFile name="land2">
    <Parameter name="protected-fraction" active="1">
      <Distribution apply="trialFuncs.protectTrial">
        <Uniform min="0.8" max="1.0"/>
      </Distribution>
    </Parameter>
  </InputFile>

  <!-- Takes values from parameter 'protected-fraction' -->
  <InputFile name="land3">
    <Parameter name="protected-fraction-linked" active="1">
      <Distribution apply="trialFuncs.protectTrial">
          <Linked parameter="protected-fraction"/>
      </Distribution>
    </Parameter>
  </InputFile>

  <!-- N2O emissions intensity -->
  <InputFile name="nonco2_aglu">
    <Parameter name="n2o-emissions">
      <Query>//AgProductionTechnology/period[@year="2005"]/Non-CO2[@name="N2O_AGR"]/input-emissions</Query>
      <Distribution apply="multiply">
        <LogUniform factor="2"/>  <!-- i.e., from half to double -->
      </Distribution>
    </Parameter>
  </InputFile>

  <InputFile name="energy_supply">
    <Parameter name="bd-biomassOil-coef">
      <Query>//global-technology-database/location-info[@sector-name="regional biomassOil"]/technology[@name="OilCrop"]/period[@year>"2010"]/minicam-energy-input/coefficient</Query>
      <Distribution apply="multiply">
        <Uniform factor="0.20"/>
      </Distribution>
    </Parameter>

    <Parameter name="corn-etoh-corn-coef">
      <Query>//global-technology-database/location-info/technology[@name="regional corn for ethanol"]/period[@year>"2010"]/minicam-energy-input/coefficient</Query>
      <Distribution apply="multiply">
        <Uniform factor="0.20"/>
      </Distribution>
    </Parameter>

    <Parameter name="corn-ddgs-output-ratio">
      <Query>//stub-technology[@name="regional corn for ethanol"]/period[@year>"2010"]/fractional-secondary-output[@name="DDGS and feedcakes"]/output-ratio</Query>
      <Distribution apply="multiply">
        <Uniform factor="0.20"/>
      </Distribution>
    </Parameter>

  </InputFile>

  <!-- Logit exponents -->
  <InputFile name="land1">
    <Parameter name="agro-forest-logit-exp">
      <!-- competition between forest-grass-crop and pasture (283 nodes) -->
      <Query>//LandAllocatorRoot/LandNode[starts-with(@name, "AgroForestLandAEZ")]/logit-exponent</Query>
      <Distribution apply="multiply">
        <Uniform factor="0.25"/>
      </Distribution>
    </Parameter>
  </InputFile>

  <InputFile name="land3">
    <Parameter name="forest-logit-exp">
      <!-- Managed vs unmanaged forest (283 nodes) -->
      <Query>//LandNode/LandNode[starts-with(@name,"AgroForest_NonPasture")]/LandNode[starts-with(@name,"AllForestLand")]/logit-exponent</Query>
      <Distribution apply="multiply">
        <Uniform factor="0.25"/>
      </Distribution>
    </Parameter>

    <Parameter name="crop-logit-exp">
      <!-- Crops (283 nodes) -->
      <Query>//LandNode/LandNode[starts-with(@name,"AgroForest_NonPasture")]/LandNode[starts-with(@name,"CropLand")]/logit-exponent</Query>
      <Distribution apply="multiply">
        <Uniform factor="0.25"/>
      </Distribution>
    </Parameter>

    <Parameter name="grass-shrub-logit-exp">
      <!-- Grass vs shrubland (283 nodes; value is 0.05 everywhere) -->
      <Query>//LandNode/LandNode[starts-with(@name,"AgroForest_NonPasture")]/LandNode[starts-with(@name,"GrassShrubLand")]/logit-exponent</Query>
      <Distribution apply="multiply">
        <Uniform factor="0.25"/>
      </Distribution>
    </Parameter>
  </InputFile>

  <InputFile name="land2">
    <Parameter name="pasture-logit-exp">
      <!-- Managed vs unmanaged pasture (283 nodes) -->
      <Query>//LandNode[starts-with(@name,"AgroForestLand")]/LandNode[starts-with(@name,"AllPastureLand")]/logit-exponent</Query>
      <Distribution apply="multiply">
        <Uniform factor="0.25"/>
      </Distribution>
    </Parameter>

    <Parameter name="forest-grass-crop-logit-exp">
      <!-- Forest vs grassland vs cropland (283 nodes) -->
      <Query>//LandNode[starts-with(@name,"AgroForestLand")]/LandNode[starts-with(@name,"AgroForest_NonPasture")]/logit-exponent</Query>
      <Distribution apply="multiply">
        <Uniform factor="0.25"/>
      </Distribution>
    </Parameter>
  </InputFile>

  <!-- Carbon densities -->
  <InputFile name="land3">
    <Parameter name="crop-biomass-c">
      <!-- All crops (3636 nodes) -->
      <Query>//LandNode/LandNode/LandNode[starts-with(@name,"CropLand")]/LandLeaf/land-carbon-densities/above-ground-carbon-density</Query>
      <Distribution apply="multiply">
        <Uniform factor="0.30"/>
      </Distribution>
    </Parameter>
    <Parameter name="crop-soil-c">
      <!-- All crops (3636 nodes) -->
      <Query>//LandNode/LandNode/LandNode[starts-with(@name,"CropLand")]/LandLeaf/land-carbon-densities/below-ground-carbon-density</Query>
      <Distribution apply="multiply">
        <Uniform factor="0.30"/>
      </Distribution>
    </Parameter>

    <!-- Managed forest (566 nodes; the "*" captures both <land-use-history> and <no-emiss-carbon-calc> elements) -->
    <Parameter name="mgd-forest-biomass-c">
      <Query>//LandNode/LandNode/LandNode[starts-with(@name,"AllForestLand")]/LandLeaf[starts-with(@name,"ForestAEZ")]/*/above-ground-carbon-density</Query>
      <Distribution apply="multiply">
        <Uniform factor="0.30"/>
      </Distribution>
    </Parameter>
    <Parameter name="mgd-forest-soil-c">
      <Query>//LandNode/LandNode/LandNode[starts-with(@name,"AllForestLand")]/LandLeaf[starts-with(@name,"ForestAEZ")]/*/below-ground-carbon-density</Query>
      <Distribution apply="multiply">
        <Uniform factor="0.30"/>
      </Distribution>
    </Parameter>

    <!-- Unmanaged forest (283 nodes) -->
    <Parameter name="unmgd-forest-biomass-c">
      <Query>//LandNode/LandNode/LandNode/UnmanagedLandLeaf[starts-with(@name,"UnmanagedForest")]/no-emiss-carbon-calc/above-ground-carbon-density</Query>
      <Distribution apply="multiply">
        <Uniform factor="0.30"/>
      </Distribution>
    </Parameter>
    <Parameter name="unmgd-forest-soil-c">
      <Query>//LandNode/LandNode/LandNode/UnmanagedLandLeaf[starts-with(@name,"UnmanagedForest")]/no-emiss-carbon-calc/below-ground-carbon-density</Query>
      <Distribution apply="multiply">
        <Uniform factor="0.30"/>
      </Distribution>
    </Parameter>

    <!-- Other arable land (283 nodes) -->
    <Parameter name="other-arable-biomass-c">
      <Query>//LandNode/LandNode/LandNode/UnmanagedLandLeaf[starts-with(@name,"OtherArableLand")]/land-carbon-densities/above-ground-carbon-density</Query>
      <Distribution apply="multiply">
        <Uniform factor="0.30"/>
      </Distribution>
    </Parameter>
    <Parameter name="other-arable-soil-c">
      <Query>//LandNode/LandNode/LandNode/UnmanagedLandLeaf[starts-with(@name,"OtherArableLand")]/land-carbon-densities/below-ground-carbon-density</Query>
      <Distribution apply="multiply">
        <Uniform factor="0.30"/>
      </Distribution>
    </Parameter>

    <!-- Unmanaged shrubland (283 nodes) -->
    <Parameter name="shrub-biomass-c">
      <Query>//LandNode/LandNode/LandNode/UnmanagedLandLeaf[starts-with(@name,"Shrubland")]/land-carbon-densities/above-ground-carbon-density</Query>
      <Distribution apply="multiply">
        <Uniform factor="0.30"/>
      </Distribution>
    </Parameter>
    <Parameter name="shrub-soil-c">
      <Query>//LandNode/LandNode/LandNode/UnmanagedLandLeaf[starts-with(@name,"Shrubland")]/land-carbon-densities/below-ground-carbon-density</Query>
      <Distribution apply="multiply">
        <Uniform factor="0.30"/>
      </Distribution>
    </Parameter>

    <!-- Unmanaged grassland (283 nodes) -->
    <Parameter name="grass-biomass-c">
      <Query>//LandNode/LandNode/LandNode/UnmanagedLandLeaf[starts-with(@name,"Grassland")]/land-carbon-densities/above-ground-carbon-density</Query>
      <Distribution apply="multiply">
        <Uniform factor="0.30"/>
      </Distribution>
    </Parameter>
    <Parameter name="grass-soil-c">
      <Query>//LandNode/LandNode/LandNode/UnmanagedLandLeaf[starts-with(@name,"Grassland")]/land-carbon-densities/below-ground-carbon-density</Query>
      <Distribution apply="multiply">
        <Uniform factor="0.30"/>
      </Distribution>
    </Parameter>
  </InputFile>

  <!-- Carbon densities -->
  <InputFile name="land2">
    <!-- Managed pasture (283 nodes) -->
    <Parameter name="mgd-pasture-biomass-c">
      <Query>//LandNode/LandNode/LandLeaf[starts-with(@name,"Pasture")]/land-carbon-densities/above-ground-carbon-density</Query>
      <Distribution apply="multiply">
        <Uniform factor="0.30"/>
      </Distribution>
    </Parameter>
    <Parameter name="mgd-pasture-soil-c">
      <Query>//LandNode/LandNode/LandLeaf[starts-with(@name,"Pasture")]/land-carbon-densities/below-ground-carbon-density</Query>
      <Distribution apply="multiply">
        <Uniform factor="0.30"/>
      </Distribution>
    </Parameter>

    <!-- Unmanaged pasture (283 nodes) -->
    <Parameter name="unmgd-pasture-biomass-c">
      <Query>//LandNode/LandNode/UnmanagedLandLeaf[starts-with(@name,"UnmanagedPasture")]/land-carbon-densities/above-ground-carbon-density</Query>
      <Distribution apply="multiply">
        <Uniform factor="0.30"/>
      </Distribution>
    </Parameter>
    <Parameter name="unmgd-pasture-soil-c">
      <Query>//LandNode/LandNode/UnmanagedLandLeaf[starts-with(@name,"UnmanagedPasture")]/land-carbon-densities/below-ground-carbon-density</Query>
      <Distribution apply="multiply">
        <Uniform factor="0.30"/>
      </Distribution>
    </Parameter>
  </InputFile>

  <!-- Agricultural productivity (yield) (66,222 nodes) -->
  <InputFile name="ag_prodchange">
    <Parameter name="crop-productivity">
      <Query>//region/AgSupplySector/AgSupplySubsector/AgProductionTechnology/period[@year>"2010"]/agProdChange</Query>
      <Distribution apply="multiply">
        <Uniform factor="0.30"/>
      </Distribution>
    </Parameter>
  </InputFile>

  <!-- Food demand -->
  <InputFile name="demand">
    <!-- Price elasticity of food crop demand (558 nodes; currently zero everywhere, so can't multiply) -->
    <Parameter name="food-crop-price-elast">
      <Query>//energy-final-demand[@name="FoodDemand_Crops"]/price-elasticity[@year>"2010"]</Query>
      <Distribution apply="direct">
        <Uniform min="-0.2" max="0"/>
      </Distribution>
    </Parameter>

    <!-- Price elasticity of meat demand (558 nodes) -->
    <Parameter name="meat-price-elast">
      <Query>//energy-final-demand[@name="FoodDemand_Meat"]/price-elasticity[@year>"2010"]</Query>
      <Distribution apply="multiply">
        <Uniform factor="0.2"/>
      </Distribution>
    </Parameter>

    <!-- Income elasticity of food crop demand (558 nodes) -->
    <Parameter name="food-crop-income-elast">
      <Query>//energy-final-demand[@name="FoodDemand_Crops"]/income-elasticity[@year>"2010"]</Query>
      <Distribution apply="multiply">
        <Uniform factor="0.2"/>
      </Distribution>
    </Parameter>

    <!-- Income elasticity of meat demand (558 nodes) -->
    <Parameter name="meat-income-elast">
      <Query>//energy-final-demand[@name="FoodDemand_Meat"]/income-elasticity[@year>"2010"]</Query>
      <Distribution apply="multiply">
        <Uniform factor="0.4"/>
      </Distribution>
    </Parameter>
  </InputFile>

</ParameterList>