Tutorial 2: Creating an Atmosphere and Working with Physical Units

# Some set-up for this document
using RetrievalToolbox;
const RE = RetrievalToolbox;
using Plots; default(fontfamily="Helvetica", titlefontsize=10, labelfontsize=10)
using Unitful;

1 Introduction

In the previous tutorial (Tutorial 1), we have learned how to instantiate two fundamental objects within RetrievalToolbox, spectroscopy and gases. Those two object types are found towards the bottom of the RetrievalToolbox type hierarchy: an ABSCOSpectroscopy4D object does not require another RetrievalToolbox object type, and a GasAbsorber only requires one spectroscopy type.

Goal of this tutorial

In this tutorial we will focus on constructing an atmosphere object with RetrievalToolbox, which plays an equally important role within retrieval applications. Further, we will learn how we can effectively use physical units.

1.1 The `EarthAtmosphere` Object

RetrievalToolbox features a type that represents a typical atmosphere found here on Earth, of course only containing information that is relevant for typical retrieval applications. Within RetrievalToolbox, we follow the general model of a layered atmosphere in which we assume all relevant quantities to be constant inside that layer. Here we already must make an important observation! The EarthAtmosphere object contains not just one, but two layer systems: one representing the vertical grid which the inversion will see, and a second one on which meteorological quantities are defined. While possibly confusing at first, this set-up allows us to be much more flexible with respect to our choices in setting up the retrieval algorithm.

1.1.1 The Retrieval and Meteorological Grids

The choice of pressure grid in a given retrieval application is based on a number of factors, such as computational performance, type of instrument from whose measurements we are trying to retrieve gases, and so on. More layers usually means slower forward model, as the radiative transfer solver has to do more work. On the other hand, we might find that we need more vertical layers to account for e.g. scatterers that are located in different parts of the atmosphere and a coarse layering structure would not allow us to reasonably model those. We recommend reading the appropriate section in (Rodgers 2000, vol. 2, sec. 10.3.1) to learn more about optimal choices for the retrieval grid.

In general, we want to be highly flexible when making that choice regarding this retrieval grid.

Further, most retrieval applications require meteorological profiles. For retrievals in the short-wave infrared, the most commonly uses ones are profiles of temperature and specific humidity. It is also highly common for these meteorological profiles to be derived from modern forecasting systems or reanalysis products, both tend to have much higher vertical resolution than most retrieval grids. We aim to retain the higher resolution of those MET profiles, and thus RetrievalToolbox has a dedicated space for those profiles. Figure 1 shows an illustration of two possible grids.

Note

The retrieval and meteorological grids are fully independent of each other! There is no general requirement that one must be finer resolved than the other one, and they could be the same!

Figure 1: The retrieval pressure grid (left) compared to the meteorological grid (right).

Note the convention in Figure 1! By this convention, we order the atmosphere with a running index starting at 1 from the top. p_1 thus refer to the top-of-atmosphere (TOA) level in this figure. Also note that we use the same convention when indexing layers rather than levels. The first layer, situated between levels p_1 and p_2 also has index 1.

Warning

Handling level and layer-related quantities can easily cause confusion and requires attention! In general, there is no guaranteed method of learning whether a quantity is defined with respect to layers or levels, other than understanding the nature of that quantity itself. Make sure to consult on-line and in-line documentation of the related functions!

The meteorological grid and its associated profiles (temperature T, specific humidity q, altitude z and local gravity g) are used mostly in the calculation of the optical properties that later enter the radiative transfer calculations and are of utmost importance.

We can illustrate this by a specific example. The quantity of interest there is the optical depth due to some gas absorber, within layer l = 1:

\tau_l = \int_{p_2}^{p_1} \underbrace{\dfrac{1 - q(p)}{g(p) \cdot M_{\mathrm{dry}}}}_{\text{number of dry-air molecules}} \cdot \overbrace{c_\text{gas}(p)}^{\text{concentration of gas}} \cdot \underbrace{k(p)}_{\text{gas cross section}} \; \mathrm{d}p \tag{1}

Without going into the details about how one numerically evaluates this integral, we can identify the various terms in Equation 1 and whether they belong to the retrieval or the meteorological grid.

The gas concentration c_\text{gas}(p) is based off a profile that is likely connected to the retrieval grid. For the sake of this example, let us imagine this particular gas represents a target gas that we want to retrieve from a measurement, and therefore the profile is defined on the retrieval grid. In RetrievalToolbox, gases to be retrieved are defined on pressure levels p_i and values for any arbitrary p are obtained through linear interpolation in (linear) pressure.

The two meteorological pressure-dependent variables q(p) (specific humdity) and g(p) (local gravity) are discretized on their own grid (the right grid in Figure 1). When evaluating q(p) or g(p) for arbitrary p, the respective values are obtained via linear interpolation on their own meteorological grid.

Lastly, the evaluation of the gas cross section k(p) is actually performed in a completely different grid that either of the two shown in Figure 1. As mentioned in another tutorial (Tutorial 1), spectroscopic data is sampled on its own pressure and temperature levels.

Note however the following. When we have to evaluate the integrand in Equation 1 at some arbitrary p, we must also look up k(p). Earlier, we learned that k(p) is really k(p,T,c_\text{H$_2$O}), so we must also know what the temperature and the water vapor mole fraction is for any given pressure in the model atmosphere. The temperature profile is defined on the MET grid, so we can easily infer T \rightarrow T(p). While the water vapor mole fraction c_\text{H$_2$O} is not explicitly stored, we can first infer the specific humidity at any given p as q \rightarrow q(p) and then calculate c_\text{H$_2$O}(q), so that we can finally evaluate the cross section k as k(p, T(p), c_\text{H$_2$O}(q(p))).

1.1.2 Type definition

The definition of the EarthAtmosphere object is shown below, with all type fields having verbose names to make it clear what they represent - especially whether they belong to pressure layers or pressure levels.

EarthAtmosphere

atm_elements::Vector{<:AbstractAtmosphereElement}: Vector of atmosphere elements present in this atmosphere
N_level::Int64: Number of retrieval levels in this atmosphere
N_layer::Int64: Number of retrieval layers in this atmosphere
pressure_levels::Vector{T} where T<:AbstractFloat: Pressure level locations
pressure_layers::Vector{T} where T<:AbstractFloat: Mid-layer pressure locations
pressure_unit::Unitful.Units{U, 𝐌 𝐋⁻¹ 𝐓⁻²} where U: Pressure unit
N_met_level::Int64: Number of meteorological levels in this atmosphere
N_met_layer::Int64: Number of meteorological layers in this atmosphere
met_pressure_levels::Vector{T} where T<:AbstractFloat: Meteorological pressure level locations
met_pressure_layers::Vector{T} where T<:AbstractFloat: Mid-layer pressure locations for meteorology
met_pressure_unit::Unitful.Units{U, 𝐌 𝐋⁻¹ 𝐓⁻²} where U: Pressure units for meteorology
temperature_levels::Vector{T} where T<:AbstractFloat: Temperatures at pressure levels
temperature_layers::Vector{T} where T<:AbstractFloat: Temperatures at mid-layer pressures
temperature_unit::Unitful.Units{U, 𝚯, nothing} where U: Temperature unit
specific_humidity_levels::Vector{T} where T<:AbstractFloat: Specific humidity at pressure levels
specific_humidity_layers::Vector{T} where T<:AbstractFloat: Specific humidity at mid-layer pressures
specific_humidity_unit::Unitful.Units{U, NoDims} where U: Specific humidity unit
altitude_levels::Vector{T} where T<:AbstractFloat: Altitude at pressure levels
altitude_layers::Vector{T} where T<:AbstractFloat: Altitude at mid-layer pressures
altitude_unit::Unitful.Units{U, 𝐋} where U: Altitude units
gravity_levels::Vector{T} where T<:AbstractFloat: Gravity at pressure levels
gravity_layers::Vector{T} where T<:AbstractFloat: Gravity at mid-layer pressures
gravity_unit::Unitful.Units{U, 𝐋 𝐓⁻²} where U: Gravity unit

For the creation of an EarthAtmosphere object, there is one convenice function that produces the object with a specified number of levels for both grids, and fills them with zeros. This is a recurring design in RetrievalToolbox that users should get familiar with. Objects, especially those of considerable size, are allocated (or created) once, and then altered (or mutated) through the course of the retrieval process. Later tutorials will elaborate on the neccessity of this approach along with best practices.

The convenience function used to produce a pre-allocated EarthAtmosphere object is named create_empty_EarthAtmosphere, and can be called in the following way:

atm = RE.create_empty_EarthAtmosphere(
    4, # number of levels for retrieval grid
    6, # number of levels for MET grid
    Float64, # data type for all arrays
    pressure_unit=u"hPa", # pressure unit for retrieval grid
    met_pressure_unit=u"Pa", # pressure unit for MET grid
    temperature_unit=u"K", # unit for temperature profile
    specific_humidity_unit=u"g/kg", # unit for specific humidity profile
    altitude_unit=u"km", # unit for altitude profile
    gravity_unit=u"m/s^2", # unit for local gravity profile
);

The parameters to the function create_empty_EarthAtmosphere are rather self-explanatory: we first pass the number of levels we want for both retrieval and meteorological grid. Then we must also decide on the data type that we want the arrays of the object to be; 64-bit floats are a good choice in almost all cases. The remaining 6 parameters define the units of the quantities that object contains. Two major things are of note here. First, we can see that there are two different units for the two pressure grids! This means that the retrieval grid could be specified in, e.g., pascal, and the meteorological grid could be specified in, e.g., torr. They can be the same of course. Supplying units explicitly is optional, the function has defined default values.

1.2 Ingesting Values with Units

From here on, we can inspect the object fields in the known manner - for example we can look at vector representing the retrieval pressure grid variables and see that they are all zeros.

atm.pressure_levels

4-element Vector{Float64}:
 0.0
 0.0
 0.0
 0.0

The EarthAtmosphere type itself is not mutable, which means that we cannot replace the vector atm.pressure_level by our desired pressure level. Attempting an operation such as

atm.pressure_levels = [1., 100., 500., 1000.]

setfield!: immutable struct of type EarthAtmosphere cannot be changed

Stacktrace:
 [1] setproperty!(x::EarthAtmosphere{Float64}, f::Symbol, v::Vector{Float64})
   @ Base ./Base.jl:53
 [2] top-level scope
   @ In[6]:1

will result in the error seen above. However, we can easily change the contents of the vector without having to violate the immutable nature of the EarthAtmosphere-typed object atm.

atm.pressure_levels[:] = [1., 100., 500., 1000.];
atm.pressure_levels

4-element Vector{Float64}:
    1.0
  100.0
  500.0
 1000.0

Recall that [:] is an indexing operation, so rather than trying to assign a new vector to the pressure_levels field of the object atm, we are accessing the contents of the vector that is referenced by atm.pressure_levels.

Tip

Contents of vectors, arrays, lists etc. of objects of mutable types can be altered!

Above we have created the atm object with a specific unit for the retrieval pressure grid, and we can access that unit via

atm.pressure_unit

hPa

The new pressure grid we just created is obviously “reasonable”, so to speak - meaning that values from 1 through 1000 hPa make sense within the context of an atmosphere on Earth. Note, however, that there is nothing that would keep us from assigning non-sensical values.

A very likely error to occur is the mix-up of units. We can demonstrate a possible scenario by pretending we obtain our retrieval grid from some external source, like a configuration script or a file containing our desired grid. Further, we will pretend that those values were initially calculated in units of pascal (Pa) rather than hectopascal (hPa).

# Note: Julia allows "_"-separators in numbers to make them visually more obvious to readers
rgrid = [100., 300., 50_000., 100_000.]
atm.pressure_levels[:] = rgrid

4-element Vector{Float64}:
    100.0
    300.0
  50000.0
 100000.0

Now neither RetrievalToolbox nor Julia have any complaints about this, we have mostly just copied the contents of some vector into the memory space of another vector. Like in all scientific programming, it is the responsibility of the user to ensure that the correct values are fed into objects and functions. The most classic way to do this would be the following: we know we obtain the data in units of pascal, and we know that our object knows its retrieval grid in units of hectopascal, so we make the appropriate conversion before assigment and comment somewhere in our code why we did so:

# Divide by 100 since `rgrid` is in Pa
atm.pressure_levels[:] = rgrid ./ 100

4-element Vector{Float64}:
    1.0
    3.0
  500.0
 1000.0

The above solution is perfectly fine, however we can make a slightly smarter choice. If we know the units of the source data, we can attach those to the data like so:

rgrid = [100., 300., 50_000., 100_000.]u"Pa"

4-element Vector{Quantity{Float64, 𝐌 𝐋⁻¹ 𝐓⁻², Unitful.FreeUnits{(Pa,), 𝐌 𝐋⁻¹ 𝐓⁻², nothing}}}:
    100.0 Pa
    300.0 Pa
  50000.0 Pa
 100000.0 Pa

Now rgrid is not just a Float64 vector, but a vector of a different type, as evidenced by the output in the code cell above. As such, we can no longer simply copy the contents of rgrid into the contents of atm.pressure_levels, since Julia would throw an error due to the two types being incompatible. We must first make an appropriate conversion, and then extract the numerical values that we can then copy to atm.pressure_levels.

atm.pressure_levels[:] = ustrip(rgrid .|> atm.pressure_unit);
atm.pressure_levels

4-element Vector{Float64}:
    1.0
    3.0
  500.0
 1000.0

The code above handles performs two operations. First, the vector rgrid is being converted into a new vector with units of atm.pressure_unit (hectopascal in our case). The |> is an infix operator in Julia which applies some function on the right-hand side to some expression on the left-hand side. Since rgrid is a vector rather than a number, we must pre-fix with a . to call the broadcast operation which in turn will apply the infix operation on all elements of rgrid. This first operation leaves us with a new vector in which the numerical contents have been appropriately converted from the original units (Pa) to the new ones (hPa). Feel free to experiment with different, compatible units:

rgrid .|> u"atm"

4-element Vector{Quantity{Float64, 𝐌 𝐋⁻¹ 𝐓⁻², Unitful.FreeUnits{(atm,), 𝐌 𝐋⁻¹ 𝐓⁻², nothing}}}:
 0.0009869232667160128 atm
 0.0029607698001480384 atm
 0.4934616333580064 atm
 0.9869232667160128 atm

Note that after such a conversion has taken place, the resulting vector is still a vector whose elements are of some Unitful-type which are not compatible with the contents of atm.pressure_levels. The last step takes care of this: the ustrip function (provided by Unitful.jl) returns a view of the vector that allows copying over the numerical contents only.

ustrip(rgrid .|> u"atm")

4-element reinterpret(Float64, ::Vector{Quantity{Float64, 𝐌 𝐋⁻¹ 𝐓⁻², Unitful.FreeUnits{(atm,), 𝐌 𝐋⁻¹ 𝐓⁻², nothing}}}):
 0.0009869232667160128
 0.0029607698001480384
 0.4934616333580064
 0.9869232667160128

Note that the ustrip function, when called on an array (or vector), returns a view onto the underlying array data, whereas calling ustrip on a scalar Unitful-type object returns a new value:

ustrip(5.7u"km")

5.7

The advatage of this approach is clear: our retrieval application now automatically handles the unit conversion. Think of other examples. Meteorological data tends to come in netCDF formats (or similar) in which each variable contains appropriate metadata that describes its units. The lack of hard-coding these units makes the retrieval application more resilient against these conversion errors which could arise when, for example, changing data sources.

1.3 Summary and Take-aways!

This section introduces some very important concepts, hence we want to emphasize the key take-aways.

RetrievalToolbox is designed such that certain objects must be created (allocated) once and then fed with appriate numerical values.
- Due to Julia’s lack of manual memory management (we cannot explicitly free memory), creating large arrays over and over again leads to dramatic performance loss when the garbage collector has to be called repeatedly¹
- In general, this requires that retrieval applications built with RetrievalToolbox must do some up-front work to create necessary objects and then change the values inside those objects, as appropriate.
- Some parametric types offered by RetrievalToolbox are mutable, whereas others are not. There is no good rule as to which type is mutable or immutable, but a good rule of thumb is: if you cannot change a value after creating the object, you probably shouldn’t. Mutability of RetrievalToolbox types is subject to change due to updates. For example, the EarthScene type is mutable, since it contains solar anlges, the scene location and time - all quantities that one might want to change after instantiation (when performing retrievals for many scenes).
Many RetrievalToolbox objects have unit fields that describe the unit of their corresponding fields, and they can be leveraged to produce unit-aware algorithms.
- This feature can be used to dynamically incorporate the units of source data (e.g. meteorological model ouptut) and match it with user-preferred units on the algorithm side.

2 Layers and Levels

In the last section, we filled our atmosphere object with values for the retrieval pressure grid: atm.pressure_levels. Inspecting the type documentation [ADD LINK], we can see that there is a field atm.pressure_layers, which still contains only zeros:

atm.pressure_layers

3-element Vector{Float64}:
 0.0
 0.0
 0.0

Again, there is no function that is called automatically when we assign new values to atm.pressure_levels, so we have to manually calculate and ingest the mid-layer pressure values. One rather explicit way of doing it would be the following:

for i in 1:atm.N_layer
    atm.pressure_layers[i] = 0.5 * (atm.pressure_levels[i] + atm.pressure_levels[i+1])
end
atm.pressure_layers * atm.pressure_unit

3-element Vector{Quantity{Float64, 𝐌 𝐋⁻¹ 𝐓⁻², Unitful.FreeUnits{(hPa,), 𝐌 𝐋⁻¹ 𝐓⁻², nothing}}}:
   2.0 hPa
 251.5 hPa
 750.0 hPa

Since this is a common operation, RetrievalToolbox has a convenice function levels_to_layers which calculates the mid-layer values for a given level-based profile.

RE.levels_to_layers(atm.pressure_levels)

3-element Vector{Float64}:
   2.0
 251.5
 750.0

We can plot the location of level and layer values and visually “check” that they are correct:

# Plot the pressure level/layer values over level numbers
scatter(
    collect(1:atm.N_level),
    atm.pressure_levels,
    markershape=:square, label="Levels",
    size=(400, 300)
)
plot!(
    collect(1:atm.N_layer) .+ 0.5, # set x coordinate of layer center to be in the middle
    atm.pressure_layers,
    seriescolor=:black, seriestype=:stepmid,
    markershape=:circle, label="Layers"
)
title!("Retrieval grid")
xlabel!("Level number")
ylabel!("Pressure [$(atm.pressure_unit)]")

At this point, the atmosphere object atm has reasonable values for retrieval pressure grid, but we must also make sure we add data to represent the meteorological grid. Remember, that we initialized the atmosphere object with two more levels than the retrieval grid:

mgrid = [5., 65., 200., 400., 650., 950.]u"hPa"

atm.met_pressure_levels[:] = ustrip(mgrid .|> atm.pressure_unit);
atm.met_pressure_layers[:] = RE.levels_to_layers(atm.met_pressure_levels);

# Plot the pressure level/layer values over level numbers
scatter(
    collect(1:atm.N_met_level),
    atm.met_pressure_levels,
    markershape=:square, label="Levels",
    size=(400, 300)
)
plot!(
    collect(1:atm.N_met_layer) .+ 0.5,
    atm.met_pressure_layers,
    seriescolor=:black, seriestype=:stepmid,
    markershape=:circle, label="Layers"
)
title!("Meteorological grid")
xlabel!("Level number")
ylabel!("Pressure [$(atm.pressure_unit)]")

Moving on, we should also add some reasonable values to represent both the temperature and the specific humidity profiles. Earlier, we chose the units for specific humidity to be u"kg/kg", which is a special type of unit in Unitful, namely a DimensionlessUnits type. It signifies that the resulting unit is effectively “1”, and objects with dimensionless units can always be cast onto “naked” Julia types.

To demonstrate, let us create two vectors that we want to fill in with some values:

demo_a = zeros(3);
demo_b = zeros(3);

And let us assume we have another two vectors that represent some quantities that have units attached:

uv_a = [1., 2., 3.]u"mbar/m"
uv_b = [4., 5., 6.]u"g/kg"

3-element Vector{Quantity{Float64, NoDims, Unitful.FreeUnits{(g, kg⁻¹), NoDims, nothing}}}:
 4.0 g kg⁻¹
 5.0 g kg⁻¹
 6.0 g kg⁻¹

The first quantity has units of pressure over length and cannot be reduced much; the second quantity, however, is effectively just a factor. If we try to cast the values of uv_a into the contents of demo_a, an error is thrown:

demo_a[:] .= uv_a[:]

DimensionError:  and mbar m⁻¹ are not dimensionally compatible.

Stacktrace:
  [1] #s104#131
    @ ~/.julia/packages/Unitful/sxiHA/src/conversion.jl:7 [inlined]
  [2] var"#s104#131"(::Any, s::Any, t::Any)
    @ Unitful ./none:0
  [3] (::Core.GeneratedFunctionStub)(::UInt64, ::LineNumberNode, ::Any, ::Vararg{Any})
    @ Core ./boot.jl:707
  [4] convfact(::Type{Float64}, s::Unitful.FreeUnits{(), NoDims, nothing}, t::Unitful.FreeUnits{(mbar, m⁻¹), 𝐌 𝐋⁻² 𝐓⁻², nothing})
    @ Unitful ~/.julia/packages/Unitful/sxiHA/src/conversion.jl:54
  [5] uconvert(a::Unitful.FreeUnits{(), NoDims, nothing}, x::Quantity{Float64, 𝐌 𝐋⁻² 𝐓⁻², Unitful.FreeUnits{(mbar, m⁻¹), 𝐌 𝐋⁻² 𝐓⁻², nothing}})
    @ Unitful ~/.julia/packages/Unitful/sxiHA/src/conversion.jl:102
  [6] convert(::Type{Float64}, y::Quantity{Float64, 𝐌 𝐋⁻² 𝐓⁻², Unitful.FreeUnits{(mbar, m⁻¹), 𝐌 𝐋⁻² 𝐓⁻², nothing}})
    @ Unitful ~/.julia/packages/Unitful/sxiHA/src/conversion.jl:167
  [7] setindex!
    @ ./array.jl:976 [inlined]
  [8] setindex!
    @ ./subarray.jl:384 [inlined]
  [9] copyto_unaliased!
    @ ./abstractarray.jl:1081 [inlined]
 [10] copyto!(dest::SubArray{Float64, 1, Vector{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}}, true}, src::Vector{Quantity{Float64, 𝐌 𝐋⁻² 𝐓⁻², Unitful.FreeUnits{(mbar, m⁻¹), 𝐌 𝐋⁻² 𝐓⁻², nothing}}})
    @ Base ./abstractarray.jl:1061
 [11] copyto!
    @ ./broadcast.jl:961 [inlined]
 [12] copyto!
    @ ./broadcast.jl:920 [inlined]
 [13] materialize!
    @ ./broadcast.jl:878 [inlined]
 [14] materialize!(dest::SubArray{Float64, 1, Vector{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}}, true}, bc::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, typeof(identity), Tuple{Vector{Quantity{Float64, 𝐌 𝐋⁻² 𝐓⁻², Unitful.FreeUnits{(mbar, m⁻¹), 𝐌 𝐋⁻² 𝐓⁻², nothing}}}}})
    @ Base.Broadcast ./broadcast.jl:875
 [15] top-level scope
    @ In[24]:1

The important portions of the error log are found at the top, but one needs to look closely. Dimenionless units in Unitful are unfortunately not displayed as 1, but as an empty character . The above error log mentions a DimensionError followed by the statement that and the pressure unit mbar/m are not compatible. This is the behavior we truly want! We use Unitful such that unit conversion errors do not happen in the first place!

Performing this operation on the second set of vectors, however, succeeds as we expect. More importantly, Unitful has correctly used the conversion factor 1000 that stems from kg/g before copying over the values.

demo_b[:] .= uv_b[:];
demo_b

3-element Vector{Float64}:
 0.004
 0.005
 0.006

With this knowledge, we can fill in the specific humidity profile with a unit-carrying vector, and we also make sure to not forget to calculate the mid-layer values.

q = [0.0001, 0.0002, 0.00035, 0.00035, 0.00075, 0.0020]u"kg/kg"
atm.specific_humidity_levels[:] .= q
atm.specific_humidity_layers[:] = RE.levels_to_layers(
    atm.specific_humidity_levels
);

Important

But wait, this is wrong! An important lesson follows!

First, we look at vector q again, which we have defined just above to represent a profile in units of kg/kg. Note that the vector does not carry any units, since Unitful.jl implicitly casts the resulting vector as a “normal” vector

6-element Vector{Float64}:
 0.0001
 0.0002
 0.00035
 0.00035
 0.00075
 0.002

If we perform the assigment in the obvious way, such as

atm.specific_humidity_levels[:] .= q

the field specific_humidity_levels of our atmosphere object is filled with values that represent a quantity in unitless dimensions of kg/kg. The atmosphere object, however, was earlier defined in g/kg, as can be inferred:

atm.specific_humidity_unit

g kg⁻¹

This discrepancy between unit and quantity will result in errors in following computations!

The solution to this is to always perform an explicit unit conversion when setting unit-valued object fields. The correct, explicit assignment is

atm.specific_humidity_levels[:] .= q .|> atm.specific_humidity_unit .|> ustrip
atm.specific_humidity_layers[:] = RE.levels_to_layers(atm.specific_humidity_levels)
# This should now be in g/kg!

5-element Vector{Float64}:
 0.15000000000000002
 0.275
 0.35
 0.55
 1.375

The output in the above section confirms that the correction conversion was performed before the values were copied into atm.specific_humidity_levels.

The last quantity to be covered in this section is the temperature profile. We can use the same recipes that we have learned before to add a temperature profile. Temperature units, however, require a litte more care than other units. A well-written article is found within the Unitful.jl documentation: link. In short, there are ambiguities with respect to mathematical operations that involve temperatures with different units, in particular if one mixes absolute (Kelvin, Rankine) and relative (Celcius, Fahrenheit) scales. To avoid downstream issues, the functions and objects that currently accept temperature units will force users to use absolute temperature units, such as K, mK, Ra, etc.

T = [253., 233., 238., 253., 278., 293.]u"K"
atm.temperature_levels[:] = ustrip(T .|> atm.temperature_unit)
atm.temperature_layers[:] = RE.levels_to_layers(
    atm.temperature_levels
);

Tip

It is good practice to keep all temperature quantities in units of K, mostly to minimize potential (unknown) issues.

2.1 Summary and a Useful Convenience Function

The prior section intends to emphasize the role of units and how they are used within RE. To summarize:

Quantities that have units are stored as “normal” number-types (or number-type arrays). A corresponding field states the units of those.

Using number-type arrays, e.g. of Matrix{Float64}, allows certain mathematical operations to be performed much faster, which is why RetrievalToolbox takes the approach of using a seperate variable that stores unit information. In principle, it is possible to make use of arrays with attached units - but early tests have shown that the current approach is more performant.

At all times, users must be aware of units, which quantities have units attached, and how they enter various RetrievalToolbox objects.

As per the design goals of RetrievalToolbox, there is no over-arching mechanism that controls how quantities flow from data source to be part of RetrievalToolbox-provided objects or functions. Users are free to design that route themselves as they see fit for their application. However, users must always make sure they understand which quantities have units attached to them. Lack of awareness can lead to calcluation errors.

Warning

RetrievalToolbox does not evaluate users’ handling of unit-attached quantities. It is the user’s responsibility to ensure that the RetrievalToolbox object units and the corresponding numerical values match!

While we have used explicit formulations in the section above to learn about how to deal with units, there is a helpful convience function provided by RetrievalToolbox that makes the typing a little more compact.

Instead of writing

atm.specific_humidity_levels[:] .= q .|> atm.specific_humidity_unit .|> ustrip

6-element view(::Vector{Float64}, :) with eltype Float64:
 0.1
 0.2
 0.35
 0.35
 0.75
 2.0

we can use the convenience function ingest! to copy over a numerical value (or the contents of some array) into a RetrievalToolbox object. The unit conversion is handled by the function itself.

RE.ingest!(atm, :specific_humidity_levels, q)

Note that there are two specialized implementations of the function, so bother number-type and array-type quantities can be ingested into RetrievalToolbox objects very efficiently this way. Also, the target must be valid in the sense that the RetrievalToolbox object must possess both a field, say obj.field, and the accompanying obj.field_unit.

Tip

It is a Julia convention, that functions which modify one or more of its arguments, have an exclamation mark at the end. Users can spot immediately that ingest! modifies the atm object!

3 Altitude and Gravity

Looking at Equation 1, we can see that we will likely require the local acceleration due to Earth’s gravity at various points in our atmosphere. The EarthAtmosphere object holds the level-resolved value for g in its own field, named .gravity. Users have the option to use their own method to calculate this profile via their own function, e.g.

atm.gravity_levels[:] .= my_own_gravity_function()

Since this is such a common operation, RetrievalToolbox ships with a function to calculate the location-based local gravity profile which takes into account the latitude and altitude of the observation point at which we want to create a model atmosphere. However, that function requires additional inputs (scene latitude and altitude), hence we will skip that particular functionality for now and use dummy values instead:

atm.gravity_levels[:] .= 9.81; # this will be in m/s^2
atm.altitude_levels[:] .= [45.0, 20.0, 10.0, 7.0, 3.5, 0.6]; # this will be in "km"

Finally, to make things a little more quick, we use a convenience function calculate_layers! that acts on an EarthAtmosphere object and calculates all mid-layer quantities. We could have, equivalently, used RE.levels_to_layers, as was introduced earlier and manually calculate and assign the mid-layer quantities for gravity_layers, altitude_layers, etc.

RE.calculate_layers!(atm)

4 Adding Atmosphere Elements

We have added everything needed for a functioning atmosphere: we defined a pressure level grid for the retrieval (pressure_levels), and then another pressure level grid for our meteorological profiles (met_pressure_levels). We then assigned our temperature and specific humidity profiles (temperature_levels, specific_humidity_levels) as well as altitude and gravity (altitude_levels, gravity_levels). Finally, we calculate the mid-layer values using a helpful convenience function (calculate_layers).

This atmosphere object would work just fine within a retrieval set-up, although this model atmosphere would be somewhat empty. When we initially created the EarthAtmosphere object, with the function create_empty_EarthAtmosphere, we specified number of layers and some units, but we did not add “actors” so to speak.

In RetrievalToolbox, Earth atmosphere objects contain a list of “atmosphere elements”. These atmosphere elements can be thought of as items which characterize the atmosphere in the sense that they modify the radiance when calculated through one of the provided radiative transfer schemes. This is the place where we could insert, for example, a few aerosols or clouds, or gases. We can inspect this list by typing:

atm.atm_elements

AbstractAtmosphereElement[]

We notice that this list is empty, and it is of type AbstractAtmosphereElements. We will revisit type hierarchy at a later stage, but for now we will note that AbstractAtmosphereElements encompasses all of the above mentioned concepts (gases, aerosols and more). Being an empty vector of some type also means that we can only add elements to this vector whose type matches!

For the sake of simplicity, we can add atmosphere element to this atmosphere that requires no additional set-up: Rayleigh scattering. We can instantiate new Rayleigh scattering object by typing RE.RayleighScattering(), which produces a mostly empty object:

ray = RE.RayleighScattering()

RayleighScattering

And we can push this into the vector of atmosphere elements in our atmosphere:

push!(atm.atm_elements, ray)

1-element Vector{AbstractAtmosphereElement}:
 RayleighScattering

Tip

push! is a very general Julia function that adds an element (or multiple elements) to some collection. This works for almost all types of vectors or lists. E.g. x = [1,2,3]; push!(x, 4) would add 4 to the vector x.

Inspecting the type definition of RayleighScattering reveals that it is actually fully empty - there is no field inside this particular type, you can try it yourself by typing ?RE.RayleighScattering in a Julia prompt. So how does RetrievalToolbox calculate the appropriate quantities when the time comes, such as the Rayleigh scattering optical depth, or the Rayleigh scattering phase matrix?

In this case, RetrievalToolbox is simply interrogating the list of atmosphere elements. Functions internal to RetrievalToolbox will see if certain objects or objects of certain types are part of atm.atm_elements, and then act accordingly. For example, the code snippet below is taken straight from the routine which calculates all the optical properties for our retrieval. When performing the calculations, RetrievalToolbox will inquire whether any of the types of the objects in atm.atm_elements belong to the AbstractRayleighScattering class. This indicates to the overall program logic that the user wants Rayleigh scattering to be computed, and thus the appropriate function is called.

if findanytype(atm.atm_elements, AbstractRayleighScattering)
    # Check if we have Rayleigh scattering as an atmospheric element..

    # Calculate OD profiles
    calculate_rayleigh_optical_depth_profiles!(opt, scene)
end

findanytype is a function of RetrievalToolbox that allows to quickly iterate over some vector to see if any of the elements belong to some type, so this first code will evaluate to true (the second element is of String type)

RE.findanytype([1, "test", 2.0, 5.25im], String)

true

whereas this next code bit evaluates to false, since the third element is a Float64, rather than a Float32, and thus there is no 64-bit float type in this vector.

RE.findanytype([1, "test", 2.0, 5.25im], Float32)

false

The idea behind this design pattern is the following: atmospheric elements contain all the necessary information about them, which can be very little. In the case of Rayleigh scattering, we have no need for additional information, as RetrievalToolbox (for now) only knows one particular realization of this phenomenon and it does not need any more parameters. It is enough to know that the Earth atmosphere object contains a RayleighScattering instance, which then signals to relevant functions that certain computations must be performed.

We can also add a gas to our atmosphere, using what we have learned in the last tutorial:

# Load the spectroscopy
absco_o2 = RE.load_ABSCO_spectroscopy(
    joinpath("data", "o2_spectroscopy_demo.h5")
)
# Define the gas object
gas_o2 = RE.GasAbsorber(
    "O2",
    absco_o2,
    [0.2095, 0.2095, 0.2095, 0.2095],
    Unitful.NoUnits
)
# push into our list of atmosphere objects
push!(atm.atm_elements, gas_o2)

2-element Vector{AbstractAtmosphereElement}:
 RayleighScattering
 GasAbsorber: O2

Now our atmosphere contains both Rayleigh scattering and molecular oxygen as an absorber. As we have learned in the last tutorial, populating a list such as atm.atm_elements does not automatically create a new copy of some element. In the example above, only one instance of gas_o2 exists. We can check that again by looking at the element in the list and comparing it with gas_o2 using the === operator:

# `end` refers to the last element in some list
atm.atm_elements[end] === gas_o2

true

Since the above line evaluates to true, we can be sure that both symbols point to the same object in memory! This remains an important aspect of working with RetrievalToolbox! The name gas_o2 could easily be clobbered, for example we could create a new gas_o2 object with some slightly different oxygent volume mixing ratio:

gas_o2 = RE.GasAbsorber(
    "O2",
    absco_o2,
    [0.22, 0.22, 0.22, 0.22],
    Unitful.NoUnits
)

Gas absorber: O2
Associated spectroscopy: data/o2_spectroscopy_demo.h5

And we can see that those two symbols now do not point to the same obect in memory:

# `end` refers to the last element in some list
atm.atm_elements[end] === gas_o2

false

Or we can look at the VMR levels:

println(gas_o2.vmr_levels)
println(atm.atm_elements[end].vmr_levels)

[0.22, 0.22, 0.22, 0.22]
[0.2095, 0.2095, 0.2095, 0.2095]

Again, we repeat the same important lesson as in Tutorial 1: be careful when managing these objects of interest! There are not just downsides such as possible confusion of objects - the fact that these “down-the-line”-objects only reference the original objects can be very effectively used!

First, let us make sure we have our correct gas absorber:

gas_o2 = atm.atm_elements[end]

Gas absorber: O2
Associated spectroscopy: data/o2_spectroscopy_demo.h5

Now think of a situation in which we might want to be able to change the contents of gas_o2, but do not want to re-create all of the other objects that initially depended on it. For example, we could want to study the impact of slightly different oxygen columns to a top-of-atmosphere radiance, as calculated by our model. We could simply cast the relevant functions into a loop such as so:

for o2_vmr in [0.2095, 0.2065, 0.2035]
    gas_o2.vmr_levels[:] .= o2_vmr # Set the VMR for the entire column

    run_forward_model(atm) # this is a placeholder dummy function
end

The above code illustrates how we would update the gas object volume mixing ratio, then call some function to do the radiative transfer. Since gas_o2 is being (implicitly) referenced by atm, the atmosphere object, when interrogated, would show varying oxygen columns at each loop iteration. Of course this all depends on run_forward_model actually re-calculating the optical properties, which we assume would happen. This method would obviously not produce the intended result if the symbol gas_o2 had been clobbered or overwritten by some other object in the meantime!

Move on to the next tutorial! Tutorial 3

References

Rodgers, Clive D. 2000. Inverse Methods for Atmospheric Sounding: Theory and Practice. Vol. 2. World scientific.

Footnotes

See e.g. https://docs.julialang.org/en/v1/manual/performance-tips/#Pre-allocating-outputs ↩︎

1 Introduction

1.1 The EarthAtmosphere Object

1.1.1 The Retrieval and Meteorological Grids

1.1.2 Type definition

1.2 Ingesting Values with Units

1.3 Summary and Take-aways!

2 Layers and Levels

2.1 Summary and a Useful Convenience Function

3 Altitude and Gravity

4 Adding Atmosphere Elements

References

Footnotes

1.1 The `EarthAtmosphere` Object