Tutorial 3: Spectral Windows, Scenes and Optical Properties

# Some set-up for this document
using RetrievalToolbox;
const RE = RetrievalToolbox;
using Plots; default(fontfamily="JuliaMono-Medium", titlefontsize=10, labelfontsize=10)
using Unitful;

1 Introduction

In the previous tutorial (Tutorial 2), we have learned about atmospheres and which types of profiles and other objects they contain. Further, we have learned about the importance of dealing with physical units correctly, such that RetrievalToolbox can correctly do the necessary unit conversions for us.

Goal of this tutorial

In this tutorial we will move to creating an actual retrieval scene, and define the extent of our spectral region of interest. This provides us with everything needed to peform some basic calculations that we then use to produce top-of-atmosphere radiances. We will also learn how RetrievalToolbox internally links objects of interest to establish some of the needed relationships between them.

2 Spectral Windows

2.1 Introduction

During the beginning stages of setting up a trace gas retrieval algorithm, we must choose the so-called spectral windows. Possibly known as “fitting windows” or just “windows”, we mean a contiguous section of the spectral dimension, be it wavelength or wavenumber. RetrievalToolbox supports an arbitrary number of spectral windows that can be used for retrievals, and, apart from memory limitations, they can be of any size - so both micro-windows and larger broad-band windows are supported.

Spectral windows are best defined very early in the retrieval application, since many other RetrievalToolbox objects depend on it. The type structure itself is relatively straightforward: we have to give a name, which is purely for labelling purposes, and then provide the lower and upper limits in either wavelength or wavenumber space. The wavelength or wavenumber limits (denoted here as ww_min and ww_max) represent the “what the user wants” spectral boundaries. The third argument, ww_grid is the underlying high-resolution grid. Users can freely choose this grid, it can have regular or irregular intervals, however it must extend beyond the spectral limits as given by ww_min and ww_max. The fourth argument (ww_unit) now defines the units of the spectral grid, and can be any length-type unit (for wavelengths), or any wavenumber unit (inverse length). Finally, the last argument defines a so-called reference point (ww_reference): this is a quantity used for a number of wavelength- (or wavenumber-) dependent objects. For example, if we want to define a surface reflectance that varies spectrally over the range of the spectral window, we likely need a reference point to anchor some polynomial expression.

Below is the list of type fields for the SpectralWindow type. Note that the basic constructor (the function that is called when you type RE.SpectralWindow(...)) does not require the N_hires field to be supplied, since it is automatically calculated from the supplied ww_grid parameter.

SpectralWindow

window_name::String: Label for this spectral window
ww_min::AbstractFloat: This is the ‘what user wants’ lower limit
ww_max::AbstractFloat: This is the ‘what user wants’ upper limit
ww_grid::Vector{T} where T<:AbstractFloat: Wavelength or wavenumber grid (high-resolution) at the instrument
ww_unit::Union{Unitful.Units{U, 𝐋} where U, Unitful.Units{U, 𝐋⁻¹} where U}: Wavelength or wavenumber unit
ww_reference::AbstractFloat: Reference λ or ν from SV elements which have spectral dependence
N_hires::Int64: Number of high-resolution spectral elements

A type to hold a spectral window

It is easier to understand the workings of the SpectralWindow type by example. First, we simply create a new object by calling the type constructor. The example below creates a window that is 1 nm wide with a rather crude spacing of 0.1 nm for demonstration purposes.

wl_unit = u"nm"

wl_min = 759.95
wl_max = 760.25
wl_buffer = 0.03
wl_spacing= 0.01
wl_grid = collect(wl_min-wl_buffer:wl_spacing:wl_max+wl_buffer+wl_spacing)

swin1 = RE.SpectralWindow(
    "O2A", # Label
    wl_min, # Lower wavelength limit
    wl_max, # Upper wavelength limit
    wl_grid, # Hires-grid
    wl_unit, # Wavelength unit
    (wl_min + wl_max) / 2 # Reference at center
)

Spectral window: O2A
Boundaries: 759.95 to 760.25, in nm.
N = 37

We can peek into the swin1 object as usual, for example we can look at the actual underlying wavelength grid.

show(swin1.ww_grid)

[759.9200000000001, 759.9300000000001, 759.94, 759.95, 759.96, 759.97, 759.98, 759.9900000000001, 760.0000000000001, 760.0100000000001, 760.0200000000001, 760.0300000000001, 760.0400000000001, 760.0500000000001, 760.0600000000001, 760.07, 760.08, 760.09, 760.1, 760.1100000000001, 760.1200000000001, 760.1300000000001, 760.1400000000001, 760.1500000000001, 760.1600000000001, 760.1700000000001, 760.1800000000001, 760.19, 760.2, 760.21, 760.22, 760.23, 760.2400000000001, 760.2500000000001, 760.2600000000001, 760.2700000000001, 760.2800000000001]

The role of the wavelength grid is quite important: all optical property and radiative transfer calculations will be performed on that grid. Overall computing performance tends to scale linearly with the number of points in the spectral grid - so doubling the number of points of the spectral window rougly leads to a doubling of the overall forward model execution time.

2.2 Wavelengths and Wavenumbers

RetrievalToolbox supports both wavelength and wavenumbers as choices for the spectral dimension natively. Additional details are also found in the RetrievalToolbox documentation at [ADD LINK].

Users should be able to access the spectral dimension in an explicit way that reflects their initial choice and that helps other users to be constantly aware of whether this particular bit of code was intended for wavelength or wavenumber space. We can access the wavelength-based grid in swin1 as we have done above (swin1.ww_grid), but it is much cleaner to write .wavelength_grid!

show(swin1.wavelength_grid)

[759.9200000000001, 759.9300000000001, 759.94, 759.95, 759.96, 759.97, 759.98, 759.9900000000001, 760.0000000000001, 760.0100000000001, 760.0200000000001, 760.0300000000001, 760.0400000000001, 760.0500000000001, 760.0600000000001, 760.07, 760.08, 760.09, 760.1, 760.1100000000001, 760.1200000000001, 760.1300000000001, 760.1400000000001, 760.1500000000001, 760.1600000000001, 760.1700000000001, 760.1800000000001, 760.19, 760.2, 760.21, 760.22, 760.23, 760.2400000000001, 760.2500000000001, 760.2600000000001, 760.2700000000001, 760.2800000000001]

show(swin1.λ_grid)

[759.9200000000001, 759.9300000000001, 759.94, 759.95, 759.96, 759.97, 759.98, 759.9900000000001, 760.0000000000001, 760.0100000000001, 760.0200000000001, 760.0300000000001, 760.0400000000001, 760.0500000000001, 760.0600000000001, 760.07, 760.08, 760.09, 760.1, 760.1100000000001, 760.1200000000001, 760.1300000000001, 760.1400000000001, 760.1500000000001, 760.1600000000001, 760.1700000000001, 760.1800000000001, 760.19, 760.2, 760.21, 760.22, 760.23, 760.2400000000001, 760.2500000000001, 760.2600000000001, 760.2700000000001, 760.2800000000001]

With both the unicode lowercase “lambda” (λ_grid) and wavelength_grid we can access the underlying wavelength grid. The same holds for wavenumber-based spectral windows. Let us define a new window:

wn_unit = u"cm^-1"
wn_min = 5000.0
wn_max = 5001.0
wn_buffer = 0.1
wn_spacing= 0.01
wn_grid = collect(wn_min-wn_buffer:wn_spacing:wn_max+wn_buffer)

swin2 = RE.SpectralWindow(
    "just_test", # Label
    wn_min, # Lower wavenumber limit
    wn_max, # Upper wavenumber limit
    wn_grid, # Hires-grid
    wn_unit, # Wavenumber unit
    (wn_min + wn_max) / 2 # Reference at center
)

Spectral window: just_test
Boundaries: 5000.0 to 5001.0, in cm⁻¹.
N = 121

and now access the underlying grid using wavenumber_grid or the lower case greek nu ν_grid:

show(swin2.wavenumber_grid)

[4999.9, 4999.91, 4999.92, 4999.93, 4999.94, 4999.95, 4999.96, 4999.97, 4999.98, 4999.99, 5000.0, 5000.01, 5000.02, 5000.03, 5000.04, 5000.05, 5000.06, 5000.07, 5000.08, 5000.09, 5000.1, 5000.11, 5000.12, 5000.13, 5000.14, 5000.15, 5000.16, 5000.17, 5000.18, 5000.19, 5000.2, 5000.21, 5000.22, 5000.23, 5000.24, 5000.25, 5000.26, 5000.27, 5000.28, 5000.29, 5000.3, 5000.31, 5000.32, 5000.33, 5000.34, 5000.35, 5000.36, 5000.37, 5000.38, 5000.39, 5000.4, 5000.41, 5000.42, 5000.43, 5000.44, 5000.45, 5000.46, 5000.47, 5000.48, 5000.49, 5000.5, 5000.51, 5000.52, 5000.53, 5000.54, 5000.55, 5000.56, 5000.57, 5000.58, 5000.59, 5000.6, 5000.61, 5000.62, 5000.63, 5000.64, 5000.65, 5000.66, 5000.67, 5000.68, 5000.69, 5000.7, 5000.71, 5000.72, 5000.73, 5000.74, 5000.75, 5000.76, 5000.77, 5000.78, 5000.79, 5000.8, 5000.81, 5000.82, 5000.83, 5000.84, 5000.85, 5000.86, 5000.87, 5000.88, 5000.89, 5000.9, 5000.91, 5000.92, 5000.93, 5000.94, 5000.95, 5000.96, 5000.97, 5000.98, 5000.99, 5001.0, 5001.01, 5001.02, 5001.03, 5001.04, 5001.05, 5001.06, 5001.07, 5001.08, 5001.09, 5001.1]

show(swin2.ν_grid)

[4999.9, 4999.91, 4999.92, 4999.93, 4999.94, 4999.95, 4999.96, 4999.97, 4999.98, 4999.99, 5000.0, 5000.01, 5000.02, 5000.03, 5000.04, 5000.05, 5000.06, 5000.07, 5000.08, 5000.09, 5000.1, 5000.11, 5000.12, 5000.13, 5000.14, 5000.15, 5000.16, 5000.17, 5000.18, 5000.19, 5000.2, 5000.21, 5000.22, 5000.23, 5000.24, 5000.25, 5000.26, 5000.27, 5000.28, 5000.29, 5000.3, 5000.31, 5000.32, 5000.33, 5000.34, 5000.35, 5000.36, 5000.37, 5000.38, 5000.39, 5000.4, 5000.41, 5000.42, 5000.43, 5000.44, 5000.45, 5000.46, 5000.47, 5000.48, 5000.49, 5000.5, 5000.51, 5000.52, 5000.53, 5000.54, 5000.55, 5000.56, 5000.57, 5000.58, 5000.59, 5000.6, 5000.61, 5000.62, 5000.63, 5000.64, 5000.65, 5000.66, 5000.67, 5000.68, 5000.69, 5000.7, 5000.71, 5000.72, 5000.73, 5000.74, 5000.75, 5000.76, 5000.77, 5000.78, 5000.79, 5000.8, 5000.81, 5000.82, 5000.83, 5000.84, 5000.85, 5000.86, 5000.87, 5000.88, 5000.89, 5000.9, 5000.91, 5000.92, 5000.93, 5000.94, 5000.95, 5000.96, 5000.97, 5000.98, 5000.99, 5001.0, 5001.01, 5001.02, 5001.03, 5001.04, 5001.05, 5001.06, 5001.07, 5001.08, 5001.09, 5001.1]

Now what would happen if we tried to access a wavelength-based grid using ν or wavenumber?

swin1.wavenumber_grid

Wrong unit! Expected wavelength, but got ν.

Stacktrace:
 [1] error(s::String)
   @ Base ./error.jl:35
 [2] getproperty(s::SpectralWindow{Float64}, x::Symbol)
   @ RetrievalToolbox ./none:36
 [3] top-level scope
   @ In[93]:1

We receive an error! RetrievalToolbox clearly is aware of the fact that swin1 was created with a wavelength-compatible unit and throws an error if we attempt to access the underlying grid with a symbol that was meant for wavenumber-compatible objects. This is very deliberate design! Users are encouraged to use wavenumber_grid or wavelength_grid in their codes as to make it clear that it was designed to operate in that particular spectral space. In applications where wavelength and wavenumber are expected to switch often, using ww_grid might be appropriate.

Within RetrievalToolbox, the string ww has a special meaning when it comes to type fields. It signals that users can access any quantity that contains ww fully or partially as a type field, using wavelength (or λ) or wavenumber (or ν) by replacing ww with the appropriate symbol. For example, swin1.ww_min, swin1.wavelength_min and swin1.λ_min all access the same object field.

Tip

Every RetrievalToolbox type field that starts with ww is intended to be used in either wavelength or wavenumber space! Users can access those fields by replacing the ww part of the type field symbol with either wavelength or λ for length-compatible spectral units (such as µm), or with either wavenumber or ν for wavenumber-compatible spectral units (such as cm^-1). For example, the type field .ww_unit can be equivalently accessed as .ν_unit if the object was created with a wavenumber-compatible unit! There are no known significant performance penalties to doing so.

2.3 Illustration of a Spectral Window and its Relationship to Spectral Samples

In Figure 1, we can see the components of a spectral window, and in this case we have chosen wavelength space for the spectral dimension.

The blue line with triangle-shaped markers represents a measurement which is discretized in the spectral dimension into spectral samples. Counting up all triangles in Figure 1, there are 30 spectral samples visible in this plot.

When choosing the limits of the spectral window, we usually do that based on the inclusion or exclusion of features in the absorption spectrum. Here, the choice is (without further meaning, just as an example) that we include an absorption feature at ~760.1 nm as well as the surrounding two “humps”. The vertical, gray lines indicate the location of the window limits, as stored in the .wavelength_min and .wavelength_max fields. This window contains 18 spectral samples of the measurement. The red dots that overlap the x-axis represent an illustration of the high-resolution grid that is stored in .wavelength_grid - these are the spectral points at which the model spectrum will be evaluated. Finally, the gray, dashed line at 760.1 nm represents the reference wavelength .wavelength_ref, which we chose to be at the center of the spectral window. Note that the reference wavelength can be any wavelength and it does not need to be inside the spectral window. Usually, the spectral window center or one of the limits is a convenient choice.

Figure 1: Illustrating the main components of a spectral window, and how they relate to a measurement (blue triangles).

Also note that the high-resolution grid does not need to line up at all with the spectral window limits - it just happens to be due to the way we constructed the window. We see the high-resolution grid extend a few intervals beyond the limits - this is required for a successful retrieval! When the instrument spectral response function is applied later on, the high-resolution grid must contain enough data in the spectral dimension such that a forward computation for every spectral sample (blue triangle) can be calculated. Below is an illustration to better visualize this concept (Figure 2).

Figure 2: This figure illustrates the neccessity of the high-resolution grid (red dots) to extend beyond the lower wavelength limit of the window (gray vertical line). An example spectral sample (blue triangle) takes spectral contributions from outside the window limits due to the finite width of the spectral response function (ISRF, black), so we must have model calculations for those out-of-window wavelengths.

3 Scenes

3.1 First Step Towards Scenes

We now ready to move up yet another level in the RetrievalToolbox hiearchy and create a so-called scene. Conceptually, a scene is meant to describe state of the atmosphere and surface(s) along with a number of additional parameters which puts that state into context with the observing instrument (or the measurement).

The most important component of the scene is the atmosphere, which we have discussed in the previous tutorial (Tutorial 2). It contains the set-up for our vertical retrieval grid, meteorological profiles, gases and other constituents.

Along with the atmosphere, we must also define the surface state(s). At the moment, RetrievalToolbox requires a surface object to be attached to every spectral window that will be considered. We can easily instantiate a new Lambertian surface with the following code:

surf = RE.LambertianPolynomialSurface(swin1, [0.25])

LambertianPolynomialSurface{Float64}(SpectralWindow: O2A, 0, [0.25])

We must assign the surface to a spectral window, which is the first argument. The second argument in this function is the list of polynomial coefficients. For the sake of simplicity we create a spectrally flat surface, and thus only need one polynomial coefficient - the zeroth order value. The LambertianPolynomialSurface surface type might at a later stage trigger the calculation of surface reflecticity for all high-res spectral points for the spectral window, which requires the evaluation of the albedo \rho at all spectral points:

\rho(\lambda) = \sum_{i=0}^{N-1} (\lambda - \lambda_\mathrm{ref})^i

The reference wavelength \lambda_\mathrm{ref} will be taken from the swin1.ww_reference field.

At first glance, it seems like a circular dependency, but the scene object we want to create later on requires a so-called dictionary that maps a spectral window onto a surface:

surf_dict = Dict(swin1 => surf)

Dict{SpectralWindow{Float64}, LambertianPolynomialSurface{Float64}} with 1 entry:
  SpectralWindow: O2A => LambertianPolynomialSurface{Float64}(SpectralWindow: O…

The next short section explains how dictionaries in Julia work, and why they are a useful tool for RE. Even experienced Julia users might want to read through this sub-section.

3.2 Intermission: Dictionaries in Julia

Dictionaries in Julia work exactly like those in e.g. Python: they allow one to store the relationship between key and value pairs, mostly for the purpose of accessing some value with a given key. They can be easily created via typing e.g.

d1 = Dict("key1" => 1, "key2" => 3)

Dict{String, Int64} with 2 entries:
  "key2" => 3
  "key1" => 1

In Julia, dictionaries are objects of types, so in the example above we have created a dictionary that maps strings onto integers. This is quite different when compared to Python. If we attempt to store a new key-value pair in this already existing dictionary, we find that it only works if we provide a string-based key and an integer-based value (or one that can be implicity converted to one)!

d1["key3"] = 8 # This works
d1["key4"] = 1.0 # This works too (conversion float -> integer succeeds)

d1[10] = "key5" # This does not (integer -> string fails, string -> float fails)

MethodError: Cannot `convert` an object of type Int64 to an object of type String
The function `convert` exists, but no method is defined for this combination of argument types.

Closest candidates are:
  convert(::Type{String}, ::Base.JuliaSyntax.Kind)
   @ Base /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-macmini-x64-5.0/build/default-macmini-x64-5-0/julialang/julia-master/base/JuliaSyntax/src/kinds.jl:975
  convert(::Type{String}, ::String)
   @ Base essentials.jl:461
  convert(::Type{String}, ::FilePathsBase.AbstractPath)
   @ FilePathsBase ~/.julia/packages/FilePathsBase/NV2We/src/path.jl:120
  ...


Stacktrace:
 [1] setindex!(h::Dict{String, Int64}, v0::String, key0::Int64)
   @ Base ./dict.jl:346
 [2] top-level scope
   @ In[97]:5

If we want to create a dictionary that allows us to use any type for either key or value, we must initially create it as such:

d2 = Dict{Any,Any}()
d2["key1"] = 123
d2[3.2] = "test"
d2[(10, 'a')] = sin

d2

Dict{Any, Any} with 3 entries:
  (10, 'a') => sin
  3.2       => "test"
  "key1"    => 123

The dictionary d2 can be populated with almost anything now - the example above shows a string mapped to an integer ("key1" => 1), a 64-bit float mapped to a string (3.2 => "test), and finally a tuple mapped to a function ((10, 'a') => sin)!

As always, when it comes to codes that are required to perform very efficiently, one would be wise to steer clear of any data type with Any, since the Julia compiler cannot dispatch specific functions for good performance. However, for most applications the performance for Any-type dictionaries is usually good enough and can be a very convenient way of organizing relationships.

One can iterate over dictionaries in the following way:

for (k,v) in d1 # The braces around k,v are required!
    println("Key $(k) and Value $(v)")
end

Key key2 and Value 3
Key key3 and Value 8
Key key1 and Value 1
Key key4 and Value 1

We can already notice something interesting! The order in which the key-value pairs are read out is not the same as the order in which they were added to the dictionary! This is a very important detail about dictionaries in Julia: they are, as of writing of this document, not ordered by insertion and there is no guarantee that the order is preserved between executions of the script. If users need that functionality, there is a Julia package OrderedCollections.jl, which implements an order-preserving dictionary type.

Warning

Dictionaries in Python (3.7 and up) preserve the order of insertion! In Julia, this is not the case, and the iteration order is not guaranteed!

In many scripting applications, we would use dictionaries in a manner like above where we link strings or integers to other, generally simple objects. In Julia, many more types can be used as dictionary keys, not just numbers, strings or tuples. We can even use our own types! We will use a quick example by creating two new custom types, and use variables of those types as keys in dictionaries.

struct SS1
    x
end
struct SS2
    y
end
s1 = SS1([1,2,3])
s2 = SS2([4,5,6])

test_dict = Dict(s1 => "I am s1", s2 => "I am s2")

Dict{Any, String} with 2 entries:
  SS1([1, 2, 3]) => "I am s1"
  SS2([4, 5, 6]) => "I am s2"

These are very simple user types that take one argument each. We can access the contents of the dictionary as follows:

test_dict[s1]

"I am s1"

This feels like expected behavior! We have created a dictionary with some keys, one of those keys is indeed our newly created object s1, so we should be able to access the value "I am s1” inside test_dict simply by writing test_dict[s1]. And we see that it works.

Let us try something very similar: we will create a new object, that is seeminly the same object, and try to access the dictonary in exactly the same way. However, we will see that Julia raises an error!

s3 = SS1([1,2,3])
test_dict[s3]

KeyError: key SS1([1, 2, 3]) not found

Stacktrace:
 [1] getindex(h::Dict{Any, String}, key::SS1)
   @ Base ./dict.jl:477
 [2] top-level scope
   @ In[102]:2

The underlying reason as to why the above code raises an error is very important to understand! Julia dictionaries are, just like in Python, hash tables. We can use Julia’s Base.hash function to calculate the hash value of our variables s1 and s3, as well as another object that we just create:

display(Base.hash(s1))
display(Base.hash(s3))
display(Base.hash(SS1([1,2,3])))

0xf926f2f597a20787

0x81456a8acd84fcd4

0xfa6af128b2916ea6

Even though all three objects have the same contents in the sense that the values of the arrays [1,2,3] are the same, they produce different hash values! This is due to the fact that the creation of the array [1,2,3] happens on stack memory. So whenever we type [1,2,3] a new array is created at some different part of the memory that is available to Julia.

If wanted, we can circumvent this by first creating an array, and afterwards create two objects with that array as the argument:

a = [1,2,3]
s4 = SS1(a)
s5 = SS1(a)

display(hash(s4))
display(hash(s5))

0x28b7944f191a2ce5

0x28b7944f191a2ce5

We see that the hash value for both s4 and s5 is the same, and Julia cannot differentiate those two objects at all. Therefore, we could use either to access a dictionary:

test_dict2 = Dict(s4 => "I am s4")
test_dict2[s5]

"I am s4"

3.3 Moving On To Scenes

Returning back to where we left off before the intermission. There, we created a dictionary that maps our swin1 object of type SpectralWindow onto the Lambertian surface that we also created beforehand.

surf_dict = Dict(swin1 => surf)

Dict{SpectralWindow{Float64}, LambertianPolynomialSurface{Float64}} with 1 entry:
  SpectralWindow: O2A => LambertianPolynomialSurface{Float64}(SpectralWindow: O…

We now understand how surf_dict can be used to access the surf object via swin1. Spectral windows in RetrievalToolbox have arrays in them, so even if we have two separate windows that happen to have the same properties, our swin1 object has a unique hash.

We still need a few objects before we can create our scene. Next is the observer. We have to define what type of observation we want to connect to this scene - this is (for now) mostly for the purposes of radiative transfer computations which produce different outputs, depending on the observer type. If we want to use measurements from a space-based instrument, we can use the SatelliteObserver type:

observer = RE.SatelliteObserver(
    1.0, # Viewing zenith
    110.0, # Viewing azimuth
    zeros(3), # Satellite velocity
    zeros(3) # Satellite position
)

SatelliteObserver{Float64, Float64}(1.0, 110.0, [0.0, 0.0, 0.0], [0.0, 0.0, 0.0])

We fill these in with some plausible values. Note that the satellite velocity and satellite position fields are currently unused, so we can safely fill them with zeros. Here we also want to point users the RetrievalToolbox documentation about the problematic ambiguities regarding angular units: [link to RetrievalToolbox doc]. All angular units, such as zenith and azimuth angles, should be entered as degrees without any Unitful units attached to them as to avoid any implicit unwanted unit conversions between degrees and radians.

Warning

Be aware of implicit angular unit conversions when using Unitful degree units!

The next required object is a location, and for this exercise we opt for an EarthLocation type, which describes the position on Earth at which the instrument points. Currently, these Earth locations are considered to be point-like, meaning that they do not have any spatial extent. This notion is consistent with the utilized 1-dimensional radiative transfer, so any location represents some idea of a mean atmospheric and surface state of the area covered by the field-of-view of the ground footprint.

location = RE.EarthLocation(
    16.3713,
    48.2081,
    200.0,
    u"m"
)

EarthLocation
Longitude: 16.3713
Latitude:  48.2081
Altitude:  200.0 m

The last separate object needed is a simple DateTime, which is an object type that is part of the Julia standard library to represent a time. It requires loading in a module first, however:

using Dates
time = DateTime("2025-02-19T13:00:00")

2025-02-19T13:00:00

Again, we picked an arbitrary date. So far, the only function which needs this particular data (EarthScene.time) is a function that can calculate the solar Doppler shift between the sun and the Earth location (if explicitly needed).

We now have all the components needed to create a scene object. We re-use the code we wrote in the previous tutorial to produce an atmosphere object (Tutorial 2). Readers can un-fold the code cell below to look at the details.

Code

atm = RE.create_empty_EarthAtmosphere(
    4, # number of levels for retrieval grid
    6, # number of levels for MET grid
    Float64, # data type for all arrays
    pressure_unit=u"hPa", # pressure unit for retrieval grid
    met_pressure_unit=u"Pa", # pressure unit for MET grid
    temperature_unit=u"K", # unit for temperature profile
    specific_humidity_unit=u"g/kg", # unit for specific humidity profile
    altitude_unit=u"km", # unit for altitude profile
    gravity_unit=u"m/s^2", # unit for local gravity profile
);

RE.ingest!(atm, :pressure_levels, [1., 100., 500., 1000.]u"hPa")
RE.ingest!(atm, :met_pressure_levels, [5., 65., 200., 400., 650., 950.]u"hPa")
RE.ingest!(atm, :specific_humidity_levels, [0.0001, 0.0002, 0.00035, 0.00035, 0.00075, 0.0020]u"kg/kg")
RE.ingest!(atm, :temperature_levels, [253., 233., 238., 253., 278., 293.]u"K")

RE.calculate_layers!(atm)

# Load the spectroscopy
absco_o2 = RE.load_ABSCO_spectroscopy(
    joinpath("data", "o2_spectroscopy_demo.h5")
)

# Define the gas object
gas_o2 = RE.GasAbsorber(
    "O2",
    absco_o2,
    [0.2095, 0.2095, 0.2095, 0.2095],
    Unitful.NoUnits
)

# push into our list of atmosphere objects
push!(atm.atm_elements, gas_o2);

scene = RE.EarthScene(
    atm, # Atmosphere
    surf_dict, # Surface(s)
    observer, # Observer
    location, # Scene location
    12.3, # Solar zenith angle
    56.7, # Solar azimuth angle
    time # Time of measurement
)

EarthScene

Before moving on to use the scene object for computations, let us ponder the relationship and dependencies between the various objects we have created in order to build this. Understanding those connections is helpful since those reflect the needed paths of fundamental calculations that we require for most retrieval applications.

---
config:
    class:
        hideEmptyMembersBox: true
---
classDiagram

    class SpectralWindow["SpectralWindow\n(shortened)"]{
        Number λ_min
        Number λ_max
    }

    class surfaces

    class LambertianPolynomialSurface["LambertianPolynomialSurface\n(shortened)"]

    class ABSCOSpectroscopy["ABSCOSpectroscopy\n(shortened)"]{
        String filename
    }

    class GasAbsorber["GasAbsorber\n(shortened)"]{
        ABSCOSpectroscopy spectroscopy
    }

    class EarthAtmosphere["EarthAtmosphere\n(shortened)"]{
        AtmosphereElements

        retrieval_pressure_grid
        met_pressure_grid
        met_profiles
        met_profiles_units

    }

    class EarthScene{
        Dict[SpectralWindow->AbstractSurfaces] surfaces
        EarthAtmosphere atm
        AbstractObserver observer
        AbstractLocation location
        Number solar_zenith
        Number solar_azimuth
        DateTime time
    }

    GasAbsorber <-- ABSCOSpectroscopy
    EarthAtmosphere <-- GasAbsorber
    EarthScene <-- EarthAtmosphere
    EarthScene <-- surfaces

    SpectralWindow --> surfaces
    LambertianPolynomialSurface --> surfaces

The above schematic shows a stripped-down version of the actual objects and their fields, done mostly for clarity. The EarthScene object is situated at the top right of the diagram, and there are two arrows pointing inward to it. One of the paths show the atmosphere, as defined by the EarthAtmosphere object that contains the various retrieval and meteorological grids and profiles (see Tutorial 2). The atmosphere may have some AtmosphereElements, such as a GasAbsorber which itself needs underlying spectrosopic data. The other arrow shows how surfaces are introduced into the scene. We must first create a surface of type AbstractSurface, in this case we use a LambertianPolynomialSurface - one that has some spectral variability through polynomial coefficients. Then we must have a spectral window to which the surface is attached to. The relationship between surface and spectral window is then determined by the dictionary which maps the spectral window to the surface.

Similarly to what we have done before, the convenience of this hierarchy becomes clear when we want to refer to any of the important quantities. Some function that gets passed a EarthScene object scene, can access all these data through the scene itself. For the solar zenith angle, a simple scene.solar_zenith is enough, the retrieval grid, for example, is accessed by scene.atm.pressure_levels. One can keep chaning the dot-operator to get lower-level data. For example, if one needed to write a small function that lists all the spectroscopy file paths for all gases used in this scene, one can do the following:

function list_all_spec_fnames(scene::RE.EarthScene)

    # Loop through all atmosphere elements in the atmosphere
    for ele in scene.atmosphere.atm_elements
        # Check if this atmosphere element is a GasAbsorber type
        if ele isa RE.GasAbsorber
            # Finally, print out the gas name and the spectroscopy file name
            println("Gas $(ele.gas_name) is read from: $(ele.spectroscopy.file_name)")
        end
    end

end

# Note that `scene` was defined above in the collapsed code cell
list_all_spec_fnames(scene)

Gas O2 is read from: data/o2_spectroscopy_demo.h5

Now the above example highlights the way one can walk down the object hiearchy, starting from the uppermost available object (in this case the scene). Observant readers will notice, however, that this particular function does not need anything beyond the list of atmosphere elements, so one could have written the function so that it accepts either the scene.atmosphere or even scene.atmosphere.atm_elements as its argument. In general, most functions inside RetrievalToolbox are set up this way such that functions will operate on the “highest” required object, which in this case would be the atmosphere. Sometimes it is simply more convenient to have a function act on an object much higher in the hierarchy, for example when it reduces typing. To illustrate this on our example here, we define our function again, but let it act on the atmosphere object:

function list_all_spec_fnames(atmosphere::RE.EarthAtmosphere)
    # Loop through all atmosphere elements in the atmosphere
    for ele in atmosphere.atm_elements
        # Check if this atmosphere element is a GasAbsorber type
        if ele isa RE.GasAbsorber
            # Finally, print out the gas name and the spectroscopy file name
            println("Gas $(ele.gas_name) is read from: $(ele.spectroscopy.file_name)")
        end
    end
end

list_all_spec_fnames (generic function with 2 methods)

Now, we can create a very simply wrapper function that calls this for a scene object:

# This now overwrites our very fist function
list_all_spec_fnames(scene::EarthScene) = list_all_spec_fnames(scene.atmosphere)

list_all_spec_fnames (generic function with 2 methods)

This way, we now have two functions at our disposal. We can use this for an atmosphere object (when no scene has yet been created), and we can also use this for the scene itself to which the atmosphere is attached to. Further, the actual code that does the work is only implemented once, so we will only have to edit the code in one place to extend or improve the function, and the changes will occur for any other wrapper function!

list_all_spec_fnames(scene.atmosphere)
list_all_spec_fnames(scene)

Gas O2 is read from: data/o2_spectroscopy_demo.h5
Gas O2 is read from: data/o2_spectroscopy_demo.h5

To finish up, we have one last step to do before we calculate optical properties. Optical depths calculations require knowledge of at least the local gravity at the various pressure levels, so we must perform this calculation before doing any further steps:

RE.calculate_altitude_and_gravity!(scene)

The above function performs the calculation, stores the results in-place with the correct units, and then also provides those values at the mid-layer pressures. We can have a quick look at the profiles:

p1 = plot(atm.altitude_levels * atm.altitude_unit, collect(1:atm.N_met_level),
    lw=2.0, marker=:circle, label="Altitude profile", legend=:bottomright)
ylabel!("Met profile level #");
yflip!()
p2 = plot(atm.gravity_levels * atm.gravity_unit, collect(1:atm.N_met_level),
    lw=2.0, marker=:circle, label="Gravity profile", legend=:bottomleft)
yflip!()
plot(p1, p2, layout=(1, 2))

4 Optical Properties

With the EarthScene object in place (including all objects within), we likely have everything needed that define an atmospheric and a surface state, from the point of view of an atmospheric retrieval. Between those states and an observable measurement lies the most major component of a retrieval algorithm: the radiative transfer - the mathematical description that provides a way of calculating how light interacts with an absorbing and potentially scattering atmosphere. Between a scene and the output of a radiative transfer calculation usually lies a more fundamental set of quantities, which we shall call optical properties. These represent mostly the raw inputs to the radiative transfer solvers. For now, RetrievalToolbox can utilize the XRTM (atmopsheric) radiative transfer model, which takes as inputs total optical properties, meaning that the RT code does not care about what molecule causes absorption, or what type of scatterer causes scattering. We have to break down our atmospheric state and calculate the resulting total optical properties to then feed into the RT model.

Since it is also very useful to be able to inspect these optical properties, RetrievalToolbox has an explicit type that holds all of this information: EarthAtmosphereOpticalProperties. Right below this paragraph is the documentation of the type including its fields.

EarthAtmosphereOpticalProperties

struct EarthAtmosphereOpticalProperties{T} <: AbstractOpticalProperties

Type to hold optical property data that contain all necessary information to produce the RT inputs for a spectral window spectral_window.

spectral_window::AbstractSpectralWindow: Spectral window for which to calculate optical properties
gas_tau::Dict{GasAbsorber{T}, Matrix{T}} where T: Dictionary to map GasAbsorber -> (wavelength, layer) array
gas_derivatives::Dict{GasAbsorber{T}, Dict{String, AbstractArray}} where T: Dictionary to map GasAbsorber -> Dict of string -> (wavelength, layer) array
aerosol_tau::Dict{<:AbstractAerosolType, Matrix{T}} where T: Dictionary to map AbstractAerosol -> (wavelength, layer) array for optical depth
aerosol_omega::Dict{<:AbstractAerosolType, Matrix{T}} where T: Dictionary to map AbstractAerosol -> (wavelength, layer) array for single scatter albedo
rayleigh_tau::Matrix: Rayleigh optical extinction array (wavelength, layer)
rayleigh_derivatives::Matrix: Rayleigh optical extinction derivatives ∂τray/∂psurf (wavelength, layer)
total_tau::Matrix: Total extinction array (wavelength, layer)
total_omega::Matrix: Total single scatter albedo (wavelength, layer)
total_coef::Union{Nothing, Array{T, 3}} where T: Total phase function (matrix) expansion coefficient array (moment, element(s) - 1 for scalar, 6 for polarized, layer)
nair_dry::Vector: Number of dry air molecules
nair_wet::Vector: Number of wet air molecules
tmp_Nhi1::Vector
tmp_Nhi2::Vector
tmp_Nlay1::Vector
tmp_Nlay2::Vector
tmp_coef::Union{Nothing, Array{T, 3}} where T
tmp_coef_scalar::Union{Nothing, Array{T, 3}} where T

In the above documentation, we see that there are in fact quite a few fields inside an EarthAtmosphereOpticalProperties object, and we have not yet discussed the meaning of most. For now, we just note some important aspects:

Currently, an EarthAtmosphereOpticalProperties object contains explicit fields for following atmospheric constituents or properties: gases, aerosols and Rayleigh scattering
- Should any new constituent or property be included in the future, it will likely result in a change of the type definition and have an impact on applications
Some of fields are dictionary types that use objects as keys to link the corresponding optical properties to them. For example, if we want to access the layer-resolved optical depth for a specific gas object, we would use the gas_tau dictionary like so: gas_tau[gas].
The EarthAtmosphereOpticalProperties type also defines a few “temporary” vectors and arrays. These are used for computations that require vectors or arrays of a certain shape, and are very convenient since we do not have to allocate a new vector or array to act as storage. Note that by design, the contents of these vectors and arrays have no meaning outside any function.

We could instantiate a new EarthAtmosphereOpticalProperties object in the usual way by creating all required quantities and then call EarthAtmosphereOpticalProperties(...) with a lengthy list of arguments. However, a slightly more elegant way exists via a convenience function that creates the object with the correct shapes and dictonary keys:

# First we have to create a spectral window that fits into the range of
# the demo spectroscopy data we have. The example we made before, unfortuantely
# does not work here.

swin = RE.spectralwindow_from_ABSCO(
    "my_window", # name
    0.76455, # wavelength range start
    0.76485, # wavelength range end
    0.7645, # reference wavelength (can be anything really)
    0.00001, # buffer (we can make this small here, true applications might need a big buffer..)
    absco_o2, # The ABSCO object
    u"µm" # Using micro meter units
);

# Then, we have to make sure to create surface and the surface dictionary:
new_surface = RE.LambertianPolynomialSurface(swin, [0.15])
new_surf_dict = Dict(swin => new_surface)

# And swap out the one used in the scene:
scene.surfaces = new_surf_dict

opt = RE.create_empty_EarthAtmosphereOpticalProperties(
    swin, # The spectral window that this optical property object is assigned to
    scene, # The scene that contains the needed atmosphere elements
    RE.RetrievalStateVector([]), # We can leave this empty for now
    RE.ScalarRadiance, # The type of radiance used - either a RE.ScalarRadiance or RE.VectorRadiance
    Float64 # The number type for the arrays
);

The opt object at this point is iniitialized with mostly zeros, so there are no meaningful values present in the various containers. Calculations are performed in-place by calling the calculate_earth_optical_properties! function with opt as the leading argument, the scene we want to calculate the optical properties for - and it also needs a state vector, which for now we keep empty (the state vector is needed to make the function aware of which derivatives should be computed, if any).

RE.calculate_earth_optical_properties!(opt, scene, RE.RetrievalStateVector([]))

Tip

Note that there is no history attached to objects. There is (unfortunately) no way to track whether the optical properties have been already computed or not!

A quick reminder as to what we have just done, and how it fits into the whole work flow:

We created an empty atmosphere object with create_empty_EarthAtmosphere, defining the number of levels for the retrieval grid and the meteorological grid, as well as the physical units for each of the profiles. This creates the EarthAtmosphere object with correctly-sized vectors, that are all zeros.
With the ingest! function, we filled the profiles with sensible values, taking into account units.
We added a GasAbsorber object for molecular oxygen by adding gas_o2 into atmosphere’s atm_elements list.
We created an EarthScene object, which contains the atmosphere we created before (including the gas_o2 constituent), as well as a SatelliteObserver and an EarthLocation. Further, the scene contains the solar angles and the time which this scene represents.
Important! We have to calculate (or supply) the altitude and gravity profiles. The calculate_altitude_and_gravity! function allows to conveniently perform this calculation consistently with the pressure and humidity profiles.
For optical properties, we must have a SpectralWindow - since all the optical property quantities will share its spectral grid.
We then instantiated the EarthAtmosphereOpticalProperties object, that brings together the spectral window and the scene.
Since we have already done all the work to fill the various objects with meaningful data (met profiles etc.), we can call calculate_earth_optical_properties! and let the function compute all of the quantities needed later on for radiative transfer calculations. In our example, this is only the optical depth profile due to gas_o2.

We can visualize the layer optical depths as a function of wavelength in the figure below.

p = plot()
for l in 1:atm.N_layer
    plot!(p, swin.wavelength_grid, opt.gas_tau[gas_o2][:,l], label="Layer $(l)")
end
ylabel!("$(gas_o2.gas_name) layer optical depth")
xlabel!("Wavelength [$(swin.wavelength_unit)]")
display(p)

What is the usefulness of this object type? Other retrieval frameworks might not store optical properties in an explicit container that can be inspected at any given time. The general design is that new objects are used for some explicit points, at which intermediate results of significance are stored. Optical properties are the fundamental inputs to the type of radiative transfer solvers that we use with RetrievalToolbox, and we want to give users full control over the calculation. Users might want to perform additional calculations on top of what calculate_earth_optical_properties! provides - and having the results stored this way allows to easily introduce a new function into users’ forward models that performs whatever computation they want.

We will round up this tutorial with an example of how the optical property object can be used to perform some simple radiative transfer. Suppose we want to obtain the sun-normalized top-of-atmosphere radiance at nadir given the parameters of our scene, only considering absorption. The equation to do that is the following

I^\mathrm{TOA}_\lambda = \exp(\sum_{l=1}^{N_\mathrm{lay}} -\frac{\tau_{\lambda,l}}{\cos\theta_0}) \cdot \frac{\rho\cdot\cos\theta_0}{\pi} \cdot \exp(\sum_{l=1}^{N_\mathrm{lay}} -\frac{\tau_l}{\cos 0})

with \tau_l being the optical depth for layer l, \theta_0 being the solar zenith angle (we assume 0 for the viewing zenith) and \rho being the Lambertian surface albedo. The wavelength index \lambda signifies a spectral dependence.

Inside the EarthAtmosphereOpticalProperties object, the per-gas layer-resolved optical depths are found in the dictionary .gas_tau[GAS_OBJECT]. however, it is more convenient to get the total layer-resolved optical depths, which can be accessed by .total_tau, which containts the extinction profiles of all constituents in the model atmosphere. The code snipped below shows how the Beer-Lambert absorption-only model would be implemented, and is not very different from what the interal radiative solver does.

# Placeholder for the result
I_toa = zeros(swin.N_hires)

# Spectral loop
for λ in 1:swin.N_hires
    ρ = evaluate_surface_at_idx(new_surface, λ)
    I_toa[λ] = exp(-sum(opt.total_tau[λ,:]) / cosd(scene.solar_zenith)) *
        ρ * cosd(scene.solar_zenith) / pi * exp(-sum(opt.total_tau[λ,:]) / cosd(0.0))
end

plot(
    swin.wavelength_grid,
    I_toa,
    label=nothing,
)
ylabel!("Sun-normalized TOA")
xlabel!("Wavelength [$(swin.wavelength_unit)]")

The figure above shows the high-resolution TOA radiance as produced by the native resolution of the underlying spectroscopy, which our SpectralWindow derived from.