Example: Lysimeter Readings
In this tutorial, we will be composing a four-panel plot for multiscale visualization of rainfall time series data in Texas made available by Evett et al. via the USDA. Our data comprises recordings from a pair of rain gauges positioned in opposite corners of the study area.
We’ll need to tackle two key challenges into visualizing this rainfall time series: 1) dealing with scrunched time/rainfall scales and 2) co-visualizing dueling readings from our twin gauges.
Challenge 1
To address the first challenge, we will use matplotlib
’s stackplot
to create area plots with a transparency (“alpha”) effect. (For those unfamiliar, area plots are line plots with the area between the line and the x-axis filled in.) Because the gauges mostly agree, the majority of plotted area will be overlap from both gauges. However, where they differ one area plot will show through.
Challenge 2
The second challenge in visualizing this data arises because, in the particular study area, large amounts of rain falls in short spurts. So, when we zoom out to see the whole month and the maximum rainfall rate, large spikes in the data cause the rest of the data to be scrunched into obscurity.
To plot our data without losing information about low-magnitude rainfall and the short-time events, we will use the outset
package draw supplementary views of the data at alternate magnifications. These auxiliary plots will be combined with the main plot (overall view) as an axes grid.
Setup
Begin by importing necessary packages.
Notably: - datetime
for converting the month and day values from day-of-year - pandas
for data management - matplotlib
for plotting area charts using stackplot
- outset
for managing multi-zoom grid and zoom indicators
To install dependencies for this exercise,
python3 -m pip install \
matplotlib `# ==3.8.2`\
numpy `# ==1.26.2` \
outset `# ==0.1.4` \
opytional `# ==0.1.0` \
pandas `# ==2.1.3` \
seaborn `# ==0.13.0`
Data Preparation
Next, fetch our data and do a little work on it: rename columns and subset data down to just the month of March.
Year | Decimal DOY | NW dew/frost in mm | SW dew/frost in mm | NW precip in mm | SW precip in mm | NW irrig. in mm | SW irrig. in mm | NW ET in mm | SW ET in mm | NW Lysimeter\n(35.18817624°N, -102.09791°W) | SW Lysimeter\n(35.18613985°N, -102.0979187°W) | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
5568 | 2019 | 59.010417 | 0.005681 | 0.000000 | 0.0 | 0.0 | 0.0 | 0.0 | -0.001523 | -0.005245 | 0.0 | 0.0 |
5569 | 2019 | 59.020833 | 0.000000 | 0.000000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.018815 | -0.005285 | 0.0 | 0.0 |
5570 | 2019 | 59.031250 | 0.000000 | 0.000000 | 0.0 | 0.0 | 0.0 | 0.0 | -0.009881 | 0.002503 | 0.0 | 0.0 |
5571 | 2019 | 59.041667 | 0.000000 | 0.002464 | 0.0 | 0.0 | 0.0 | 0.0 | -0.006134 | 0.002146 | 0.0 | 0.0 |
5572 | 2019 | 59.052083 | 0.000000 | 0.000000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.005352 | 0.000119 | 0.0 | 0.0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
8538 | 2019 | 89.947917 | 0.000000 | 0.000000 | 0.0 | 0.0 | 0.0 | 0.0 | -0.013010 | -0.005682 | 0.0 | 0.0 |
8539 | 2019 | 89.958333 | 0.000000 | 0.000000 | 0.0 | 0.0 | 0.0 | 0.0 | 0.001976 | -0.003616 | 0.0 | 0.0 |
8540 | 2019 | 89.968750 | 0.000000 | 0.000000 | 0.0 | 0.0 | 0.0 | 0.0 | -0.001029 | 0.002821 | 0.0 | 0.0 |
8541 | 2019 | 89.979167 | 0.000000 | 0.003894 | 0.0 | 0.0 | 0.0 | 0.0 | 0.001976 | 0.002940 | 0.0 | 0.0 |
8542 | 2019 | 89.989583 | 0.000000 | 0.000000 | 0.0 | 0.0 | 0.0 | 0.0 | -0.004899 | -0.001907 | 0.0 | 0.0 |
2975 rows × 12 columns
Here’s a preliminary look at the time series.
data:image/s3,"s3://crabby-images/236f2/236f26dadc175650d2b0ec782e1a08dc6bb70db9" alt="_images/example-rain_10_0.png"
We’ve certainly got some work to do to nice this up!
Our visualization will focus on showing three details that are difficult to make out in a naive visualization 1) a little shower around day 72, 2) the big rainstorm around day 82, and 3) light precipitation events over the course of the entire month. We’ll create a zoom panel to show each of these components.
Setup Axes Grid
Our first plotting step is to initialize an outset.OutsetGrid
to manage content and layout of our planned axes grid. This class operates analogously to seaborn’s `FacetGrid
<https://seaborn.pydata.org/generated/seaborn.FacetGrid.html>`__, if you’re familiar with that.
We’ll provide a list of the main plot regions we want to magnify through the data
kwarg. Other kwargs provide styling and layout information, including how we want plots to be shaped and how many columns we want to have.
data:image/s3,"s3://crabby-images/ee84b/ee84b59d41d2bc9579929a7a8fa275360d3c2e0a" alt="_images/example-rain_14_0.png"
Set Up Plot Content
Next, we’ll set up the content of our plots — overlapped area plots showing the two rain gauges’ readings.
Matplotlib’s `stackplot
<https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.stackplot.html>`__ is designed to create area plots with areas “stacked” on top of each other instead of overlapping. To get an overlap, we’ll call stackplot
twice so that each “stack” contains only one of our variables.
We will use OutsetGrid.broadcast
to draw the same content across all four axes in our grid. This method take a plotter function as its first argument then calls it with subsequent arguments forwarded to it on each axis.
data:image/s3,"s3://crabby-images/0e334/0e334a0540746eee931c58f0e656247f564bfd8e" alt="_images/example-rain_17_0.png"
To pretty things up, we’ll lay down a white underlay around the stackplots for better contrast against background fills.
We can do this by drawing another stackplot that tracks the maximum reading between our rain gauges at each point in time. Specifying a lower zorder
for this plot causes it to be drawn below the other stackplots.
data:image/s3,"s3://crabby-images/40732/40732fd4ea6c2284610d6eaceaaf6a447eb47ad8" alt="_images/example-rain_19_0.png"
Add Zoom Indicators
Now it’s time to add zoom indicator boxes, a.k.a. outset
“marquees,” to show how the scales of our auxiliary plots relate to the scale of the main plot. Note that we pass a kwarg to allow aspect ratios to vary between the main plot and outset plots. That way, zoom areas can be expanded along their smaller dimension to take full advantage of available space.
data:image/s3,"s3://crabby-images/9f1f8/9f1f81eee7a65b35e61567f63070b0cf648283fe" alt="_images/example-rain_22_0.png"
Replace Numeric Tick Labels with Human-readable Timestamps
We’re almost there! But the x-axis tick labels are still numeric “day of year” values, which is not very intuitive. I don’t know off the top of my head what day 42 of the year corresponds to, do you?
Let’s fix that. To replace the existing tick labels with timestamps, we’ll define a function that takes a numeric day of the year and returns a human-readable timestamp. We’ll always include the time of day, but we’ll only include the date on between-day transitions. We’ll also need helper function to convert numeric day of the year to a Python datetime object.
With this out of the way, we can loop over the axes in our grid and perform the label replacement.
data:image/s3,"s3://crabby-images/30bc3/30bc3209afbaa860df2cf7cb0cf21dde6c857e71" alt="_images/example-rain_27_0.png"
Final Details
The last order of business is to add a legend to the upper left corner of the main plot.
Et Voilà!
data:image/s3,"s3://crabby-images/44528/44528dca69a8c91d74b6a91a5cb643f14a0dd5cc" alt="_images/example-rain_32_0.png"
Want Insets Instead?
Just call outset.inset_outsets
! In this case, we’ll also use outset.util.layout_inset_axes
to tweak inset sizing and positioning.
data:image/s3,"s3://crabby-images/ef7e7/ef7e7a54486915c47414f78c64f82c1e6d05bc48" alt="_images/example-rain_35_0.png"
Citations
Evett, Steven R.; Marek, Gary W.; Copeland, Karen S.; Howell, Terry A. Sr.; Colaizzi, Paul D.; Brauer, David K.; Ruthardt, Brice B. (2023). Evapotranspiration, Irrigation, Dew/frost - Water Balance Data for The Bushland, Texas Soybean Datasets. Ag Data Commons. https://doi.org/10.15482/USDA.ADC/1528713. Accessed 2023-12-26.
Hunter, “Matplotlib: A 2D Graphics Environment”, Computing in Science & Engineering, vol. 9, no. 3, pp. 90-95, 2007. https://doi.org/10.1109/MCSE.2007.55
Marek, G. W., Evett, S. R., Colaizzi, P. D., & Brauer, D. K. (2021). Preliminary crop coefficients for late planted short-season soybean: Texas High Plains. Agrosystems, Geosciences & Environment, 4(2). https://doi.org/10.1002/agg2.20177
Matthew Andres Moreno. (2023). mmore500/outset. Zenodo. https://doi.org/10.5281/zenodo.10426106
Data structures for statistical computing in python, McKinney, Proceedings of the 9th Python in Science Conference, Volume 445, 2010. https://doi.org/ 10.25080/Majora-92bf1922-00a
Waskom, M. L., (2021). seaborn: statistical data visualization. Journal of Open Source Software, 6(60), 3021, https://doi.org/10.21105/joss.03021.