We've been in the depths of a desperate drought.
Last year's monsoon season never happened, and then the winter snow season
didn't happen either.
Dave and I aren't believers in tropical garden foliage that requires
a lot of water; but even piñons and junipers and other native
plants need some water.
You know it's bad when you find yourself carrying a watering can down
to the cholla and prickly pear to keep them alive.
This year, the Forest Service closed all the trails for about a month
-- too much risk of one careless cigarette-smoking hiker, or at least I think
that was the reason (they never really explained it) -- and most the other
trail agencies followed suit. But then in early July,
the forecasts started predicting the monsoon at last.
We got some cloudy afternoons, some humid days (what qualifies
as humid in New Mexico, anyway -- sometimes all the way up to 40%),
and the various agencies opened their trails again. Which came as
a surprise, because those clouds and muggy days didn't actually include
any significant rain. Apparently mere air humidity is enough to
mitigate a lot of the fire risk?
Tonight the skies finally let loose. When the thunder and lightning
started in earnest, a little after dinner, Dave and I went out to the
patio to soak in the suddenly cool and electric air and some spectacular
lightning bolts while watching the hummingbirds squabble over territory.
We could see rain to the southwest, toward Albuquerque, and more rain
to the east, toward the Sangres, but nothing where we were.
Then a sound began -- a distant humming/roaring, like the tires of a
big truck passing on the road. "Are we hearing rain approaching?" we both
asked at the same time. Since moving to New Mexico we're familiar with
being able to see rain a long way away; and of course everyone has
heard rain as it falls around them, either as a light pitter-patter
or the louder sound from a real storm; but we'd never been able to
hear the movement of a rainstorm as it gradually moved toward us.
Sure enough, the sound got louder and louder, almost unbearably loud
-- and then suddenly we were inundated with giant-sized drops, blowing
way in past the patio roof to where we were sitting.
I've heard of rain dances, and songs sung to bring the rain,
but I didn't know it could sing back.
We ran for the door, not soon enough.
But that was okay; we didn't mind getting drenched.
After a drought this long, water from the sky is cause only for celebration.
The squall dumped over a third of an inch in only a few minutes.
(This according to our shiny new weather station with a sensitive
tipping-bucket rain gauge that measures in hundredths of an inch.)
Then it eased up to a light drizzle for a while, the lightning moved
farther away, and we decided it was safe to run down the trail to
"La Cienega" (Spanish for swamp) at the bottom of the property and
see if any water had accumulated. Sure enough! Lake La Senda (our
humorous moniker for a couple of little puddles that sometimes persist
as long as a couple of days) was several inches deep. Across the road,
we could hear a canyon tree frog starting to sing his ratchety song --
almost more welcome than the sound of the rain itself.
As I type this, we're reading a touch over half an inch and we're
down to a light drizzle.
The thunder has receded but there's still plenty of lightning.
More rain! Keep it coming!
[ 20:38 Jul 23, 2018
More nature |
permalink to this entry |
I've been trying to learn more about weather from a friend who
used to work in the field -- in particular, New Mexico's notoriously
windy spring. One of the reasons behind our spring winds relates to
the location of the jet stream. But I couldn't find many
good references showing how the jet stream moves throughout the year.
So I decided to try to plot it myself -- if I could find the data.
Getting weather data can surprisingly hard.
In my search, I stumbled across Geert Barentsen's excellent
variations in the jet stream (video). It wasn't quite what I wanted --
it shows the position of the jet stream in December in successive
years -- but the important thing is that he provides a Python script
on GitHub that shows how he produced his beautiful animation.
Well -- mostly. It turns out his data sources are no longer available,
and he didn't go into a lot of detail on where he got his data, only
saying that it was from the ECMWF ERA re-analysis model (with a
link that's now 404).
That led me on a merry chase through the ECMWF website trying to
figure out which part of which database I needed. ECMWF has
of publically available databases
(and even more)
and they even have Python libraries to access them;
and they even have a lot of documentation, but somehow none of
the documentation addresses questions like which database includes
which variables and how to find and fetch the data you're after,
and a lot of the sample code doesn't actually work.
I ended up using the "ERA Interim, Daily" dataset and requesting
data for only specific times and only the variables and pressure levels
I was interested in. It's a great source of data once you figure out
how to request it.
Sign up for an ECMWF API Key
Access ECMWF Public Datasets
MARS and I'm not sure what the difference is),
which has links you can click on to register for an API key.
Once you get the email with your initial password, log in using the URL
in the email, and change the password.
That gave me a "next" button that, when I clicked it, took me to
a page warning me that the page was obsolete and I should update
whatever bookmark I had used to get there.
That page also doesn't offer a link to the new page where you can
get your key details, so go here:
Your API key.
The API Key page gives you some lines you can paste into ~/.ecmwfapirc.
You'll also have to
license terms for the databases you want to use.
Install the Python API
That sets you up to use the ECMWF api. They have a
API and a Python library,
other Python packages,
but after struggling with a bunch of Magics tutorial examples that mostly
crashed or couldn't find data, I decided I was better off sticking to
the basic Python downloader API and plotting the results with Matplotlib.
The Python data-fetching API works well. To install it,
activate your preferred Python virtualenv or whatever you use
for pip packages, then run the pip command shown at
API Downloads (under "Click here to see the
As always with pip packages, you'll have to decide on a Python version
(they support both 2 and 3) and whether to use a virtualenv, the
much-disrecommended sudo pip, pip3, etc. I used pip3 in a virtualenv
and it worked fine.
Specify a dataset and parameters
That's great, but how do you know which dataset you want to load?
There doesn't seem to be anything that just lists which datasets have
which variables. The only way I found is to go to the Web API page
for a particular dataset to see the form where you can request
different variables. For instance, I ended up using the
database, where you can choose date ranges and lists of parameters.
There are more choices in the sidebar: for instance, clicking on
"Pressure levels" lets you choose from a list of barometric pressures
ranging from 1000 all the way down to 1. No units are specified, but
they're millibars, also known as hectoPascals (hPa): 1000 is more or
less the pressure at ground level, 250 is roughly where the jet stream
is, and Los Alamos is roughly at 775 hPa (you can find charts of
pressure vs. altitude on the web).
When you go to any of the Web API pages, it will show you a dialog suggesting
you read about
retrieval efficiency, which you should definitely do if you're
expecting to request a lot of data, then click on the details for
the database you're using to find out how data is grouped in "tape files".
For instance, in the ERA-interim database, tapes are grouped by date,
so if you're requesting multiple parameters for multiple months,
request all the parameters for a given month together, rather than
making one request for level 250, another request for level 1000, etc.
Once you've checked the boxes for the data you want, you can fetch the
data via the web interface, or click on "View the MARS request" to get
parameters you can plug into a Python script.
If you choose the Python script option as I did, you can start with the
basic data retrieval example.
Use the second example, the one that uses
'format' : "netcdf",
which will (eventually) give you a file ending in .nc.
Requesting a specific area
You can request only a limited area,
but they're not very forthcoming on the syntax of that, and it's
particularly confusing since "75/-20/10/60" supposedly means "Europe".
It's hard to figure how those numbers as longitudes and latitudes
correspond to Europe, which doesn't go down to 10 degrees latitude,
let alone -20 degrees. The
page gives more information: it's North/West/South/East,
which still makes no sense for Europe,
until you expand the Area examples
tab on that
page and find out that by "Europe" they mean Europe plus Saudi Arabia and
most of North Africa.
Using the data: What's in it?
Once you have the data file, assuming you requested data in netcdf format,
you can parse the .nc file with the netCDF4 Python module -- available
as Debian package "python3-netcdf4", or via pip -- to read that file:
data = netCDF4.Dataset('filename.nc')
But what's in that Dataset?
Try running the preceding two lines in the
interactive Python shell, then:
>>> for key in data.variables:
You can find out more about a parameter, like its units, type,
and shape (array dimensions). Let's look at "level":
current shape = (3,)
filling on, default _FillValue of -2147483647 used
array([ 250, 775, 1000], dtype=int32)
Levels has shape
(3,): it's a one-dimensional array with
three elements: 250, 775 and 1000.
Those are the three levels I requested from the web API and in my
Python script). The units are millibars.
More complicated variables
How about something more complicated? u and v are the two
components of wind speed.
int16 u(time, level, latitude, longitude)
units: m s**-1
long_name: U component of wind
unlimited dimensions: time
current shape = (30, 3, 241, 480)
u (v is the same) has a shape of (30, 3, 241, 480): it's a
4-dimensional array. Why? Looking at the numbers in the shape gives a clue.
The second dimension has 3 rows: they correspond to the three levels,
because there's a wind speed at every level. The first dimension has
30 rows: it corresponds to the dates I requested (the month of April 2015).
I can verify that:
Sure enough, there are 30 times, so that's what the first dimension
of u and v correspond to. The other dimensions, presumably, are
latitude and longitude. Let's check that:
Sure enough! So, although it would be nice if it actually told you
which dimension corresponded with which parameter, you can probably
figure it out. If you're not sure, print the shapes of all the
variables and work out which dimensions correspond to what:
>>> for key in data.variables:
... print(key, data[key].shape)
Iterating over times
data['time'] has all the times for which you have data
(30 data points for my initial test of the days in April 2015).
The easiest way to plot anything is to iterate over those values:
timeunits = JSdata.data['time'].units
cal = JSdata.data['time'].calendar
for i, t in enumerate(JSdata.data['time']):
thedate = netCDF4.num2date(t, units=timeunits, calendar=cal)
Then you can use thedate like a datetime,
thedate.strftime or whatever you need.
So that's how to access your data. All that's left is to plot it --
and in this case I had Geert Barentsen's script to start with, so I
just modified it a little to work with slightly changed data format,
and then added some argument parsing and runtime options.
Converting to Video
I already wrote about how to take the still images the program produces
and turn them into a video:
Videos (that work in Firefox) from a Series of Images.
However, it turns out ffmpeg can't handle files that are named with
timestamps, like jetstream-2017-06-14-250.png. It can only
handle one sequential integer. So I thought, what if I removed the
dashes from the name, and used names like jetstream-20170614-250.png
with %8d? No dice: ffmpeg also has the limitation that the
integer can have at most four digits.
So I had to rename my images. A shell command works: I ran this in
zsh but I think it should work in bash too.
for fil in *.png; do
newname=$(printf "%04d.png" $i)
ln -s ../$fil moviedir/$newname
ffmpeg -i moviedir/%4d.png -filter:v "setpts=2.5*PTS" -pix_fmt yuv420p jetstream.mp4
controls the delay between
frames -- I'm not clear on the units, but larger numbers have more delay,
and I think it's a multiplier,
so this is 2.5 times slower than the default.
When I uploaded the video to YouTube, I got a warning,
"Your videos will process faster if you encode into a streamable file format."
I then spent half a day trying to find a combination of ffmpeg arguments
that avoided that warning, and eventually gave up. As far as I can tell,
the warning only affects the 20 seconds or so of processing that happens
after the 5-10 minutes it takes to upload the video, so I'm not sure
it's terribly important.
video of the jet stream from
2012 to early 2018, and an earlier effort with a
much longer 6.0x delay.
And here's the script, updated from the original Barentsen script
and with a bunch of command-line options to let you plot different
collections of data:
jetstream.py on GitHub.
[ 14:18 May 14, 2018
More programming |
permalink to this entry |
We've been considering the possibility of moving out of the Bay Area
to somewhere less crowded, somewhere in the desert southwest we so
love to visit. But that also means moving to somewhere
with much harsher weather.
How harsh? It's pretty easy to search for a specific location and get
average temperatures. But what if I want to make a table to compare
several different locations? I couldn't find any site that made
No problem, I say. Surely there's a Python library, I say.
Well, no, as it turns out. There are Python APIs to get the current
weather anywhere; but if you want historical weather data, or weather
data averaged over many years, you're out of luck.
NOAA purports to have historical climate data, but the only dataset I
found was spotty and hard to use. There's an
FTP site containing
directories by year; inside are gzipped files with names like
723710-03162-2012.op.gz. The first two numbers are station numbers,
and there's a file at the top level called ish-history.txt
with a list of the station codes and corresponding numbers.
Once you figure out the station codes, the files themselves are easy to
parse, with lines like
STN--- WBAN YEARMODA TEMP DEWP SLP STP VISIB WDSP MXSPD GUST MAX MIN PRCP SNDP FRSHTT
724945 23293 20120101 49.5 24 38.8 24 1021.1 24 1019.5 24 9.9 24 1.5 24 4.1 999.9 68.0 37.0 0.00G 999.9 000000
Each line represents one day (20120101 is January 1st, 2012),
and the codes are explained in another file called
For instance, MAX is the daily high temperature, and SNDP is snow depth.
So all I needed was to decode the station names, download the right files
and parse them. That took about a day to write (including a lot of
time wasted futzing with mysterious incantations for matplotlib).
Little accessibility refresher: I showed it to Dave -- "Neat, look at
this, San Jose is the blue pair, Flagstaff is green and Page is red."
"This makes no sense. They all look the same to me. I have no idea
which is which."
Oops -- right. Don't use color as your only visual indicator. I knew that,
supposedly! So I added markers in different shapes for each site.
(I wish somebody would teach that lesson to Google Maps, which uses
color as its only indicator on the traffic layer, so it's useless
for red-green colorblind people.)
Back to the data --
it turns out NOAA doesn't actually have that much historical data
available for download. If you search on most of these locations,
you'll find sites that claim to have historical temperatures dating
back 50 years or more, sometimes back to the 1800s. But NOAA typically
only has files starting at about 2005 or 2006. I don't know where
sites are getting this older data, or how reliable it is.
Still, averages since 2006 are still interesting to compare.
Here's a run of
noaatemps.py KSJC KFLG KSAF KLAM KCEZ KPGA KCNY.
It's striking how moderate California weather is compared
to any of these inland sites. No surprise there. Another surprise
was that Los Alamos, despite its high elevation, has more moderate weather
than most of the others -- lower highs, higher lows. I was a bit
disappointed at how sparse the site list was -- no site in Moab?
Really? So I used Canyonlands Field instead.
Anyway, it's fun for a data junkie to play around with, and it prints
data on other weather factors, like precipitation and snowpack, although
it doesn't plot them yet.
The code is on my
scripts page, under Weather.
Anyone found a better source for historical weather information?
I'd love to have something that went back far enough to do some
climate research, see what sites are getting warmer, colder, or
seeing greater or lesser spreads between their extreme temperatures.
The NOAA dataset obviously can't do that, so there must be something
else that weather researchers use. Data on other countries would be
interesting, too. Is there anything that's available to the public?
[ 22:57 Apr 13, 2013
More programming |
permalink to this entry |