Circumventing the Mathematica & Illustrator Font Problem

I often have trouble when exporting Mathematica graphics to PDF or EPS for later post processing in Adobe Illustrator (or InkScape). I am not exactly sure what causes the problem, but the incompatibility is font dependent, so I have this workaround.

I want the final graphics to include Helvetica text, so I choose a non-problematic font that is similar to Helvetica, like “Microsoft Sans Serif”, and use it for Mathematica graphics. The key characteristic of the non-problematic font is that it imports into Illustrator as text (not outlines, not non-printg characters that render as boxes or symbols, etc.). So something like this works:

BaseStyle -> {21, FontFamily -> "Microsoft Sans Serif"}

Then I can replace the temporary font in Illustrator using the Find-Font dialog box.

The bug is really annoying, but always shows itself when you are working on a figure and don’t have time to track down exactly what is going on.

Writing Data to Simple HDF5 Files with Mathematica

HDF5 files (https://support.hdfgroup.org) are a convenient format for storing related data and information. HDF5 is the basis of the netCDF format used in Generic Mapping Tools and the native format for the most recent versions of Matlab. Storing your data or computations in HDF5 allows you to import the data into many different tools and into the most common computer languages (Python handles them very well (see http://www.h5py.org) , C and Fortran are more tedious, but well supported – see the hdfgroup pages for libraries and documentation).

Mathematica can be used to store data in HDF5 files as long as you keep things simple. To my knowledge, you cannot export HDF5 attributes using Mathematica. But you can easily create and use relatively simple HDF5 file. Simple is often a good choice. The following lines can get you started. You simple use the Export command and it handles all the details (and there are many details, for example, if you are using fortran).

hdf5FileName = "~/test.h5";
(* the first call creates the hdf5 file if it doesn't exist *)
Export[hdf5FileName, theModel, {"Datasets", "TheVModel"}]
(* this call appends a new data set to the hdf5 file *)
Export[hdf5FileName, dsprsn, {"Datasets", "/Love/Dispersion"}, "Append" -> True ]
(* writing text to the file *)
Export[hdf5FileName, infoString, {"Datasets", "/00README.txt"}, "Append" -> True ]

The file suffix “.h5” on the end of the filename identifies the HDF5 file format. The variables theModel and dsprsn are Mathematica lists containing numerical values. I tend to think of the lists as tables of values like you would see in a spreadsheet. I include a description of the data by exporting a string (infoString) into the 00README.txt dataset. This is not an ideal approach, but it is easy, simple, and works for my internal research applications.

Once HDF5 files are created, they can be loaded as easily and displayed, used in computations, etc. You can also use simple tools to explore (and even edit) the HDF5 file to make sure that the process works. Definitely get the command line tools from https://support.hdfgroup.org. One handy, simple interactive tool is HDFView, which I often use to double check things in the data that I’ve written.

Increasing Axes Tick Length With Mathematica

One of my long-standing criticisms of Mathematica’s graphics is the ultra small default tick marks used most graphics. I assume the goal is to not have people focus on the tick marks but on the plot content, but the default tick marks are so small that any analysis of the graphic is difficult. You can find numerous posts online asking how to increase the tick length, but few simple solutions (that don’t rely on external packages). As I have described in earlier posts you can define you own tick marks and pass those to any common graphics command, which is fine for custom final figures, but for general applications the requirement that you know the best major and minor tick spacings makes it tedious.

While searching for a fix for a bug in the LogLogPlot command (which doesn’t even let you change tick thickness) I came across some examples on stackexchange.com that used some private Charting functions to compute the major and minor tick lists. I haven’t found any detailed help on these functions, but the stackexchange.com descriptions were enough to get something working.

Linear Axes

Here’s my function for asking Mathematica for nicely spaced major and minor ticks but increasing the length of the tick marks.

(* arguments: min and max axis value, scale factor for tick length *)
GetScaledLinearTicks[min_, max_, scale_] := Module[{ticks}, 
  ticks = #[min, max] & /@ 
    {Charting`ScaledTicks[{Identity, Identity}], 
     Charting`ScaledFrameTicks[{Identity, Identity}]};
  (*scale the tick length*)
  Table[{#[[1]], #[[2]], scale*#[[3]]} & /@ ticks[[i]], {i, 2}]]

Here’s an example usage,

lticks = {GetScaledLinearTicks[-1.1, 1.1, 2.5],
          GetScaledLinearTicks[-3.2, 3.2, 2.5]};
(**)
pd = Plot[Cos[x], {x, -Pi, Pi}, Frame -> True, ImageSize -> 300,
    PlotLabel -> "Default Tick Length"];
(**)
pl = Plot[Cos[x], {x, -Pi, Pi}, Frame -> True, 
   FrameTicks -> lticks, PlotLabel -> "Tick Length Scaled by 2.5"];
(**)
GraphicsRow[{pd, pl}] // Print

and here is the default and scaled tick length outputs,

Default and scaled ticks using the function listed above. The axes limits are not exactly the same, but are close enough to produce the same major and minor tick spacing.

Continue reading

Generating Sequential File Names with Mathematica

One common task when outputting scientific computation results, figures, etc. is the need to include numbers in a file name. For example, including leading zeros in a name can ensure that the files are ordered sequentially when you list the directory contents. If the numerical values correspond to real numbers (as opposed to integers) you may want to include decimals after the decimal point.

The function below generates a name that starts with the string “T_” and then includes a left and right zero padded number at the end of the file name.

BuildNameForPeriod[period_, n_] := 
 Module[{}, 
  "T_" <> StringTake[ToString[NumberForm[period, {n, 2}, 
       NumberPadding -> {"0", "0"}]], -n]
 ]

You can test it with

Table[BuildNameForPeriod[i,6],{i,0,200,5}]

which produces the output

{T_000.00,T_005.00,T_010.00,T_015.00,T_020.00,T_025.00,
T_030.00,T_035.00,T_040.00,T_045.00,T_050.00,T_055.00,
T_060.00,T_065.00,T_070.00,T_075.00,T_080.00,T_085.00,
T_090.00,T_095.00,T_100.00,T_105.00,T_110.00,T_115.00,
T_120.00,T_125.00,T_130.00,T_135.00,T_140.00,T_145.00,
T_150.00,T_155.00,T_160.00,T_165.00,T_170.00,T_175.00,
T_180.00,T_185.00,T_190.00,T_195.00,T_200.00}

Of course there ought to be an easier way (and probably is), but this seems to work.

Don’t Study Large Earthquakes with Mathematica’s EarthquakeData

Mathematica contains information on earthquakes that is accessible using the built-in EarthquakeData function. The function includes useful search approaches that allow users to select earthquakes by magnitude, depth, date, etc. The geographic selection tools are particularly useful – you can select earthquakes by geographic regions, polygons, distance from a reference location, etc.

Unfortunately, something is wrong with the data. The earthquake information are not what most earthquake seismologists would agree is accurate. I have not combed through their entire data set, but the Wolfram data seem to have a problem with at least earthquake magnitudes. Although magnitude is the most commonly quoted estimate of an earthquake size, it is not a quantity easy for nonspecialists to understand and to package for general use.

The development of earthquake magnitude scales has a complex history because observational seismology developed before the firm theoretical basis of faulting and earthquakes was established. Many magnitudes may be associated with a large earthquake and not all are of equal value or even accurately reflect the size of the earthquake. Seismologists continue to measure and use a suite of magnitudes because that’s how we can compare old and new earthquakes.

The data for large earthquakes returned by the EarthquakeData command are inaccurate. I emailed Wolfram several years ago to describe problems I saw on Wolfram Alpha and they responded, but serious, obvious problems remain.  This is not a big deal to me, I use Mathematica to analyze and display earthquake data often, but I never use Mathematica’s data. But two things concern me: (1) Do all the data included with Mathematica suffer from such substantial errors (I don’t think so, but I wouldn’t know the details); (2) I worry about students exploring earthquakes on their own who perform calculations with the Mathematica’s data and who then are disappointed and discouraged that the effort was wasted because the input was flawed.

In the following, I evaluate some of the EarthquakeData information for large earthquakes. Problems appear quickly.

A simple search for a list of great earthquakes

Great earthquakes are earthquakes with magnitudes greater than or equal to 8.0. Mathematica makes searching for earthquakes with a minimum magnitude convenient. The search results are a list earthquake entities that contain properties related to each earthquake in the list. Since we know much more about recent earthquakes than historical events, some properties for the older events may be missing.

Here is an example query to get a list of all great earthquakes in the Wolfram data base.

(* Select earthquakes with magnitudes greater than or equal to 8.0 *)
eqs = EarthquakeData[All, 8.0];
Print["Returned " <> ToString[Length[eqs]] <> " earthquake entities."]

Continue reading

Adding FrameTicks to Maps with Mathematica

As I’ve illustrated in a number of posts, Mathematica is developing a number of useful and relatively easy way to construct maps. One of the most frustrating issues with the implementation is that the default latitude tick marks are not in degrees (along with the ever-present problem that they have no idea how long a tick mark should be in a quality plot). In this post I describe a simple function to create tick marks for the Mercator projection maps, and the code is relatively easily generalized to work with other projections.

GeoGraphics – Incorrect Ticks Marks

Let’s start with a simple map of the location of the 2015 Nepal (Gorkha) Earthquake.

(* set up the earthquake information *)
cat = {607208674, "CSEM", "2015-04-25", "06:11:26.30", 28.28, 84.79, 
       "     ", "", "GCMT", "MW", 7.9};
eloc = {#[[5]], #[[6]], #[[11]]} & /@ {cat}
eqradius[m_] := N[10^(0.5*m - 2.25)]
scaleFactor = 1;
(* plot the map *)
GeoGraphics[{Polygon[Entity["Country", "Nepal"]], 
  GeoStyling[Opacity[0.25], EdgeForm[Black], FaceForm[Red]],
  GeoDisk[{#[[1]], #[[2]]}, (scaleFactor*Quantity[eqradius[#[[3]]], "km"]) ] & /@ eloc},
  GeoRange -> {{26.25, 29.75}, {82.5, 88.5}}, Frame -> True,
  FrameTicks -> {Automatic, Automatic},
  BaseStyle -> {18, FontFamily -> "Helvetica"},
  PlotLabel -> Style["Gorkha, Nepal Earthquake, 2015-04-25 06:11:26.30 UTC, Mw 7.9", {16}],
  GeoZoomLevel -> 8, FrameStyle -> AbsoluteThickness[1.2],
  GeoScaleBar -> {"Imperial", "Metric"},
  ImageSize -> 600] // Print

Here is the output

Mathematica map of the 2015 Nepal earthquake epicenter plotted with the default FrameTicks. Those are not latitudes on the vertical axis.

The map looks pretty good, but those are not latitude tick marks. Worse yet, in this example, they happen to be close enough to actual latitudes that they could fool you. The latitude of the center of the red circle is 28.28N, the tick marks lead you to believe the location is at roughly 29.5N. Technically, the tick marks are not wrong, they are simply in the practically useless transformation coordinate system. To get the correct tick marks, you need to specify them yourself, convert to projection coordinates, and label appropriately.

Continue reading

Manipulating Elevation/Bathymetry Profiles with Mathematica

Introduction

One of the earliest posts that I made on this site described how to use Mathematica to create elevation/bathymetry profiles for the Earth, at a specified latitude. I use those profiles to have students in my 100-level non-science majors course on earthquakes and society become familiar with Earth’s surface topography and its relationship to plate tectonic features. We are now studying tsunami, and one of the activities that we complete is an estimate of the travel time from an earthquake in the eastern Atlantic to a set of cities along the east coast of the United States. The source is located near the assumed source region of the 1755 Lisbon Earthquake.

We compute the rough speed of the tsunami, \(c\) using the simple formula \(c = \sqrt{d \cdot g}\), where \(d\) is the mean depth of the water along the great-circle arc connecting the source region and the point of interest, and \(g\) is the acceleration of gravity (about \(10~m/s^2\)). In large classes I provide a map with depth of the ocean contoured and including great circle arcs connecting the source and locations of interest and students estimate the mean depth along the path. They generally do ok, but estimating the mean depth is a challenge (one worthwhile). In my smaller courses, which are taught in computer laboratories, we can do the analysis using Mathematica notebooks. Then students can plot the great circle arcs, look at the profile, estimate the mean depth, and then compute the mean depth.

The Code

Here’s the code listing to extract the elevation/bathymetry along an arc connecting points 0 and 1, specified by the latitude and longitudes (e.g. lat1,lon1)

(* city *)
lat0 = 40.66;
lon0 = -73.94;
(* eq *)
lat1 = 36;
lon1 = -10;
(* get the great-circle arc and the distance along the arc *)
gcarc = GeoPath[{{lat0, lon0}, {lat1, lon1}}, "GreatCircle"];
gcarcDistance = GeoDistance[{{lat0, lon0}, {lat1, lon1}}, UnitSystem -> "Metric"]
(**)
p0 = Framed[GeoGraphics[{Red, AbsoluteThickness[4], gcarc}, 
     GeoRange -> {{0, 70}, {-120, 20}},
     GeoProjection -> "CylindricalEquidistant", GeoZoomLevel -> 6, 
     GeoBackground -> "ReliefMap", ImageSize -> Full]];
(* extract the elevations along the great-circle arc *)
profile = GeoElevationData[gcarc, Automatic, "GeoPosition", GeoZoomLevel -> 4];
(* extract the pts = {lat,lon,elevation} from the results *)
pts = profile[[1]][[1]];
depths = #[[3]] & /@ pts;
distances =  QuantityMagnitude[GeoDistance[{pts[[1]][[1;;2]], #[[1;;2]]}, 
             UnitSystem -> "Metric"]] & /@ pts;
avg = Mean[depths];
pts = Transpose[{distances, depths}];
(* plot the elevations along the line and a dashed line for sea level *)
label =  "Mean depth along the path: " <> ToString[Round[-avg, 5]] <> " m";
(**)
p1 = ListPlot[pts, Joined -> True, 
     Frame -> {True, True, False, False}, Axes -> False, 
     Filling -> Bottom, ImageSize -> Full,
     PlotLabel -> Style[label, 18], AspectRatio -> 0.4, BaseStyle -> 14,
     Prolog -> {Blue, Dashed,
     Line[{{distances[[1]], 0}, {distances[[-1]], 0}}], Black, 
     Line[{{distances[[1]], avg}, {distances[[-1]], avg}}]}];
(**)
Column[{p0, p1}] // Print

The Output

The graphics that result are show below – the dashed blue line is sea-level, the dashed black line is the mean value. A minor goal of the computation is to have students see the relationship between the mean depth and the profile. Continue reading

Mapping Countries, States, & Cities with Mathematica

The last few updates to Mathematica have added some useful cartographic tools for map making. Some of my earlier posts included some simple maps. For any serious cartography, you need to be able to plot the boundaries of various “administrative” units that make up a region – these can be states, provinces, counties, cities, etc. Mathematica, like python can do just about anything that you really set your mind to, so if you have geographic information in existing files, you can parse it and display it with the GeoGraphics command. In this note I show how to acquire information on basic administrative units, in particular, their coordinates, so they can be displayed on the maps you use to show your own data (earthquake locations, seismic stations, etc).

When I make a map, I usually have a set of bounds that define the region of interest (influenced strong by the Generic Mapping Tools) by specifying the west, east, south, and north edges. Let’s start with an example defining a geographic rectangle and requesting a list of the countries that overlap our region. In  the example I was exploring seismic stations available in northeast China so the box covers that region. Instead of plotting seismometer locations, I’ll plot country boarders and large city locations. We’ll need the country list, country polygons, and a list of large cities and their populations.

(* define the geographic region of interest *)
west = 112;
east = 137;
south = 35;
north = 51;

(* acquire a list of countries that are contained in the region *)
regionalCountries = 
 GeoEntities[Rectangle[GeoPosition[{south, west}], GeoPosition[{north, east}]],
        "Country"]

(* acquire the coordinate polygons for each country in the list *)
regCountryPolygons = CountryData[#, "Polygon"] & /@ regionalCountries;

regionalCities = GeoEntities[Rectangle[GeoPosition[{south, west}], GeoPosition[{north, east}]], 
 "City"]
(* create a list of large cities in the region - 
          minimum population is 1,000,000 *)
largeRegionalCities = Select[regionalCities,
     QuantityMagnitude[CityData[#, "Population"]] >= 1000000 &];

You can see what properties are available from CityData using

In[1]:= CityData["Properties"]

Out[1]= {"AlternateNames", "Coordinates", "Country", "Elevation", "FullName", 
"Latitude", "LocationLink", "Longitude", "Name", "Population", "Region", 
"RegionName", "StandardName", "TimeZone"}

Here’s what the output in the notebook looks like. The “data” requests return Mathematica “Entities“, which are information objects that can be queried for additional information, as we did asking the countries for cities and asking the cities for populations and coordinates. Entities are represented by the orange rounded rectangles in the notebook. Continue reading

Understanding FFT Scaling with Matlab (or Python, or…)

The Fourier Transform is one of the most frequently used computational tools in earthquake seismology. Using an FFT requires some understanding of the way the information is encoded (frequency ordering, complex values, real values, etc) and these are generally well documented in the various software packages used in the field. I’ll assume that you have a good handle on these issues and that you know the basic idea about using forward and inverse transforms. I want to focus on an aspect of FFT use that often confuses students, the scaling provided by the transforms. Scaling can be confusing because to make the functions more efficient and flexible, the scaling is often omitted, or it is implicitly included by assuming that you plan to forward and inverse transform the same signal.

In seismology we usually start with a signal that has a specified sample rate, \(\delta t\), and a length, \(N\). I’ll assume that we are dealing with a signal for which \(N\) is a power of two, which used to be the case whenever we used FFT’s but is less required now that computers are much faster. FFT routines don’t use a specific value of sample rate, \(\delta t\), at all, but they don’t always completely ignore it – they sometimes include \(\delta t\) implicitly in their scaling. For most cases, when you compute a forward transform, you have to scale the result by the sample rate to get the correct values (this \(\delta t\) arises from the \(\delta t\) in the Fourier Transform that the DFT is approximating).

Consider the case in Matlab. We’ll construct a unit area “spike” in Matlab by creating a signal with a single non-zero value. So we can keep track of the physical units, which are also ignored in the computation but are essential to keep track of when doing science, I’ll assume that our signal is a displacement seismogram with units of meters and that the units of \(\delta t\) are seconds. To create a signal with unit area, we must set the amplitude equal to \(1 / \delta t ~\mbox{meters}\). That is the spike amplitude has the same numerical value as \(1/\delta t\), but units of length. For a spike with an amplitude \(1/\delta t\), the area under the curve in the time domain is (using a the formula for the area of a triangle with width 2 dt):

\[Area = \frac{1}{2} \cdot 2 \times \delta t~\mbox{[s]} \cdot \frac{1}{\delta t}~\mbox{[m]}= 1 \mbox{ [m-s] .} \]

The Fourier Transform of a unit area spike should have a value of unity. Let’s see what we get in Matlab.

 >> dt = 0.1;
 >> s = [1/dt,0,0,0,0,0,0,0];
 >> shat = fft(s) % amplitudes are wrong without the correct scaling
 shat = 10 10 10 10 10 10 10 10

The spectrum is a factor of \(1/\delta t\) too large. We get the correct result  by multiplying by \(\delta t\).

>> shat = fft(s) * dt % you must scale by dt to get the correct values shat = 1 1 1 1 1 1 1 1

When we inverse transform, we have to account for the \(\delta f\) in the Fourier Transform Integral. That means we multiply by \(\delta f\). For a signal of length \(N\) and sample rate \(\delta t\), the frequency sampling rate, \(\delta f\) is

\[\delta f = \frac{1}{N \cdot \delta t}~. \]

Here’s the tricky part, Matlab multiplies the inverse transform by \(1/N = \delta t \cdot \delta f\). So we have to correct for that by either dividing by \(\delta t\) or simply remove the \(1/N\) factor and apply the \(\delta f\), which makes things easier to understand. Here’s a summary, I explicitly multiply the results of the inverse transform (ifft) by \(N\) to remove Matlab’s scaling, then I scale by \(\delta f\).

dt = 0.1;
N = 8;
df = 1/(N*dt);
% define a unit area signal
s = [1/dt,0,0,0,0,0,0,0]
% forward transform - scale by dt
shat = fft(s) * dt
% inverse transform - remove 1/N factor, then scale by df
ifs = (N * ifft(shat)) * df

Summary

The first thing that I do when I start using a new FFT, is to test the amplitudes using a unit-area delta-function approximation (which is really a triangle). The unit area time domain signal must have a value of unity at the zero frequency, and if it’s a spike, the spectrum should be flat. If you think this is not that critical, then you need to think about deconvolutions, earthquake-source spectra calculations, etc. To get the correct amplitudes, you have to apply your scaling (unless you do a forward and then inverse transform without applying any operations using physical quantities to the original signal).

There’s nothing special about Matlab here – you can use the same tests to understand the scaling in Mathematica, Python, Fortran, C, JavaScript, …

Searching the USGS Earthquake Catalog with Mathematica

This note documents a simple in-class activity that I use in my GEOSC 109H earthquakes and society class. In class I describe the Gutenberg-Richter Relation that represents the distribution of earthquakes as a function of magnitude (a nice paper-based activity is available from SCEC). I show the students a plot from the GCMT catalog in class, in this activity, they use Mathematica to download data for a few randomly chosen years and see if the pattern I showed in class holds up.

The Mathematica notebook begins with a few utility functions that allow students to use web services to acquire seismicity data and a few functions to list some events (which is partly a check to see that things are working). Below is a listing of the search script (an example usage is provided below).

comcatRectangularSearchPath[start_, end_, minlat_, maxlat_, minlon_, maxlon_] := Module[{},
  "http://earthquake.usgs.gov/fdsnws/event/1/query?format=csv&starttime=" <> 
   DateString[start, "ISODateTime"] <> "&endtime=" <> 
   DateString[end, "ISODateTime"] <> 
   "&minmagnitude=4.0&minlatitude=" <> 
   ToString[Round[minlat, 0.001]] <>
   "&maxlatitude=" <> 
   ToString[Round[maxlat, 0.001]] <>
   "&minlongitude=" <> 
   ToString[Round[minlon, 0.001]] <> "&maxlongitude=" <> 
   ToString[Round[maxlon, 0.001]] <> "&orderby=time-asc"
  ]

comcatRectangularSearch[start_, end_, minlat_: - 90, maxlat_: 90, 
  minlon_: - 180, maxlon_: 180] := Module[{path},
   path = comcatRectangularSearchPath[start, end, minlat, maxlat, minlon, maxlon];
  {path, Import[path, "CSV"]}
  ]

The scripts are simplistic, and to avoid errors for some of the more active years, I limit the magnitude range of the analysis to events greater than or equal to magnitude 4.0. The code below is a function to list the largest events in the search results.

Continue reading