Data Preparation
The example ships with everything needed to launch the inversion immediately. This page explains the data files, the travel-time CSV schema, and how to optionally build a 3-D initial model from CSEM.
Travel-time data (src_rec_data_wus.csv)
The travel-time table uses the standard SurfATT CSV schema with station and event elevation columns:
tt,staname,stla,stlo,stel,evtname,evla,evlo,evel,period,weight,dist,vel
48.8896,109C-TA,32.889,-117.105,0,BBR-CI,34.262,-116.921,0,5.0,1,153.22,3.134
62.2573,109C-TA,32.889,-117.105,0,BC3-CI,33.655,-115.454,0,5.0,1,175.69,2.822
52.048,109C-TA,32.889,-117.105,0,BEL-CI,34.001,-115.998,0,5.0,1,160.62,3.086
...| Column | Description |
|---|---|
tt | Travel time (s) |
staname | Receiver station code (e.g. 109C-TA) |
stla / stlo / stel | Receiver latitude, longitude, elevation (m) |
evtname | Virtual-source station code |
evla / evlo / evel | Source latitude, longitude, elevation (m) |
period | Rayleigh-wave period (s) |
weight | Measurement weight |
dist | Inter-station distance (km) |
vel | Measured phase velocity (km/s) |
The measurements come from USArray-TA and co-located regional networks (CI, TA, US, etc.). The shortest period is 5 s, sensitive to the very upper crust; longer periods extend to several tens of seconds and sample the Moho and uppermost mantle.
Quick summary
import pandas as pd
df = pd.read_csv("src_rec_data_wus.csv")
print(f"Total measurements : {len(df):,}")
print(f"Unique stations : {pd.concat([df['staname'], df['evtname']]).nunique()}")
print(f"Period range : {df['period'].min():.1f} – {df['period'].max():.1f} s")
print(f"Periods : {sorted(df['period'].unique().tolist())}")
print(f"Velocity range : {df['vel'].min():.2f} – {df['vel'].max():.2f} km/s")
print(f"Distance range : {df['dist'].min():.0f} – {df['dist'].max():.0f} km")Running this on the bundled src_rec_data_wus.csv prints:
Total measurements : 334,554
Unique stations : 689
Period range : 5.0 – 40.0 s
Periods : [5.0, 6.0, 8.0, 10.0, 12.0, 15.0, 20.0, 25.0, 30.0, 35.0, 40.0]
Velocity range : 1.36 – 4.10 km/s
Distance range : 42 – 600 kmSo the dataset delivers roughly 334 k phase-velocity picks among 689 stations across 11 discrete periods (5, 6, 8, 10, 12, 15, 20, 25, 30, 35, 40 s), with inter-station distances of 42–600 km.
Optional: build a 3-D initial model from CSEM
The default configuration uses init_model_type: 1 (1-D inversion of average travel times) as the starting model, which converges robustly without extra setup. To start from a higher-quality 3-D reference, convert the bundled CSEM NetCDF to HDF5:
import h5py
import numpy as np
from scipy.io import netcdf_file
def read_nc(fname):
f = netcdf_file(fname, mode="r", mmap=False)
try:
vsv = np.asarray(f.variables["vsv"][:])
vsh = np.asarray(f.variables["vsh"][:])
x = np.asarray(f.variables["longitude"][:])
y = np.asarray(f.variables["latitude"][:])
z = np.asarray(f.variables["depth"][:])
finally:
f.close()
# Voigt-average isotropic Vs
vs = np.sqrt((2.0 * vsv**2 + vsh**2) / 3.0).T
return x, y, z, vs
def write_h5(fname, x, y, z, vs):
with h5py.File(fname, "w") as f:
f.create_dataset("x", data=x)
f.create_dataset("y", data=y)
f.create_dataset("z", data=z)
f.create_dataset("vs", data=vs)
if __name__ == "__main__":
x, y, z, vs = read_nc("csem.nc")
write_h5("csem.h5", x, y, z, vs)Then switch init_model_type to 2 in input_params.yml:
model:
init_model_type: 2
init_model_path: csem.h5CSEM is distributed in NetCDF classic format (CDF1/CDF2), which is not HDF5-compatible — use scipy.io.netcdf_file for the read step rather than h5py.