Python toolbox for Probabilistic Acoustic Sediment Mapping
PriSM is a program to carry out substrate mapping using multibeam acoustic backscatter data, implementing two task-specific probabilistic approaches. It works with both single-frequency (monospectral) and multi-frequency (multispectral) backscatter data inputs. Two models are implemented: 1) Gaussian Mixture Model (GMM) and 2) fully-connected Conditional Random Field (CRF).
PriSM is a program for substrate mapping using multibeam acoustic backscatter using a task-specific probabilistic approach.
It works with both single-frequency (monospectral) and multi-frequency (multispectral) backscatter data inputs
Two models are implemented:
1) Gaussian Mixture Model (GMM); and
2) fully-connected Conditional Random Field (CRF)
The program consists of 6 tabs. The tabs are to be navigated in order (read, classify, plot, export)
Read Backscatter and Ground Truth Data
Select backscatter geotiff file(s). Each file contains a raster of gridded backscatter values at a certain frequency. One or multiple files may be selected. For monospectral data, this is a single .tiff file containing a 2D grid. For multispectral data, this is multiple .tiff files, each containing a 2D grid
Select bed observations file. This is either a csv containing bed observations (codes) referenced by WGS84 (longitude, latitude) coordinates or an ESRI shapefile containing the same information
Set output grid resolution (m)
Set buffer distance (m). This is the ground distance over which each bed observation applies to. For example, if buffer distance is 10 m, then the bed code at a point is assumed to apply to all grid nodes within 10 m. The larger this distance, the more data the model has to train with.
Set weight for chambolle filter (0 for no filter). This is for despeckling backscatter data. The principle of total variation denoising is explained inhttp://en.wikipedia.org/wiki/Total_variation_denoising
Substrate Classification using a Gaussian Mixture Model (GMM)
Applies the GMM model to the data, generating grids of 1) substrate class and 2) posterior probability of substrate class. Model parameters are set using sliding switches.
`Prob. Threshold' is the threshold probability below which a classification is considered indeterminate (`unknown'). In general, the higher this number, the fewer grid cells are classified but the greater confidence in those grid cells that are classified. `Test proportion' is the amount of the data used for testing the model. The remaining proportion is used to train the model.
`Covariance type' refers to how the Sigma parameter is specified in the model. The 4 options are explained in the instructions on the tab, and also in this paper.
`Tolerance' is a numerical value that determines how the model decides when to finish (very generally, smaller = more accurate model). Model cessation occurs when the average gain in posterior probability from the previous iteration falls below this number
Substrate Classification using a Fully connected Conditional Random Field (CRF)
Applies the CRF model to the data, generating grids of 1) substrate class and 2) posterior probability of substrate class. The Theta and Mu parameter values may be specified, as well as the number of iterations.
Theta: controls the degree of allowable similarity in backscatter between graph nodes in the CRF model. Relatively large values means backscatter features with relatively large differences in amplitude may be considered to be a given substrate label.
Mu: specifies the distance between pairs of pixels beyond which they are considered far enough apart to have similar backscattering but different substrate labels
Number of iterations is the number of times the model iterates to refine the estimate.
Make presentation-ready plots
This module allows the user to generate .png format plots of GMM and/or CRF model outputs. The user can specifiy a color map for the substrates (which consists of assigning a unique color to each substrate present). Other available plots are each backscatter channel, a backscatter distribution per substrate plot, and a confusion matrix plot to evaluate model performance.
Export data for use in GIS or other programs
Substrate classisication maps may be exported in GeoTIFF format, which is readily imported into a GIS or other analysis package. Bed observations, which have been filtered to only those within the surveyed extents of the multibeam data, may also be exported, in ESRI shapefile (.shp extension), or comma separated text file (.csv or .txt extension) formats.
Contributing & Credits
The software has been developed in the python programming language by Dr. Daniel Buscombe, Northern Arizona University, Flagstaff, AZ 86011, firstname.lastname@example.org using the CRF subfunctions written in C++ by Philipp Krähenbühl using the pydensecrf wrapper
Two data sets are provided with the toolbox for users to experiment with. The data come from 1) Patricia Bay, British Columbia, Canada, and 2) lower Portsmouth Harbor, New Hampshire, USA. All example backscatter data (.tiff files) originate from data collected by R2Sonic and distributed for use as part of the R2Sonic 2017 Multispectral Backscatter competition. Bed observation data from Patricia Bay are digitized from data presented in:
- B. Biffard. Seabed remote sensing by single-beam echosounder: models, methods and applications. Doctoral dissertation, University of Victoria, Canada, 2011.
Bed observation data from Portsmouth (NEWBEX) are digitized from data presented in:
- T. Weber, and L. Ward. Observations of backscatter from sand and gravel seafloors between 170 and 250 kHz. Journal of the Acoustical Society of America, vol. 138, no. 4, pp. 2169 - 2180, 2015.
Use of this toolbox is permitted under a GPL v.3 license - see full terms and conditions here.