Preface

Digital Soil Mineralogy relates to the data-driven analysis of soil X-ray powder diffraction (XRPD) data. Such data are considered to be precise digital signatures of a given soil’s mineralogy, within which all of the information required to identify and quantify the various mineral components within these complex mixtures is encoded.

In recent years various methods for Digital Soil Mineralogy have been developed and published in peer-reviewed literature. These methods include the use of supervised and unsupervised machine learning to predict and interpret soil properties from XRPD data, the application of novel multivariate statistical methods, and automated approaches for mineral quantification. Each chapter in this course will detail one such method, providing code and data for reproducible examples that can be adapted by readers for their own projects/research.

Whilst all data and methods presented herein relate to soil samples, the methods can be considered transferable to all aspects of environmental mineralogy and beyond!

Prerequisites

To run the examples provided throughout this document, it is recommended that you have R and RStudio installed on your machine. Once that’s set up, then additional extensions (packages) required along the way can be installed and loaded. R and it’s extensions are designed to be multi-platform so all material presented here should work on Windows, Mac, or Linux. The only package needed from the very start of the document is powdR, and subsequent packages will be introduced in later chapters. To install powdR, use:

install.packages("powdR")

Code Conventions

This document contains many chunks of R code that provide reproducible examples that can be copied and pasted to run on your own computer, for instance:

#Summarise a vector of integers 1 to 10
summary(1:10)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
##    1.00    3.25    5.50    5.50    7.75   10.00

The R session information when compiling this book is shown below:

sessionInfo()
## R version 4.2.1 (2022-06-23 ucrt)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 19042)
##
## Matrix products: default
##
## locale:
## [1] LC_COLLATE=English_United Kingdom.utf8
## [2] LC_CTYPE=English_United Kingdom.utf8
## [3] LC_MONETARY=English_United Kingdom.utf8
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_United Kingdom.utf8
##
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base
##
## loaded via a namespace (and not attached):
##  [1] bookdown_0.27    codetools_0.2-18 digest_0.6.28    R6_2.5.1
##  [5] jsonlite_1.7.2   magrittr_2.0.1   evaluate_0.15    highr_0.9
##  [9] stringi_1.7.6    rlang_0.4.11     rstudioapi_0.13  jquerylib_0.1.4
## [13] bslib_0.3.0      rmarkdown_2.14   tools_4.2.1      stringr_1.4.0
## [17] xfun_0.31        yaml_2.2.1       fastmap_1.1.0    compiler_4.2.1
## [21] htmltools_0.5.2  knitr_1.39       sass_0.4.0

Text outputs associated with R code are denoted by two hashes (##) by default, as you can see from the example above. This is for your convenience when you want to copy and run the code (the text output will be ignored since it is commented out). Package names are in bold text (e.g. powdR), and inline code and filenames are formatted in a typewriter font (e.g., summary(1:10)). Function names can easily be identified by the parentheses that follow them (e.g., mean(1:10)).

What to expect

This document is divided into chapters that each detail specific aspects of Digital Soil Mineralogy. To start with, the basics of handling XRPD data in R are introduced, which progresses to more advanced manipulation of such data that cannot be realised with proprietary XRPD software. Subsequently, specific examples of methods for Digital Soil Mineralogy are provided that include high throughput quantitative analysis, data mining, and cluster analysis. As such, the documentation is separated into the following chapters:

• Chapter 2: Quantitative analysis of XRPD data using full pattern summation
• Chapter 3: The use of machine learning to predict and interpret soil properties from XRPD data
• Chapter 4: The application of cluster analysis to identify discrete groups of soils based on mineralogy
• Chapter 5: Identifying soils analogues for Martian mineralogy based on XRPD data

Each chapter contains reproducible R code along with written explanations. For those that prefer video tutorials, there are a number of embedded YouTube videos throughout the course material that describe and explain the R code. With exception to Chapter 1, all chapters are standalone so there is no need to read everything!