A multifunctional matching algorithm for sample design in agricultural plots

N. Ohana-Levi; A. Derumigny; A. Peeters; A. Ben-Gal; I. Bahat; L. Katz; Y. Netzer; A. Naor; Y. Cohen

doi:10.1016/j.compag.2021.106262

A multifunctional matching algorithm for sample design in agricultural plots

N. Ohana-Levi^*, A. Derumigny, A. Peeters, A. Ben-Gal, I. Bahat, L. Katz, Y. Netzer, A. Naor, Y. Cohen

^*Corresponding author for this work

Statistics

Research output: Contribution to journal › Article › Scientific › peer-review

5 Citations (Scopus)

105 Downloads (Pure)

Abstract

Collection of accurate and representative data from agricultural fields is required for efficient crop management. Since growers have limited available resources, there is a need for advanced methods to select representative points within a field in order to best satisfy sampling or sensing objectives. The main purpose of this work was to develop a data-driven method for selecting locations across an agricultural field given observations of some covariates at every point in the field. These chosen locations should be representative of the distribution of the covariates in the entire population and represent the spatial variability in the field. They can then be used to sample an unknown target feature whose sampling is expensive and cannot be realistically done at the population scale. An algorithm for determining these optimal sampling locations, namely the multifunctional matching (MFM) criterion, was based on matching of moments (functionals) between sample and population. The selected functionals in this study were standard deviation, mean, and Kendall's tau. An additional algorithm defined the minimal number of observations that could represent the population according to a desired level of accuracy. The MFM was applied to datasets from two agricultural plots: a vineyard and a peach orchard. The data from the plots included measured values of slope, topographic wetness index, normalized difference vegetation index, and apparent soil electrical conductivity. The MFM algorithm selected the number of sampling points according to a representation accuracy of 90% and determined the optimal location of these points. The algorithm was validated against values of vine or tree water status measured as crop water stress index (CWSI). Algorithm performance was then compared to two other sampling methods: the conditioned Latin hypercube sampling (cLHS) model and a uniform random sample with spatial constraints. Comparison among sampling methods was based on measures of similarity between the target variable population distribution and the distribution of the selected sample. MFM represented CWSI distribution better than the cLHS and the uniform random sampling, and the selected locations showed smaller deviations from the mean and standard deviation of the entire population. The MFM functioned better in the vineyard, where spatial variability was larger than in the orchard. In both plots, the spatial pattern of the selected samples captured the spatial variability of CWSI. MFM can be adjusted and applied using other moments/functionals and may be adopted by other disciplines, particularly in cases where small sample sizes are desired.

Original language	English
Article number	106262
Number of pages	14
Journal	Computers and Electronics in Agriculture
Volume	187
DOIs	https://doi.org/10.1016/j.compag.2021.106262
Publication status	Published - 2021

Keywords

Agricultural sampling
Partially-observed data
Representative sampling given covariates
Spatial autocorrelation
Two-phase study

Access to Document

10.1016/j.compag.2021.106262Licence: CC BY-NC-ND

1-s2.0-S0168169921002799-mainFinal published version, 5.54 MBLicence: CC BY-NC-ND

Cite this

@article{4c2f680041064605834b5bb2d33dec29,

title = "A multifunctional matching algorithm for sample design in agricultural plots",

abstract = "Collection of accurate and representative data from agricultural fields is required for efficient crop management. Since growers have limited available resources, there is a need for advanced methods to select representative points within a field in order to best satisfy sampling or sensing objectives. The main purpose of this work was to develop a data-driven method for selecting locations across an agricultural field given observations of some covariates at every point in the field. These chosen locations should be representative of the distribution of the covariates in the entire population and represent the spatial variability in the field. They can then be used to sample an unknown target feature whose sampling is expensive and cannot be realistically done at the population scale. An algorithm for determining these optimal sampling locations, namely the multifunctional matching (MFM) criterion, was based on matching of moments (functionals) between sample and population. The selected functionals in this study were standard deviation, mean, and Kendall's tau. An additional algorithm defined the minimal number of observations that could represent the population according to a desired level of accuracy. The MFM was applied to datasets from two agricultural plots: a vineyard and a peach orchard. The data from the plots included measured values of slope, topographic wetness index, normalized difference vegetation index, and apparent soil electrical conductivity. The MFM algorithm selected the number of sampling points according to a representation accuracy of 90% and determined the optimal location of these points. The algorithm was validated against values of vine or tree water status measured as crop water stress index (CWSI). Algorithm performance was then compared to two other sampling methods: the conditioned Latin hypercube sampling (cLHS) model and a uniform random sample with spatial constraints. Comparison among sampling methods was based on measures of similarity between the target variable population distribution and the distribution of the selected sample. MFM represented CWSI distribution better than the cLHS and the uniform random sampling, and the selected locations showed smaller deviations from the mean and standard deviation of the entire population. The MFM functioned better in the vineyard, where spatial variability was larger than in the orchard. In both plots, the spatial pattern of the selected samples captured the spatial variability of CWSI. MFM can be adjusted and applied using other moments/functionals and may be adopted by other disciplines, particularly in cases where small sample sizes are desired.",

keywords = "Agricultural sampling, Partially-observed data, Representative sampling given covariates, Spatial autocorrelation, Two-phase study",

author = "N. Ohana-Levi and A. Derumigny and A. Peeters and A. Ben-Gal and I. Bahat and L. Katz and Y. Netzer and A. Naor and Y. Cohen",

year = "2021",

doi = "10.1016/j.compag.2021.106262",

language = "English",

volume = "187",

journal = "Computers and Electronics in Agriculture",

issn = "0168-1699",

publisher = "Elsevier",

}

TY - JOUR

T1 - A multifunctional matching algorithm for sample design in agricultural plots

AU - Ohana-Levi, N.

AU - Derumigny, A.

AU - Peeters, A.

AU - Ben-Gal, A.

AU - Bahat, I.

AU - Katz, L.

AU - Netzer, Y.

AU - Naor, A.

AU - Cohen, Y.

PY - 2021

Y1 - 2021

N2 - Collection of accurate and representative data from agricultural fields is required for efficient crop management. Since growers have limited available resources, there is a need for advanced methods to select representative points within a field in order to best satisfy sampling or sensing objectives. The main purpose of this work was to develop a data-driven method for selecting locations across an agricultural field given observations of some covariates at every point in the field. These chosen locations should be representative of the distribution of the covariates in the entire population and represent the spatial variability in the field. They can then be used to sample an unknown target feature whose sampling is expensive and cannot be realistically done at the population scale. An algorithm for determining these optimal sampling locations, namely the multifunctional matching (MFM) criterion, was based on matching of moments (functionals) between sample and population. The selected functionals in this study were standard deviation, mean, and Kendall's tau. An additional algorithm defined the minimal number of observations that could represent the population according to a desired level of accuracy. The MFM was applied to datasets from two agricultural plots: a vineyard and a peach orchard. The data from the plots included measured values of slope, topographic wetness index, normalized difference vegetation index, and apparent soil electrical conductivity. The MFM algorithm selected the number of sampling points according to a representation accuracy of 90% and determined the optimal location of these points. The algorithm was validated against values of vine or tree water status measured as crop water stress index (CWSI). Algorithm performance was then compared to two other sampling methods: the conditioned Latin hypercube sampling (cLHS) model and a uniform random sample with spatial constraints. Comparison among sampling methods was based on measures of similarity between the target variable population distribution and the distribution of the selected sample. MFM represented CWSI distribution better than the cLHS and the uniform random sampling, and the selected locations showed smaller deviations from the mean and standard deviation of the entire population. The MFM functioned better in the vineyard, where spatial variability was larger than in the orchard. In both plots, the spatial pattern of the selected samples captured the spatial variability of CWSI. MFM can be adjusted and applied using other moments/functionals and may be adopted by other disciplines, particularly in cases where small sample sizes are desired.

AB - Collection of accurate and representative data from agricultural fields is required for efficient crop management. Since growers have limited available resources, there is a need for advanced methods to select representative points within a field in order to best satisfy sampling or sensing objectives. The main purpose of this work was to develop a data-driven method for selecting locations across an agricultural field given observations of some covariates at every point in the field. These chosen locations should be representative of the distribution of the covariates in the entire population and represent the spatial variability in the field. They can then be used to sample an unknown target feature whose sampling is expensive and cannot be realistically done at the population scale. An algorithm for determining these optimal sampling locations, namely the multifunctional matching (MFM) criterion, was based on matching of moments (functionals) between sample and population. The selected functionals in this study were standard deviation, mean, and Kendall's tau. An additional algorithm defined the minimal number of observations that could represent the population according to a desired level of accuracy. The MFM was applied to datasets from two agricultural plots: a vineyard and a peach orchard. The data from the plots included measured values of slope, topographic wetness index, normalized difference vegetation index, and apparent soil electrical conductivity. The MFM algorithm selected the number of sampling points according to a representation accuracy of 90% and determined the optimal location of these points. The algorithm was validated against values of vine or tree water status measured as crop water stress index (CWSI). Algorithm performance was then compared to two other sampling methods: the conditioned Latin hypercube sampling (cLHS) model and a uniform random sample with spatial constraints. Comparison among sampling methods was based on measures of similarity between the target variable population distribution and the distribution of the selected sample. MFM represented CWSI distribution better than the cLHS and the uniform random sampling, and the selected locations showed smaller deviations from the mean and standard deviation of the entire population. The MFM functioned better in the vineyard, where spatial variability was larger than in the orchard. In both plots, the spatial pattern of the selected samples captured the spatial variability of CWSI. MFM can be adjusted and applied using other moments/functionals and may be adopted by other disciplines, particularly in cases where small sample sizes are desired.

KW - Agricultural sampling

KW - Partially-observed data

KW - Representative sampling given covariates

KW - Spatial autocorrelation

KW - Two-phase study

UR - http://www.scopus.com/inward/record.url?scp=85110179725&partnerID=8YFLogxK

U2 - 10.1016/j.compag.2021.106262

DO - 10.1016/j.compag.2021.106262

M3 - Article

AN - SCOPUS:85110179725

SN - 0168-1699

VL - 187

JO - Computers and Electronics in Agriculture

JF - Computers and Electronics in Agriculture

M1 - 106262

ER -

A multifunctional matching algorithm for sample design in agricultural plots

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this