RT info:eu-repo/semantics/article T1 Potato yield prediction using machine learning techniques and Sentinel 2 data A1 Gómez Aragón, Diego A1 Salvador González, Pablo A1 Sanz Justo, María Julia A1 Casanova Roque, José Luis K1 Machine learning K1 Aprendizaje automático K1 Potato yield K1 Potato yield K1 Patata - Cultivo K1 Precision agriculture K1 Agricultura de precisión K1 Satellite remote sensing K1 Teledetección satelital AB Traditional potato growth models evidence certain limitations, such as the cost of obtaining the input data required to run the models, the lack of spatial information in some instances, or the actual quality of input data. In order to address these issues, we develop a model to predict potato yield using satellite remote sensing. In an effort to offer a good predictive model that improves the state of the art on potato precision agriculture, we use images from the twin Sentinel 2 satellites (European Space Agency—Copernicus Programme) over three growing seasons, applying different machine learning models. First, we fitted nine machine learning algorithms with various pre-processing scenarios using variables from July, August and September based on the red, red-edge and infra-red bands of the spectrum. Second, we selected the best performing models and evaluated them against independent test data. Finally, we repeated the previous two steps using only variables corresponding to July and August. Our results showed that the feature selection step proved vital during data pre-processing in order to reduce multicollinearity among predictors. The Regression Quantile Lasso model (11.67% Root Mean Square Error, RMSE; R2 = 0.88 and 9.18% Mean Absolute Error, MAE) and Leap Backwards model (10.94% RMSE, R2 = 0.89 and 8.95% MAE) performed better when predictors with a correlation coefficient > 0.5 were removed from the dataset. In contrast, the Support Vector Machine Radial (svmRadial) performed better with no feature selection method (11.7% RMSE, R2 = 0.93 and 8.64% MAE). In addition, we used a random forest model to predict potato yields in Castilla y León (Spain) 1–2 months prior to harvest, and obtained satisfactory results (11.16% RMSE, R2 = 0.89 and 8.71% MAE). These results demonstrate the suitability of our models to predict potato yields in the region studied. PB MDPI SN 2072-4292 YR 2019 FD 2019 LK https://uvadoc.uva.es/handle/10324/55863 UL https://uvadoc.uva.es/handle/10324/55863 LA eng NO Remote Sensing, 2019, vol. 11, n. 15, 1745 NO Producción Científica DS UVaDOC RD 15-ene-2025