Title : Performance of soybean (Glycine max) cultivars in Virginia field sites: identifying shoot phenotypes through manual and automated image analysis methods to predict yield
This study focuses on assessing soybean (Glycine max) to determine shoot architectural traits that influence seed yield in determinate and indeterminate cultivars. Nine soybean cultivars were planted at two field sites in Virginia, reflecting clay-loam and sandy-loam profiles, during spring 2017 using a randomized block design. Harvest and processing of plants occurred near the end of seed-filling. Photographs of de-leaved R7-stage shoots were acquired with a Nikon D-90 digital SLR camera with a 28-85 mm zoom lens and an aperture equal to f8 with a shutter speed of 1/60 s and flash mode enabled. The camera was fixed onto a Benro FTA28CV1 carbon fiber tripod mount and B03 head and a boom to allow enable picture acquisition directly above the plants, on a white background. Seed yield was quantified as the number and weight of pods and seeds on the main stem and branches. Shoots were manually classified into one of four phenotype categories based on the density of pod placement among the main stem and branches. Permutation feature importance and partial dependence plots were created to assess phenotype and field site effects on seed yield using MATLAB (Random Forest model). The data showed that cultivar had a larger effect on shoot phenotype expression and yield than site soil differences. The highest yielding cultivar in terms of number of seeds per plant produced mainly two phenotypes: high density pods along the main stem and branches or high-density pods within the main stem (but lower density within the branches), while the cultivar with the highest average seed weight produced either high density of pods within the main stem or within the lower region of the main stem (but lower density on branches). To accelerate the phenotyping process, we also developed automated methods based on image processing to segment the plants in the images. Machine learning (ML) was used to find the pods within the whole plants. The ML approach shows promise as the basis of an automated image analysis protocol that has potential for labor-saving advances in yield quantification and prediction. However, more work is needed to develop the training sets required to determine pod parameters more accurately, to ensure comparable results between manual and automated yield predictions. The interim results of the algorithms that were developed using WEKA machine learning within the ImageJ/FIJI software platform will be shown.