Expanding the Breeding Toolbox to Develop Soybean Cultivars
The soybean breeding program at Iowa State University, through an interdisciplinary effort, is developing strategies and building tools to improve the breeding pipeline. This presentation will explore few emerging technologies and data analytics developed or utilized in our program to study plant traits, and describe their application in phenotyping and cultivar development.
icon search Searchable Transcript
Toggle between list and paragraph view.
[00:00:00.810]The following presentation is part
[00:00:02.730]of the agronomy and horticulture seminar series
[00:00:05.840]at the University of Nebraska Lincoln.
[00:00:08.320]Good afternoon everyone welcome
[00:00:10.400]to the agronomy and horticulture seminar.
[00:00:12.830]My name is Vikas Belamkar
[00:00:14.660]and I'm a research assistant professor here
[00:00:16.410]in the department.
[00:00:18.710]If you're attending the seminar for the first time
[00:00:20.810]we generally have a 40 to 50 minutes seminar
[00:00:23.890]and it will be followed by the Q and A session.
[00:00:26.700]It's my pleasure to introduce our speaker today,
[00:00:28.920]Dr. Asheesh K Singh, who goes by Dr. Danny Singh
[00:00:33.380]from the department of agronomy
[00:00:34.940]at Iowa state University in Ames, Iowa.
[00:00:38.150]Danny is a soybean breeder.
[00:00:39.870]His professional interest is to help improve
[00:00:42.350]agricultural production and use research
[00:00:45.040]and breeding activities for the benefit of farmers
[00:00:47.940]and the agriculture industry.
[00:00:50.253]He is committed to developing superior soybean cultivars
[00:00:53.080]and germplasm for farmers and other stakeholders.
[00:00:56.930]A little bit about his education.
[00:00:59.000]Danny completed his MSE degree in plant breeding
[00:01:02.090]from University of Saskatchewan in 2001
[00:01:06.030]and obtained his PhD in plant breeding in 2007
[00:01:09.620]from the University of Guelph in Canada.
[00:01:13.120]Danny has had an excellent career so far,
[00:01:16.680]he began his professional career working
[00:01:19.450]as a durum wheat breeder
[00:01:21.210]at agriculture and Agri-Food Canada in 2007.
[00:01:24.740]He joined the department of agronomy
[00:01:26.420]at Iowa State University as an assistant professor in 2013
[00:01:30.400]and was promoted to associate professor in 2017
[00:01:32.967]and a full professor in 2020.
[00:01:35.450]He's also the director of graduate education
[00:01:39.030]for plant breeding at Iowa State University.
[00:01:42.710]Over the years he has developed three soybean varieties
[00:01:46.120]more than 40 wheat varieties,
[00:01:48.290]and more than 14 germplasm releases.
[00:01:50.920]And these are being grown in more
[00:01:52.460]than 10 million acres each year.
[00:01:56.390]He has published over 124 peer reviewed article
[00:01:59.430]and these are being cited more than 3000 times.
[00:02:02.480]He's presented over 60 invited talks at national
[00:02:05.710]and international meetings and more
[00:02:07.810]than a 100 media engagements with farmer magazines,
[00:02:10.460]farmer radio station, and extension talks.
[00:02:13.160]He's also won many, many awards and don't have the time
[00:02:16.140]to go through all of them.
[00:02:17.150]But I mentioned one, he was awarded the Monsanto Chair
[00:02:20.710]in soybean breeding, I think it was in 2013
[00:02:25.530]and he's a PI/co-PI on 59 funded grants.
[00:02:31.460]Two things that I personally admire about Danny
[00:02:35.010]are he's a breeder who has practically used
[00:02:38.390]some of these emerging technologies,
[00:02:40.250]especially like the digital phenotyping tools
[00:02:43.170]to improve breeding pipelines
[00:02:45.700]and many in the community would agree
[00:02:48.610]that he's an excellent mentor to students
[00:02:50.920]and also in scientists.
[00:02:53.070]So we are really excited to have Danny present today
[00:02:55.970]at our seminar series.
[00:02:58.010]Danny, welcome to UNL.
[00:03:00.924]Thank you so much for such kind introduction Vikas
[00:03:03.870]and for the invitation to present today.
[00:03:06.090]It's my pleasure to be here and in today's presentation
[00:03:09.770]I'll go over some of our current work that aim
[00:03:12.120]to expand the building toolbox for ready development.
[00:03:15.520]And that's your request because the aim
[00:03:17.580]of this presentation is to give diverse examples
[00:03:20.480]of interdisciplinary projects that we have going
[00:03:22.600]on in our program here.
[00:03:24.970]And I wanna make sure that I let you know
[00:03:27.510]that I present on behalf of my students post-docs
[00:03:30.710]and staff that get the job done.
[00:03:32.860]And I also acknowledge my Soynomics team
[00:03:36.470]Dr. Baskar Ganapathysubramania, Dr. Arti Singh
[00:03:39.550]and Dr. Soumik Sarkar
[00:03:41.230]with whom I collaborate closely on most
[00:03:42.920]of the interdisciplinary work that I'll present today.
[00:03:45.810]So several years ago at an early stage of our group start
[00:03:50.650]all staff postdocs and students
[00:03:53.210]we came together as a team
[00:03:54.480]and we developed a mission statement and our research goals
[00:03:58.740]I would say it was a very enlightening exercise for us
[00:04:01.090]about self-awareness and self discovery for our young team
[00:04:04.510]at that time, essentially, we see ourselves
[00:04:06.670]as an interdisciplinary group that is focused
[00:04:09.320]on advancing science and empowering farmers through our work
[00:04:12.670]this mission motivated our research goals
[00:04:14.580]that look at the development of new products, discoveries
[00:04:17.700]and insights on traits and outcomes.
[00:04:20.040]And here I've included a word cloud that was made
[00:04:23.410]by soliciting feedback from our team members
[00:04:26.660]during that exercise on things we work
[00:04:29.040]on and are considered important to us.
[00:04:30.900]And as you can naturally see
[00:04:33.950]some of our interests, they motivate us to establish
[00:04:36.280]the interdisciplinary collaborations
[00:04:37.710]which you'll see through the presentation.
[00:04:43.340]So as most of you probably know in the audience you know
[00:04:46.820]that plant breeding involves intricate processes
[00:04:49.120]and combines learnings from several disciplines
[00:04:52.190]to develop varieties that are high yield, desirable
[00:04:55.240]quality characteristics, resistant to biotic
[00:04:57.330]and abiotic stresses and efficient in resource utilization.
[00:05:01.410]Therefore, once we developed our mission statement
[00:05:03.840]and our research goals
[00:05:05.680]the next step we took was based to plan
[00:05:07.930]and prioritize our research projects.
[00:05:10.570]We want to strive to be breeders
[00:05:12.670]and scientists who are aware and in sync with farmer needs.
[00:05:16.060]And as a plant breeding group, our most immediate output.
[00:05:20.140]Our next generation cultivars
[00:05:21.470]that are less resource intensive, have some value added
[00:05:24.500]and meet the present and future needs
[00:05:26.610]of farmers and projected climate challenges.
[00:05:30.120]This requires us to integrate economics and environment
[00:05:32.970]and genetics, which creates this genomic cube
[00:05:36.703]that is an important component for us.
[00:05:39.720]However, having phenomic cube it implies
[00:05:42.080]high throughput phenotyping on ground
[00:05:44.290]and ariel that generates large and complex data.
[00:05:47.610]Therefore we require advanced and nonlinear models
[00:05:51.990]and analyses to extract meaningful and- timely information.
[00:05:55.350]So keeping that in mind
[00:05:56.940]we set up our breeding and research pipeline
[00:05:59.760]which you see here in this bottom right side of the slide
[00:06:04.390]and this is inherently interdisciplinary
[00:06:06.360]and some word unconventional.
[00:06:08.280]We rely a lot on digital tools
[00:06:10.040]and machine learning methods to meet our research goals
[00:06:12.370]including in filter development, it also focused
[00:06:15.280]on the study of above and below ground traits.
[00:06:18.960]One of the work we do we need to justify
[00:06:21.690]how it can directly or indirectly helping
[00:06:23.547]the cultivar development pipeline.
[00:06:25.540]So our students are getting engaged in both sides
[00:06:31.272]of this figured that you see
[00:06:33.650]and we need to be able to increase the response to selection
[00:06:36.220]through better utilization of genetic diversity,
[00:06:38.460]improved accuracy of selection and faster selection time.
[00:06:45.750]We are interested in speeding up of breeding
[00:06:48.040]for resilient crops that can withstand the challenges
[00:06:50.900]of an increasingly variable environment.
[00:06:52.977]And this figure, we noticed that there are several regions
[00:06:55.360]of the US, and these are in the red sheet
[00:06:58.930]and it's are projected to experience almost two third days
[00:07:02.900]in a year with drought
[00:07:04.224]and high temperature stress in the next several decades.
[00:07:08.200]Therefore, we are very interested
[00:07:09.860]in our research to develop tools that provide inclusion
[00:07:12.580]of complex traits, such as drought and heat tolerance
[00:07:15.320]and the suitability to select
[00:07:16.530]for future climate scenarios that are decades away.
[00:07:19.920]Therefore, we continue to explore artificial
[00:07:21.690]intelligence based approaches in our breeding pipeline.
[00:07:28.160]These are examples of cyber enabled breeding
[00:07:30.820]and they give a glimpse of where field breeding
[00:07:32.780]is heading in my opinion.
[00:07:34.640]In the left side video,
[00:07:35.960]you see a breeder with several methods of phenotyping
[00:07:38.500]including a pushcart right here, autonomous group
[00:07:43.030]ground robots, aerial drones, and just collecting data
[00:07:47.820]that was previously inconceivable or difficult,
[00:07:50.550]in the right side video you see a ground robot
[00:07:52.990]collecting images under the canopy.
[00:07:55.170]And yes, these are soybean plants, not an orchard species
[00:07:58.490]but with the combination
[00:07:59.380]of ground and aerial based phenotyping system
[00:08:02.550]we can obtain a lot of useful information
[00:08:04.950]at a scale that was previously considered impossible
[00:08:07.890]and also an Oregon's that were previously
[00:08:10.270]very difficult to phenotype.
[00:08:12.200]So we work relentlessly to keep streamlining
[00:08:14.320]our breeding pipelines.
[00:08:16.400]We look at maximizing returns
[00:08:18.020]and minimizing costs and errors
[00:08:19.460]while we continually update and modernize our tools.
[00:08:26.130]In our breeding pipeline, we have five main stages
[00:08:30.000]and four main decision-making steps.
[00:08:32.770]As you can see the main stages are hybridization,
[00:08:36.380]generation advancement and selection,
[00:08:38.720]progeny row testing, preliminary
[00:08:40.260]and advanced yield testing and commercialization.
[00:08:43.250]In each of these steps, we decided
[00:08:45.290]to include four main decision-making opportunities
[00:08:48.470]together the stages and decision-making choices.
[00:08:51.060]They help guide our research project
[00:08:52.970]identification and methods.
[00:08:55.170]Since we have a large research footprint in area
[00:08:57.730]under experiments and time series data on multiple traits
[00:09:01.940]high-throughput phenotyping, genomics, robotics, automation
[00:09:07.400]and ML they're integral components of the work we do.
[00:09:13.300]In today's presentation,
[00:09:14.400]I'll give several examples on the use
[00:09:16.340]of image based phenotyping and machine learning methods
[00:09:19.360]which are helping improve
[00:09:20.410]the plant breeding pipeline for us.
[00:09:22.460]These will include tools for ground robots
[00:09:24.730]and drone based phenotyping, the various sensors
[00:09:27.310]stress phenotyping, and study of fruit traits.
[00:09:30.600]More specifically in produce presentation
[00:09:32.930]I'll give examples of four main areas of our work.
[00:09:36.530]These are yield related projects,
[00:09:39.470]tools or methods development, stress related,
[00:09:42.710]and then root leaded projects.
[00:09:45.190]The output from these projects
[00:09:46.730]they directly or indirectly relate
[00:09:48.750]to one of the four decision-making opportunities
[00:09:51.810]which are choice of parent, choice of selection method,
[00:09:54.110]choice of data analysis and choice of placement
[00:09:57.170]are in today's presentational
[00:09:58.460]primarily present projects that relate
[00:10:00.210]to the first three types of choices.
[00:10:07.050]Before we get into the four main types of projects
[00:10:10.000]and examples of ongoing things, these are examples
[00:10:13.360]of projects from our team in the last couple of years
[00:10:15.890]you probably are maybe not able to see
[00:10:18.230]because of the font touch showing up on your screen.
[00:10:20.850]But red outline is for projects and tool method
[00:10:23.240]development related to automated drone based phenotyping.
[00:10:27.020]The green outline boxes are for projects that focus
[00:10:29.490]on stress related topics, the purple outline
[00:10:32.470]are projects that lead to yield and root later topics.
[00:10:35.620]And if anyone has an interest in exploring some
[00:10:37.950]of our prior published work in the last couple of years
[00:10:41.050]on ground robots, stress phenotyping,
[00:10:43.430]and yield and root related work,
[00:10:45.420]please access them through my Google scholar page.
[00:10:49.910]Today, I'll explain, as I said earlier
[00:10:52.030]I'll present our unpublished work.
[00:10:53.990]Although several of the projects I'll present today
[00:10:56.550]they are accessible as an archive paper.
[00:10:58.870]And again, can be found on my Google scholar page.
[00:11:01.750]And if you're interested in reading more about
[00:11:03.090]phenomic assisted breeding, especially students
[00:11:05.730]if you have an interest, please check our chapter number 28
[00:11:09.400]which has been phenomics and machine learning
[00:11:11.280]in crop improvement
[00:11:12.390]in my recently published co-authored book
[00:11:15.560]plant bleeding and cultivar development.
[00:11:19.920]In the first of the four main topics
[00:11:22.240]I'll present two projects that relate to yield.
[00:11:27.020]This first example is our work on soybean pod counting.
[00:11:31.910]It is led by Louis Rivera is a PhD student
[00:11:34.760]of Dr Sarkar and Matt Carroll.
[00:11:36.610]Who's a PhD student in my group.
[00:11:39.350]We were motivated by our aim to reduce phenotyping efforts
[00:11:43.570]with harvesting each plot.
[00:11:46.230]Ultimately combine yield is the best data
[00:11:48.120]for a row crop breeder.
[00:11:49.060]However, we also know that yield components
[00:11:52.730]those traits are effective predictors of CDL
[00:11:55.640]particularly pod counts
[00:11:56.840]they correlate very well with CDL.
[00:11:59.760]So our thought was, if we can estimate pod count in plots
[00:12:02.510]we can reduce the need to harvest plots
[00:12:05.020]which is an expensive and time-consuming task.
[00:12:07.610]So I'm not saying that we will not harvest any plot.
[00:12:09.760]I'm just simply saying that
[00:12:11.750]if we can accurately estimate yield
[00:12:13.630]can we forgo harvesting at all locations?
[00:12:17.940]As a first step, we started working on ground robots
[00:12:20.530]for phenotyping through an NSF USDA funding on robotics.
[00:12:24.120]We wanted to explore the ground robots, their utility
[00:12:27.130]along with ML methods to estimate pods in standing plots.
[00:12:32.030]In this video, you can see the robot traveling
[00:12:34.920]through the rows, getting videos of diverse plants
[00:12:38.580]use several 100 different accessions
[00:12:41.120]from the USDA germplasm bank to build our model
[00:12:45.030]ML model to estimate pod counts.
[00:12:47.660]We develop detailed protocols
[00:12:49.410]for the use of ground-based robots, that record videos
[00:12:52.280]while traversing these breeding plots.
[00:12:55.370]To deploy yield estimation framework
[00:12:57.890]on board of this robotic platform, we need to detect
[00:13:02.140]and keep track of individual plots in real time.
[00:13:05.460]Therefore in our two-step framework,
[00:13:07.250]which is built on RetinaNet architecture
[00:13:09.320]with a VGD 19 backbone
[00:13:11.680]it consisted of a feature extraction module followed
[00:13:14.100]by a regression module.
[00:13:15.820]So we built a plot detection tracking framework
[00:13:19.720]that can provide the necessary video frames
[00:13:21.770]for all plants in a specific plot that feeds
[00:13:24.950]into the yield estimation module.
[00:13:27.070]We then developed a deep learning based multi view image
[00:13:29.870]fusion framework for pod detection
[00:13:31.860]and localization that can identify pod
[00:13:34.210]from multiple things
[00:13:35.860]and then created a formula to predict not just the pod
[00:13:39.230]that the robot could see, but also account
[00:13:41.470]for pod that were hidden due to occlusion.
[00:13:43.930]So that's very exciting from a breeding perspective
[00:13:46.580]and initial results are promising
[00:13:48.350]and they can generate reasonable ranks for selection
[00:13:52.180]and this paper is currently under review
[00:13:53.900]but it's also available as an archive paper.
[00:13:58.710]It is also important to mention
[00:14:00.160]that these ground robots can collect data
[00:14:02.750]even in some work, but field conditions
[00:14:04.990]their combines can go to the field.
[00:14:07.490]And our future work will focus on increasing data set sizes
[00:14:10.330]as well as improving the automation
[00:14:11.830]of plot detection, data collection, as well
[00:14:14.770]as integration with aerial based genotyping systems.
[00:14:19.420]In the second project on yieldly work,
[00:14:21.750]it highlight the effort of Tryambak
[00:14:25.160]who is a PhD student with Dr. Sarkar
[00:14:27.050]and John who's a PhD student with me
[00:14:29.600]and now works at Cordova.
[00:14:31.810]We recently presented this work at a workshop and it was one
[00:14:34.870]of the first works on attention based model
[00:14:37.230]for spatial temporal interpretability in yield prediction
[00:14:42.210]can never get that word.
[00:14:43.560]Interpretability in yield prediction.
[00:14:45.610]We used an LSTM model with dual attention mechanism
[00:14:48.750]to accurately predict annual crop yield
[00:14:51.830]using weather variables.
[00:14:53.610]It can learn the most significant weather variables
[00:14:56.660]across 30 weeks in the growing season.
[00:14:59.980]After the spatial attention phase
[00:15:01.710]and according with the LSTM layers that temporal attention
[00:15:05.490]learns the most significant weeks for that yield prediction.
[00:15:09.890]The most important feature of this approach
[00:15:11.940]is that it has both temporal attention, which highlights
[00:15:15.270]the most significant weeks along the growing season
[00:15:18.870]and also a spacial attention, which highlights
[00:15:22.230]which weather variables are most significant in each week.
[00:15:25.870]So what I'm saying is that this method allows breeders
[00:15:28.290]to not only highlight the most important variables
[00:15:30.950]but also figuring out the time step IEV
[00:15:34.340]to break the annual crop yield.
[00:15:36.400]This is useful to us breeders because not only
[00:15:38.860]do we learn what is or are the most
[00:15:41.870]important predictor variables,
[00:15:43.140]but also when it is most important.
[00:15:45.690]The interpretations provided by our model
[00:15:47.630]can help in understanding the impact of weather variability
[00:15:51.030]on agricultural production, in the presence
[00:15:52.790]of climate change,
[00:15:54.050]and also to formulate breeding strategies
[00:15:55.730]to circumvent these climate challenges
[00:16:00.740]For this work we included more
[00:16:02.080]than a 100,0000 unique data points
[00:16:04.030]and about 6,000 unique varieties across 13 years
[00:16:06.870]in the US using the uniform trials.
[00:16:10.480]If we look at the top right graph,
[00:16:13.107]the temporal attention weights highlight
[00:16:15.220]the increasing importance of the features in the time period
[00:16:18.320]of week eight to 10 right here, which falls in about June.
[00:16:23.480]If you look at the graph at the bottom right
[00:16:26.380]you'll notice that average precipitation right here
[00:16:29.910]that is the most important variable for most of the weeks
[00:16:34.130]in the growing season in irradiance is also highlighted
[00:16:37.130]as important by the dual attention model for few weeks.
[00:16:40.890]So we're now working to look
[00:16:42.550]at additional data and additional traits.
[00:16:44.620]Also the next iteration needs
[00:16:45.980]to include molecular information in prediction
[00:16:48.330]as well as some of the biophysical models.
[00:16:51.110]One of the most I would say useful feature of this method
[00:16:55.880]is that it can generate insights
[00:16:57.160]by understanding predictions from the perspective
[00:16:59.250]of what invade.
[00:17:01.060]The insights obtained
[00:17:01.920]by using the model can help to understand
[00:17:04.970]the impacts of weather on yield prediction
[00:17:06.740]and strategize breeding activity that I mentioned earlier
[00:17:09.510]and have a more data-driven decision-making
[00:17:11.790]particularly what invent that plant plasticity is required.
[00:17:16.150]But I would have to say this
[00:17:17.070]that much more work needs to be done
[00:17:18.820]that builds on this project.
[00:17:22.258]Our present two projects that focus on tool
[00:17:24.310]and method development.
[00:17:25.860]In the first example of the method development
[00:17:29.150]and testing project is led by Koushik
[00:17:31.900]who's a PhD student with Dr. Baskar
[00:17:34.920]and Zaki Jubery who was a postdoc fellow
[00:17:37.270]with me and Baskar.
[00:17:39.300]So machine learning models are methods.
[00:17:41.420]They achieve really good performance.
[00:17:43.140]And in image based phenotyping, a lot
[00:17:45.230]of you folks at UNL are already doing this work.
[00:17:48.780]And especially if you have large amount of labeled data
[00:17:52.100]and there is a review article that we put out
[00:17:54.480]in 2018 and in trends and plant science on this topic
[00:17:58.040]but how we're collecting large set of labeled data
[00:18:01.230]or data sets has been a big challenge
[00:18:03.280]for the plant phenotyping community.
[00:18:05.460]These data labeling can be costly, laborious.
[00:18:08.380]It can be time consuming and it reads domain expertise
[00:18:12.580]especially if it is a more complex task
[00:18:14.450]with more complex phenotypes
[00:18:16.260]that requires additional resources
[00:18:18.370]and crowdsourcing has been proposed in use
[00:18:20.460]but it also has its own challenges.
[00:18:22.930]Therefore, we were motivated to ask the question,
[00:18:26.970]can we reduce the amount of labeling needed
[00:18:29.800]for these deep learning models to achieve good performance
[00:18:33.200]in a image based classification task?
[00:18:36.700]And this is where active learning is very useful
[00:18:38.760]because the goal of active learning
[00:18:40.520]is to achieve maximum predictive performance
[00:18:43.200]with a fixed labeling budget.
[00:18:45.230]And it makes it desirable for plant science applications
[00:18:48.160]which are related to the classification tasks
[00:18:50.720]such as these plants, stress ratings
[00:18:52.980]these active learning methods,
[00:18:54.140]they select samples for labeling from a pool
[00:18:56.680]of unlabeled samples to maximize the predictive performance
[00:19:00.290]of the deep learning model.
[00:19:02.070]So in our work we compared four active learning methods,
[00:19:05.770]leased confidence and tropey, deep basion active learning
[00:19:09.983]or DBAL or core set on two very disparate
[00:19:13.470]plant phenotyping problems.
[00:19:14.850]One was the soybean stress identification.
[00:19:19.470]This was a paper that we published
[00:19:21.270]in 2018 in Goshen at all in PNS
[00:19:24.190]and then a weed species classification data set
[00:19:27.074]that was published in 2019 by a group also at all.
[00:19:30.960]And as you can see both these data sets,
[00:19:33.110]they have quite large as you can see the stable,
[00:19:36.320]diverse types as well of classes.
[00:19:42.340]So we use mobile net.
[00:19:44.310]We do architecture for the soybean stress classification
[00:19:47.360]and the resonant 50 architecture
[00:19:48.910]for the weed species classification broadly
[00:19:51.430]because mobile net is a smaller, more compact network
[00:19:54.690]and it's not appropriate for the soybean data set
[00:19:56.580]because we use controlled condition imaging.
[00:19:59.410]Then there's net is a large network
[00:20:01.230]and therefore it is more appropriate
[00:20:02.860]for the V dataset, which was data collected in the field
[00:20:07.070]with imaging and it's more diverse compared
[00:20:09.360]to the controlled environment set for the soybean stress.
[00:20:13.420]And we found that for both soybeans stress and weed dataset
[00:20:16.890]the uncertainty based deep learning methods
[00:20:19.410]that's the least confidence, entropy and DBAL.
[00:20:22.220]They performed better than the random sampling.
[00:20:24.410]This is presentation of that information
[00:20:29.890]and random scent based sample is the core set method
[00:20:33.230]is the other type of method.
[00:20:35.810]And they minimize the amount of labeling that is needed.
[00:20:38.420]And least confidence sampling was the best
[00:20:40.730]performing active learning method for both datasets.
[00:20:44.020]And that considering the different labeling budgets
[00:20:48.110]and DBAL was found to be the best performing
[00:20:50.320]method when labeling budget was small.
[00:20:51.980]So up to 4,000 samples for the soybean
[00:20:54.110]and 2000 samples for the weed dataset
[00:20:57.000]and these results they're helpful
[00:20:58.770]because it helps us to determine the amount of labeling
[00:21:02.810]or labeled data that we need to feed in
[00:21:05.900]to develop our ML models.
[00:21:07.830]And it helps us better manage their resources.
[00:21:10.910]So in the second project Koushik, again
[00:21:13.940]he worked to determine the usefulness
[00:21:15.570]of interpretability methods to explain deep learning
[00:21:18.960]based plant stress, phenotyping.
[00:21:20.850]This is also an archive paper
[00:21:22.720]and this particular work we compared
[00:21:25.170]some of the most popular interpretability methods
[00:21:27.610]including saliency maps, smooth grad,
[00:21:30.240]guided back propagation deep tailor decomposition,
[00:21:33.990]integrated gradients, layer wise relevance propagation,
[00:21:36.850]input time gradients, so just a whole
[00:21:40.670]suit of different interpretability method
[00:21:43.500]to interpret the deep learning model
[00:21:45.980]that is to find pixels
[00:21:47.250]that are important for classification.
[00:21:50.090]And interpretability, I should mention it's becoming very
[00:21:52.810]very important because until recently deep learning
[00:21:55.090]methods were considered a black box
[00:21:56.650]and practitioners were unable to understand
[00:21:59.430]what computer used to generate the outcomes.
[00:22:02.140]For example we can develop a good model that can identify
[00:22:05.550]and quantify a soybean stressed
[00:22:07.540]but all we see are model performance metrics
[00:22:09.670]and no reason as to what the computer learn.
[00:22:12.910]And then we may be less inclined to believe the computer.
[00:22:15.370]So therefore us and other groups
[00:22:17.320]they're very interested in investigating interpretability
[00:22:20.090]for the models that we are developing and using.
[00:22:23.450]So in this project, we trained a dense net, 121 network
[00:22:26.410]for classification of eight different soybeans stresses
[00:22:29.110]both biotic and abiotic stresses
[00:22:32.930]and using a dataset consisting of over 16,000
[00:22:37.020]these RGB images of healthy and stressfully
[00:22:39.220]soy bean leaflets that were captured
[00:22:41.250]in the control condition.
[00:22:42.290]I explained that earlier as well, we find a really
[00:22:45.390]high overall classification accuracy of 1.5.
[00:22:49.097]That's quite good.
[00:22:50.700]And as you can see in the confusion matrix here
[00:22:53.750]on the left side, majority of the class differentiation,
[00:22:57.870]the modern worked quite well.
[00:22:59.280]However, in some instances, for example,
[00:23:01.890]the bacterial plus you'll hear the bacterial blight here.
[00:23:06.870]The model struggled to correctly, a classic it
[00:23:09.170]by the image and even for a human reader
[00:23:11.050]some of these diseases are not easily classified.
[00:23:13.300]So it broadly speaking, I would say that
[00:23:16.390]given a new sample, the program can accurately
[00:23:18.700]identify the stress from an image, but of course
[00:23:22.010]more improvement can be made.
[00:23:24.130]We then used a diverse subset
[00:23:25.910]of the test data to compare these important model features
[00:23:29.710]with those identified by the human expert.
[00:23:31.770]And we observed that most interpretability methods
[00:23:34.880]identify the infected region of the leaflet
[00:23:37.500]as important feature for some
[00:23:39.860]but not all of the correctly classified images.
[00:23:44.080]So you can see some of the examples here.
[00:23:46.110]If you look at potassium deficiency here
[00:23:48.780]most of these models
[00:23:50.010]these are examples of those interpretability models.
[00:23:52.840]You can see that most
[00:23:53.810]of the models they identified similar important features
[00:23:56.730]which is good.
[00:23:57.563]You know that's, there's a very distinct type
[00:23:59.790]of a potassium deficiency symptom that comes.
[00:24:02.940]In other cases.
[00:24:03.773]For example, sudden death syndrome
[00:24:05.800]different models are identified somehow different
[00:24:08.550]or dissimilar important regions.
[00:24:11.120]And this is still perhaps to be expected
[00:24:13.630]as different models are built on somewhere
[00:24:15.520]different base to generate these little importance.
[00:24:18.370]But the unresolved challenge here
[00:24:19.890]is that none of the interpretability methods worked
[00:24:22.240]in all different stress classes.
[00:24:24.720]And therefore at this stage, I would say that we advocate
[00:24:28.870]that these interpretability methods they should be used
[00:24:31.850]as a hypothesis generation that can drive some
[00:24:35.143]of the scientific insight
[00:24:35.976]and more robust models need to be developed.
[00:24:38.840]It is important to clarify
[00:24:40.023]that the model is being used to accurately classify
[00:24:43.960]the outstanding current challenge
[00:24:45.820]is that no universal interpretability method works
[00:24:48.710]in all situation.
[00:24:49.900]So the model works.
[00:24:51.030]What we're talking about is that we don't yet
[00:24:53.040]have a universally acceptable interpretability method
[00:24:56.700]but with further resolution breeders
[00:24:58.930]should be able to use these deep learning based methods
[00:25:01.940]for more accurate phenotyping.
[00:25:03.740]And also check that the model interpretability matches
[00:25:06.250]with the visual with visual reading
[00:25:09.430]heuristic reading as well.
[00:25:14.050]So the third topic of plant stress phenotyping work
[00:25:18.740]I'll give two examples.
[00:25:23.140]It is prior work that has been done in UAV based phenotyping
[00:25:28.670]for crop health and yield prediction.
[00:25:30.480]And it has shown its potential for using a breeding program.
[00:25:33.850]There is it's being done in the V program
[00:25:36.700]the soybean program at UNL
[00:25:38.130]and we showed other programs as well.
[00:25:40.940]For this specific type of stress iron deficiency chlorosis,
[00:25:44.520]there is a really good paper that came out
[00:25:46.550]of Dr. Lorenzo's lab doubles at all where the showed
[00:25:51.890]that are inefficient sclerosis genotyping
[00:25:53.820]can be done successfully using drones.
[00:25:56.890]So our work is aimed to look both
[00:25:59.530]at the ability to predict iron inefficiency chlorosis
[00:26:03.330]but also standardized flight parameters for best practices
[00:26:07.510]and to look into the effect of IDC
[00:26:09.420]on canopy growth and development for a science question.
[00:26:13.550]And it also with an interest
[00:26:15.480]to conduct time series genetic studies.
[00:26:17.760]So we flew at 60 and 30 meters.
[00:26:19.970]So that would be a 200 feet
[00:26:21.760]for 60 and then a 100 feet for the 30 meter.
[00:26:24.940]And we found that accuracy and mean
[00:26:27.360]per class accuracy were not effected
[00:26:30.390]by flight altitude in our experiments.
[00:26:32.210]So those are the different dates than the 30 meter height
[00:26:36.750]and 60 meter height.
[00:26:38.510]And we are looking at both overall accuracy
[00:26:40.680]but also mean per class accuracy
[00:26:42.550]because it's that we are getting good accuracy
[00:26:45.230]in each of the five rating classes.
[00:26:48.540]And this is good news because there's a large increase
[00:26:51.980]in efficiency in flight time,
[00:26:53.910]as well as the area that can be covered
[00:26:55.610]with increases in flight, these flight parameters.
[00:26:59.350]In addition to these, we also looked at canopy growth rates
[00:27:03.020]across time points and farmed some very interesting snips
[00:27:06.920]that when we did the, the Guas
[00:27:09.950]we also grew a complainant steady
[00:27:11.550]on the same farm part without IDC symptom.
[00:27:14.630]And we did not find similar regions
[00:27:16.150]of interest when using this as the control set.
[00:27:18.490]So that seems to give some evidence
[00:27:20.470]that there is diversity in how soybeans recover
[00:27:22.760]from IDC stress and generating time series based insight
[00:27:26.670]on a differential response to ADC and serving radius.
[00:27:29.720]So this project is helping us to set guidelines
[00:27:32.090]for our drone based phenotyping
[00:27:34.010]as well as give some scientifics insights
[00:27:35.970]on the impact of IDC on canopy growth and development.
[00:27:39.510]And we hope to upload an archive paper this year.
[00:27:42.490]In the second project
[00:27:43.560]I'll present work off Sarah Jones on drought tolerance.
[00:27:49.270]Sarah is a PhD student in the group.
[00:27:52.090]And Jay you'll remember if you're in the audience.
[00:27:55.550]Sarah was a student when Jay was here at Iowa state.
[00:28:00.040]So drought tolerance is a complex trait
[00:28:02.400]and you all know this
[00:28:04.140]it's going to become increasingly important.
[00:28:07.400]So our team is working to develop drought tolerant
[00:28:09.500]soybean lines that can withstand these weather extremes
[00:28:12.800]with a lot of effort that went behind this.
[00:28:15.870]We were finally able to set up a good drought
[00:28:19.110]screening field site in Iowa, and we were able
[00:28:22.420]to screen a large unpleasant panel consisting
[00:28:24.830]of work 450 PI lines and checks and replicated trial.
[00:28:29.490]Also breeding populations were planted
[00:28:31.390]in the drought nursery evaluation nursery.
[00:28:35.270]And what we have done is in addition to wilt score
[00:28:38.100]and some of the other leaf, and canopy are a response rates.
[00:28:41.650]We are also interested in some of the other measurements.
[00:28:44.150]For example, researchers have shown the usefulness
[00:28:46.440]of hyperspectral imaging
[00:28:47.890]to accurately phenotype plant stress response
[00:28:50.530]and thermal imaging that measures canopy temperature
[00:28:53.110]has been used to estimate a serving transportation
[00:28:55.910]in the field.
[00:28:56.860]So we've used ground and aerial based thermal imagery
[00:29:00.020]with some success thermal canopy imagery
[00:29:02.940]showing the dark purple
[00:29:04.790]the cooler temperatures compared
[00:29:06.570]to more magenta warmer canopy temperatures.
[00:29:10.500]With ML approach, we are also interested
[00:29:12.950]to be able to estimate wilt score from drone images.
[00:29:16.620]So mounted on these drones sensors
[00:29:18.790]they enabled high-throughput phenotyping
[00:29:21.220]which is necessary for field scale time series experiments.
[00:29:24.750]We are also looking to standardize phenotyping protocols
[00:29:27.230]working with different sensors
[00:29:29.020]and building on past year's experience.
[00:29:31.000]This year we are going to perform
[00:29:33.390]another large germplasm screening
[00:29:36.100]and breeding family screening
[00:29:37.870]for drought tolerance using wilt score, thermal RGB
[00:29:41.220]and hyperspectral cameras radiometer
[00:29:44.610]both through drone and ground data collection.
[00:29:48.890]And this is a visual representation
[00:29:50.960]of all the reflectance data collected across different range
[00:29:54.720]of electromagnetic spectrum, the RGB imagery.
[00:29:59.080]So just on these for the vape bands
[00:30:02.710]the RGB imagery and visual wilt score
[00:30:05.590]from 400 to 750 nanometers is the hyperspectral
[00:30:09.540]from 350 to 2,500.
[00:30:12.060]And then the thermal camera
[00:30:13.900]from 7,500 to 13,500 nanometers.
[00:30:17.780]And these graph show in the figure that wilt scores
[00:30:21.200]across pay bands.
[00:30:22.180]There is a differentiation and that's good news.
[00:30:25.400]One of the goal is to replace ground based imaging
[00:30:28.390]with drone imagery to facilitate more accurate
[00:30:30.870]and more rapid rating for larger drought screening nurseries
[00:30:34.710]and as well as to pick up water spray symptoms
[00:30:37.150]before they're visible to the human eye.
[00:30:39.290]And you can see here, we are taking a hierarchal
[00:30:41.950]approach to improve our phenotyping.
[00:30:44.830]What about our stress
[00:30:45.663]from more simple and then come getting more
[00:30:48.197]and more integration of different sensors
[00:30:51.070]to get as close to accurate phenotyping as possible.
[00:30:57.600]And the next and final project group class, I'll talk about.
[00:31:02.470]So the root system architecture and nodulation
[00:31:04.580]related projects that we are working on
[00:31:07.180]and those a few we've heard me talk about it
[00:31:10.700]I'm very, very passionate about roots.
[00:31:12.250]They are absolutely critical for plants because they help
[00:31:14.510]with water, nutrient uptake, structural strength,
[00:31:17.910]and also in case of a soybean just
[00:31:20.560]with the symbiotic association with bacteria
[00:31:22.980]and it's an often overlooked area of plant science
[00:31:26.460]but our group is quite passionate about this topic.
[00:31:29.020]So currently a large chunk of research effort
[00:31:32.390]in our team, it is dedicated to related research.
[00:31:38.120]And (indistinct) not directly integrated root traits
[00:31:40.390]in they breeding objectives, but more recently
[00:31:42.490]there is a renewed interest in root leader traits.
[00:31:44.487]And this has happened, I would say partly
[00:31:46.560]because of advances in phenotyping capability.
[00:31:48.916]And there's some really cool work that's going on
[00:31:51.580]across different institutions in the US
[00:31:53.570]and overseas as well in developing group imaging softwares.
[00:31:58.220]And we have built an automated route imaging analysis
[00:32:01.910]at Aria in short, this was initially developed
[00:32:04.400]by Baskers group with maize focused on maize root
[00:32:09.520]and then more recently Zaki
[00:32:11.140]and then Kevin for my PhD student who works for Cordova.
[00:32:16.250]We have developed an audio version two, which is focused
[00:32:20.318]on the soybean phenotyping pipeline in early stages.
[00:32:24.180]So this was published
[00:32:25.200]in recently in the planned methods journal
[00:32:27.390]for a mobile low-cost high resolution
[00:32:29.910]root phenotyping system
[00:32:31.540]consisting of imaging platform with computer vision
[00:32:34.500]and MLBs approaches to establish
[00:32:37.060]it's an end to end pipeline with human intervention.
[00:32:41.370]With this high-throughput phenotyping system
[00:32:43.460]has given us the capability and capacity to handle hundreds
[00:32:47.430]of thousands of plants at same time
[00:32:50.830]to integrate time series data capture
[00:32:53.920]we get our work 45 different root traits.
[00:32:55.820]Not all, all of them are important
[00:32:57.180]but they capture different aspects of the root.
[00:33:01.700]And this is an example of the output of a select trait.
[00:33:05.020]We can study multiple accessions and across time points
[00:33:07.720]as I mentioned for numerous states
[00:33:09.460]and you can see here just the time series
[00:33:12.040]for the same genotype and these strings outputs
[00:33:17.740]they can be visualized.
[00:33:18.760]This is just an example
[00:33:20.000]of a visualization for three different genotypes
[00:33:23.330]for different traits in a time series.
[00:33:25.090]So just a lot of information that can be gleaned
[00:33:28.440]from the output of the software.
[00:33:30.610]And again, as I said
[00:33:31.510]there are other really good softwares are paid as well
[00:33:34.610]for root phenotyping.
[00:33:36.530]So building on the audio version two that I showed
[00:33:39.820]in the previous slide, then Kevin, he study
[00:33:43.770]close to 300 diverse lines representing the sample
[00:33:46.790]of genetic diversity in the USDA germplasma collection.
[00:33:50.400]And the main objective there was to study genetic diversity
[00:33:52.860]in group traits for early soybean growth stages.
[00:33:56.800]And as you would expect gynotypes were detected
[00:33:59.270]to be a significant source of variation
[00:34:02.170]but no set pattern was detected
[00:34:03.940]when we explored categories based
[00:34:05.420]on maturity group and growth habits.
[00:34:08.520]We noted that primary root land surface area
[00:34:11.420]of the primary root they were significantly better
[00:34:13.500]in relief than unadapted expression
[00:34:15.920]with large trait diversity was evident
[00:34:17.750]for root traits in the germplasm panel.
[00:34:19.920]And in this paper we also studied the previously
[00:34:23.724]And we call them, iroot for informative roots.
[00:34:26.740]These are examples, nutrient foraging as category is based
[00:34:32.850]from previous paper that describes this phenotype
[00:34:34.930]as maximizing the distribution of lateral roots
[00:34:37.030]and top cell at a minimal metabolic cost
[00:34:39.840]to our performed competitors in a nutrient poor soil.
[00:34:45.290]Then we have drought tolerant iroot category
[00:34:47.600]right here which is the falling
[00:34:50.940]the steep deep and shift paradigm long primary root
[00:34:54.100]with steep lateral roots with minimal solidity
[00:34:57.980]then beard type, again, maximizing total root length number
[00:35:02.440]of lateral root branches and traits that give it
[00:35:06.800]that distinct beard shape, just getting really
[00:35:10.460]high maximum group density, the umbrella type
[00:35:14.160]that maximizes root length root width, convex area.
[00:35:18.320]So again a distinct what we call it some relative.
[00:35:22.270]And finally the maximum iroot category,
[00:35:24.570]which is an effort to identify genotypes
[00:35:26.710]that maximize the phenotypic potential
[00:35:29.110]without a particular environment in mind.
[00:35:31.200]So when we study root these little trait correlations
[00:35:34.580]we noted that there was a large set
[00:35:36.350]of traits that provide more general metrics were length
[00:35:39.180]width area strange, good, strong correlations
[00:35:43.270]and forming some single hierarchal clusters
[00:35:46.324]where these relationship held at different studies
[00:35:49.780]standpoints that we looked at.
[00:35:52.310]In the bottom here, the stop here.
[00:35:56.510]So the highest medical values are examples here.
[00:35:59.630]There are some of the highest categories
[00:36:02.950]distinct types are shown.
[00:36:05.270]And this is a graph that shows just how these distinct types
[00:36:09.650]for ranking of the top 10 genotypes
[00:36:11.920]in each category to show
[00:36:14.280]differences between these accessions for different traits.
[00:36:19.600]This here is a cubic clustering heat map
[00:36:21.920]to show the comparisons of genotypes
[00:36:23.630]that phenotype performance, that genogram three
[00:36:26.010]on the left side groups,
[00:36:28.545]the about the 300 different genotypes based
[00:36:31.082]on genetic relatedness using S and P data.
[00:36:33.240]And this first branch
[00:36:35.010]the B indicates the gynotype based clusters
[00:36:37.780]G and F here they're genetically distinct
[00:36:40.610]from the other six and genotype
[00:36:42.490]in these two clusters, they generally showed very
[00:36:45.230]low root trait values indicating their booth performance.
[00:36:48.410]And that's based on this map here, black hair is good.
[00:36:52.000]Red is bad.
[00:36:52.833]And then blue is good.
[00:36:53.760]Orange is worse in terms of ranking.
[00:36:57.260]So second diagram above the heat map
[00:36:59.600]that connects the phenotypic traits based
[00:37:01.290]on these relatedness, the numerical values.
[00:37:04.550]And these are the types that I explained earlier.
[00:37:08.900]So when you look at the red, black heat map here
[00:37:13.790]anything that's black, the harder black would be closer
[00:37:16.900]to one rank, and the most red would be close
[00:37:19.930]to that 300 for the category.
[00:37:23.267]And this genotype group here
[00:37:25.820]see it displayed the high root rate value black
[00:37:29.350]for three of the root categories.
[00:37:31.610]The next is a sub-branch within cluster B
[00:37:35.930]which has strong right here in B, which has strong.
[00:37:41.550]Then you would have been cluster C are also
[00:37:43.300]strong performance cluster A
[00:37:44.790]which is mostly made up of American genotypes.
[00:37:47.960]Wasn't particularly good or bad
[00:37:50.230]but this cluster did have the longest taproot.
[00:37:52.700]So it gives us some insights
[00:37:54.470]of work where different genetic diversity exists
[00:37:57.400]and what is it that is still unused
[00:37:59.630]from a US perspective for breeding.
[00:38:02.750]It also showed the relationship of traits to each other.
[00:38:05.490]That's right here and performance of these 13 root traits
[00:38:08.930]from good to worse rankings and these observations
[00:38:12.100]they match those of the genotype based clusters.
[00:38:14.430]But again you insights were generated
[00:38:16.360]from these subbranches.
[00:38:17.910]So what we are doing now is you're working
[00:38:20.020]on breeding activities to diversify gene pool
[00:38:22.150]for these root traits.
[00:38:25.940]In this project led by PhD student Clayton Carly
[00:38:29.130]it builds on Kevin's work on doing Guas for RSA traits.
[00:38:34.580]So leveraging automated trade extraction
[00:38:36.320]from audio to expand our genetic mapping work
[00:38:39.449]but is an example of the trait diversity within a real
[00:38:42.330]a panel from across a maximum, but minimum the same trait.
[00:38:45.570]And there's a lot of trait variability
[00:38:47.650]which is quite encouraging for us.
[00:38:50.160]And this shows the output from Aria, just to give
[00:38:53.750]the kinds of traits that can be explored coming
[00:38:56.550]from the original image.
[00:39:01.420]This when the final project I'm showing here
[00:39:04.430]is post-root system architecture, study Clayton
[00:39:07.410]and then Zaki
[00:39:08.760]They continued to explore nodulation
[00:39:11.122]and genotypes across times.
[00:39:12.770]There's a lot known about nodulation
[00:39:14.430]but we still don't fully grasp the spatial map
[00:39:17.890]and also the tempo map of nodulation.
[00:39:20.670]So we felt there was a need to acquire quality
[00:39:22.600]consistent and resource efficient data on nodulation.
[00:39:25.870]So one of the skills that has been used
[00:39:28.420]for nodulation is a helpful scale.
[00:39:30.820]That includes visual rating on a scale of zero to five
[00:39:33.590]to evaluate nodules
[00:39:34.590]on taproots compared to the whole root
[00:39:38.060]advances in image based phenotyping.
[00:39:39.960]I think our interest is that we want
[00:39:42.330]to develop more accurate
[00:39:44.354]and more nimble approach for nodule phenotyping.
[00:39:48.514]You can see that these nodules from really small
[00:39:51.160]to really large and just stayed all over the place.
[00:39:54.560]So you can see that their hard to rate
[00:39:57.030]and there's lots of mature
[00:40:00.670]to very young and it's very time consuming
[00:40:03.570]to remove nodules for phenotypic quantification.
[00:40:07.090]So with the snap software that was developed
[00:40:10.910]to get more automated measurement counter for nodule.
[00:40:15.260]We observed an R squared of 0.91 when comparing snap
[00:40:18.410]to the actual ground truth nodules when the same validation,
[00:40:21.990]if we had a higher R squared of 0.99, when we compared snap
[00:40:26.390]with the image based count, and this discrepancy
[00:40:30.230]is likely due to the nodules being knocked
[00:40:32.450]off between imaging and drawing,
[00:40:34.270]as well as the nodule disintegration that may have happened
[00:40:36.800]in the bag, making them nearly impossible
[00:40:39.800]to identify and counting by hand.
[00:40:42.620]We manually remove the nodules
[00:40:44.090]from the samples and about 700 root samples
[00:40:48.280]and took notes of the time record to do so.
[00:40:50.550]And we noticed that that snappy can increase
[00:40:52.970]the rate of phenotyping
[00:40:53.820]to 20 times greater than manual hand quantification.
[00:40:56.930]It allows us to study nodule related traits and soybean.
[00:41:00.100]And when comparing the time, it takes snap
[00:41:02.030]to validate an image versus hand removal of the nodule,
[00:41:04.800]there's a drastic improvement in the reduction of labor.
[00:41:07.630]And it also snap also gives more quantifiable
[00:41:10.340]information that can be collected on each nodule.
[00:41:13.170]This is an archive paper, but it is currently under review.
[00:41:16.090]Hopefully it will be out soon
[00:41:18.160]but we are deploying this snap
[00:41:19.360]for seeing several biological question
[00:41:21.610]and an extensive time study.
[00:41:25.630]So you can just see an example here to show you
[00:41:28.600]that you can acquire traits as a nodule size
[00:41:31.600]and location on the root and overall root
[00:41:34.140]and then the root growth zones.
[00:41:36.250]So this the set comes from several different genotypes.
[00:41:39.710]It's not biased towards one
[00:41:41.060]or two different genotypes, so it's more robust.
[00:41:44.420]At least our intent is, and this also
[00:41:49.090]allows us to get information on the size and you know
[00:41:53.480]where their place of getting that portal as well
[00:41:55.660]as spatial aspect that we were interested in.
[00:41:59.710]So we can also evaluate the different growth
[00:42:03.180]stages that helps us figure out what type
[00:42:06.810]of breeding of what is a breeding target that needs to be
[00:42:11.410]but just some preliminary results I'm showing here
[00:42:16.000]six different accessions for diverse RSA traits.
[00:42:19.010]We put in a study root sampling was done
[00:42:22.100]at different vegetative stage.
[00:42:23.520]And this was done from 12 to 14 applications
[00:42:26.520]per genotype for each of the stage.
[00:42:28.990]This was done over two years, three locations.
[00:42:32.110]And there's a little deviation
[00:42:33.590]in the first growth stage in terms of nodule.
[00:42:36.950]But then you can start seeing
[00:42:38.540]that differences arise so much
[00:42:41.640]of this initial nodulation increase in happening
[00:42:43.620]on the taproot, but some genotypes are also
[00:42:45.700]showing increased inoculation on lateral roots.
[00:42:47.690]And this is greatest variation
[00:42:50.080]and divergence are happening in the stage.
[00:42:52.440]And this is what I meant
[00:42:54.647]the taproot versus the oral group.
[00:42:57.200]So overall total year location mean rank
[00:43:01.340]between genotypes for the total number of nodule
[00:43:04.480]and total number of taproot nodule, certain genotypes.
[00:43:07.810]They seem to be putting more nodules
[00:43:09.320]on the taproot compared to the other genotypes.
[00:43:11.970]And sometimes these trend held
[00:43:13.340]across sleeping across stages too.
[00:43:15.600]And you can see genotype two
[00:43:16.850]in the table here that it adds more nodule
[00:43:20.210]in the lateral node than in it's taproot.
[00:43:22.800]And it's lower in taproot.
[00:43:23.970]Therefore we notice variation
[00:43:26.040]among genotypes and the placement
[00:43:27.170]of the nodule on the main and lateral group.
[00:43:29.400]And prosing some really good research question
[00:43:31.810]that we are exploring now.
[00:43:33.800]And just the port from the image that was taken
[00:43:37.540]and then worked a snap can generate.
[00:43:41.010]It gives us a lot of power.
[00:43:42.980]And finally, through this experiment
[00:43:44.947]we are exploring the relationship between root and shoots
[00:43:47.680]by investigating correlations between about ground traits
[00:43:51.360]growth and development, and then below ground traits.
[00:43:54.680]And this video here showing an example of experiment
[00:43:57.802]we conducted in field that we using to connect these
[00:44:01.240]above and below ground traits.
[00:44:02.560]And this is an example
[00:44:03.740]of a plant where the above part is shown
[00:44:07.330]and then its root is shown here.
[00:44:10.720]So we are interested to understand
[00:44:12.150]and connect root structural architectural traits
[00:44:14.600]and yield and other above ground traits.
[00:44:17.330]And ultimately our interest is that we map this relationship
[00:44:20.610]so we can breed for root to be straights.
[00:44:24.260]So I like to wrap up in terms of just some of the learnings
[00:44:29.220]I've given several different examples
[00:44:30.980]of ongoing projects that directly
[00:44:32.810]or indirectly help our breeding pipeline.
[00:44:35.250]And they're expanding our toolbox
[00:44:36.780]in new ways to expand the research output
[00:44:42.060]we are now attempting to make our data
[00:44:44.200]and codes public easily accessible
[00:44:46.000]for all of our projects so that researchers can use them.
[00:44:48.980]If there's an interest, this is a get hub account.
[00:44:52.630]As new papers come that information on codes
[00:44:56.070]and data should be available.
[00:44:58.520]And then, well I've given
[00:45:02.450]different examples in a short time
[00:45:04.890]there are two important aspects I want to communicate
[00:45:08.180]they're listed here
[00:45:09.013]but two that I want to communicate are that just effective
[00:45:12.610]and meaningful collaborations that are absolutely essential
[00:45:15.150]for complex projects that some
[00:45:18.920]of the projects I explained they're quite complex
[00:45:21.690]and it's very difficult to find a single approach that works
[00:45:26.973]So there's no single approach that will solve
[00:45:29.070]or work on each problem.
[00:45:31.630]You understand these research projects are complex.
[00:45:33.730]And what that means is that greater attention
[00:45:36.590]to detail as needed with the continuous adjustments
[00:45:39.530]and improvements with the methods and analysis.
[00:45:42.230]And that will ensure that the outcomes are relevant
[00:45:44.300]and applicable to the wider community of researchers.
[00:45:47.870]And I'll say that there is so much more that we can do.
[00:45:51.940]And what that means is plant scientist, engineers
[00:45:56.330]if they can join hands on collaborative projects.
[00:46:01.580]And with this I'll wrap
[00:46:02.480]up by expressing thanks to the soynomics team
[00:46:05.670]including the PIs staff, graduate students
[00:46:08.770]undergraduate students that are currently in the group
[00:46:11.210]and past team members, currently Brian, Jennifer and Ryan
[00:46:15.790]a new employee, and just recently started,
[00:46:17.840]there are staff members that help enable
[00:46:19.560]these complex projects for our team
[00:46:22.350]and sincere appreciation to the funding agency
[00:46:24.510]the Iowa Soybean Association, United Soybean Board,
[00:46:27.510]North Central Soybean Research Program,
[00:46:29.460]Bayer Chair, Iowa Crop Improvement Association
[00:46:32.910]and then support from USD and NISF
[00:46:35.450]and all institutional support is greatly appreciated.
[00:46:39.550]Thank you for your time.
[00:46:41.610]Thank you, Danny.
[00:46:42.443]Thanks for that excellent talk.
[00:46:45.950]One thing that caught my attention was you said
[00:46:48.180]you talked about drought phenotyping and you said
[00:46:50.870]there's now that infrastructure to do drought phenotyping.
[00:46:54.248]And we kind of understand the challenges, right?
[00:46:56.710]Because you have to have that doing draw phenotyping.
[00:47:01.790]If it's in the field, it can be challenging.
[00:47:03.750]And if that is more rain that season
[00:47:08.280]then that can disturb your experiment and things like that.
[00:47:11.731]So one question was about when you talked
[00:47:14.490]about drug phenotyping
[00:47:15.570]is it in the field and are you thinking
[00:47:18.460]of like Reno shelters or something like that?
[00:47:23.150]The second question along the same lines is if I'm not wrong
[00:47:28.950]in same project, you mentioned about bypass dental families
[00:47:31.720]and also screening germplasm accessions.
[00:47:34.500]And I was wondering that
[00:47:36.820]for genomics allocation kinship really matters.
[00:47:39.520]And it depends it's the accuracy is sort of dependent
[00:47:44.750]to some extent on that kinship
[00:47:46.640]but with ML methods
[00:47:48.760]my understanding is somewhat independent to that.
[00:47:51.620]So do you see better results with drought rates
[00:47:54.390]with by parental farm this was max accesions
[00:47:58.170]or diverse accessions?
[00:48:00.900]Yeah, thanks for those questions Vikas.
[00:48:04.260]So, yes the study that I presented there
[00:48:08.170]that is a field site infrastructure there.
[00:48:14.720]And your point is a good point.
[00:48:17.900]That dark screening is you may be successful
[00:48:25.250]on the year but you may not be successful in other year
[00:48:28.120]because rain may just you're just dependent
[00:48:31.010]on the rain and rain are shelters were maybe circumvent
[00:48:36.000]some of the challenges, but they are inherently
[00:48:41.890]they have their own problem because some of the skills
[00:48:43.940]that we want to use in a breeding nursery
[00:48:49.017]or in our shelters would probably
[00:48:49.910]not be an economical solution for us.
[00:48:52.460]So right now we are still
[00:48:54.440]at the mercy of mother nature
[00:48:56.540]that we may get a a good year
[00:48:59.851]where there is an extended period of no rain
[00:49:02.490]that allows us to get some good quality data.
[00:49:09.470]Both for DBAL stresses, but also DBAl stresses
[00:49:12.870]there is a challenge to connect greenhouse
[00:49:15.090]or control environment condition
[00:49:16.980]and its translation to field.
[00:49:18.360]And that's being shown over and over again.
[00:49:21.960]So there's a lot of effort that still goes on
[00:49:25.360]in order to make sure that we can have some more
[00:49:31.010]maybe more easier, more controlled type
[00:49:34.270]of situations where we can still get meaningful data.
[00:49:39.260]And that's an active area of work
[00:49:42.500]but to your point about whether it's in the genomic studies
[00:49:49.120]or in the phenomic study, if there is the reason
[00:49:53.450]for ensuring that we are accounting
[00:49:56.910]for any dependencies is to ensure
[00:50:00.140]that our results are not biased
[00:50:04.460]because of any pre-existing dependencies,
[00:50:07.710]whether it's in terms of a relationship
[00:50:10.270]or maybe something else.
[00:50:12.610]So when it, in our approach in the genomic side
[00:50:17.220]of things, yes those things have to be accounted
[00:50:19.450]for what are the animal side of things.
[00:50:23.240]If there are aspects there's some ML tools that will allow
[00:50:30.180]more intelligent sampling.
[00:50:31.580]One of the examples I gave was that active learning example.
[00:50:34.360]That's probably where you're referring to.
[00:50:37.000]If you take a random sample
[00:50:39.330]then random would serve as that baseline.
[00:50:41.530]But if you're using some of those active learning methods
[00:50:44.440]it they're using different type of functions
[00:50:47.820]that allow them to pick the samples
[00:50:51.360]such that they are able to sample all different classes.
[00:50:56.300]Let's say it's a classification task.
[00:50:58.560]And that gives the model better ability to perform
[00:51:04.630]for the tasks that it is being generated for.
[00:51:07.390]So it doesn't ML weren't necessarily removed
[00:51:11.180]because if we are not accounting for those relationship
[00:51:15.480]that model will probably be a little more myopic
[00:51:18.740]in what it can do a good job.
[00:51:22.530]And then outside of that, it will probably fail
[00:51:25.640]because it has not seen that data where of course
[00:51:27.670]they are some more advanced techniques
[00:51:29.760]if you have a lot of data.
[00:51:31.440]And so the deep learning methods
[00:51:33.750]they may not have the same type of issues.
[00:51:38.220]I don't know if that what you're thinking.
[00:51:41.480]That makes sense.
[00:51:42.420]Yeah, thank you, George, please go ahead.
[00:51:49.340]Thanks Vikas, hey Danny.
[00:51:50.560]Thank you for coming.
[00:51:51.820]Good to see you too bad.
[00:51:53.010]You can't be here in person, but maybe shortly.
[00:51:57.970]Thanks for the presentation.
[00:51:59.050]It was really interesting.
[00:52:00.070]I had a question related to the root imaging and root data.
[00:52:04.770]Just wondering, just a curiosity question.
[00:52:07.410]You were talking mostly
[00:52:08.243]about architecture and that's really interesting.
[00:52:10.260]I was wondering if you were able to evaluate any differences
[00:52:13.760]in growth rate and if there's any relationship
[00:52:16.480]in the rate and between ray and architecture.
[00:52:20.900]Sure, thanks George.
[00:52:26.157]With the time series experiment
[00:52:28.290]with the Aria software that I presented
[00:52:32.000]the current methodology only works deliberate 15 days
[00:52:38.750]and we can make the those background paper
[00:52:44.070]that we are currently
[00:52:44.990]using to go beyond it, but it starts getting
[00:52:48.060]into that question the cynic
[00:52:49.760]and field breeders like yourself and myself, George
[00:52:53.180]we start thinking how does it really relate
[00:52:56.100]to what happens in the field?
[00:52:58.270]So we are using that current, those sorts
[00:53:01.130]of experiments to really understand what is happening
[00:53:04.510]in the architectural traits.
[00:53:07.890]And it's still in a 2D, it's a two dimensional
[00:53:11.210]we tried the 3D and in the early stages
[00:53:14.750]it does like it can work
[00:53:15.840]in the later stages where it's very structurally robust
[00:53:19.060]it works, but otherwise
[00:53:20.790]it just becomes like a bluff.
[00:53:21.890]And we have some difficulty
[00:53:24.960]in getting known because there's a lot
[00:53:27.112]of occlusion because it just becomes like a big glop.
[00:53:29.430]But in the time series that we did
[00:53:32.020]up until that 15 days, yes
[00:53:33.720]we were able to determine the datas
[00:53:37.470]for a number of traits, which gives us again
[00:53:40.230]opportunity to at least start formulating some hypothesis
[00:53:43.690]that maybe we can test in the field that certain genotypes
[00:53:46.450]they have a differential rate in terms of their development
[00:53:50.860]but then it comes back
[00:53:51.860]to how do we relate it back to the performance of the plant?
[00:53:55.820]So it's a very vast area, George
[00:53:59.380]and I hope a lot more people can get
[00:54:02.410]into in studying of those root traits.
[00:54:05.350]Okay, thank you, Jim, please go ahead.
[00:54:09.300]Okay, thank you.
[00:54:10.400]I had a seminar by the way, Danny
[00:54:14.230]a nice overview of everything.
[00:54:15.870]Appreciate your willing to take the time
[00:54:17.680]to present this year.
[00:54:19.410]I was amazed by one of your (indistinct)
[00:54:24.670]were you able to detect nodules at the V1 stage?
[00:54:30.240]And normally when we talk about in the field situation
[00:54:33.470]they're very hard to see until about the V3 stage
[00:54:36.860]and the V1 stage we go by fair and cavernous
[00:54:40.400]where the the the first trifoliate is just unfolding.
[00:54:46.140]So is your software that good
[00:54:49.650]so that you can detect nodules almost
[00:54:55.440]visible to the human eye.
[00:54:56.900]I see where they want.
[00:54:58.870]And I thought, man, that's pretty good detection.
[00:55:02.054]Can you comment on that?
[00:55:04.130]Yeah, Jim, thanks for making time.
[00:55:07.190]It's good to hear you.
[00:55:09.890]So these experiments, they were based
[00:55:15.680]on two different sites that we used.
[00:55:19.860]We did one of the sites we did
[00:55:24.050]and all the care was taken to extract it
[00:55:28.120]at these stages and carefully current
[00:55:32.160]at every single nodule that could be farmed.
[00:55:36.320]So I don't know why we didn't see much difference
[00:55:45.010]at that V stage but obviously we did see some nodules
[00:55:48.720]in this particular image based software.
[00:55:52.530]They are able to like, if we can give the label data
[00:55:57.080]then they are able to recognize it.
[00:55:59.160]And it gives confidence that each
[00:56:01.580]of the box that it draws automatically
[00:56:03.400]that it draws once it's trained the model stream
[00:56:06.080]it does give confidence level
[00:56:08.170]whether or not the model fields that yes
[00:56:10.900]I feel comfortable that this actually is a nodule.
[00:56:15.250]We also, we were interested to see
[00:56:17.620]whether there is any mixing up with cyst
[00:56:22.300]soybean cyst for some reason,
[00:56:24.970]and in the paper that hopefully will be coming
[00:56:28.320]or we have covered that aspect as well.
[00:56:31.150]So I'll have to look into a little bit more
[00:56:37.150]the comment that they are not that if they're not found
[00:56:40.940]at V1 but yes at least in the imaging that we did,
[00:56:47.400]those were, the models did get
[00:56:49.680]that data labeled data fed in into it.
[00:56:54.100]And we also had every sample went through two people.
[00:56:57.890]So one person did it initially and other person
[00:57:01.360]did the the validation on it on the actual physical root.
[00:57:06.560]And if there was a mismatch
[00:57:08.470]then a third person counted again.
[00:57:10.720]And then it was also done
[00:57:12.210]on the image that was taken off the sample.
[00:57:14.560]So I feel fairly confident that if it is counted
[00:57:22.360]or annotated as a nodule I feel comfortable
[00:57:26.360]that it goes it has gone through three people.
[00:57:29.460]Yeah, I don't have any doubts about the thing.
[00:57:31.966]I just thought it was kind of amazing because imagery
[00:57:35.460]you can detect things beyond the human eye probably
[00:57:39.190]but I can detect.
[00:57:40.023]So you might be detected even just a budding amount
[00:57:43.090]that it's pretty hard for the IOC.
[00:57:45.047]And you answered my second question
[00:57:46.560]about the SIS this seminar I've learnt very much.
[00:57:51.670]Yeah, thanks, Jim.
[00:57:52.560]I want to ask a question
[00:57:53.670]and this is probably unrelated to the research,
[00:57:56.940]but just in terms of graduate education
[00:57:59.440]I know that you've been interacting with students
[00:58:01.420]and you're also the director for graduate education.
[00:58:04.730]And just listening to the seminar today
[00:58:06.760]we just talking about so many technologies
[00:58:09.110]data-driven science, the Nomex
[00:58:12.630]we still have the integral part of plant breeding teaching.
[00:58:15.880]So how is plant breeding education evolving in academia?
[00:58:20.130]And what are your thoughts there and will happen before?
[00:58:23.330]Yeah, that's a deep, deep question
[00:58:25.130]because I think there are some breeders
[00:58:31.180]that are just, they're gifted
[00:58:33.380]and George, a good friend of mine
[00:58:35.960]he's one of those just gifted breeders.
[00:58:38.670]And they're probably some of the best breeders
[00:58:43.680]that are in the country and the knowledge
[00:58:48.320]that field breeders bring that is something
[00:58:54.450]that courses alone cannot get an up and coming breeder
[00:59:00.430]to really get that length and depth of the knowledge
[00:59:06.040]that someone like George Getty's.
[00:59:09.850]So these courses are helpful
[00:59:12.000]but spending time with the experts
[00:59:15.500]it makes a lot of difference to the up-and-coming breeders.
[00:59:20.940]And as they're using some of the tools
[00:59:23.880]that George and I have some of the most routine tools
[00:59:26.510]that we are using they're becoming more routine
[00:59:29.730]like things with markers food selection
[00:59:33.200]and even some of the drone based things.
[00:59:36.100]But our students are also learning things that maybe George
[00:59:40.840]and I are not that we're not experts in
[00:59:43.410]but things that are just so new and they may come
[00:59:48.830]from a non-planned science discipline.
[00:59:52.150]And that means that those students
[00:59:54.030]not only have to take courses in it
[00:59:56.830]but they have to embed themselves
[00:59:58.800]in those other groups to learn the nuances
[01:00:03.450]of that particular discipline understand
[01:00:05.890]get the hard skills associated
[01:00:07.950]with that particular discipline, that particular topic.
[01:00:11.200]And then also being able to integrate,
[01:00:14.100]like it's one thing to get the knowledge
[01:00:15.680]but it's another thing to be able to integrate
[01:00:17.560]in some actionable outcomes.
[01:00:19.690]And that's where the education component comes with it.
[01:00:24.780]There may be a lot more cool advisee students.
[01:00:28.260]They may have a major advisor
[01:00:30.640]let's say I'm a major advisor
[01:00:31.960]but then they have a major advisor who may be in college
[01:00:34.920]of engineering or maybe in computer science.
[01:00:38.380]So plant breeding is a fascinating discipline
[01:00:41.810]because it integrates every other discipline in itself.
[01:00:45.840]And that's why I think it's a great
[01:00:50.400]I'm not going to say an easy
[01:00:51.430]but I think it's just rightly placed
[01:00:54.700]to get more interdisciplinary aspects in the education.
[01:00:58.070]And that comes from the coursework, but also spending time
[01:01:01.510]in learning with some really capable researchers leaders
[01:01:06.708]and then also spending time outside of their discipline.
[01:01:11.230]So I think it takes it takes a lot of effort
[01:01:18.550]to have these breeders for the future
[01:01:20.910]because some of the challenges that are coming our way,
[01:01:24.610]I know, I'm not equipped to handle
[01:01:30.780]maybe address those challenges
[01:01:32.080]but I think the next generation of plant breeders
[01:01:34.930]they will be able to, because they will be better version
[01:01:37.530]of breeders than I can be.
[01:01:40.910]Sure, now that's a difficult question
[01:01:43.960]hotline to answer but thanks for sharing those thoughts.
[01:01:47.510]Thank you again for the great seminar.
[01:01:50.020]We really enjoyed it.
Log in to post comments