Transcriptome Analysis to Increase Soybean Oil- a Biofuel Source
Pranav Palli
Author
08/03/2021
Added
8
Plays
Description
This experiment hopes to understand differentially expressed genes between multiple genetically altered lines of soybean plants. Our goal is to find the genes that are key to increased oil production. Feel free to email me for questions or concerns.
Email- ppalli2@huskers.unl.edu
Searchable Transcript
Toggle between list and paragraph view.
- [00:00:00.540]Hey everyone. I'm Pranav and I'm a rising junior at UNL.
- [00:00:03.960]This upcoming fall majoring in computer science. This summer, I worked under Dr.
- [00:00:08.460]Cheesing to conduct research to identify differentially expressed genes between
- [00:00:12.720]multiple genetically altered lines of soybean plants.
- [00:00:15.870]Let's dive into the background.
- [00:00:18.030]The main goal of this experiment was to isolate genes that increase the oil
- [00:00:21.150]content in soybean seeds. As soybean oil is a potential source of biofuel.
- [00:00:25.800]Plants captured carbon dioxide to synthesize macro molecules,
- [00:00:29.040]such as carbohydrates and lipids and soybeans possessing an efficient pathway
- [00:00:33.060]for this is essential to increase oil content in its seats.
- [00:00:36.930]While we know that links exists between pathways into a production,
- [00:00:40.350]this experiment hopes to understand the relations between specific genes,
- [00:00:43.800]which is key to develop plants with better yield.
- [00:00:47.010]The experiment contains four lines of soybean plants each with three replicas as
- [00:00:51.120]shown in table one,
- [00:00:52.710]the experimental lines being the plans that have the wr one gene,
- [00:00:56.280]the wr one place. Do you get genes?
- [00:00:58.440]And the wr one plus cast two gene treatments edited in their genome.
- [00:01:03.030]It's worth noting that the wr one is an enabler protein that increases lipid
- [00:01:07.110]generation.
- [00:01:07.920]Hence why is also being tested with the other genes to see its effects on them.
- [00:01:12.870]Each of these plans have their gene count measured using a high-throughput RNA
- [00:01:16.980]sequencing tool at three distinct time points are five are five slash six,
- [00:01:21.990]and our six as shown here,
- [00:01:23.820]which are the stages of seed develop gene counts represent the number of times a
- [00:01:28.050]certain gene is present in the genome in parts per million.
- [00:01:31.590]These have varying impact on the plants. As for instance,
- [00:01:34.470]a large count of oil production.
- [00:01:36.480]Gene means that more oil is produced and vice versa.
- [00:01:40.440]Our goal is to find genes with counts that have large differences between
- [00:01:44.340]genotypes eat sample have obtained. Data follows a strict naming convention,
- [00:01:48.720]such as B3 underscore two or B represents the time period.
- [00:01:53.160]The three represents the genetic treatment and the two represents the replica
- [00:01:56.970]number second, let's take a look at her methods.
- [00:02:00.690]The raw counts obtained from the experiment are shown in figure one we have ever
- [00:02:05.040]cannot compare these for two main reasons. Reason. Number one,
- [00:02:08.430]the gene counts for each sample could have been acquired from different parts of
- [00:02:11.700]the plant, thus unbalancing, the whole sample itself. For example,
- [00:02:15.960]the gene reads between a leaf and a stem,
- [00:02:18.000]or obviously wouldn't be very different as their tissues are different.
- [00:02:21.510]And the data is so large and reason. Number two,
- [00:02:24.660]the data is so large so that any small p-value would still lead to huge amounts
- [00:02:28.890]of false positives. For instance,
- [00:02:31.020]if we're experimenting with five 50,000 genes and have a P value of
- [00:02:35.550]0.1, which is conventionally very good,
- [00:02:37.830]that results still has 500 false positives, which is not great to combat this.
- [00:02:42.360]We use bio conductors, DEC two package,
- [00:02:45.000]which uses logarithmic normalization methods to compare the genes.
- [00:02:48.870]Logarithmic methods are much better because they reduced the variability in the
- [00:02:52.260]data. Figure two does a great job explaining this here.
- [00:02:56.010]We're comparing the log of all the gene counts of three.
- [00:02:59.890][inaudible] a one, three,
- [00:03:01.780]and B two for the more clustered a graph between two samples is the more similar
- [00:03:06.730]the two samples are.
- [00:03:08.050]So it makes sense that the [inaudible] samples are closer as seen on
- [00:03:13.000]the left and figure to eight.
- [00:03:15.910]These are closer than figure two be which contains the comparison between
- [00:03:19.240][inaudible] and before too,
- [00:03:21.250]because the former is a comparison between replicas while the latter is a
- [00:03:25.240]comparison between two different samples at two different time points.
- [00:03:28.960]If you compared any sample to itself,
- [00:03:30.730]it would be a perfect line across the diagonal.
- [00:03:34.270]In addition to logarithmic normalization, DEC,
- [00:03:37.210]to also utilize a design formula as shown in figure three, here,
- [00:03:41.350]it controls for the time values across samples while analyzing the differential
- [00:03:45.660]express genes among different genotypes.
- [00:03:48.130]It is the design formula is the third parameter of the function shown in figure
- [00:03:51.880]three, where the first value type is being controlled for.
- [00:03:55.390]And the second value genotype is being the one tested for each of these
- [00:03:59.290]comparisons have two groups of samples, one control,
- [00:04:02.260]and one experimental and their designations are as shown in table two.
- [00:04:06.460]We've isolated the samples so that in each comparison there's only one
- [00:04:09.700]independent variable that we're testing for DC.
- [00:04:13.690]Two also uses an adjusted p-value calculation through the Benjaminian Hertzberg
- [00:04:18.070]method. That is way more accurate than the regular one.
- [00:04:20.530]Thus reducing the number of false positives there let's discuss the results.
- [00:04:25.090]There's the results from running the data through the DEC two model are shown in
- [00:04:28.810]figure four, for the purpose of this presentation,
- [00:04:31.540]we will be discussing the comparisons between the wild type and the wr one gene
- [00:04:35.350]treatments. Although the same methods were applied to all other treatments.
- [00:04:39.250]There are two important values that we will focus on for each gene.
- [00:04:43.030]Those are the log two-fold change and the adjusted p-value.
- [00:04:46.150]The lock to full change is a difference between the log of gene counts and tells
- [00:04:49.960]us how different two genes are.
- [00:04:51.940]I E if the log two for change for gene X is four.
- [00:04:55.600]That means the wr one sample has two to the power of four times.
- [00:04:59.770]As many reads of that gene as a wild type sample does the adjusted P value is
- [00:05:04.540]calculated internally so that it is more accurate than the original P value.
- [00:05:08.350]As I mentioned before,
- [00:05:10.570]we have sorted the results from figure four to only show genes with the log two
- [00:05:13.810]for change that is greater than 0.7 and an adjusted P value that is less than
- [00:05:17.800]0.005.
- [00:05:19.360]This way we can assure that the only even the most significantly different genes
- [00:05:22.630]and the ones with the highest probability of being true positives are
- [00:05:25.480]considered,
- [00:05:26.440]we then use soy based and online soybean genomics tool to analyze these
- [00:05:30.580]significant genes to find their biological function.
- [00:05:33.430]A pie chart is displayed in figure five. The three,
- [00:05:36.730]the three biggest parts of this pie chart are the carbohydrates slash lipids
- [00:05:40.210]synthesis and transport,
- [00:05:42.160]which corroborates to the idea that the wr one is responsible for increased oil
- [00:05:46.090]production and maintenance among other functions to conclude the differential
- [00:05:50.890]gene analysis in soybeans can help us find the intimate relations that EEG has
- [00:05:55.330]on the overall plant and each other.
- [00:05:57.980]The results that we have displayed show that the way that genes impact several
- [00:06:01.820]others, another aspect of differential analysis that we can explore in future
- [00:06:05.720]experiments is how the genes change over time periods and how that varies across
- [00:06:10.340]different genotypes.
- [00:06:11.630]This can help us construct better plans and better plant lines that can have
- [00:06:15.350]increased oil yield in the future. Finally,
- [00:06:18.410]I would like to acknowledge the Nebraska center for energy sciences research for
- [00:06:22.280]directly supporting this research. I would also like to thank Dr.
- [00:06:25.280]Cheesing for mentoring and guiding me through this project.
- [00:06:28.460]Thank you everyone for listening to this presentation,
- [00:06:30.410]and I hope you have a great day.
The screen size you are trying to search captions on is too small!
You can always jump over to MediaHub and check it out there.
- Tags:
- genes
Log in to post comments
Embed
Copy the following code into your page
HTML
<div style="padding-top: 56.25%; overflow: hidden; position:relative; -webkit-box-flex: 1; flex-grow: 1;"> <iframe style="bottom: 0; left: 0; position: absolute; right: 0; top: 0; border: 0; height: 100%; width: 100%;" src="https://mediahub.unl.edu/media/17582?format=iframe&autoplay=0" title="Video Player: Transcriptome Analysis to Increase Soybean Oil- a Biofuel Source" allowfullscreen ></iframe> </div>
Comments
0 Comments