Identifying, Analyzing, and Using Discriminatory Variables for Classification of Neutrino Signal and Background Noise in Multivariate Analysis in the Askaryan Radio Array Experiment
Jesse Osborn
Author
04/05/2021
Added
12
Plays
Description
Developing a noise filter for data in the Askaryan Radio Array Experiment by developing discriminatory variables based on the physical characteristics of the data.
Searchable Transcript
Toggle between list and paragraph view.
- [00:00:01.940]Hi, my name is Jesse Osborn.
- [00:00:03.990]And today I'm gonna be talking about my UCARE project,
- [00:00:07.050]Identifying, Analyzing and Using Discriminatory Variables
- [00:00:12.820]for Classification of Neutrino Signal
- [00:00:15.190]and Background Noise and Multivariate Analysis
- [00:00:17.570]in the Akaryan Radio Array Experiment.
- [00:00:21.320]So the Askaryan Radio Array experiment
- [00:00:23.750]is a neutrino detector built near the South Pole,
- [00:00:28.640]which operates by trying to record radio wave signals
- [00:00:35.110]that occur when high energy neutrinos come from outer space
- [00:00:38.480]and interact with ice molecules at the South Pole.
- [00:00:41.900]So the experiment is made up of five stations.
- [00:00:46.380]Each of which has 16 radio wave antennas
- [00:00:49.730]buried about 200 meters deep into the ice.
- [00:00:53.860]So the Askryan Radio Array experiment or ARA for short,
- [00:00:56.960]records, millions of events per year but only a handful
- [00:01:02.180]of these events are expected to come from actual neutrinos
- [00:01:06.250]from outer space that the experiment is trying to analyze.
- [00:01:11.240]The rest of them are expected to come from background noise,
- [00:01:16.910]from the instrumentation, electronics or construction
- [00:01:20.560]going on at the South Pole, that sort of thing.
- [00:01:24.000]So my goal with this research was to eliminate
- [00:01:27.790]some of this background noise.
- [00:01:29.810]Now, there currently exists three filters for the experiment
- [00:01:34.450]that already work to get rid of background noise,
- [00:01:37.490]based off of the physical characteristics
- [00:01:40.510]of a actual neutrino signal that we're trying to measure.
- [00:01:44.390]So the first filter is the hit filter
- [00:01:47.620]which just requires that four of the 16 antennas
- [00:01:50.800]for a given station record something for a given event.
- [00:01:56.560]The second filter is the time sequence filter
- [00:01:59.460]which just says these hits that these antennas
- [00:02:02.640]are recording come in an order that resembles a planar wave
- [00:02:08.580]moving across the antennas from a single point source.
- [00:02:13.110]So wherever a neutrino would have interacted
- [00:02:15.310]with an ice molecule.
- [00:02:17.770]And the third filter is the arrival angle filter
- [00:02:21.140]which restricts the stations to only recording stuff
- [00:02:25.110]that comes from within the ice and not from above the ice.
- [00:02:30.460]So these three filters get rid of about 99% of all the data
- [00:02:39.180]that a given station might record for a single year.
- [00:02:42.810]And for this particular experiment
- [00:02:44.890]or for my particular research, I worked with 10% of the data
- [00:02:49.770]taken by a single station for a given year
- [00:02:52.590]which was about 10,400,000 events.
- [00:02:55.730]And after these three filters had done their work
- [00:02:58.610]on that number of events, we're down to about 2,700 events
- [00:03:03.490]which I am taking for what I'm talking about today
- [00:03:06.870]to be a background sample of events.
- [00:03:09.980]And I'm comparing these background events
- [00:03:13.470]because almost none of these are expected
- [00:03:15.870]to be actual neutrino events with simulated neutrino events
- [00:03:20.750]from a particular software package
- [00:03:22.530]that another student developed.
- [00:03:24.610]And I'm trying to draw distinctions
- [00:03:27.530]between these two sets of events so I can get rid
- [00:03:30.330]of all the remaining background events.
- [00:03:34.720]So in particular, here's an example of an event
- [00:03:38.750]that made it through all three of the existing filters.
- [00:03:42.270]And here's an example of a simulated neutrino signal event.
- [00:03:46.560]So, you know, looking at these two you can see
- [00:03:48.790]there are a lot of fiscal characteristics
- [00:03:51.240]that are different between them.
- [00:03:53.550]And my goal was to try and build a collection of variables
- [00:03:59.620]that can discriminate between these two different sets
- [00:04:02.730]of data and plug these variables into a what's called TMVA
- [00:04:08.370]which is just a computer algorithm that can take a bunch
- [00:04:12.610]of input variables and turn them
- [00:04:14.370]into a single output variable that's very good
- [00:04:17.550]for distinguishing between actual signal
- [00:04:21.040]and background noise.
- [00:04:23.160]And then I wanted to see if discrimination
- [00:04:26.010]on this new variable could be effectively used
- [00:04:28.890]for a new filter for the ARA data.
- [00:04:33.940]So for my research, instead of working directly
- [00:04:38.250]with the radio wave forms as they exist,
- [00:04:42.950]I would run an envelope over the wave forms
- [00:04:45.960]which is defined up here in the top right.
- [00:04:49.220]Just taking the data points and sort of smoothing them out
- [00:04:52.390]and making them all positive as you can see here
- [00:04:54.840]in this figure.
- [00:04:56.260]And then by breaking the wave form up
- [00:04:58.080]into four different quadrants and using
- [00:05:00.650]the two smallest quadrants, which would be green
- [00:05:04.090]and orange here, we can define a threshold level
- [00:05:09.330]for interesting stuff in the wave form as I'll call it.
- [00:05:14.190]So, as an example of this on an actual wave form,
- [00:05:18.260]you can see here in red is an envelope that's been drawn
- [00:05:21.560]and a line has been drawn at the level
- [00:05:23.580]of what I call a threshold.
- [00:05:26.090]And anything above this threshold is something interesting
- [00:05:29.100]in the wave form that I wanna look at.
- [00:05:31.860]So using this definition, I developed a bunch of variables
- [00:05:36.850]some of which you can see here,
- [00:05:39.770]and calculated these variables for both signal
- [00:05:43.420]and background events to create these
- [00:05:46.060]different distributions of the variables and their values
- [00:05:50.610]for the two different samples,
- [00:05:52.820]blue being the signal and red being the background samples.
- [00:05:56.470]And you can see from these different plots
- [00:05:58.550]that some of them have more or less disagreement
- [00:06:03.550]between signal and background samples.
- [00:06:06.810]But by using this TMVA computer algorithm,
- [00:06:10.870]I can merge all of this information together
- [00:06:13.930]into one very effective variable
- [00:06:19.020]that I can use for discrimination.
- [00:06:21.890]Here are some of the other variables that I worked with.
- [00:06:24.960]So by taking all of these different distributions
- [00:06:30.450]and plugging them into a computer algorithm,
- [00:06:32.780]I can develop a single output variable which is seen here.
- [00:06:37.840]Again with the background in red and the signal in blue.
- [00:06:41.350]And here you can see there's a much wider separation
- [00:06:44.720]between the two samples.
- [00:06:46.440]So any simple restriction on this variable,
- [00:06:49.920]say you want your events to have this variable
- [00:06:55.500]be greater than zero, then you could cut all
- [00:06:58.110]of this background in red over here on the left of the plot
- [00:07:01.250]and keep what appears to be all of your signal in blue
- [00:07:04.530]on the right of the plot.
- [00:07:06.500]And to just quantify this a bit further,
- [00:07:09.400]if you did perform these linear cuts
- [00:07:12.390]on this distribution plot, you would see
- [00:07:16.820]at one particular value, if you were willing to give up 4.9%
- [00:07:22.000]of your total signal, you could remove upwards of 99.75%
- [00:07:27.420]of all the background events.
- [00:07:29.640]So it appears that by building a bunch of variables
- [00:07:37.040]on individual wave forms for background and signal
- [00:07:40.600]and plotting the distributions of these variables
- [00:07:43.490]for a signal sample and a background sample,
- [00:07:46.220]we can merge all these distributions into one
- [00:07:49.310]very efficient variable to cut out a lot of background noise
- [00:07:54.350]while keeping all of our good neutrino signal.
- [00:07:58.590]So in essence, this was my research project for ARA.
- [00:08:05.050]ARA already has three filters that have been developed
- [00:08:07.710]to get rid of background noise,
- [00:08:09.720]and they've done a lot of work,
- [00:08:11.860]but there's still quite a bit of work left to be done.
- [00:08:15.840]So I developed some discrimination variables
- [00:08:18.550]and by using all of these variables
- [00:08:21.210]and merging them together with a computer algorithm,
- [00:08:25.280]I was able to create a single highly efficient variable
- [00:08:30.950]for background rejection, greater than 90%
- [00:08:34.690]background rejection in the samples that I looked at.
- [00:08:39.230]And my hopes going forward are to develop this process
- [00:08:44.410]into a fourth data filter for the ARA experiment.
- [00:08:49.890]So I would like to thank my advisor, Dr. Ilya Kravchenko,
- [00:08:53.640]who I've worked with for the past three years.
- [00:08:57.810]I'd also like to thank you UCARE
- [00:08:59.200]for funding all of this research.
- [00:09:01.200]I'd like to thank a former UNL student Andrew Schultz,
- [00:09:03.970]who helped develop some of the software for this project.
- [00:09:07.080]And I'd also like to thank the ARA collaboration
- [00:09:10.230]and the UNL Physics Department's High Energy Group
- [00:09:13.020]for giving me valuable feedback
- [00:09:15.520]during this research project.
The screen size you are trying to search captions on is too small!
You can always jump over to MediaHub and check it out there.
Log in to post comments
Embed
Copy the following code into your page
HTML
<div style="padding-top: 56.25%; overflow: hidden; position:relative; -webkit-box-flex: 1; flex-grow: 1;"> <iframe style="bottom: 0; left: 0; position: absolute; right: 0; top: 0; border: 0; height: 100%; width: 100%;" src="https://mediahub.unl.edu/media/16367?format=iframe&autoplay=0" title="Video Player: Identifying, Analyzing, and Using Discriminatory Variables for Classification of Neutrino Signal and Background Noise in Multivariate Analysis in the Askaryan Radio Array Experiment" allowfullscreen ></iframe> </div>
Comments
0 Comments