Lightning McQueen Goes to School: Learning Based Autonomous Driving for F1TENTH Vehicles

Ethan Clark Author

07/28/2021 Added

21 Plays

Description

Intelligent and autonomous vehicle taught through supervised learning and reinforcement learning in a simulated environment.

Searchable Transcript

Search:

[00:00:06.140]Hello, my name is Ethan.
[00:00:08.060]And the project that I'll be discussing today is titled Lightning McQueen Goes to
[00:00:12.410]School: Learning-Based Autonomous Driving for F1TENTH Vehicles.
[00:00:17.090]My supervisor for this project was Dr.
[00:00:19.160]Tran from the Computer Science and Engineering department.
[00:00:22.730]And before I go any further, I would like to point out in the bottom, right,
[00:00:26.060]that this is the F1TENTH vehicle.
[00:00:27.980]For those of you that are not familiar with it,
[00:00:35.120]to provide some motivation for this project,
[00:00:37.490]autonomous vehicles have enormous potential in decreasing all vehicle related
[00:00:42.080]accidents over before they can become available to all consumers.
[00:00:47.150]They must first guarantee a better safety approval rating than human drivers.
[00:00:53.900]For some background into this problem,
[00:00:56.030]autonomous driving research has been in the works for 70 to 80 years now.
[00:01:00.680]However, there's still yet to be any
[00:01:03.740]level five autonomous vehicles or fully autonomous.
[00:01:08.780]Also with the advent of supervised learning,
[00:01:11.480]along with reinforcement learning in the modern era of computing,
[00:01:15.020]these have produced a resurgence in the field of research for autonomous
[00:01:19.010]driving.
[00:01:20.570]Below we have five of the leading companies in the field,
[00:01:25.010]starting from the left. We have General Motors Cruise, Tesla,
[00:01:30.080]Autopilot, Google Waymo,
[00:01:33.230]Amazon's Aurora and comma.ai.
[00:01:38.380]The main contributions of this project was lane following and advanced emergency
[00:01:43.120]braking, to begin with lane following, this just means the vehicle is able to stay
[00:01:48.040]in its position within the lane. And then with advanced emergency braking,
[00:01:52.810]this teaches the agent when to perceive there's going to be a crash.
[00:01:57.760]And if it suspects is going to crash, then it applies its brakes,
[00:02:02.920]allowing it to stay safe
[00:02:07.780]here. I have a short clip demonstrating the advanced emergency braking system in
[00:02:12.310]action.
[00:02:25.040]So this clip was done in the CARLA simulation environment.
[00:02:35.380]The methods used in this project was a Dueling Deep Q Network,
[00:02:39.850]which is an extension of the Deep Q Network.
[00:02:42.640]And this essentially takes in the state of an agent,
[00:02:46.420]passes it through some linear layers.
[00:02:48.700]A linear layer is a linear transformation.
[00:02:51.520]So you apply a linear transformation to the input values,
[00:02:54.730]and then you output the linear transformed values
[00:02:58.480]then on these linear transformed values,
[00:03:00.460]you apply an activation function. Most commonly,
[00:03:03.700]a rectified linear unit. I'm not going to get into the details of this,
[00:03:07.660]but this is very common in the field.
[00:03:11.110]And then once it's passed through all these layers,
[00:03:13.930]it outputs a Q value for each action.
[00:03:17.080]And the Q value essentially represents how advantageous it is to take that
[00:03:21.220]action. So once you pass,
[00:03:23.320]in a state and this state has, say, five actions,
[00:03:28.060]so you have five Q values corresponding to each action. And through this,
[00:03:32.050]you're able to determine which action is the optimal action.
[00:03:37.450]This chart
[00:03:39.610]depicts the results from the experiment on the X axis.
[00:03:43.690]You have the episodes on the Y axis. You have the reward.
[00:03:47.530]So in the blue we have the episodic reward and this is the
[00:03:52.330]reward received for every episode of training. As you can see in the beginning,
[00:03:56.980]the range is very large because the agent is not aware,
[00:04:01.010]which action is better. So it's just taking random actions in the beginning.
[00:04:05.950]And then as you can see over time, the range decreases.
[00:04:10.210]And then now if you look at the average reward in red,
[00:04:13.240]this is the average over the last 50 episodes.
[00:04:16.030]So it's a very smooth increase as opposed to the very large range.
[00:04:20.770]And by looking at the average reward, you're able to better, uh,
[00:04:26.020]better observe the agent in the age of learning and how it's able to
[00:04:30.970]learn the optimal breaking policy over time,
[00:04:33.910]by reducing it's reducing his punishment and in a sense, increasing his reward.
[00:04:41.230]So now for some conclusions, this experiment,
[00:04:45.190]or this project demonstrated that the agent was able to learn the optimal
[00:04:48.700]breaking policy. And this was done through a reward function.
[00:04:53.620]This reward function heavily punished the agent when it crashed.
[00:04:58.030]And the punishment was increased with respect to its speed.
[00:05:02.710]Prior to the crash. This has parallels to in real life,
[00:05:06.790]you can think about if you crash into a car,
[00:05:09.190]depending on how fast you're going prior to your crash,
[00:05:12.100]you'll produce more damage. And this was captured with the reward function.
[00:05:17.620]These contributions done in this project are building blocks to
[00:05:22.240]develop a fully autonomous driving system.
[00:05:25.000]Lane following and advanced emergency braking are two of the most
[00:05:28.240]fundamental things,
[00:05:29.140]because from those you're able to build atop of and develop more features.
[00:05:36.650]This was the poster that was used for this presentation,
[00:05:41.330]which was developed from our projects.
[00:05:45.500]And these are my references and my acknowledgments,
[00:05:48.380]I would like to thank the National Science Foundation for funding this project
[00:05:51.410]and allowing this to be possible. And that is the end.
[00:05:55.880]Thank you very much. And now I'm open for questions.

The screen size you are trying to search captions on is too small!

You can always jump over to MediaHub and check it out there.

Comments

0 Comments

Lightning McQueen Goes to School: Learning Based Autonomous Driving for F1TENTH Vehicles

Description

Searchable Transcript

Comments icon comment

Related Channels

Comments