Image: Tesla
In recent years, companies have increasingly sought to develop autonomous driving systems. While others rely on Lidar, Tesla has taken its own unique path, seeking to create a "Smart Autopilot" which interprets data from four dimensions. To achieve this, Tesla's hardware uses only eight cameras, radar and sonar, which ultimately have to convert information into surround video. All this has a number of advantages, which help Tesla Autopilot work like a Superhuman.
A person with only two eyes (like two cameras) is not able to simultaneously control everything that surrounds the car. As humans, we do our best to drive safely and avoid accidents. But we are already doing this as well as possible for a human; that is, we have already reached our limit.
On the other hand, autonomous driving technologies can be improved many times, to the point where vehicles become 100% safe. Tesla cars already have what it takes to become like a Superhuman and make autonomous driving at least 10X safer than human-pilot driving.
Tesla is steadily moving all NNs to 8 camera surround video. This will enable superhuman self-driving.
— Elon Musk (@elonmusk) January 25, 2021
In February 2019, Tesla filed a patent for 'Generating ground truth for machine learning from time series elements,’ which disclosed a machine learning training technique for generating highly accurate machine learning results. In fact, it reveals the methodology of the work of the system in 4D. Tesla CEO Elon Musk tried to explain what 4D is:
"You're thinking about the world in three dimensions and the fourth dimension being time.
It's capable of things that if you just look - looking at things as individual pictures as opposed to video - basically, like you could go from like individual pictures to surround video. So, it's fundamental.
So, that architectural change, which has been underway for some time but has not really been rolled out to anyone in the production fleet, is what really matters for full self-driving."
Musk was trying to explain that, when we look at images of an event, we see only separate images—but when we watch a video, this gives us a complete understanding of what happened, and we can correctly and objectively assess it. The same thing happens with Tesla cars, which "see" in 4D. When Tesla cameras capture images, they combine them with time (4th dimension) to create surround video.
This is the key point in order to correctly recognize and take into account the trajectory of dynamically occluded objects, which is especially important in places with dense vehicle and pedestrian traffic. Tesla is rewriting all labeling software for 4D, and within a year (or so), Dojo (Tesla's supercomputer) will begin contributing to NN training, making FSD even safer.
© 2021, Eva Fox. All rights reserved.
_____________________________
We appreciate your readership! Please share your thoughts in the comment section below.
Article edited by @SmokeyShorts, you can follow him on Twitter