Getting started with deep learning 12 years ago, Karpathy set off a wave of memories of the AlexNet era, and LeCun, Goodfellow, etc. all ended up-AI-php.cn

Getting started with deep learning 12 years ago, Karpathy set off a wave of memories of the AlexNet era, and LeCun, Goodfellow, etc. all ended up

王林

Release： 2024-07-16 01:08:30

Original

1090 people have browsed it

Unexpectedly, 12 years have passed since the deep learning revolution started by AlexNet in 2012.

And now, we have also entered the era of large models.

Recently, a post by the well-known AI research scientist Andrej Karpathy has caused many big guys involved in this wave of deep learning revolution to fall into memories. From Turing Award winner Yann LeCun to Ian Goodfellow, the father of GAN, they all recalled the past.

This post has 630,000+ views so far.

Getting started with deep learning 12 years ago, Karpathy set off a wave of memories of the AlexNet era, and LeCun, Goodfellow, etc. all ended up

In the post, Karpathy mentioned: An interesting fact is that many people may have heard of the ImageNet/AlexNet moment in 2012 and the deep learning revolution it started. However, few may know that the code supporting the winning entry in this competition was hand-written from scratch in CUDA/C++ by Alex Krizhevsky. This code repository was called cuda-convnet, which was hosted on Google Code at the time:

Getting started with deep learning 12 years ago, Karpathy set off a wave of memories of the AlexNet era, and LeCun, Goodfellow, etc. all ended up

https://code.google.com/archive/p/cuda-convnet/

Karpathy was thinking about Google Code, right? has been closed (?), but he found some new versions on GitHub created by other developers based on the original code, such as:

Getting started with deep learning 12 years ago, Karpathy set off a wave of memories of the AlexNet era, and LeCun, Goodfellow, etc. all ended up

https://github.com/ulrichstern/cuda-convnet

"AlexNet is one of the earliest famous examples of using CUDA for deep learning." Karpathy recalled that it is precisely because of the use of CUDA and GPU that AlexNet can process such large-scale data (ImageNet) and achieve great results on image recognition tasks. Such a great performance. "AlexNet not only simply uses GPUs, but is also a multi-GPU system. For example, AlexNet uses a technology called model parallelism to divide the convolution operation into two parts and run them on two GPUs respectively."

Karpathy reminds everyone, you need to know that this is 2012! "In 2012 (about 12 years ago), most deep learning research was conducted in Matlab, running on the CPU, and continuously iterating various learning algorithms, network architectures and optimization ideas on toy-level data sets." He wrote. But Alex, Ilya and Geoff, the authors of AlexNet, did something completely different from the mainstream research style at the time - "No longer obsessed with algorithm details, just take a relatively standard convolutional neural network (ConvNet) and It was made very large, trained on a large-scale data set (ImageNet), and then implemented the whole thing in CUDA/C++”

Alex Krizhevsky wrote all the code directly in CUDA and C++. Including convolution, pooling and other basic operations in deep learning. This approach is very innovative and challenging, requiring programmers to have an in-depth understanding of algorithms, hardware architecture, programming languages, etc.

The programming method from the bottom is complex and cumbersome, but it can optimize performance to the maximum extent and give full play to the computing power of the hardware. It is this return to the basics that injects a powerful power into deep learning and constitutes deep learning. Learn about turning points in history.

What’s interesting is that this description brought back the memories of many people, and everyone was searching for what tools they used to implement deep learning projects before 2012. Alfredo Canziani, a professor of computer science at New York University, was using Torch at the time. "I have never heard of anyone using Matlab for deep learning research..."

Getting started with deep learning 12 years ago, Karpathy set off a wave of memories of the AlexNet era, and LeCun, Goodfellow, etc. all ended up

Yann lecun agrees, most of the important deep learning in 2012 was done with Torch and Theano.

Getting started with deep learning 12 years ago, Karpathy set off a wave of memories of the AlexNet era, and LeCun, Goodfellow, etc. all ended up

Karpathy had a different view. He added that most projects are using Matlab, and he has never used Theano. He used Torch in 2013-2014.

Getting started with deep learning 12 years ago, Karpathy set off a wave of memories of the AlexNet era, and LeCun, Goodfellow, etc. all ended up

Some netizens also revealed that Hinton also uses Matlab.

Getting started with deep learning 12 years ago, Karpathy set off a wave of memories of the AlexNet era, and LeCun, Goodfellow, etc. all ended up

It seems that there were not many people using Matlab at that time:

Getting started with deep learning 12 years ago, Karpathy set off a wave of memories of the AlexNet era, and LeCun, Goodfellow, etc. all ended up

The well-known father of GAN, Ian Goodfellow, also appeared and said that Yoshua’s laboratory used Theano at that time. He also said that before the release of ImageNet, he had Wrote Theano bundle for Alex's cuda-convnet.

Getting started with deep learning 12 years ago, Karpathy set off a wave of memories of the AlexNet era, and LeCun, Goodfellow, etc. all ended up

Douglas Eck, director of Google DeepMind, appeared and said that he had not used Matlab, but C++, and then switched to Python/Theano.

Getting started with deep learning 12 years ago, Karpathy set off a wave of memories of the AlexNet era, and LeCun, Goodfellow, etc. all ended up

New York University professor Kyunghyun Cho said that in 2010, when he was still on the other side of the Atlantic, he was using the CUV library made by Hannes SChulz and others, which helped him switch from Matlab to python.

Getting started with deep learning 12 years ago, Karpathy set off a wave of memories of the AlexNet era, and LeCun, Goodfellow, etc. all ended up

Lamini co-founder Gregory Diamos said that the paper that convinced him was the paper "Deep learning with COTS HPC systems" by Andrew Ng et al.

Getting started with deep learning 12 years ago, Karpathy set off a wave of memories of the AlexNet era, and LeCun, Goodfellow, etc. all ended up

The paper shows that a Frankenstein CUDA cluster can beat a MapReduce cluster of 10,000 CPUs.

Getting started with deep learning 12 years ago, Karpathy set off a wave of memories of the AlexNet era, and LeCun, Goodfellow, etc. all ended up

Paper link: https://proceedings.mlr.press/v28/coates13.pdf

However, the great success of AlexNet was not an isolated event, but a microcosm of the development trend of the entire field at that time . Some researchers have realized that deep learning requires larger scale and stronger computing power, and GPU is a promising direction. Karpathy wrote, "Of course, before the emergence of AlexNet, the field of deep learning had already had some signs of moving towards scale. For example, Matlab has begun to initially support GPUs. A lot of work in Andrew Ng's laboratory at Stanford University is moving towards using GPUs There are also some other parallel efforts in the direction of large-scale deep learning.” At the end of the archeology, Karpathy said with emotion, “When writing C/C++ code and CUDA kernel, I have an interesting feeling. It seems like I have returned to the era of AlexNet and the era of cuda-convnet. "

The current approach of "back to the basics" is similar to the approach of AlexNet back then - the author of AlexNet switched from Matlab to CUDA/C++. , in order to pursue higher performance and larger scale. Although high-level frameworks are now available, when they cannot easily achieve extreme performance, you still need to go back to the bottom and write CUDA/C++ code yourself.

By the way, what were domestic researchers using at that time? Welcome to leave a message for discussion.

The above is the detailed content of Getting started with deep learning 12 years ago, Karpathy set off a wave of memories of the AlexNet era, and LeCun, Goodfellow, etc. all ended up. For more information, please follow other related articles on the PHP Chinese website!