The content of this article is about how to create a GPU computing instance according to the description of creating an ECS instance. It has certain reference value. Friends in need can refer to it. I hope it will be helpful to you.
GPU computing instances include gn4, gn5, gn5i and gn6v.
Create an instance
You can create a GPU computing instance as described in Creating an ECS Instance. You need to pay attention to the following configurations when creating it.
Region: Different instance specification families provide different regional information. As shown below:
gn4: North China 2 (Availability Zone A), East China 2 (Availability Zone B), South China 1 (Availability Zone C)
gn5: North China 2 (Availability Zone C, E), North China 5 (Availability Zone A), East China 1 (Availability Zone G, F), East China 2 (Availability Zone D, B, E), South China 1 (Availability Zone D), Hong Kong (Availability Zone C, B), Asia Pacific Southeast 1 (Availability Zone B, A), Asia Pacific Southeast 2 (Availability Zone A), Asia Pacific Southeast 3 (Availability Zone A), Asia Pacific Southeast 5 (Availability Zone A), US West 1 (Availability Zone B, A), United States East 1 (Availability Zone B, A), Central Europe 1 (Availability Zone A)
Description
If you want to deploy an NGC (NVIDIA GPU CLOUD) environment on a gn5 instance, when selecting a region See Deploy an NGC environment on gn5 instances.
gn5i: North China 2 (Availability Zone C, E, A), East China 1 (Availability Zone B), East China 2 (Availability Zone D, B), South China 1 (Availability Zone A)
gn6v: East China 2 (Available Zone F)
If the region and availability zone information displayed on the ECS creation page does not match the above description, the information displayed on the ECS creation page shall prevail.
Image:
If you need to install the GPU driver and CUDA library, you can choose any of the following methods:
Select CentOS 64-bit in the system image (currently provided All versions are supported), Ubuntu16.04 64-bit or SUSE Linux Enterprise Server 12 SP2 64-bit image, and choose to automatically install the GPU driver. Then select the required CUDA library and GPU driver versions.
Description
You can choose the appropriate GPU driver version according to your business needs. If it is a new business system, it is recommended that you select the latest GPU driver version from the drop-down menu.
If you choose to automatically install the GPU driver, the instance custom data will be automatically generated in the Advanced Options of the system configuration, that is, the shell script to automatically install the CUDA library and GPU driver. After the instance is started for the first time, cloud-init will automatically execute the script and automatically install the GPU driver. For more information, see Notes on Automatically Installing GPU Driver Scripts.
Select the image market, search for NVIDIA, and select the required image in the search results. Currently only CentOS 7.3 and Ubuntu 16.04 are supported.
If the GPU computing instance is to be used for deep learning, you can choose an image with the deep learning framework pre-installed: select the image market, search for deep learning, and select the required image in the search results. Currently only Ubuntu 16.04 and CentOS 7.3 are supported.
For other images other than the above, after the instance is created, download and install the GPU driver yourself.
Instance: Select Heterogeneous Computing GPU/FPGA > GPU Computing Type, and select the appropriate instance specification according to your needs.
Network: Select Private Network.
Public network bandwidth: Select the bandwidth according to your actual needs.
Note
If you use Windows 2008 R2 image, after the GPU driver installation takes effect, you cannot use the remote connection function of the console to connect to the GPU computing instance, so you must choose to allocate a public IP Address, or bind cloud resources after creating an instance.
Login credentials: Set the login credentials according to actual needs.
Description
It is recommended that you do not choose to set it after creation. After the instance is successfully created and before the GPU driver is successfully installed, if you need to log in to the instance, you must reset the password or bind an SSH key pair, and restart the instance for the modification to take effect. Restarting the instance will cause the GPU driver installation to fail.
Instance custom data: If you choose to automatically install the GPU driver, the shell script for automatically installing the CUDA library and GPU driver will be displayed here. Please read the script content and precautions carefully.
View the automatic GPU driver installation process
If you choose to automatically install the GPU driver, after the instance is created, you can remotely connect to the instance through the installation log /root/ nvidia_install.log View the GPU driver installation process.
Note
Before the GPU driver installation is completed, you cannot operate the GPU or install other GPU-related software to avoid automatic installation failure.
Download and install the GPU driver
If you use an image that does not have a pre-installed GPU driver, you must install the GPU driver for the instance. The steps are as follows:
Obtain the GPU driver installation package:
Enter the NVIDIA official website.
Manually find the driver for your instance and click Search. The filter information is described in the following table.
After confirming that it is correct, click the Download button.
Install the GPU driver:
Windows instance: Double-click to install the GPU driver.
Linux instance: Follow the steps below to install the driver
Download and install the kernel-devel and kernel-header packages corresponding to the kernel version.
Run the following command to confirm that the download and installation of the kernel-devel and kernel-header packages have been completed:
sudo rpm -qa | grep $(uname -r)
Taking CentOS 7.3 as an example, if the following similar information appears, it means that the installation has been completed.
kernel-3.10.0-514.26.2.el7.x86_64 kernel-headers-3.10.0-514.26.2.el7.x86_64 kernel-tools-libs-3.10.0-514.26.2.el7.x86_64 python-perf-3.10.0-514.26.2.el7.x86_64 kernel-tools-3.10.0-514.26.2.el7.x86_64
Install the GPU driver according to the other information description on the NVIDIA official website GPU driver download page.
Take Linux 64-bit Ubuntu 14.04 as an example:
Install GRID driver
If the gn5, gn5i or gn6v instance needs to support OpenGL For graphics display, the GRID driver must be installed. For details, see Installing the GRID Driver in a GPU Instance.
Notes
Remote connection function
For Windows 2008 R2 and below, after the GPU driver installation takes effect, the remote connection function of the console is not available If used, the management terminal will always display a black screen or stay on the startup interface. Please enter the system through other protocols, such as the remote connection (RDP) that comes with Windows.
The Remote Connection (RDP) protocol that comes with Windows does not support DirectX, OpenGL and other related applications. You need to install the VNC service and client yourself, or other supported protocols, such as PCOIP, XenDeskop HDX 3D, etc.
Automatically install GPU driver script
Regarding the shell script that automatically installs the GPU driver, please note the following:
This script will automatically download and install the NVIDIA GPU driver and CUDA library .
Due to the different intranet bandwidth and vCPU core count of the instance specifications, the actual automatic installation time ranges from 4.5 minutes to 10 minutes. When installing the GPU driver, you cannot operate the GPU or install other GPU-related software to avoid automatic installation failure.
After the automatic installation is completed, the instance automatically restarts to make the driver effective.
The script will automatically turn on the Persistence Mode of the GPU driver and add this setting to the system auto-start script to ensure that this mode can be turned on by default after the instance is restarted. The GPU driver works more stably in this mode.
When changing the operating system:
If the original image is Ubuntu16.04 64-bit or SUSE Linux Enterprise Server 12 SP2 64-bit, after changing to other images, the GPU driver cannot be automatically installed.
If the original image is a certain version of CentOS, after changing to another version of CentOS image, the GPU driver can be installed normally.
If you change to another image that does not support automatic installation of the GPU driver script, the GPU driver cannot be automatically installed.
The corresponding installation log will be generated during the installation process, and the log storage path is /root/nvidia_install.log. You can check whether the driver installation is successful through the log. If it fails, you can view the reason for the failure through the logs.
The above is the detailed content of How to create a GPU computing instance as described in Creating an ECS Instance. For more information, please follow other related articles on the PHP Chinese website!