Detailed explanation of HTTPS-HTML Tutorial-php.cn

Article synchronized with Github/blog

People will use Web transactions to handle some very important things. Without strong security guarantees, people cannot shop or bank online with confidence. Companies cannot place important documents on a Web server without tightly restricting access. The web needs a secure form of HTTP.

There are currently some providing authentication(Basic authentication and Digest authentication) and Message integrity check( Summary A lightweight approach to qop="auth-int"). These methods work well for many network transactions, but are not powerful enough for large-scale shopping, banking transactions, or for accessing confidential data. These more important transactions require a combination of HTTP and digital encryption to ensure security.

The secure version of HTTP should be efficient, portable, and easy to manage. It should not only be able to adapt to changing situations but also meet the requirements of society and government. We need an HTTP security technology that provides the following capabilities.

Server authentication (clients know they are talking to a real server and not a fake one).
Client authentication (servers know they are talking to a real client and not a fake client).
Integrity (client and server data will not be modified).
Encryption (the conversation between client and server is private, no need to worry about being eavesdropped).
Efficiency (an algorithm that runs fast enough to be used by low-end clients and servers).
Universality (basically all clients and servers support these protocols).
Managed scalability (anyone, anywhere can communicate instantly and securely).
Adaptability (capable of supporting the most well-known security methods currently available).
Feasibility in society (meeting the political and cultural needs of society).

Digital Encryption

Terminology

In this introductory introduction to digital encryption technology, we will discuss the following.

Key: A digital parameter that changes the behavior of a password.
Symmetric key encryption system: An algorithm that uses the same key for encoding/decoding.
Asymmetric key encryption system: Encoding/decoding algorithms using different keys.
Public key encryption system: A system that enables millions of computers to send confidential messages easily.
Digital signature: A checksum used to verify that the message has not been forged or tampered with.
Digital Certificate: Identification information verified and issued by a trusted organization.

Cryptography

Cryptography is based on a secret code called cipher(cipher). A cipher is an encoding scheme—a combination of a specific way of encoding a message and a corresponding way of decoding it later. The original message before encryption is usually called plaintext (plaintext or cleartext). The encoded message after using the password is usually called ciphertext.

Passwords have been used to generate confidential information for thousands of years. Legend Julius Caesar once used a three-character cyclic shift cipher, in which each character in the message was replaced by the character three positions later in the alphabet. In the modern alphabet, "A" would be replaced by "D", "B" by "E", and so on.

Key

With the advancement of technology, people have begun to build machines that can quickly and accurately encode and decode messages using much more complex passwords. These cipher machines can not only do some simple rotations, they can also replace characters, change the order of characters, and slice and dice messages to make the code more difficult to crack.

Often in reality, both the encoding algorithm and the encoding machine may fall into the hands of the enemy, so most machines have some dials that can be set to a large number of different values to change the working of the password Way. Even if the machine is stolen, the decoder will not work without the correct dial setting (key value).

These password parameters are called key(key). The correct key must be entered into the cryptographic machine for the decryption process to proceed correctly. Cryptographic keys make a cryptographic machine appear to be multiple virtual cryptographic machines, each with a different key value and therefore different behavior.

Given a plaintext message P, an encoding function E and a digital encoding key e, an encoded ciphertext C can be generated. Through the decoding function D and the decoding key d, the ciphertext C can be decoded into the original plaintext P. Of course, the encoding/decoding functions are inverse functions of each other. Decoding the encoding of P will return to the original message P.

Detailed explanation of HTTPS

Symmetric key encryption technology

Many digital encryption algorithms are called symmetric-key (symmetric-key) encryption technology because of the key they use when encoding. The value is the same as when decoding (e=d). We will collectively call them key k.

Detailed explanation of HTTPS

Popular symmetric key encryption algorithms include: DES, Triple-DES, RC2 and RC4.

It is important to keep keys confidential. In many cases, the encoding/decoding algorithm is well known, so the key is the only thing kept secret. A good encryption algorithm forces an attacker to try every possible key before they can break the code. Using brute force to try all key values is called an enumeration attack(enumeration attack).

Disadvantages

One of the disadvantages of symmetric key encryption technology is that the sender and receiver must have a shared secret before talking to each other. key.

For example, Alice(A), Bob(B) and Chris(C) all want to talk to Joe’s Hardware Store(J). A, B and C all need to establish secret keys between themselves and J. A may need the key KAJ, B may need the key KBJ, and C may need the key KCJ. Each pair of communicating entities requires its own private key. If there are N nodes, each node has to securely talk to all N-1 other nodes, and there will be about N2 secret keys in total: this will be a management nightmare.

Public key encryption technology

Public key encryption technology does not use a separate encryption/decryption key for each pair of hosts, but uses two Asymmetric keys: One is used to encode host messages, and the other is used to decode host messages .

RSA

The common challenge faced by all public key asymmetric encryption systems is to ensure that even if someone has all the following clues, they cannot calculate the secret private key:

Public key (it is public and available to everyone);
A small piece of intercepted ciphertext (can be accessed through the network Obtained by sniffing);
A message and its related ciphertext (you can get it by running the encryptor on any piece of text).

The RSA algorithm is a popular public-key encryption system that meets all of these criteria. It was invented at MIT and later commercialized by RSA Data Security. Even with the public key, any piece of plaintext, the relevant ciphertext obtained by encoding the plaintext with the public key, the RSA algorithm itself, and even the source code of the RSA implementation, the difficulty of cracking the code to find the corresponding private key is still equivalent to that of The difficulty of factoring a very large number into prime factors, a calculation considered one of the most difficult problems in all of computer science. Therefore, if you discover a method that can quickly decompose a very large number into prime factors, you will not only be able to hack into the Swiss bank account system, but also win a Turing Award.

Hybrid encryption system and session key

Anyone can send a secure message to a public server as long as they know its public key, so an asymmetric public key encryption system It's very useful. Two nodes do not need to first exchange private keys in order to communicate securely.

But the calculation of the public key encryption algorithm may be very slow. It actually uses a mix of symmetric and asymmetric strategies. For example, a common approach is to establish secure communication between two nodes through convenient public key encryption technology, then use that secure channel to generate and send a temporary random symmetric key, and use faster symmetric encryption technology to The remaining data is encrypted. (This is true for both SSH and HTTPS)

Digital signature

In addition to encrypting/decrypting messages, you can also use an encryption system to sign (sign), to indicate who wrote the message and to prove that the message has not been tampered with. This technology is called digital signing(digital signing).

The digital signature is a special encrypted verification code attached to the message. Digital signatures are usually generated using asymmetric public key technology. Because only the owner knows the private key, the author's private key can be used as a kind of "fingerprint."

Detailed explanation of HTTPS

The RSA encryption system uses the decoding function D as the signature function because D has already used the private key as input. Note that the decode function is just a function, so you can use it with any input. Likewise, in the RSA encryption system, the D and E functions cancel each other out when applied in any order. Therefore E(D(stuff)) = stuff, just like D(E(stuff)) = stuff.

Note

The private key and the public key are a pair, both can be encrypted and decrypted, use when paired. The principle of RSA is that the product (n) of two large prime numbers (p, q) is difficult to solve inversely, so pq is equivalent, and the public key and private key are also equivalent.

Private key encryption and public key decryption can prove the unique identity of the "private key owner" and are used for signatures.
Public key encryption and private key decryption ensure that only the "private key owner" can decrypt the information sent (if the private key is used to encrypt the data, it will be decrypted by the public key The holder (there may be many holders) decrypts and loses the protection of the information)

digital certificate

"ID card" on the Internet - digital certificate . Digital certificates (often called "certs", a bit like certs brand mints) contain information about a user or company that is guaranteed by a trusted organization.

The digital certificate also contains a set of information, all of which is digitally issued by an official Certificate Authority (CA).

Furthermore, digital certificates usually include the object’s public key, as well as descriptive information about the object and the signature algorithm used. Anyone can create a digital certificate, but not everyone has the respected issuing authority to vouch for the certificate information and issue the certificate with their private key. A typical certificate structure is shown in the figure.

Detailed explanation of HTTPS

X.509 v3 Certificate

Unfortunately, there is no single global standard for digital certificates. Just like not all printed ID cards contain the same information in the same places, digital certificates come in many slightly different forms. But the good news is that most certificates in use today store their information in a standard format - X.509 v3. X.509 v3 certificates provide a standard way to formalize certificate information into a number of parsable fields. Different types of certificates have different field values, but most follow the X.509 v3 structure.

There are several types of signatures based on X.509 certificates, including web server certificates, client email certificates, software code signing certificates and certificate authority certificates.

Use certificate to authenticate the server

After establishing a secure web transaction through HTTPS, modern browsers will automatically obtain the digital certificate of the connected server. If the server does not have a certificate, the secure connection will fail.

When the browser receives the certificate, it will check the signing authority. If this organization is a very authoritative public signing authority, the browser may already know its public key (browsers will have certificates from many signing authorities pre-installed).

If nothing is known about the signing authority, the browser cannot determine whether it should trust the signing authority. It will usually display a dialog box to the user to see if he trusts the signing publisher. The signature issuer may be a local IT department or software vendor.

HTTPS

HTTPS is the most popular form of HTTP security. It was pioneered by Netscape and is supported by all major browsers and servers.

When using HTTPS, all HTTP request and response data must be encrypted before being sent to the network. HTTPS provides a transport-level cryptographic security layer under HTTP - you can use SSL or its successor - Transport Layer Security (TLS). Because SSL and TLS are very similar, we use the term SSL less strictly to refer to both SSL and TLS.

Detailed explanation of HTTPS

SSL/TLS

HTTP communication that does not use SSL/TLS is unencrypted communication. All information spread in plain text brings three major risks.

Eavesdropping risk (eavesdropping): A third party can learn the content of the communication.
Tampering risk (tampering): A third party can modify the communication content.
Pretending (pretending) : A third party can pretend to be someone else to participate in communications.

The SSL/TLS protocol is designed to solve these three major risks, hoping to achieve:

All information is encrypted and transmitted. Three parties cannot eavesdrop.
It has a verification mechanism. Once it is tampered with, the communicating parties will immediately discover it.
Equipped with an identity certificate to prevent identity impersonation.

SSL (Secure Socket Layer) is a secure socket layer, TLS (Transport Layer Security) is a transport layer security protocol, built on SSL3 .0 protocol specification, which is the subsequent version of SSL3.0. SSL was not deployed and applied on a large scale until version 3.0. The obvious difference between the TLS version and SSL is that it supports different encryption algorithms. The latest one currently in use is the TLS1.2 protocol. Version 1.3 is still in the draft stage.

Detailed explanation of HTTPS

HTTPS Scenario

Secure HTTP is now optional. When requesting a client (such as a web browser) to perform a transaction on a web resource, it will check the URL scheme:

If the URL scheme is http, the client will open a connection to the server port 80 (by default) and send it the old HTTP command.
If the URL scheme is https, the client will open a connection to the server port 443 (by default), then "handshake" with the server, exchanging with the server in binary format Some SSL security parameters, along with encrypted HTTP commands.

SSL is a binary protocol, completely different from HTTP, and its traffic is carried on another port (SSL is usually carried by port 443). If both SSL and HTTP traffic arrive on port 80, most web servers will interpret the binary SSL traffic as bad HTTP and close the connection. Further integrating security services into the HTTP layer eliminates the need to use multiple destination ports, which in practice does not cause serious problems.

SSL Handshake

Before starting encrypted communication, the client and server must first establish a connection and exchange parameters. This process is called handshake. Assuming that the client is called Alice and the server is called Bob, the entire handshake process can be illustrated by the following figure.

Detailed explanation of HTTPS

The handshake phase is divided into five steps:

Alice gives the protocol version number and a client The generated random number (Client random), and the encryption method supported by the client.
Bob confirmed the encryption method used by both parties, and gave the digital certificate and a Random number generated by the server (Server random).
Alice confirms that the digital certificate is valid, then generates a new random number (Premaster secret), and uses the public key in the digital certificate to encrypt the random number , sent to Bob.
Bob uses his private key to obtain the random number sent by Alice (i.e. Premaster secret).
Alice and Bob use the previous three random numbers to generate session key according to the agreed encryption method, which is used to encrypt the connection. down the entire conversation.

The above five steps are drawn into a picture, which is as follows:

Detailed explanation of HTTPS

There are three handshake stages Points to note:

A total of three random numbers are required to generate the conversation key.
The conversation after the handshake uses session key Encryption (symmetric encryption), the server's public key and private key are only used for encryption and decryptionsession key (asymmetric encryption), has no other effect.
The server public key is placed in the server’s digital certificate.

Handshake phase of DH algorithm

The entire handshake phase is not encrypted (and cannot be encrypted), it is all plain text. Therefore, if someone eavesdrops on the communication, he can know the encryption method chosen by both parties, as well as two of the three random numbers. The security of the entire call only depends on whether the third random number (Premaster secret) can be cracked.

Although in theory, as long as the server's public key is long enough (for example, 2048 bits), then Premaster secret can be guaranteed not to be cracked. But in order to be secure enough, we can consider changing the algorithm in the handover stage from the default RSA algorithm to the Diffie-Hellman algorithm (DH algorithm for short).

After adopting DH algorithm, Premaster secret does not need to be transmitted. Both parties can calculate this random number by exchanging their own parameters.

Detailed explanation of HTTPS

In the above figure, the third and fourth steps change from passing Premaster secret to passing DH The parameters required by the algorithm, and then both parties calculate the Premaster secret. This increases security.

Server Certificate

SSL supports two-way authentication, carries the server certificate back to the client, and then sends the client's certificate back to the server. Nowadays, client certificates are not often used when browsing. Most users don't even have their own client certificates. The server can require a client certificate, but in practice this is rarely the case.

Validity of site certificate

SSL itself does not require users to check the web server certificate, but most modern browsers will perform a simple integrity check on the certificate and provide users with further verification. means of investigation. A Web server certificate validity algorithm proposed by Netscape is the basis for most browser validity verification technologies. The verification steps are described below:

Date Detection First, the browser checks the start and end dates of the certificate to ensure that the certificate is still valid. If the certificate has expired or has not been activated, the certificate validity verification fails and the browser displays an error message.
签名颁发者可信度检测 每个证书都是由某些 证书颁发机构(CA) 签发的，它们负责为服务器担保。证书有不同的等级，每种证书都要求不同级别的背景验证。比如，如果申请某个电子商务服务器证书，通常需要提供一个营业的合法证明。
任何人都可以生成证书，但有些 CA 是非常著名的组织，它们通过非常清晰的流程来验证证书申请人的身份及商业行为的合法性。因此，浏览器会附带一个签名颁发机构的受信列表。如果浏览器收到了某未知(可能是恶意的)颁发机构签发的证书，那它通常会显示一条警告信息。
签名检测 一旦判定签名授权是可信的，浏览器就要对签名使用签名颁发机构的公开密钥，并将其与校验码进行比较，以查看证书的完整性。
站点身份检测 为防止服务器复制其他人的证书，或拦截其他人的流量，大部分浏览器都会试着去验证证书中的域名与它们所对话的服务器的域名是否匹配。服务器证书中通常都包含一个域名，但有些 CA 会为一组或一群服务器创建一些包含了服务器名称列表或通配域名的证书。如果主机名与证书中的标识符不匹配，面向用户的客户端要么就去通知用户，要么就以表示证书不正确的差错报文来终止连接。

HTTPS客户端

SSL 是个复杂的二进制协议。除非你是密码专家，否则就不应该直接发送原始的 SSL 流量。幸运的是，借助一些商业或开源的库，编写 SSL 客户端和服务器并不十分困难。

OpenSSL 是 SSL 和 TLS 最常见的开源实现。OpenSSL 项目由一些志愿者合作开发，目标是开发一个强壮的、具有完备功能的商业级工具集，以实现 SSL 和 TLS 协议以及一个全功能的通用加密库。可以从 http://www.openssl.org 上获得 OpenSSL 的相关信息，并下载相应软件。

后记

强烈推荐一本书：HTTP 权威指南

参考

超详解析 | CDN HTTPS优化实践，全网一分钟生效
图解SSL/TLS协议
HTTP 权威指南

Article synchronized with Github/blog

Server authentication (clients know they are talking to a real server and not a fake one).
Client authentication (servers know they are talking to a real client and not a fake client).
Integrity (client and server data will not be modified).
Encryption (the conversation between client and server is private, no need to worry about being eavesdropped).
Efficiency (an algorithm that runs fast enough to be used by low-end clients and servers).
Universality (basically all clients and servers support these protocols).
Managed scalability (anyone, anywhere can communicate instantly and securely).
Adaptability (capable of supporting the most well-known security methods currently available).
Feasibility in society (meeting the political and cultural needs of society).

Digital Encryption

Terminology

In this introductory introduction to digital encryption technology, we will discuss the following.

Key: A digital parameter that changes the behavior of a password.
Symmetric key encryption system: An algorithm that uses the same key for encoding/decoding.
Asymmetric key encryption system: Encoding/decoding algorithms using different keys.
Public key encryption system: A system that enables millions of computers to send confidential messages easily.
Digital signature: A checksum used to verify that the message has not been forged or tampered with.
Digital Certificate: Identification information verified and issued by a trusted organization.

Cryptography

Key

Detailed explanation of HTTPS

Symmetric key encryption technology

Detailed explanation of HTTPS

Popular symmetric key encryption algorithms include: DES, Triple-DES, RC2 and RC4.

Disadvantages

One of the disadvantages of symmetric key encryption technology is that the sender and receiver must have a shared secret before talking to each other. key.

Public key encryption technology

RSA

The common challenge faced by all public key asymmetric encryption systems is to ensure that even if someone has all the following clues, they cannot calculate the secret private key:

Public key (it is public and available to everyone);
A small piece of intercepted ciphertext (can be accessed through the network Obtained by sniffing);
A message and its related ciphertext (you can get it by running the encryptor on any piece of text).

Hybrid encryption system and session key

Digital signature

Detailed explanation of HTTPS

The RSA encryption system uses the decoding function D as the signature function because D has already used the private key as input. Note that the decode function is just a function, so you can use it with any input. Likewise, in the RSA encryption system, the D and E functions cancel each other out when applied in any order. Therefore E(D(stuff)) = stuff, just like D(E(stuff)) = stuff.

Note

Private key encryption and public key decryption can prove the unique identity of the "private key owner" and are used for signatures.
Public key encryption and private key decryption ensure that only the "private key owner" can decrypt the information sent (if the private key is used to encrypt the data, it will be decrypted by the public key The holder (there may be many holders) decrypts and loses the protection of the information)

digital certificate

The digital certificate also contains a set of information, all of which is digitally issued by an official Certificate Authority (CA).

Detailed explanation of HTTPS

X.509 v3 Certificate

Use certificate to authenticate the server

HTTPS

HTTPS is the most popular form of HTTP security. It was pioneered by Netscape and is supported by all major browsers and servers.

Detailed explanation of HTTPS

SSL/TLS

HTTP communication that does not use SSL/TLS is unencrypted communication. All information spread in plain text brings three major risks.

Eavesdropping risk (eavesdropping): A third party can learn the content of the communication.
Tampering risk (tampering): A third party can modify the communication content.
Pretending (pretending) : A third party can pretend to be someone else to participate in communications.

The SSL/TLS protocol is designed to solve these three major risks, hoping to achieve:

All information is encrypted and transmitted. Three parties cannot eavesdrop.
It has a verification mechanism. Once it is tampered with, the communicating parties will immediately discover it.
Equipped with an identity certificate to prevent identity impersonation.

Detailed explanation of HTTPS

HTTPS Scenario

Secure HTTP is now optional. When requesting a client (such as a web browser) to perform a transaction on a web resource, it will check the URL scheme:

If the URL scheme is http, the client will open a connection to the server port 80 (by default) and send it the old HTTP command.
If the URL scheme is https, the client will open a connection to the server port 443 (by default), then "handshake" with the server, exchanging with the server in binary format Some SSL security parameters, along with encrypted HTTP commands.

SSL Handshake

Detailed explanation of HTTPS

The handshake phase is divided into five steps:

Alice gives the protocol version number and a client The generated random number (Client random), and the encryption method supported by the client.
Bob confirmed the encryption method used by both parties, and gave the digital certificate and a Random number generated by the server (Server random).
Alice confirms that the digital certificate is valid, then generates a new random number (Premaster secret), and uses the public key in the digital certificate to encrypt the random number , sent to Bob.
Bob uses his private key to obtain the random number sent by Alice (i.e. Premaster secret).
Alice and Bob use the previous three random numbers to generate session key according to the agreed encryption method, which is used to encrypt the connection. down the entire conversation.

The above five steps are drawn into a picture, which is as follows:

Detailed explanation of HTTPS

There are three handshake stages Points to note:

A total of three random numbers are required to generate the conversation key.
The conversation after the handshake uses session key Encryption (symmetric encryption), the server's public key and private key are only used for encryption and decryptionsession key (asymmetric encryption), has no other effect.
The server public key is placed in the server’s digital certificate.

Handshake phase of DH algorithm

After adopting DH algorithm, Premaster secret does not need to be transmitted. Both parties can calculate this random number by exchanging their own parameters.

Detailed explanation of HTTPS

Server Certificate

Validity of site certificate

Date Detection First, the browser checks the start and end dates of the certificate to ensure that the certificate is still valid. If the certificate has expired or has not been activated, the certificate validity verification fails and the browser displays an error message.
签名颁发者可信度检测 每个证书都是由某些 证书颁发机构(CA) 签发的，它们负责为服务器担保。证书有不同的等级，每种证书都要求不同级别的背景验证。比如，如果申请某个电子商务服务器证书，通常需要提供一个营业的合法证明。
任何人都可以生成证书，但有些 CA 是非常著名的组织，它们通过非常清晰的流程来验证证书申请人的身份及商业行为的合法性。因此，浏览器会附带一个签名颁发机构的受信列表。如果浏览器收到了某未知(可能是恶意的)颁发机构签发的证书，那它通常会显示一条警告信息。
签名检测 一旦判定签名授权是可信的，浏览器就要对签名使用签名颁发机构的公开密钥，并将其与校验码进行比较，以查看证书的完整性。
站点身份检测 为防止服务器复制其他人的证书，或拦截其他人的流量，大部分浏览器都会试着去验证证书中的域名与它们所对话的服务器的域名是否匹配。服务器证书中通常都包含一个域名，但有些 CA 会为一组或一群服务器创建一些包含了服务器名称列表或通配域名的证书。如果主机名与证书中的标识符不匹配，面向用户的客户端要么就去通知用户，要么就以表示证书不正确的差错报文来终止连接。

HTTPS客户端

后记

强烈推荐一本书：HTTP 权威指南

Reference

Super detailed analysis | CDN HTTPS optimization practice, effective for the entire network in one minute
Illustrated SSL/TLS protocol
The Definitive Guide to HTTP

Posted 20 hours ago

If you find my article useful to you, please feel free to appreciate it

Articles you may be interested in

"Illustrated HTTP" study notes (1): Understanding the Web And network basics ? Basics - Network Communication 49 Collection, 2.4K Browse
## This work adopts signature-non-commercial use-prohibited interpretation 4.0 Licensed under an International License Agreement
Previous article: "Simplicity Comes First" - Talking about simplicity and clear understanding

-->

Detailed explanation of HTTPS

Comment

Sort by time

The above is the detailed content of Detailed explanation of HTTPS. For more information, please follow other related articles on the PHP Chinese website!