This article compiles and shares 28 PHP interview questions (with answers to share) to help you sort out the basic knowledge. It has certain reference value. Friends in need can refer to it. I hope it will be helpful to everyone.
Related recommendations: 2023 PHP interview questions summary (collection)
After the New Year I plan to look for new job opportunities, but I found that my understanding and study of many basic interviews were not deep enough before. In order to encourage myself to keep moving forward, I recently started to learn and summarize relevant knowledge on forums and search engines. Some of the questions are from the forums. Questions or answers shared by seniors, and some of them are questions I encountered in recent interviews. Based on my own understanding and the sharing of seniors, I have archived some of them, so I share them in the hope that they will be helpful to other friends. I also hope that I can receive guidance from the experts on misunderstandings, and will continue to update in the near future
1 , the underlying implementation is through a hash table (hash table) doubly linked list (resolving hash conflicts)
hashtable: calculate the hash of different keywords (keys) through the mapping function Value (Bucket->h) thereby directly indexing to the corresponding Bucket
The hash table saves the pointer of the current loop, so foreach is faster than for
Bucket: Save the key and value of the array elements, as well as the hash value h
2. How to ensure orderliness
1. Add a mapping table with the same size as the storage element array between the hash function and the element array (Bucket).
2. The subscript used to store the elements in the actual storage array
3. The elements are inserted into the actual storage in the order of the mapping table
in the array 4. The mapping table is just a theoretical idea. In fact, there is no actual mapping table. Instead, when the Bucket memory is allocated during initialization, the same Amount of space the size of uint32_t, and then offset arData to the location where the array of elements is stored.
3. Solve hash duplication (linked list method used by PHP):
1. Linked list method: different keys When the words point to the same unit, use a linked list to save the keyword (traverse the linked list to match the key)
2. Open addressing method: When the keyword points to a unit where data already exists, continue Search for other units until an available unit is found (occupying the position of other units, more prone to hash conflicts, performance degradation)
4. Basic knowledge
Linked list: queue, stack, two-way linked list,
linked list: pointer of the element pointing to the next element
bidirectional Linked list: Pointer to the previous element Pointer to the next element Complexity and space complexity
2. Time complexity and space complexity of bubble sort
$arr = [2, 4, 1, 5, 3, 6]; for ($i = 0; $i < (count($arr)); $i++) { for ($j = $i + 1; $j < (count($arr)); $j++) { if ($arr[$i] <= $arr[$j]) { $temp = $arr[$i]; $arr[$i] = $arr[$j]; $arr[$j] = $temp; } } } result : [6,5,4,3,2,1]Copy after login
First round: Compare the first element of the array with all other elements , whichever element is larger, change the order, thus bubble up the first largest (largest) element
First round: Combine the second element of the array and all other elements Compare (the first largest element has been filtered out, no need to continue comparing), whichever element is larger, change the order, so that the second largest element
O(n^2)
;O(n)
, need to add judgment. If there is no exchange in the first loop, jump out of the loop directlyO(1) , the space occupied by temporary variables when exchanging elements
O(1), sorted, no need to swap positions
3. Time complexity and space complexity
Representation method: Big O notation
Constant order O(1)
Linear order O(n)
Square order O(n²)
Cubic order O(n³ )
Kth power order O(n^k)
Exponential order (2^n)
Logarithmic order O(logN)
Linear logarithmic order O(nlogN)
Time replication type:
Best time complexity
Worst time complexity
Average time complexity
Amortized time complexity
Space complexity: the asymptotic space complexity of the whole process, estimating the usage of computer memory (describing the storage space occupied by the algorithm Trend, not actual occupied space, same as above)
Reference:
This article talks about the time complexity and space complexity of the algorithm
Application layer, presentation layer, session layer, transport layer, network layer, (data) link layer, physical layer
Memory routine:
First word: Response to form and transfer (Things Chain Network)
First word: Application layer (occurs many times, easy to remember)
The first four forward directions: should be expressed - will be transmitted
The last three reverse directions: the homophony of the Internet of Things is easier to remember than the Internet of things
1. They all belong to the transport layer protocol
2.TCP
Connection-oriented, so it can only be one-to-one
Oriented to byte stream transmission
The data is reliable and will not be lost
Full-duplex communication
3. UDP (reverse according to TCP characteristics)
No connection, supports one-to-one, one-to-many, many-to-many
Insulation-oriented transmission
The header overhead is small, and the data Not necessarily reliable but faster
1. Three-way handshake:
1) First time: client sends SYN = 1,seq = client_isn
Function:
Client: None
Server: Confirm its own receiving function and client’s sending function
2) Second time: Server sends SYN = 1, seq = server_isn, ACK =client_isn 1
Function:
Client: Confirm that its own sending and receiving are normal, confirm that the server's receiving and sending are normal
Server: Confirm that its own receiving and receiving are normal, confirm that the server's receiving and sending are normal The sending is normal (at this time, the server cannot confirm whether the client is receiving normally)
3) The third time: the client sends SYN = 0, ACK = server_isn 1, seq =client_isn 1
Function: Both parties confirm that each other’s reception and transmission are normal and establish a connection
2. Wave four times
1) The first time: the client sends FIN
Function: Tell the server that I have no data to send (but can still receive data)
2 ) The second time: The server sends ACK
Function: Tells the client that the request has been received. The server may still have data to send, so the client enters the FIN_WAIT state upon receipt and waits for the server data transmission to complete Then send FIN
1. Status code classification
2. Common status codes
401 unauthorized: The client has no permission
403 forbidden: The server refuses the client request
3. Reasons and solutions for 502
Cause: nginx submits the request to the gateway (php-fpm) to handle the exception1) The fastcgi buffer is set too smallfastcgi_buffers 8 16k;
fastcgi_buffer_size 32k;
netstat -anpo | grep "php-cgi"| wc -l
max_children
The parameter specifies the maximum number of requests that each child can handle. Children will be restarted after reaching the maximum value. Setting too small may lead to frequent restart of children: PHP will poll the request to each child. In a large traffic scenario, the time for each child to reach the maximum value is almost the same. If set If it is too small, multiple children may be closed at the same time, nginx cannot forward the request to php-fpm, the CPU will be reduced, and the load will become high. Setting too large may cause memory leaks4) PHP execution time exceeds nginx waiting time
fastcgi_connect_timeout
fastcgi_send_timeout
fastcgi_read_timeout
max_execution_time
Reference:
Learn more about how to optimize php php-fom nginx configuration parameters
What should I do if nginx reports error 502? Solution sharing
1. Implementation :
Locking: setnx Unlocking: del Lock timeout: expire2. Possible problems
Reference:
Redis requires attention when implementing distributed locks What? [Summary of Notes]
Take you in-depth understanding of distributed locks in Redis
Recommended reading: //m.sbmmt.com/redis/475918.html
1. String:
Ordinary key/value storage2. Hash:
Hashmap: key-value team collection, storing object information
3, list:
Doubly linked list: message queue
4 , set:
hashMap whose value is always null: unordered set and non-repeating: calculate intersection, union, difference set, deduplication value
5, zset :
Ordered collection without duplication: hashMap (remove duplication) skiplist jump table (guaranteed order): Ranking list
Reference:
1. RDB persistence (snapshot): The memory data set snapshot within the specified time interval is written to the disk (dump.rdb), when the child process writes the snapshot content, the new file replaces the old file
2) The entire redis database only contains one backup file
3) Maximize performance, Only the fork child process is needed to complete the persistence work and reduce disk IO
4) Downtime before persistence may cause data loss
2. AOF persistence: in the form of logs Record all write and delete operations of the server1) Each time a write command is received, use the write function to append to the file appendonly.aof
2) The persistent file will be longer than As it gets bigger, there are a lot of redundant logs (0 increases from 100 times to 100, 100 log records will be generated)
3) Different fsync policies can be set
12. Flash sale design process and difficultiesDetailed explanation of the persistence principle of Redis in-depth learning A brief analysis of RDB and AOF persistence, what are the advantages and disadvantages? How to choose?
2. nginx load balancing
Three methods: DNS polling, IP debt balancing, CDN
3, current limiting mechanismMethod: IP current limit, interface token current limit, user current limit, header dynamic token (front-end encryption, back-end decryption)
4, distributed lockMethod:
Method:
3. Verify data type and format
4. Use precompiled mode and bind variables
1. Standard SQL isolation level implementation principle
Uncommitted read: other transactions can directly read uncommitted: dirty read
Transaction Do not lock the currently read data
Add a row-level shared lock at the moment of update and release it at the end of the transaction
Committed read: The data read between the start and end of the transaction may be inconsistent. Other transactions in the transaction have modified the data: non-repeatability
The transaction has a negative impact on the currently read data ( When being read) row-level shared lock, release after reading
Add row-level exclusive lock at the moment of update and release at the end of the transaction
Repeatable reading: The data read before the start and end of the transaction are consistent. Other transactions in the transaction cannot modify the data: Repeatable reading
Transactions read the current Add a row-level shared lock to the incoming data from the beginning of the transaction
Add a row-level exclusive lock at the moment of update and release it at the end of the transaction
Other transactions may add new data during the transaction process, resulting in phantom reading
Serialization
Transaction reading Add table-level shared lock when fetching data
Add table-level exclusive lock when transaction updates data
## 2. Innodb’s transaction isolation level and implementation principle (!! Different from the above, understand one is isolation level and the other is!! Transaction!! Isolation level)
1) Basic conceptsReference:The implementation principle of transaction isolation level in MySQL
This article explains the principles of transactions and MVCC in MySQL
The index is a storage structure that helps the database to find data efficiently. When stored in a disk, it requires disk IO
1. Storage engine
myisam supports table locks, indexes and data are stored separately, suitable for cross-server migration
innodb supports row locks, indexes and data storage Another file
2. Index type
hash index
Suitable for precise query and high efficiency
Cannot be sorted, not suitable for range query
In case of hash conflict, the linked list needs to be traversed (php array The implementation principle and the implementation principle of redis zset are similar)
b-tree, b tree
b-tree and To distinguish the b tree
The data of the b tree is all stored in the leaf nodes, and only the key is stored in the internal nodes. One disk IO can obtain more nodes
The internal nodes and leaf nodes of b-tree both store keys and data. Finding data does not require finding leaf nodes. Internal nodes can directly return data.
b tree has been added Pointers from leaf nodes to adjacent nodes to facilitate return query traversal
1. The index and the data are together, and the data on the same page will be cached in the (buffer) memory, so when viewing the data on the same page, you only need to take it out from the memory,
2. After the data is updated, only the primary key index needs to be maintained, and the auxiliary index will not be affected
3. The auxiliary index stores the value of the primary key index, which takes up more physical space. Therefore, it will be affected
4. Using random UUID, the data distribution is uneven, causing the clustered index to scan the entire table and reducing efficiency, so try to use auto-incrementing primary key id
Evaluate capacity and number of sub-tables -> Select the table sharding key according to the business->Table sharding rules (hash, remainder, range)->Execution->Consider the expansion issue
2. Horizontal split
Server layer: Connector->Cache->Analyzer (preprocessor)->Optimizer->Executor Engine layer: Query and store data 2. Select execution process The client sends a request and establishes a connection The server layer searches the cache and returns directly if hit, otherwise continue Analysis 7 Analysis of sql statements and preprocessing (verify field legality and type etc.) The optimizer generates the execution plan The executor calls the engine API query result Return query results 3. Update execution process Function: Format (binary file): 2. To delete a table, you only need to record one sql statement, and there is no need to record the changes in each row. This saves IO, improves performance, and reduces the amount of logs 3. Master-slave inconsistency may occur (stored procedures, functions, etc.) 4. RC isolation level (read commit), because the binlog recording order is recorded in the transaction commit order, so it may Leading to master-slave replication inconsistency. This can be solved by introducing gap locks at the repeatable read level. 2) row 1. Record the modification of each record, there is no need to record the context record of the sql statement 2. Resulting in a large amount of binlog logs 3. Deleting a table: recording the situation of each record being deleted 3) mixed 1. A mixed version of the first two formats 2. Automatically choose which one to use based on the statement : General sql statement modification uses statement To modify table structure, function, stored procedure and other operations, select row update and delete will still record all recorded changes 1. Problems solved 1) Basic concept 1) Asynchronous mode (default mode) ##3) Semi-synchronous mode 1. Increase part of the reliability and increase part of the response time of the main library 2. The main node receives the request submitted by the client After the transaction, wait for the binlog to be sent to at least one slave library and successfully saved to the local relay log. At this time, the master library submits the transaction and returns it to the client 4) Configuration of server-id and server-uuid 1. server-id is used to identify the database instance to prevent infinite loops of SQL statements in chained master-slave, multi-master and multi-slave topologies 2. The default value of server-id is 0. Binary logs will still be recorded for the host, but all slave connections will be rejected. 2. server-id = 0 will refuse to connect to other instances for the slave machine 3. server-id is a global variable , the service must be restarted before modification 4. When the server-id of the main library and the slave library are duplicated Default replicate-same-server-id = 0, the slave library will skip all master-slave synchronization data, resulting in master-slave data inconsistency replicate -same-server-id = 1, may cause wireless loop execution of sql Duplicate server-id of two slave libraries (B, C) will cause the master-slave Connection abnormality, intermittent connection When the main library (A) finds the same server-id, it will disconnect the previous connection and re-register a new connection The connections to slave libraries B and C will be reconnected over and over again MySQL service will automatically create and generate server-uuid configuration When the master-slave synchronization occurs, if the server-uuid of the master-slave instance is the same, an error will be reported and exited. However, we can avoid the error by setting replicate-same-server-id=1 (not recommended) 5. Read and write separation 1) Based on code implementation, reducing hardware expenses 2) Based on intermediate proxy Implementation 3) Master-slave delay The performance of the slave library is worse than that of the main library A large number of queries causes pressure on the slave library Large, consumes a lot of CPU resources, affecting synchronization speed: one master, multiple slaves Large transaction execution: the binlog will not be written until the transaction is executed, and the slave library reading delay Main library ddl (alter, drop, create) 1. The four generated Necessary conditions 1. Mutually exclusive conditions 2. Request and hold conditions: allocate all resources at once, otherwise one None are allocated 3. Non-deprivation condition: when the process obtains some resources and waits for other resources, it releases the occupied resources 4. Loop Waiting condition: Understanding: A resource can only be occupied by one process. A process can also apply for new resources after acquiring resources, and the resources that have been obtained cannot be deprived. At the same time, multiple processes are waiting for each other to be occupied by other processes. Resources 2. Release deadlock 1. Terminate the process (kill them all) 2. Plant one by one (kill one to see if it is relieved) 1. Reason When mysql queries paging data, it does not directly skip offset (100000), but takes offset page_size = 100000 10 = 100010 pieces of data, and then Abandon the first 100,000 pieces of data, so the efficiency is reduced 2. Optimization plan Delayed association: use covering index Primary key threshold method: When the primary key is auto-incrementing, the maximum and minimum values of the primary key that meet the conditions are calculated through conditions (using covering index) Record the result position of the previous page and avoid using OFFSET Method: 1. Update redis first and then update the database Scenario: update set value = 10 where value = 9 1) Redis update successful: redis value = 10 2) Database update failed: mysql value = 9 3) Data inconsistency 2. Update the database first, then update redis Scenario: A process update set value = 10 where value = 9; B process update set value = 11 where value = 9; 1) Process A updates the database first and has not yet written to the cache: mysql value = 10; redis value = 9 2) Process B updates the database and submits the transaction, writes the cache: mysql value = 11; redis value = 11; 3) Process A completes the request and submits the transaction, writes the cache: redis value = 10; 4) Final mysql value = 11; redis value = 10 3. Delete the cache first and then update the database Scenario: A process update set value = 10 where value = 9; Process B queries value; 1) Process A deletes the cache first and has not had time to modify the data or the transaction has not been submitted 2) Process B starts querying and does not hit the cache, so Check the database and write to the cache redis value = 9 3) Process A updates the database to complete mysql value = 10 4) Final mysql value = 10; redis value = 9 Solution: 1. Delayed double deletion Scenario: A process update set value = 10 where value = 9; B process query value; 1) Process A deletes the cache first and has not had time to modify the data or the transaction has not been submitted 2) Process B starts to query and does not hit the cache, so it checks the database and writes the cache redis value = 9 3) Process A updates the database and completes mysql value = 10 4) Process A estimates the delay time and deletes the cache again after sleep 5) Final mysql value = 10;redis value Is empty (check the database directly next time) 6) The reason for the delay is to prevent process B from writing to the cache after process A has updated it. 2. Request serialization 1) Create two Queue: update queue and query queue 2) When the cache does not exist and the database needs to be checked, store the key in the update queue 3) If a new request comes in before the query is completed, and If it is found that the key still exists in the update queue, put the key into the query queue and wait; if it does not exist, repeat the second step 4) If the queried data finds that the query queue already exists, there is no need to write to the queue again 5) After the data update is completed, rpop updates the queue. At the same time, rpop queries the queue and releases the query request. 6) The query request can use while sleep to query the cache and set the maximum delay time. If it is not completed, it will return Empty 1. connect : Release the connection after the script ends 1. close :Release the connection 2. pconnect (long connection): The connection is not released when the script ends. The connection remains in the php-fpm process, and the life cycle follows the life cycle of the php-fpm process 1. close does not release the connection It is just that redis cannot be requested again in the current php-cgi process Currently Subsequent connections in php-cgi can still be reused until php-fpm ends its life cycle 2. Reduce the cost of establishing a redis connection 3. Reduce one php-fpm to establish multiple connections 4. Consume more memory, and the number of connections continues to increase 5. The previous request of the same php-fpm woker sub-process (php-cgi) may affect the next request 3, pconnect The problem of connection reuse Variable A select db 1; Variable B select db 2; will affect the db of variable A Solution: Create a connection instance for each db 1. Basic concepts 1. Skiplist is a random data that saves elements in a hierarchical linked list in an orderly manner (can only be used when the elements are ordered) 2. Skiplist is real Evolved on the basis of ordered linked lists and multi-layer linked lists 3. Duplicate values are allowed, so the comparison check needs to compare not only the key but also the value 4. Each node has a A backward pointer with a height of 1, used for iteration from the head direction to the tail direction 5. Time complexity O(logn), space complexity O(n) 2. Comparison between skip table and balanced tree 1) Range query efficiency The skip table range query is more efficient, because after finding the minimum value, only Traverse the first-level linked list until it is less than the maximum value After the balanced tree range query finds the minimum value, in-order traversal is performed to find other nodes that do not exceed the maximum value 2) Memory usage skiplist The number of pointers for each node is 1/(1-p) The number of pointers of each node of the balanced tree is 2 3) Insertion and deletion operations skiplist only needs to modify the adjacent Node pointer Changes in the balanced tree will cause adjustments to the subtree 1. Conventional expired deletion strategy 1) Scheduled deletion Delete immediately upon expiration through timer Memory is released in time but consumes more CPU. When there is large concurrency, CPU resources are consumed, which affects the speed of processing requests Memory-friendly, CPU-unfriendly 2) Lazy deletion Leave the key expired and check whether it has expired and delete it the next time you need to remove it There may be a large number of expired keys that will not be used, causing memory overflow Not memory friendly, CPU friendly 3) Delete regularly Check every once in a while and delete expired keys How much to delete and how much to check are determined by algorithms 2. The lazy deletion adopted by redis is regularly deleted. Periodically randomly tests some keys with expiration time set to check. Delete when it expires The time for each cleanup shall not exceed 25% of the CPU. When the time is reached, the check will be exited What is not deleted regularly keys, and keys that will not be used in the future will still exist in the memory, so it is necessary to cooperate with the elimination strategy 3. Elimination strategy (executed when the memory is not enough to write new data ) volatile-lru : The expiration time is set and the less recently used, the priority will be eliminated volatile-ttl : The expiration time is set and the earlier the expiration time is The higher the priority, volatile-random: Set the expiration time to be randomly deleted 1. Cache avalanche: A large number of cache failures at the same time, causing requests to directly query the database, putting pressure on database memory and CPU Increasing or even downtime #2. Cache penetration: There is no data in the cache and database. Under a large number of requests, all requests directly penetrate the database, causing downtime. 3. Cache breakdown: There is data in the database, but a large number of requests occur after the cache suddenly fails, resulting in increased database pressure or even downtime 1) CGI protocol ##3) FastCGI protocol It is also a protocol/standard like CGI, but it is optimized on the basis of CGI and is more efficient. is used to improve the performance of CGI programs. Realizes the management of CGI processes 4) FastCGI program = php-fpm php-fpm is a FastCGI program that complies with the FastCGI protocol. FastCGI program’s management mode for CGI programs Start a master process and parse the configuration File, initialize the environment Start multiple worker sub-processes After receiving the request, pass it to the worker process for execution process_control_timeout: The timeout for the child process to accept the main process multiplexing signal (in the specified Process the request within the time, and leave it alone if it cannot be completed) Set the time php-fpm leaves for the fastcgi process to respond to the restart signal process_control_timeout = 0, that is, it does not take effect and smooth restart cannot be guaranteed process_control_timeout set too large may cause system request blocking process_control_timeout =10 In this case, if the code logic takes 11s, restarting the old one may cause the code execution part to exit Recommended value: request_terminate_timeout Graceful restart Forced restart PHP-FPM life cycle: https://www.abelzhou.com/php/php-fpm-lifespan/ #Reference: Chat Communication mechanism between PHP-FPM and Nginx A brief analysis of several timeout configurations in the PHP configuration file Let’s talk about nginx smooth restart and FPM Smooth restart 1. Communication method: fastcgi_pass 1 )tcp socket Cross-server, when nginx and php are not on the same machine, you can only use this method Connection-oriented protocol, better To ensure the correctness and integrity of communication 2) unix socket No need for network protocol stack, packaging and unpacking, etc. Reduces tcp overhead and is more efficient than tcp socket It is unstable during high concurrency, and the sudden increase in the number of connections generates a large number of long-term caches and big data The package may directly return an exception Reference: Let’s talk about the communication mechanism between PHP-FPM and Nginx A brief analysis of the communication mechanism between Nginx and php-fpm 1. SQL injection 2. How to prevent XSS attacks? ](https://tech.meituan.com/2018/09/27/fe-security.html) 3. CSRF attack: Recommended reading: [Front-end security series (2): How to prevent CSRF attacks? ](https://tech.meituan.com/2018/10/11/fe-security-csrf.html) 4. File upload vulnerability Recommended Reading: [A brief analysis of file upload vulnerabilities](https://xz.aliyun.com/t/7365) 5. Cross-domain issues: 1) jsonp PHP Video Tutorial
18. The functions and three formats of binlog
##19. Principle of master-slave synchronization (master-slave replication) Separation of problems and reading and writing
##Data distribution
2. Supported replication types (three formats of binlog)SQL statement-based replication
3. PrincipleGenerate two threads from the library
I/O thread
log dumo thread
1. After starting the start slave command from the slave node, create an IO process to connect to the master node
4. Master-slave replication mode1. It may lead to master-slave inconsistency (master-slave delay When)
1. More reliable, but will affect the response time of the main database
Twenty, deadlock
21. Mysql optimization of large paging query limit 100000 (offset), 10 (page_sie )
22. Redis cache and mysql data consistency
Twenty-three, connect and pconnect in redis
24. The principle of using skiplist for redis zset ordered collection
25. Redis expiration deletion and elimination mechanism
26. Redis common problems and solutions
27. Detailed explanation and life cycle of php-fpm1. Basic knowledge
Dynamic language code files need to pass the corresponding parser to be recognized by the server
##php-cgi is a CGI program that complies with the CGI protocol
2. php-fpm life cycle: to be updated
Twenty-eight, communication between Nginx and php
29. Web vulnerabilities and problems
The above is the detailed content of Share the latest 28 PHP interview questions in 2023 (with answers). For more information, please follow other related articles on the PHP Chinese website!