When trying to scrape page content using cURL, you may encounter issues with redirections or "page moved" errors, particularly if the query string contains special characters.
To resolve this issue, you need to ensure that the encoded query string is being handled correctly. Here's an improved code snippet that addresses this problem:
<code class="php">/** * Function to retrieve a web page using cURL. */ function get_web_page(string $url): array { $user_agent = 'Mozilla/5.0 (Windows NT 6.1; rv:8.0) Gecko/20100101 Firefox/8.0'; $options = [ CURLOPT_CUSTOMREQUEST => "GET", // Set request type as GET CURLOPT_POST => false, // Set to GET CURLOPT_USERAGENT => $user_agent, // Set user agent CURLOPT_COOKIEFILE => "cookie.txt", // Set cookie file CURLOPT_COOKIEJAR => "cookie.txt", // Set cookie jar CURLOPT_RETURNTRANSFER => true, // Return web page CURLOPT_HEADER => false, // Don't return headers CURLOPT_FOLLOWLOCATION => true, // Follow redirects CURLOPT_ENCODING => "", // Handle all encodings CURLOPT_AUTOREFERER => true, // Set referer on redirect CURLOPT_CONNECTTIMEOUT => 120, // Timeout on connect CURLOPT_TIMEOUT => 120, // Timeout on response CURLOPT_MAXREDIRS => 10, // Stop after 10 redirects ]; $ch = curl_init($url); curl_setopt_array($ch, $options); $content = curl_exec($ch); $err = curl_errno($ch); $errmsg = curl_error($ch); $header = curl_getinfo($ch); curl_close($ch); $header['errno'] = $err; $header['errmsg'] = $errmsg; $header['content'] = $content; return $header; } // Example of using the function to get a web page: $result = get_web_page('https://www.example.com/page'); if ($result['errno'] != 0) { // Handle error: bad url, timeout, redirect loop } if ($result['http_code'] != 200) { // Handle error: no page, no permissions, no service } $page = $result['content'];</code>
By including these additional options, such as setting the request type to GET, providing a user agent, and handling all encodings, you should be able to successfully retrieve the content of the desired web page.
The above is the detailed content of How Can I Retrieve Page Content Effectively with cURL?. For more information, please follow other related articles on the PHP Chinese website!