What are the frequently asked interview questions in Python?-Python Tutorial-php.cn

Q51. Please explain the meaning of using args and kwargs

When we don’t know how many parameters to pass to the function, for example, we pass a list or tuple, we use *args:

def func(*args):

  for i in args:

      print(i) 

func(3,2,1,4,7)

3

2

1

4

7

Copy after login

When we don’t know how many keyword arguments to pass, use **kwargs to collect keyword arguments:

def func(**kwargs):

  for i in kwargs:

      print(i,kwargs[i])

func(a=1,b=2,c=7)

a.1

b.2

c.7

Copy after login

Q52. How to delete files in Python?

Use the command os.remove(filename) or os.unlink(filename)

Q53.Explain how to access a module written in Python from C?

You can access modules written in Python in C via:

Module = = PyImport_ImportModule（“<modulename>”）

Copy after login

Q54. Use // operator in Python?

It is a Floor Division operator, used to divide two operands, the result is the quotient, and only the numbers before the decimal point are displayed.

For example, 10 // 5 = 2 and 10.0 // 5.0 = 2.0.

Q55. How to remove leading spaces in a string?

Leading spaces in a string are spaces that appear before the first non-space character in the string.

We use the method Istrip() to remove it from the string.

’ Data123 '.lstrip()

Result:

'Data123 ’

The initial string contains both leading characters and suffix characters , calling Istrip() removes leading spaces. If we want to remove suffix spaces, we can use the rstrip() method.

&#39;Data123    &#39;.rstrip()

&#39;Data123&#39;

Copy after login

Q56. How to use Python to output a Fibonacci sequence?

a,b = 0, 1

　　while b<100:

　　print (b)

　　a, b = b, a+b

Copy after login

Q57. How to convert a string into an integer variable in Python?

If the string contains only numeric characters, you can use the function int() to convert it to an integer.

int(‘22’)

Let’s check the variable type:

type(&#39;22&#39;)

<class&#39;str&#39;>

type(int(&#39;22&#39;))

<class&#39;int&#39;>

Copy after login

Q58. How to generate a random number in Python?

To generate random numbers, we can import the function random() from the random module.

from random import random

random()

0.013501571090371978

Copy after login

We can also use the function randint(), which uses two parameters to represent an interval and returns a random integer within the interval.

from random import randint

randint(2,7)

4

Copy after login

Q59. How to capitalize the first letter in a string?

The simplest way is to use the capitalize() method.

&#39;daxie&#39;.capitalize()

&#39;Daxie&#39;

Copy after login

Q60. How to check that all characters in a string are alphanumeric?

For this problem, we can use the isalnum() method.

&#39;DATA123&#39;.isalnum()

True

&#39;DATA123!&#39;.isalnum()

False

Copy after login

We can also use some other methods:

&#39;123&#39;.isdigit()#检测字符串是否只由数字组成

True

&#39;123&#39;.isnumeric()#只针对unicode对象

True

&#39;data&#39;.islower()#是否都为小写

True

&#39;Data&#39;.isupper()#是否都为大写

False

Copy after login

Q61. What is concatenation (concatenation) in Python?

Connection in Python is to connect two sequences together. We use operators to complete:

&#39;22&#39;+&#39;33&#39;

‘2233&#39;

[1,2,3]+[4,5,6]

[1, 2,3, 4, 5, 6]

(2,3)+(4)

TypeError  Traceback (most recent call last)

<ipython-input-7-69a1660f2fc5> in <module>

----> 1 (2,3)+(4)

TypeError: can only concatenate tuple (not "int") to tuple

Copy after login

There is an error running here because (4) is regarded as an integer. Modify it and then Rerun:

(2,3)+(4,)

(2, 3,4)

Copy after login

Q62. What is recursion?

When a function calls itself directly or indirectly during its call, it is recursive. But in order to avoid an infinite loop, there must be an end condition. For example:

def facto(n):

  if n==1: return 1

  return n*facto(n-1)

facto(5)

120

Copy after login

Q63. What is a generator?

The generator will generate a series of values for iteration, so it is an iterable object.

It continuously calculates the next element during the for loop and ends the for loop under appropriate conditions.

We define a function that can "yield" values one by one, and then use a for loop to iterate over it.

def squares(n):

  i=1

  while(i<=n):

      yield i**2

      i+=1

for i in squares(5):

  print(i)

1

4

9

16

25

Copy after login

Q64. What is an iterator?

Iterator is a way to access the elements of a collection.

The iterator object starts accessing from the first element of the collection until all elements have been accessed.

The iterator can only go forward and not backward. We create iterators using the inter() function.

odds=iter([1,2,3,4,5])

#每次想获取一个对象时，我们就调用next()函数

next (odds)

1

next (odds)

2

next (odds)

3

next (odds)

4

next (odds)

5

Copy after login

Q65. Please talk about the difference between generators and iterators

1) When using a generator, we create a function; when using an iterator, we use a built-in function iter() and next();

2) In the generator, we use the keyword ‘yield’ to generate/return an object each time;

3) There are How many "yield" statements you can customize;

The generator will save the local variable state each time the loop is paused. Iterators only need an iterable object to iterate, and there is no need to use local variables

5) You can implement your own iterator using classes, but you cannot implement generators;

6) The generator runs fast, has concise syntax, and is simpler;

7) Iterators can save memory.

Q66. What is the function zip() used for?

Newbies to Python may not be very familiar with this function. zip() can return an iterator of tuples.

list(zip([‘a’,‘b’,‘c’],[1,2,3]))

[(‘a’,1 ), (‘b’, 2), (‘c’, 3)]

Here the zip() function pairs the data items in the two lists and creates a tuple from them .

Q67. How to use Python to find out which directory you are currently in?

We can use the function/method getcwd() to import it from the module os.

import os

os.getcwd()

‘C:\Users\37410\Desktop\code’

Q68. How to calculate a The length of the string?

This is also relatively simple, just call the function len() on the string we want to calculate the length.

len(‘Data 123’)

Q69. How to delete the last object from the list?

Remove and return the last object or obj from the list.

list.pop（obj = list [-1]）

Copy after login

Q70.解释一些在Python中实现面向功能的编程的方法

有时，当我们想要遍历列表时，一些方法会派上用场。

1）filter（）

过滤器允许我们根据条件逻辑过滤一些值。

list（filter（lambda x：x> 5，range（8）））

[6,7]

2）map（）

Copy after login

Map将函数应用于iterable中的每个元素。

list（map（lambda x：x ** 2，range（8）））

[0,1,4,9,16,25,36,49]

3）reduce（）

Copy after login

在我们达到单个值之前，Reduce会反复减少序列顺序。

from functools import reduce

reduce（lambda x，y：xy，[1,2,3,4,5]）

-13

Copy after login

Q71.编写一个Python程序来计算数字列表的总和

def list_sum（num_List）：如果len（num_List）== 1：

return num_List [0]

else：

return num_List [0] + list_sum（num_List [1：]）

print（list_sum（[3,4,5,6,11]））

29

Copy after login

Q72.编写一个Python程序来读取文件中的随机行

import random

def random_line（fname）：

lines = open（fname）.read（）.splitlines（）

return random.choice（lines）

print（random_line（&#39;test.txt&#39;））

Copy after login

Q73.编写一个Python程序来计算文本文件中的行数

def file_lengthy（fname）：

open（fname）as f：

for i，l in enumerate（f）：

pass

return i + 1

print（“file of lines：”，file_lengthy（“test.txt”））

Copy after login

Q74.请写一个Python逻辑，计算一个文件中的大写字母数量

import os

os.chdir(&#39;C:\Users\lifei\Desktop&#39;)

with open(&#39;Today.txt&#39;) as today:

count=0

for i in today.read():

if i.isupper():

count+=1

print(count)

Copy after login

Q75.在Python中为数值数据集编写排序算法

以下代码可用于在Python中对列表进行排序：

list = ["1", "4", "0", "6", "9"]

list = [int(i) for i in list]

list.sort()

print (list)

Django有关

Copy after login

Q76.请解释或描述一下Django的架构

对于Django框架遵循MVC设计，并且有一个专有名词：MVT，

M全拼为Model，与MVC中的M功能相同，负责数据处理，内嵌了ORM框架；

V全拼为View，与MVC中的C功能相同，接收HttpRequest，业务处理，返回HttpResponse；

T全拼为Template，与MVC中的V功能相同，负责封装构造要返回的html，内嵌了模板引擎

Q77.Django，Pyramid和Flask之间的差异

Flask是一个“微框架”，主要用于具有更简单要求的小型应用程序。

Pyramid适用于大型应用程序，具有灵活性，允许开发人员为他们的项目使用数据库，URL结构，模板样式等正确的工具。

Django也可以像Pyramid一样用于更大的应用程序。它包括一个ORM。

Q78.讨论Django架构

Django架构

开发人员提供模型，视图和模板，然后将其映射到URL，Django可以为用户提供服务。

Q79.解释如何在Django中设置数据库

Django使用SQLite作为默认数据库，它将数据作为单个文件存储在文件系统中。

如过你有数据库服务器-PostgreSQL，MySQL，Oracle，MSSQL-并且想要使用它而不是SQLite，那么使用数据库的管理工具为你的Django项目创建一个新的数据库。

无论哪种方式，在您的（空）数据库到位的情况下，剩下的就是告诉Django如何使用它。

这是项目的settings.py文件的来源。

我们将以下代码行添加到setting.py文件中：

DATABASES ={‘default’: {‘ENGINE’: ‘django.db.backends.sqlite3’, ‘NAME’: os.path.join(BASE_DIR, ‘db.sqlite3’),

Q80.举例说明如何在Django中编写VIEW？

这是我们在Django中使用write一个视图的方法：

from django.http import HttpResponse

import datetime

def Current_datetime(request):

now =datetime.datetime.now()

html ="<html><body>It is now %s</body></html>"%now

return HttpResponse(html)

Copy after login

返回当前日期和时间，作为HTML文档。

Q81.提到Django模板的组成部分。

模板是一个简单的文本文件。

它可以创建任何基于文本的格式，如XML，CSV，HTML等。

模板包含在评估模板时替换为值的变量和控制模板逻辑的标记（％tag％）。

Q82.在Django框架中解释会话的使用？

Django提供的会话允许您基于每个站点访问者存储和检索数据。

Django通过在客户端放置会话ID cookie并在服务器端存储所有相关数据来抽象发送和接收cookie的过程。

所以数据本身并不存储在客户端。

从安全角度来看，这很好。

Q83.列出Django中的继承样式

在Django中，有三种可能的继承样式：

抽象基类：当你只希望父类包含而你不想为每个子模型键入的信息时使用；

多表继承：对现有模型进行子类化，并且需要每个模型都有自己的数据库表。

代理模型：只想修改模型的Python级别行为，而无需更改模型的字段。

数据分析

Q84.什么是Python中的map函数？

map函数执行作为第一个参数给出的函数，该函数作为第二个参数给出的iterable的所有元素。

如果给定的函数接受多于1个参数，则给出了许多迭代。

Q85.如何在NumPy数组中获得N个最大值的索引？

我们可以使用下面的代码在NumPy数组中获得N个最大值的索引：

importnumpy as np

arr =np.array([1, 3, 2, 4, 5])

print(arr.argsort()[-3:][::-1])

4 3 1

Copy after login

Q86.如何用Python/ NumPy计算百分位数？

importnumpy as np

a =np.array([1,2,3,4,5]

p =np.percentile(a, 50) #Returns 50th percentile, e.g. median

print(p)

3

Copy after login

Q87.NumPy阵列在（嵌套）Python列表中提供了哪些优势？

1）Python的列表是高效的通用容器。

它们支持（相当）有效的插入，删除，追加和连接，Python的列表推导使它们易于构造和操作。

2）有一定的局限性

它们不支持元素化加法和乘法等“向量化”操作，可以包含不同类型的对象这一事实意味着Python必须存储每个元素的类型信息，并且必须在操作时执行类型调度代码在每个元素上。

3）NumPy不仅效率更高，也更方便

You get a lot of vector and matrix operations, which can sometimes avoid unnecessary work.

4) NumPy arrays are faster

You can use NumPy, FFT, convolution, fast search, basic statistics, linear algebra, histograms, etc. built-in.

Q88. Explain the usage of decorators

Decorators in Python are used to modify or inject code in functions or classes.

Using decorators, you can wrap a class or function method call so that a section of code is executed before or after the original code is executed.

Decorators can be used to check permissions, modify or track parameters passed to methods, log calls to specific methods, etc.

Q89. What is the difference between NumPy and SciPy?

In an ideal world, NumPy would only contain the most basic array data types and operations, such as indexing, sorting, reshaping, and basic element functions.

2) All numerical code will reside in SciPy. Despite this, NumPy still maintains the goal of backward compatibility and strives to retain all features supported by its predecessor.

So, although more appropriately belonging to SciPy, NumPy still includes some linear algebra functions. Regardless, SciPy contains a more comprehensive version of the linear algebra module and many other numerical algorithms than any other.

If you use python for scientific calculations, it is recommended to install NumPy and SciPy. Most new features belong to SciPy rather than NumPy.

Q90. How to use NumPy/SciPy to make 3D plots/visualizations?

As with 2D plotting, 3D graphics are beyond the scope of NumPy and SciPy, but just like the 2D case, there are packages that integrate with NumPy.

Matplotlib provides basic 3D plotting in the mplot3d subpackage, while Mayavi uses the powerful VTK engine to provide a variety of high-quality 3D visualization functions.

Crawler and scary framework

Q91. What is the difference between scrapy and scrapy-redis? Why choose redis database?

Scrapy is a Python crawler framework that has extremely high crawling efficiency and is highly customizable, but it does not support distribution.

And scrapy-redis is a set of components based on the redis database and running on the scrapy framework, which allows scrapy to support distributed strategies. The Slaver side shares the item queue, request queue and request fingerprint in the Master side redis database. gather.

Because redis supports master-slave synchronization and data is cached in memory, distributed crawlers based on redis are very efficient in high-frequency reading of requests and data.

Q92. What crawler frameworks or modules have you used?

Python comes with: urllib, urllib2

Third party: requests

Framework: Scrapy

Both the urllib and urllib2 modules do operations related to request URLs , but they provide different functionality.

urllib2.: urllib2.urlopen can accept a Request object or url (when accepting a Request object, you can set the headers of a URL), urllib.urlopen only accepts a url

urllib has urlencode, urllib2 does not, so it is always urllib and urllib2 are often used together.

scrapy is an encapsulated framework. It includes a downloader, parser, log and exception handling, based on multi-threading ,.

The twisted method has advantages for crawling development of a fixed single website; however, for crawling 100 websites on multiple websites, it is not flexible enough in terms of concurrent and distributed processing, and it is inconvenient to adjust and expand.

request is an HTTP library. It is only used to make requests. For HTTP requests, it is a powerful library. Downloading and parsing are all handled by yourself. It has higher flexibility, high concurrency and distributed deployment. It is very flexible and can better implement functions.

Q93. What are your commonly used mysql engines? What are the differences between the engines?

There are two main engines, MyISAM and InnoDB. The main differences are as follows:

1) InnoDB supports transactions, but MyISAM does not. This is very important. Transaction is a high-level processing method. For example, in some column additions, deletions and modifications, as long as there is an error, you can roll back and restore it, but MyISAM cannot;

MyISAM is more suitable for query and insertion-based applications, while InnoDB is more suitable for applications that require frequent modifications and involve higher security

3)InnoDB supports foreign keys, but MyISAM does not;

4) MyISAM is the default engine, InnoDB needs to be specified;

5) InnoDB does not support FULLTEXT type index;

6) InnoDB does not save the number of rows in the table, such as select count(*) from table At this time, InnoDB;

needs to scan the entire table to calculate how many rows there are, but MyISAM only needs to simply read the number of saved rows.

Note that when the count(*) statement contains the where condition, MyISAM also needs to scan the entire table;

7) For self-increasing fields, InnoDB must contain an index of only that field , but in the MyISAM table, a joint index can be established together with other fields;

8) When clearing the entire table, InnoDB deletes rows one by one, and the efficiency is very slow. MyISAM will rebuild the table;

9)InnoDB supports row locks (in some cases, the entire table is locked, such as update table set a=1 whereuser like ‘%lee%’

Q94. Describe the mechanism of scrapy framework operation?

Obtain the first batch of URLs from start_urls and send requests. The requests are handed over to the scheduler by the engine and entered into the request queue. After the acquisition is completed,

The scheduler hands the request in the request queue to the downloader to obtain the response resource corresponding to the request, and hands the response to the parsing method written by itself for extraction processing:

If the required data is extracted, Then hand it over to the pipeline file for processing;

2) If the url is extracted, continue to perform the previous steps (send the url request, and the engine will hand the request to the scheduler and put it into the queue...) until the request queue No request, the program ends.

Q95. What are related queries and what are they?

Combine multiple tables for query, mainly including inner join, left join, right join, full join (outer join)

Q96. Is it better to use multiple processes to write a crawler? Or multi-threading is better? Why?

For IO-intensive code (file processing, web crawlers, etc.), multi-threading can effectively improve efficiency (if there are IO operations under a single thread, IO waiting will occur, causing unnecessary waste of time,

Enabling multi-threading can automatically switch to thread B while thread A is waiting, without wasting CPU resources, thereby improving program execution efficiency).

In the actual data collection process, you need to consider not only the network speed and response issues, but also the hardware conditions of your own machine to set up multi-process or multi-thread.

Q97. Database optimization?

1) Optimize indexes, SQL statements, and analyze slow queries;

2) Optimize hardware; use SSD, use disk queue technology (RAID0, RAID1, RDID5), etc.;

3) Use MySQL’s own table partitioning technology to layer data into different files, which can improve disk reading efficiency;

4) Choose the appropriate table engine and optimize parameters;

5) Carry out architecture-level caching, staticization and distribution;

6) Use faster storage methods, such as NoSQL to store frequently accessed data

Q98. Distributed What problems does the crawler mainly solve?

1)ip

#2)bandwidth

3)cpu

4)io

Q99. What about the verification code during the crawling process? deal with?

1) Scrapy comes with

2) Paid interface

Q100. Common anti-crawlers and countermeasures?

1) Headers anti-crawler requested from users through Headers anti-crawler is the most common anti-crawler strategy.

You can add Headers directly to the crawler and copy the browser's User-Agent to the crawler's Headers; or modify the Referer value to the target website domain name.

2) Anti-crawler based on user behavior

By detecting user behavior, such as the same IP visiting the same page multiple times in a short period of time, or the same account performing the same operation multiple times in a short period of time.

Most websites are in the former situation. For this situation, using an IP proxy can solve it.

You can write a special crawler to crawl the public proxy IPs on the Internet, and save them all after detection.

After you have a large number of proxy IPs, you can change one IP every few requests. This is easy to do in requests or urllib2, so that you can easily bypass the first anti-crawler.

For the second case, you can randomly wait a few seconds after each request before making the next request.

Some websites with logical loopholes can bypass the restriction that the same account cannot make the same request multiple times in a short period of time by requesting several times, logging out, logging in again, and continuing to request.

3) Anti-crawler for dynamic pages

First use Fiddler to analyze the network request. If we can find the ajax request and analyze the specific parameters and the specific meaning of the response, we can Use the method above.

Use requests or urllib2 to simulate an ajax request and parse the response's JSON format to get the required data.

But some websites encrypt all the parameters of the ajax request and cannot construct a request for the data they need.

In this case, use selenium phantomJS to call the browser kernel, and use phantomJS to execute js to simulate human operations and trigger js scripts in the page.

The above is the detailed content of What are the frequently asked interview questions in Python?. For more information, please follow other related articles on the PHP Chinese website!