What is the role of Code obejct in the Python virtual machine?

PHPz
Release: 2023-05-10 17:46:06
forward
946 people have browsed it

Code Object data structure

typedef struct {
    PyObject_HEAD
    int co_argcount;		/* #arguments, except *args */
    int co_kwonlyargcount;	/* #keyword only arguments */
    int co_nlocals;		/* #local variables */
    int co_stacksize;		/* #entries needed for evaluation stack */
    int co_flags;		/* CO_..., see below */
    PyObject *co_code;		/* instruction opcodes */
    PyObject *co_consts;	/* list (constants used) */
    PyObject *co_names;		/* list of strings (names used) */
    PyObject *co_varnames;	/* tuple of strings (local variable names) */
    PyObject *co_freevars;	/* tuple of strings (free variable names) */
    PyObject *co_cellvars;      /* tuple of strings (cell variable names) */
    /* The rest aren't used in either hash or comparisons, except for
       co_name (used in both) and co_firstlineno (used only in
       comparisons).  This is done to preserve the name and line number
       for tracebacks and debuggers; otherwise, constant de-duplication
       would collapse identical functions/lambdas defined on different lines.
    */
    unsigned char *co_cell2arg; /* Maps cell vars which are arguments. */
    PyObject *co_filename;	/* unicode (where it was loaded from) */
    PyObject *co_name;		/* unicode (name, for reference) */
    int co_firstlineno;		/* first source line number */
    PyObject *co_lnotab;	/* string (encoding addr<->lineno mapping) See
				   Objects/lnotab_notes.txt for details. */
    void *co_zombieframe;     /* for optimization only (see frameobject.c) */
    PyObject *co_weakreflist;   /* to support weakrefs to code objects */
} PyCodeObject;
Copy after login

The following is the role of each field in the code object:

  • First of all, you need to understand the concept of code block. The so-called code block is a Small python code is executed as a small unit. Common code blocks in python include: function body, class definition, and a module.

  • argcount, this represents the number of parameters in a code block. This parameter is only useful for function body code blocks, because the function may have parameters. For example, the above pycdemo.py is a module Rather than a function, the corresponding value of this parameter is 0.

  • co_code, the specific content of this object is a byte sequence, which stores the real python bytecode. It is mainly used for python virtual machine execution and will not be included in this article for the time being. Detailed analysis.

  • co_consts, this field is a list type field, mainly containing some string constants and numerical constants, such as "__main__" and 100 above.

  • co_filename, the meaning of this field is the file name of the corresponding source file.

  • co_firstlineno, the meaning of this field is the number of lines that appears in the first line of code in the python source file. This field is very important when debugging.

  • co_flags, the main meaning of this field is to identify the type of this code object. 0x0080 indicates that this block is a coroutine, 0x0010 indicates that this code object is nested, etc.

  • co_lnotab, the meaning of this field is mainly used to calculate the number of source code lines corresponding to each bytecode instruction.

  • co_varnames, the main meaning of this field is to represent a name defined locally in a code object.

  • co_names, the opposite of co_varnames, represents names that are not locally defined but used in the code object.

  • co_nlocals, this field indicates the number of variables used locally in a code object.

  • co_stackszie, because the Python virtual machine is a stack computer, the value of this parameter indicates the maximum value required for this stack.

  • co_cellvars, co_freevars, these two fields are mainly related to nested functions and function closures. We will explain this field in detail in subsequent articles.

CodeObject Detailed Analysis

Now we use some practical examples to analyze specific code objects.

import dis
import binascii
import types

d = 10


def test_co01(c):
    a = 1
    b = 2
    return a + b + c + d
Copy after login

In the previous article we mentioned that a function includes a code object object. The output result of the code object object of test_co01 (see co01 for the complete code) is as follows:

code
   argcount 1
   nlocals 3
   stacksize 2
   flags 0043 0x43
   code b&#39;6401007d01006402007d02007c01007c0200177c0000177400001753&#39;
  9           0 LOAD_CONST               1 (1)
              3 STORE_FAST               1 (a)

 10           6 LOAD_CONST               2 (2)
              9 STORE_FAST               2 (b)

 11          12 LOAD_FAST                1 (a)
             15 LOAD_FAST                2 (b)
             18 BINARY_ADD
             19 LOAD_FAST                0 (c)
             22 BINARY_ADD
             23 LOAD_GLOBAL              0 (d)
             26 BINARY_ADD
             27 RETURN_VALUE
   consts
      None
      1
      2
   names (&#39;d&#39;,)
   varnames (&#39;c&#39;, &#39;a&#39;, &#39;b&#39;)
   freevars ()
   cellvars ()
   filename &#39;/tmp/pycharm_project_396/co01.py&#39;
   name &#39;test_co01&#39;
   firstlineno 8
   lnotab b&#39;000106010601&#39;
Copy after login
  • The value of the field argcount is equal to 1, indicating that the function has one parameter. This function test_co01 has a parameter c that corresponds to each other.

  • The value of field nlocals is equal to 3, indicating that a total of three function local variables a, b, c are implemented in function test_co01.

  • The field names corresponds to co_names in the code. According to the previous definition, the global variable d is used in the function test_co01, but it is not defined in the function.

  • Field varnames, this represents the variables used in local definitions. There are three main variables a, b, c in the function test_co01.

  • The field filename is the address of the python file.

  • Field firstlineno indicates that the first line of the function appears on line 8 of the corresponding python code.

Flags field detailed analysis

We specifically use the source code of python3.5 for analysis. The specific implementation of the cpython virtual machine is as follows (Include/code.h ):

/* Masks for co_flags above */
#define CO_OPTIMIZED	0x0001
#define CO_NEWLOCALS	0x0002
#define CO_VARARGS	0x0004
#define CO_VARKEYWORDS	0x0008
#define CO_NESTED       0x0010
#define CO_GENERATOR    0x0020
/* The CO_NOFREE flag is set if there are no free or cell variables.
   This information is redundant, but it allows a single flag test
   to determine whether there is any extra work to be done when the
   call frame it setup.
*/
#define CO_NOFREE       0x0040

/* The CO_COROUTINE flag is set for coroutine functions (defined with
   ``async def`` keywords) */
#define CO_COROUTINE            0x0080
#define CO_ITERABLE_COROUTINE   0x0100
Copy after login

If the & operation is performed between the flags field and each macro definition above, and if the result is greater than 0, it means that the corresponding conditions are met.

The meaning of the above macro definition is as follows:

  • CO_OPTIMIZED, this field indicates that the code object is optimized, using function local defined variables.

  • CO_NEWLOCALS, the meaning of this field is that when the code of this code object is executed, a dict object will be created for the f_locals object in the stack frame.

  • CO_VARARGS, indicates whether this code object contains positional parameters.

  • CO_VARKEYWORDS, indicates whether this code object contains keyword parameters.

  • CO_NESTED, indicating that this code object is a nested function.

  • CO_GENERATOR, indicating that this code object is a generator.

  • CO_COROUTINE, indicating that this code object is a coroutine function.

  • CO_ITERABLE_COROUTINE, indicating that the code object is an iterable coroutine function.

  • CO_NOFREE, this means there are no freevars and cellvars, that is, there is no function closure.

Now let’s analyze the flags of the previous function test_co01. Its corresponding value is equal to 0x43, which means that this function satisfies three characteristics: CO_NEWLOCALS, CO_OPTIMIZED and CO_NOFREE.

freevars & cellvars

我们使用下面的函数来对这两个字段进行分析:

def test_co02():
    a = 1
    b = 2

    def g():
        return a + b
    return a + b + g()
Copy after login

上面的函数的信息如下所示(完整代码见co02):

code
   argcount 0
   nlocals 1
   stacksize 3
   flags 0003 0x3
   code
      b&#39;640100890000640200890100870000870100660200640300640400860000&#39;
      b&#39;7d0000880000880100177c00008300001753&#39;
 15           0 LOAD_CONST               1 (1)
              3 STORE_DEREF              0 (a)

 16           6 LOAD_CONST               2 (2)
              9 STORE_DEREF              1 (b)

 18          12 LOAD_CLOSURE             0 (a)
             15 LOAD_CLOSURE             1 (b)
             18 BUILD_TUPLE              2
             21 LOAD_CONST               3 (<code object g at 0x7f133ff496f0, file "/tmp/pycharm_project_396/co01.py", line 18>)
             24 LOAD_CONST               4 (&#39;test_co02.<locals>.g&#39;)
             27 MAKE_CLOSURE             0
             30 STORE_FAST               0 (g)

 20          33 LOAD_DEREF               0 (a)
             36 LOAD_DEREF               1 (b)
             39 BINARY_ADD
             40 LOAD_FAST                0 (g)
             43 CALL_FUNCTION            0 (0 positional, 0 keyword pair)
             46 BINARY_ADD
             47 RETURN_VALUE
   consts
      None
      1
      2
      code
         argcount 0
         nlocals 0
         stacksize 2
         flags 0013 0x13
         code b&#39;8800008801001753&#39;
 19           0 LOAD_DEREF               0 (a)
              3 LOAD_DEREF               1 (b)
              6 BINARY_ADD
              7 RETURN_VALUE
         consts
            None
         names ()
         varnames ()
         freevars (&#39;a&#39;, &#39;b&#39;)
         cellvars ()
         filename &#39;/tmp/pycharm_project_396/co01.py&#39;
         name &#39;g&#39;
         firstlineno 18
         lnotab b&#39;0001&#39;
      &#39;test_co02.<locals>.g&#39;
   names ()
   varnames (&#39;g&#39;,)
   freevars ()
   cellvars (&#39;a&#39;, &#39;b&#39;)
   filename &#39;/tmp/pycharm_project_396/co01.py&#39;
   name &#39;test_co02&#39;
   firstlineno 14
   lnotab b&#39;0001060106021502&#39;
Copy after login

从上面的输出我们可以看到的是,函数 test_co02 的 cellvars 为 ('a', 'b'),函数 g 的 freevars 为 ('a', 'b'),cellvars 表示在其他函数当中会使用本地定义的变量,freevars 表示本地会使用其他函数定义的变量。

再来分析一下函数 test_co02 的 flags,他的 flags 等于 0x3 因为有闭包的存在因此 flags 不会存在 CO_NOFREE,也就是少了值 0x0040 。

stacksize

这个字段存储的是在函数在被虚拟机执行的时候所需要的最大的栈空间的大小,这也是一种优化手段,因为在知道所需要的最大的栈空间,所以可以在函数执行的时候直接分配指定大小的空间不需要在函数执行的时候再去重新扩容。

def test_stack():
    a = 1
    b = 2
    return a + b
Copy after login

上面的代码相关字节码等信息如下所示:

code
   argcount 0
   nlocals 2
   stacksize 2
   flags 0043 0x43
   code b&#39;6401007d00006402007d01007c00007c01001753&#39;
   #					  字节码指令		 # 字节码指令参数 # 参数对应的值
 24           0 LOAD_CONST               1 (1)
              3 STORE_FAST               0 (a)

 25           6 LOAD_CONST               2 (2)
              9 STORE_FAST               1 (b)

 26          12 LOAD_FAST                0 (a)
             15 LOAD_FAST                1 (b)
             18 BINARY_ADD
             19 RETURN_VALUE
   consts
      None # 下标等于 0 的常量
      1 	 # 下标等于 1 的常量
      2		 # 下标等于 2 的常量
   names ()
   varnames (&#39;a&#39;, &#39;b&#39;)
   freevars ()
   cellvars ()
Copy after login

我们现在来模拟一下执行过程,在模拟之前我们首先来了解一下上面几条字节码的作用:

LOAD_CONST,将常量表当中的下标等于 i 个对象加载到栈当中,对应上面的代码 LOAD_CONST 的参数 i = 1。因此加载测常量等于 1 。因此现在栈空间如下所示:

What is the role of Code obejct in the Python virtual machine?

STORE_FAST,将栈顶元素弹出并且保存到 co_varnames 对应的下标当中,根据上面的字节码参数等于 0 ,因此将 1 保存到 co_varnames[0] 对应的对象当中。

What is the role of Code obejct in the Python virtual machine?

LOAD_CONST,将下标等于 2 的常量加载进入栈中。

What is the role of Code obejct in the Python virtual machine?

STORE_FAST,将栈顶元素弹出,并且保存到 varnames 下标为 1 的对象。

What is the role of Code obejct in the Python virtual machine?

LOAD_FAST,是取出 co_varnames 对应下标的数据,并且将其压入栈中。我们直接连续执行两个 LOAD_FAST 之后栈空间的布局如下:

What is the role of Code obejct in the Python virtual machine?

BINARY_ADD,这个字节码指令是将栈空间的两个栈顶元素弹出,然后将两个数据进行相加操作,然后将相加得到的结果重新压入栈中。

What is the role of Code obejct in the Python virtual machine?

RETURN_VALUE,将栈顶元素弹出并且作为返回值返回。

从上面的整个执行过程来看整个栈空间使用的最大的空间长度为 2 ,因此 stacksize = 2 。

The above is the detailed content of What is the role of Code obejct in the Python virtual machine?. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:yisu.com
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!