Home >Backend Development >Python Tutorial >Detailed explanation of the implementation principle of Python probes
This article will briefly describe the implementation principle of Python probes. At the same time, in order to verify this principle, we will also implement a simple probe program that counts the execution time of a specified function.
The implementation of the probe mainly involves the following knowledge points:
sys.meta_path
sitecustomize.py
sys.meta_path
sys.meta_path This is simple In other words, the function of import hook can be realized.
When import-related operations are performed, the objects defined in the sys.meta_path list will be triggered.
For more detailed information about sys.meta_path, please refer to the sys.meta_path related content in the python documentation and
PEP 0302.
The objects in sys.meta_path need to implement a find_module method,
This find_module method returns None or an object that implements the load_module method
(The code can be downloaded from github part1):
import sys class MetaPathFinder: def find_module(self, fullname, path=None): print('find_module {}'.format(fullname)) return MetaPathLoader() class MetaPathLoader: def load_module(self, fullname): print('load_module {}'.format(fullname)) sys.modules[fullname] = sys return sys sys.meta_path.insert(0, MetaPathFinder()) if __name__ == '__main__': import http print(http) print(http.version_info)
The load_module method returns a module object, which is the module object of the import.
For example, I replaced http with the sys module as I did above.
$ python meta_path1.py
find_module http
load_module http
sys.version_info(major=3, minor=5, micro=1, releaselevel='final', serial =0)
Through sys.meta_path we can realize the function of import hook:
When importing the scheduled module, the object in this module will be replaced with a civet cat,
so as to obtain the function or method Execution time and other detection information.
The above mentioned the civet cat for the prince, so how to perform the operation of civet cat for the prince on an object?
For function objects, we can use decorators to replace function objects (the code can be downloaded from github part2):
import functools import time def func_wrapper(func): @functools.wraps(func) def wrapper(*args, **kwargs): print('start func') start = time.time() result = func(*args, **kwargs) end = time.time() print('spent {}s'.format(end - start)) return result return wrapper def sleep(n): time.sleep(n) return n if __name__ == '__main__': func = func_wrapper(sleep) print(func(3))
Execution results:
$ python func_wrapper.py start func spent 3.004966974258423s 3
Let's implement a function to calculate the execution time of a specified function of a specified module (the code can be downloaded from github part3).
Suppose our module file is hello.py:
import time def sleep(n): time.sleep(n) return n
Our import hook is hook.py:
import functools import importlib import sys import time _hook_modules = {'hello'} class MetaPathFinder: def find_module(self, fullname, path=None): print('find_module {}'.format(fullname)) if fullname in _hook_modules: return MetaPathLoader() class MetaPathLoader: def load_module(self, fullname): print('load_module {}'.format(fullname)) # ``sys.modules`` 中保存的是已经导入过的 module if fullname in sys.modules: return sys.modules[fullname] # 先从 sys.meta_path 中删除自定义的 finder # 防止下面执行 import_module 的时候再次触发此 finder # 从而出现递归调用的问题 finder = sys.meta_path.pop(0) # 导入 module module = importlib.import_module(fullname) module_hook(fullname, module) sys.meta_path.insert(0, finder) return module sys.meta_path.insert(0, MetaPathFinder()) def module_hook(fullname, module): if fullname == 'hello': module.sleep = func_wrapper(module.sleep) def func_wrapper(func): @functools.wraps(func) def wrapper(*args, **kwargs): print('start func') start = time.time() result = func(*args, **kwargs) end = time.time() print('spent {}s'.format(end - start)) return result return wrapper
Test code:
>>> import hook >>> import hello find_module hello load_module hello >>> >>> hello.sleep(3) start func spent 3.0029919147491455s 3 >>>
In fact, the above code has realized the basic functions of the probe. However, there is a problem that the above code needs to execute the import hook operation to register the hook we defined.
The answer is that this function can be achieved by defining sitecustomize.py.
To put it simply, when the python interpreter is initialized, it will automatically import the sitecustomize and usercustomize modules that exist under PYTHONPATH:
.
├── sitecustomize.py
└── usercustomize.py
sitecustomize.py:
print('this is sitecustomize')
usercustomize.py:
print('this is usercustomize')
Change the current directory Add it to PYTHONPATH, and then see the effect:
$ export PYTHONPATH=. $ python this is sitecustomize <---- this is usercustomize <---- Python 3.5.1 (default, Dec 24 2015, 17:20:27) [GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>>You can see that it is indeed imported automatically. So we can change the previous detection program to support automatic execution of import hook (the code can be downloaded from github part5). Directory structure:$ tree
.
├── hello.py
├── hook.py
├── sitecustomize.py
sitecustomize.py:
$ cat sitecustomize.py import hookResult:
$ export PYTHONPATH=. $ python find_module usercustomize Python 3.5.1 (default, Dec 24 2015, 17:20:27) [GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)] on darwin Type "help", "copyright", "credits" or "license" for more information. find_module readline find_module atexit find_module rlcompleter >>> >>> import hello find_module hello load_module hello >>> >>> hello.sleep(3) start func spent 3.005002021789551s 3But the above detection program In fact, there is another problem, that is, PYTHONPATH needs to be modified manually. Friends who have used probe programs will remember that to use probes such as newrelic, you only need to execute one command: newrelic-admin run-program python hello.py In fact, the operation of modifying PYTHONPATH is in the newrelic-admin program. completed in. Next we will also implement a similar command line program, let's call it agent.py. agent
is still modified based on the previous program. First adjust a directory structure and put the hook operation in a separate directory so that there will be no other interference after setting PYTHONPATH (the code can be downloaded from github part6).
$ mkdir bootstrap $ mv hook.py bootstrap/_hook.py $ touch bootstrap/__init__.py $ touch agent.py $ tree . ├── bootstrap │ ├── __init__.py │ ├── _hook.py │ └── sitecustomize.py ├── hello.py ├── test.py ├── agent.pyThe content of bootstrap/sitecustomize.py is modified to:$ cat bootstrap/sitecustomize.py
import _hook
The content of agent.py is as follows:
<span class="kn">import</span> <span class="nn">os</span> <span class="kn">import</span> <span class="nn">sys</span> <span class="n">current_dir</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">dirname</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">realpath</span><span class="p">(</span><span class="n">__file__</span><span class="p">))</span> <span class="n">boot_dir</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">current_dir</span><span class="p">,</span> <span class="s">'bootstrap'</span><span class="p">)</span> <span class="k">def</span> <span class="nf">main</span><span class="p">():</span> <span class="n">args</span> <span class="o">=</span> <span class="n">sys</span><span class="o">.</span><span class="n">argv</span><span class="p">[</span><span class="mi">1</span><span class="p">:]</span> <span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s">'PYTHONPATH'</span><span class="p">]</span> <span class="o">=</span> <span class="n">boot_dir</span> <span class="c"># 执行后面的 python 程序命令</span> <span class="c"># sys.executable 是 python 解释器程序的绝对路径 ``which python``</span> <span class="c"># >>> sys.executable</span> <span class="c"># '/usr/local/var/pyenv/versions/3.5.1/bin/python3.5'</span> <span class="n">os</span><span class="o">.</span><span class="n">execl</span><span class="p">(</span><span class="n">sys</span><span class="o">.</span><span class="n">executable</span><span class="p">,</span> <span class="n">sys</span><span class="o">.</span><span class="n">executable</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">)</span> <span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="s">'__main__'</span><span class="p">:</span> <span class="n">main</span><span class="p">()</span>The content of test.py is:
$ cat test.py import sys import hello print(sys.argv) print(hello.sleep(3))Usage:
$ python agent.py test.py arg1 arg2 find_module usercustomize find_module hello load_module hello ['test.py', 'arg1', 'arg2'] start func spent 3.005035161972046s 3At this point, we have implemented a simple python probe program. Of course, there is definitely a big gap compared with the actual probe program. This article mainly explains the implementation principle behind the probe. If you are interested in the specific implementation of commercial probe programs, you can take a look at the source code of commercial python probes from foreign New Relic or domestic OneAPM, TingYun and other APM manufacturers. I believe you will find Some very interesting things.
For more detailed explanations of the implementation principles of Python probes, please pay attention to the PHP Chinese website!