线程 - Python的Gensim模块实现一个进程占用300%的CPU,怎样做到的?
高洛峰
高洛峰 2017-04-17 17:48:24
0
1
509

使用gensim模块建立word2vec模型,代码:


from gensim.models import Word2Vec
from gensim.models.word2vec import LineSentence

f = open('wiki.zh.txt.work_cutted', 'r')
work2vec = Word2Vec(LineSentence(f), min_count=5)
return work2vec

看到了Python进程占用了300%的进程:

因为有GIL存在,一个Python进程最多跑到100%的CPU,而看到gensim库可以跑到300%。请问是怎么做到的?

机器的CPU信息:/proc/cpuinfo

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 26
model name      : Intel(R) Xeon(R) CPU           E5506  @ 2.13GHz
stepping        : 5
cpu MHz         : 2128.076
cache size      : 4096 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu de tsc msr pae cx8 sep cmov pat clflush mmx fxsr sse sse2 ss ht syscall nx lm rep_good aperfmperf unfair_spinlock pni ssse3 cx16 sse4_1 sse4_2 popcnt hypervisor lahf_lm dts
bogomips        : 4256.15
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:

processor       : 1
vendor_id       : GenuineIntel
cpu family      : 6
model           : 26
model name      : Intel(R) Xeon(R) CPU           E5506  @ 2.13GHz
stepping        : 5
cpu MHz         : 2128.076
cache size      : 4096 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu de tsc msr pae cx8 sep cmov pat clflush mmx fxsr sse sse2 ss ht syscall nx lm rep_good aperfmperf unfair_spinlock pni ssse3 cx16 sse4_1 sse4_2 popcnt hypervisor lahf_lm dts
bogomips        : 4256.15
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:

processor       : 2
vendor_id       : GenuineIntel
cpu family      : 6
model           : 26
model name      : Intel(R) Xeon(R) CPU           E5506  @ 2.13GHz
stepping        : 5
cpu MHz         : 2128.076
cache size      : 4096 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu de tsc msr pae cx8 sep cmov pat clflush mmx fxsr sse sse2 ss ht syscall nx lm rep_good aperfmperf unfair_spinlock pni ssse3 cx16 sse4_1 sse4_2 popcnt hypervisor lahf_lm dts
bogomips        : 4256.15
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:

processor       : 3
vendor_id       : GenuineIntel
cpu family      : 6
model           : 26
model name      : Intel(R) Xeon(R) CPU           E5506  @ 2.13GHz
stepping        : 5
cpu MHz         : 2128.076
cache size      : 4096 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu de tsc msr pae cx8 sep cmov pat clflush mmx fxsr sse sse2 ss ht syscall nx lm rep_good aperfmperf unfair_spinlock pni ssse3 cx16 sse4_1 sse4_2 popcnt hypervisor lahf_lm dts
bogomips        : 4256.15
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:

processor       : 4
vendor_id       : GenuineIntel
cpu family      : 6
model           : 26
model name      : Intel(R) Xeon(R) CPU           E5506  @ 2.13GHz
stepping        : 5
cpu MHz         : 2128.076
cache size      : 4096 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu de tsc msr pae cx8 sep cmov pat clflush mmx fxsr sse sse2 ss ht syscall nx lm rep_good aperfmperf unfair_spinlock pni ssse3 cx16 sse4_1 sse4_2 popcnt hypervisor lahf_lm dts
bogomips        : 4256.15
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:

processor       : 5
vendor_id       : GenuineIntel
cpu family      : 6
model           : 26
model name      : Intel(R) Xeon(R) CPU           E5506  @ 2.13GHz
stepping        : 5
cpu MHz         : 2128.076
cache size      : 4096 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu de tsc msr pae cx8 sep cmov pat clflush mmx fxsr sse sse2 ss ht syscall nx lm rep_good aperfmperf unfair_spinlock pni ssse3 cx16 sse4_1 sse4_2 popcnt hypervisor lahf_lm dts
bogomips        : 4256.15
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:

processor       : 6
vendor_id       : GenuineIntel
cpu family      : 6
model           : 26
model name      : Intel(R) Xeon(R) CPU           E5506  @ 2.13GHz
stepping        : 5
cpu MHz         : 2128.076
cache size      : 4096 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu de tsc msr pae cx8 sep cmov pat clflush mmx fxsr sse sse2 ss ht syscall nx lm rep_good aperfmperf unfair_spinlock pni ssse3 cx16 sse4_1 sse4_2 popcnt hypervisor lahf_lm dts
bogomips        : 4256.15
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:

processor       : 7
vendor_id       : GenuineIntel
cpu family      : 6
model           : 26
model name      : Intel(R) Xeon(R) CPU           E5506  @ 2.13GHz
stepping        : 5
cpu MHz         : 2128.076
cache size      : 4096 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu de tsc msr pae cx8 sep cmov pat clflush mmx fxsr sse sse2 ss ht syscall nx lm rep_good aperfmperf unfair_spinlock pni ssse3 cx16 sse4_1 sse4_2 popcnt hypervisor lahf_lm dts
bogomips        : 4256.15
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:
高洛峰
高洛峰

拥有18年软件开发和IT教学经验。曾任多家上市公司技术总监、架构师、项目经理、高级软件工程师等职务。 网络人气名人讲师,...

reply all(1)
刘奇

gensim relies on Numpy and pandas at the bottom level, and the latter two are not pure python, but mixed with C++, fortan, etc. Therefore, in order to speed up the calculation of gensim, the underlying implementation should be the newly created Posix Thread sub-thread and run at full load.

Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!