python - How to name the IP extracted through regular expressions-PHP Chinese Network Q&A

Article Topic Learning Download Q&A Programming Dictionary Game Recent Updates

简体中文(ZH-CN) English(EN) 繁体中文(ZH-TW) 日本語(JA) 한국어(KO) Melayu(MS) Français(FR) Deutsch(DE)

python - How to name the IP extracted through regular expressions

仅有的幸福 2017-05-18 11:00:19

666

source_ip = line.split('- -')[0].strip() if re.match('[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}',source_ip): if source_ip_dict.get(source_ip,'-')=='-': source_ip_dict[source_ip]=1 else: source_ip_dict[source_ip]=source_ip_dict[source_ip]+1

Extract the apache log IP through the above code, and perform statistical deduplication.
The extracted IP data is as follows:

So how to name and classify these IP addresses,
For example,
202.108.11.103 and 220.181.32.137 are Baidu Spider IPs
The effect you want to achieve is as follows
The two IPs are named Baidu Spider, and then add their statistics together, that is 4336 3411
Baidu Spider 7747

How to do this

仅有的幸福

reply all (4)

仅有的幸福2017-05-18 11:02:19 4 floor

from itertools import groupby NAME_IP_MAPPING = { '202.108.11.103':'百度蜘蛛', '220.181.32.137': '百度蜘蛛', } spiders = [ {'ip':'202.108.11.103','count':123}, {'ip':'220.181.32.137','count':345} ] # 先用ip通过映射得到名字，再根据名字将spiders里的item分组，之后各自求和存入新的dict中。 {k: sum(s['count'] for s in g) for k, g in groupby(spiders, lambda s:NAME_IP_MAPPING.get(s['ip']))} # output: {'百度蜘蛛': 468}

Like+0

Add Reply

黄舟2017-05-18 11:02:19 3 floor

You can try to build a large dictionary with the dictionary as the key and the crawler name as the value;

ip_map = { '202.108.11.103': 'baidu-spider', '220'.181.32.137: 'baidu-spider', '192.168.1.1': 'other' .... } sum = {} for ip in source_ip: print ip sum[ip_mapping.get(ip, 'other')] = sum.get(ip, 0) + source_ip[ip] print sum

Like+0

Add Reply

滿天的星座2017-05-18 11:02:19 2 floor

Pivot table using pandas

Like+0

Add Reply

阿神2017-05-18 11:02:19 1 floor

How tiring it is!
Why not create a separate table for this IP group, named IPGroup (id, ip, groupname)

id	ip	groupName
1	202.108.11.103	Baidu Spider
2	220.181.32.137	Baidu Spider

After that, it can be done with just one SQL, how easy it is (let the poster use IPStastics)

SELECT b.groupName, SUM(a.count) FROM IPStastics a INNER JOIN IPGroup b ON a.ip = b.ip GROUP BY b.groupName

Like+0

Add Reply

Php8, I'm coming too

Learn website layout in 30 minutes

Shangguan Oracle Beginner to Proficient Video Tutorial

Your first line of UNI-APP code

Flutter from scratch to app launch

Brother Lian New Linux Video Tutorial

AXURE 9 Video Tutorial (Suitable for Product Manager Interactive Product Design UI)

Zero Basic Proficiency PS Video Tutorial

16 day UI video tutorial to get you started

PS Techniques and Slicing Techniques Video Tutorial

Alibaba Cloud Environment Construction and Project Launch Video Tutorial

Overview of Computer Networks - Basic Knowledge that Programmers Must Master

Essential Tutorial for Programmers - HTTP Protocol Explanation

Websocket Video Tutorial