如何用redis去重?
过去多啦不再A梦
过去多啦不再A梦 2017-04-25 09:02:08
0
3
847

分别从几个固定的网站上爬取数据;
为了url去重,我用的字符串型存储?还是用的sets型存储?

需要存储url数目,大概初期在100k-1000k之间。

过去多啦不再A梦
过去多啦不再A梦

reply all (3)
世界只因有你

Collect with redis
Link

    巴扎黑

    Use collections, the non-repetitiveness of collections is so applicable.

      PHPzhong
      $key = 'URL_HASH'; if(!$redis->hGet($key, md5($url))){ // do something ... // 抓取一个 $url 后 $redis->hSet($key, md5($url), true); }

      It should be noted here that if it is multi-threaded, other processes must be considered. You can change the bool value to an enumeration value.

        Latest Downloads
        More>
        Web Effects
        Website Source Code
        Website Materials
        Front End Template
        About us Disclaimer Sitemap
        php.cn:Public welfare online PHP training,Help PHP learners grow quickly!