Python论坛的帖子： - 哲思

Python论坛 - 讨论区

返回群组主页

标题：[python-chinese] 大伙有制作机器人爬虫的设计思路吗？

分享

徐继哲

楼主 2006年02月09日星期四 16:51

Steve Chu devforum at gmail.com
Thu Feb 9 16:51:24 HKT 2006

小弟有点蒙，大家都来说说 :-)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.exoweb.net/pipermail/python-chinese/attachments/20060209/70803116/attachment.htm

[导入自Mailman归档：http://www.zeuux.org/pipermail/zeuux-python]

0楼 2006年02月09日星期四 17:20

清风 paradise.qingfeng at gmail.com
Thu Feb 9 17:20:19 HKT 2006

简单说，就是顺着链接把网站抓回来，如果要抓取特定内容，看一看正则表达式相关内容。

On 2/9/06, Steve Chu <devforum at gmail.com> wrote:
> 小弟有点蒙，大家都来说说 :-)
>
> _______________________________________________
> python-chinese
> Post: send python-chinese at lists.python.cn
> Subscribe: send subscribe to
> python-chinese-request at lists.python.cn
> Unsubscribe: send unsubscribe to
> python-chinese-request at lists.python.cn
> Detail Info:
> http://python.cn/mailman/listinfo/python-chinese
>
>


--
Blog:http://qingfeng.ushared.com/blog/

[导入自Mailman归档：http://www.zeuux.org/pipermail/zeuux-python]

李迎辉

0楼 2006年02月09日星期四 17:22

limodou limodou at gmail.com
Thu Feb 9 17:22:23 HKT 2006

On 2/9/06, Steve Chu <devforum at gmail.com> wrote:
> 小弟有点蒙，大家都来说说 :-)
>

我以前写过一个Crawl 可以从 http://pyrecord.freezope.org/download/crawl.zip/down

主要是使用htmllib分析网页，使用多线程来抓取。

--
I like python!
My Blog: http://www.donews.net/limodou
NewEdit Maillist: http://groups.google.com/group/NewEdit

[导入自Mailman归档：http://www.zeuux.org/pipermail/zeuux-python]

请登录后回复。还没有在Zeuux哲思注册吗？现在注册！

Zeuux © 2025

京ICP备05028076号