Python论坛的帖子：

Python论坛 - 讨论区

标题：[python-chinese] 提取页面

楼主 2006年11月12日星期日 11:53

wxx wangxinxi在cs.hit.edu.cn
星期日十一月 12 11:53:51 HKT 2006

问一个比较弱的问题，

import urllib
page=urllib.urlopen('http://acm.hit.edu.cn/ojs/authorstatus.php?Author=AndyWang&Contestid;=0')
page.read()
用这这三行代码为什么不能正确提取出来这个页面？服务器提示说禁止访问。而用
浏览器可以正常打开。

[导入自Mailman归档：http://www.zeuux.org/pipermail/zeuux-python]

黄毅

0楼 2006年11月12日星期日 12:03

yi huang yi.codeplayer在gmail.com
星期日十一月 12 12:03:03 HKT 2006

我这里没有问题哦。一切正常。
一开始发现抓到的html不完整，用浏览器打开url一看，那页面的html本来就不完整！！ = =""


-- 
http://codeplayer.blogspot.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://python.cn/pipermail/python-chinese/attachments/20061112/99d4f2ea/attachment.html

[导入自Mailman归档：http://www.zeuux.org/pipermail/zeuux-python]

周琦

0楼 2006年11月12日星期日 12:06

Zoom.Quiet zoom.quiet在gmail.com
星期日十一月 12 12:06:34 HKT 2006

On 11/12/06, wxx <wangxinxi在cs.hit.edu.cn> wrote:
> 问一个比较弱的问题，
>
> import urllib
> page=urllib.urlopen('http://acm.hit.edu.cn/ojs/authorstatus.php?Author=AndyWang&Contestid;=0')
> page.read()
> 用这这三行代码为什么不能正确提取出来这个页面？服务器提示说禁止访问。而用
> 浏览器可以正常打开。
认证问题，啄木鸟中的高人早就解答了：
http://wiki.woodpecker.org.cn/moin/PythonClientCookie

咔咔咔………………
> _______________________________________________
> python-chinese
> Post: send python-chinese在lists.python.cn
> Subscribe: send subscribe to python-chinese-request在lists.python.cn
> Unsubscribe: send unsubscribe to  python-chinese-request在lists.python.cn
> Detail Info: http://python.cn/mailman/listinfo/python-chinese


-- 
"""Time is unimportant, only life important!
blog@  http://blog.zoomquiet.org/pyblosxom/
wiki@    http://wiki.woodpecker.org.cn/moin/ZoomQuiet
douban@ http://www.douban.com/people/zoomq/
____________________________________
Please use OpenOffice.org to stand for M$ office.
Please use 7-zip to stand for WinRAR.
You can get realy freedom from software.
"""

[导入自Mailman归档：http://www.zeuux.org/pipermail/zeuux-python]

请登录后回复。还没有在Zeuux哲思注册吗？现在注册！