Python论坛  - 讨论区

标题:[python-chinese] 请问从html源代码中提取元标记的内容用什么库比较方便啊?

2008年01月18日 星期五 19:17

刀巴虫子 acestrong在gmail.com
星期五 一月 18 19:17:14 HKT 2008

DIP里介绍的是sgmllib里的SGMLParser,有更加方便的库吗?
-- 
Best Regards!

Ace Strong

==================================================
Nanjing University of Aeronautics and Astronautics.
College of Civil Aviation
Tao Cheng
E-mail: acestrong在gmail.com ;acestrong在nuaa.edu.cn
Tel: 86-025-84892273
==================================================
-------------- 下一部分 --------------
一个HTML附件被移除...
URL: http://python.cn/pipermail/python-chinese/attachments/20080118/3f691a03/attachment.html 

[导入自Mailman归档:http://www.zeuux.org/pipermail/zeuux-python]

2008年01月18日 星期五 19:22

Jiahua Huang jhuangjiahua在gmail.com
星期五 一月 18 19:22:07 HKT 2008

抓网页用美丽的汤,

不过只是几个元数据,自己写正则也行

On Jan 18, 2008 7:17 PM, 刀巴虫子 <acestrong at gmail.com> wrote:
> DIP里介绍的是sgmllib里的SGMLParser,有更加方便的库吗?
>

[导入自Mailman归档:http://www.zeuux.org/pipermail/zeuux-python]

2008年01月18日 星期五 19:59

刀巴虫子 acestrong在gmail.com
星期五 一月 18 19:59:18 HKT 2008

正在试用美丽的汤,挺好用的,已经提取出来了~~
谢谢哦~~

在08-1-18,Jiahua Huang <jhuangjiahua在gmail.com> 写道:
>
> 抓网页用美丽的汤,
>
> 不过只是几个元数据,自己写正则也行
>
> On Jan 18, 2008 7:17 PM, 刀巴虫子 <acestrong在gmail.com> wrote:
> > DIP里介绍的是sgmllib里的SGMLParser,有更加方便的库吗?
> >
> _______________________________________________
> python-chinese
> Post: send python-chinese在lists.python.cn
> Subscribe: send subscribe to python-chinese-request在lists.python.cn
> Unsubscribe: send unsubscribe to  python-chinese-request在lists.python.cn
> Detail Info: http://python.cn/mailman/listinfo/python-chinese




-- 
Best Regards!

Ace Strong

==================================================
Nanjing University of Aeronautics and Astronautics.
College of Civil Aviation
Tao Cheng
E-mail: acestrong在gmail.com ;acestrong在nuaa.edu.cn
Tel: 86-025-84892273
==================================================
-------------- 下一部分 --------------
一个HTML附件被移除...
URL: http://python.cn/pipermail/python-chinese/attachments/20080118/a48503e6/attachment.html 

[导入自Mailman归档:http://www.zeuux.org/pipermail/zeuux-python]

2008年01月19日 星期六 09:37

beck917 beck917在gmail.com
星期六 一月 19 09:37:31 HKT 2008

不错不错,有机会也尝试下美丽的汤…还没用过.:-)

 

发件人: python-chinese-bounces在lists.python.cn [mailto:python-chinese-bounces在lists.python.cn] 代表 刀巴虫子
发送时间: 2008年1月18日 19:59
收件人: python-chinese在lists.python.cn
主题: Re: [python-chinese] 请问从html源代码中提取元标记的内容用什么库比较方便啊?

 

正在试用美丽的汤,挺好用的,已经提取出来了~~
谢谢哦~~

在08-1-18,Jiahua Huang <jhuangjiahua在gmail.com> 写道:

抓网页用美丽的汤,

不过只是几个元数据,自己写正则也行

On Jan 18, 2008 7:17 PM, 刀巴虫子 <acestrong在gmail.com> wrote:
> DIP里介绍的是sgmllib里的SGMLParser,有更加方便的库吗?
>
_______________________________________________ 
python-chinese
Post: send python-chinese在lists.python.cn
Subscribe: send subscribe to python-chinese-request在lists.python.cn 
Unsubscribe: send unsubscribe to  python-chinese-request在lists.python.cn
Detail Info: http://python.cn/mailman/listinfo/python-chinese 




-- 
Best Regards!

Ace Strong 

==================================================
Nanjing University of Aeronautics and Astronautics.
College of Civil Aviation 
Tao Cheng
E-mail: acestrong在gmail.com ;acestrong在nuaa.edu.cn
Tel: 86-025-84892273
================================================== 

-------------- 下一部分 --------------
一个HTML附件被移除...
URL: http://python.cn/pipermail/python-chinese/attachments/20080119/2e4a6aa7/attachment.html 

[导入自Mailman归档:http://www.zeuux.org/pipermail/zeuux-python]

2008年01月20日 星期日 13:52

tairan wang python在tairan.net
星期日 一月 20 13:52:34 HKT 2008

ðÃÁµÄÎÊһϠÃÀÀöµÄÌÀ ÊÇʲô£¿



From: beck917在gmail.com
To: python-chinese在lists.python.cn
Date: Sat, 19 Jan 2008 09:37:31 +0800
Subject: [python-chinese] ´ð¸´: 	ÇëÎÊ´ÓhtmlÔ´´úÂëÖÐÌáÈ¡Ôª±ê¼ÇµÄÄÚÈÝÓÃʲô¿â±È½Ï·½±ã°¡£¿
















²»´í²»´í,Óлú»áÒ²³¢ÊÔÏÂÃÀÀöµÄÌÀ¡­»¹Ã»Óùý.:-)

 



·¢¼þÈË: python-chinese-bounces在lists.python.cn
[mailto:python-chinese-bounces在lists.python.cn] ´ú±í µ¶°Í³æ×Ó

·¢ËÍʱ¼ä: 2008Äê1ÔÂ18ÈÕ 19:59

ÊÕ¼þÈË:
python-chinese在lists.python.cn

Ö÷Ìâ: Re: [python-chinese]
ÇëÎÊ´ÓhtmlÔ´´úÂëÖÐÌáÈ¡Ôª±ê¼ÇµÄÄÚÈÝÓÃʲô¿â±È½Ï·½±ã°¡£¿



 

ÕýÔÚÊÔÓÃÃÀÀöµÄÌÀ£¬Í¦ºÃÓõģ¬ÒѾ­ÌáÈ¡³öÀ´ÁË¡«¡«

ллŶ¡«¡«



ÔÚ08-1-18£¬Jiahua Huang <jhuangjiahua在gmail.com> дµÀ£º

×¥ÍøÒ³ÓÃÃÀÀöµÄÌÀ£¬



²»¹ýÖ»ÊǼ¸¸öÔªÊý¾Ý£¬×Ô¼ºÐ´ÕýÔòÒ²ÐÐ



On Jan 18, 2008 7:17 PM, µ¶°Í³æ×Ó <acestrong在gmail.com> wrote:

> DIPÀï½éÉܵÄÊÇsgmllibÀïµÄSGMLParser£¬Óиü¼Ó·½±ãµÄ¿âÂð£¿

>

_______________________________________________ 

python-chinese

Post: send python-chinese在lists.python.cn

Subscribe: send subscribe to python-chinese-request在lists.python.cn


Unsubscribe: send unsubscribe to  python-chinese-request在lists.python.cn

Detail Info: http://python.cn/mailman/listinfo/python-chinese









-- 

Best Regards£¡



Ace Strong 



==================================================

Nanjing University of Aeronautics and Astronautics.

College of Civil Aviation 

Tao Cheng

E-mail: acestrong在gmail.com
;acestrong在nuaa.edu.cn

Tel: 86-025-84892273

================================================== 







_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today it's FREE!
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
-------------- 下一部分 --------------
Ò»¸öHTML¸½¼þ±»ÒƳý...
URL: http://python.cn/pipermail/python-chinese/attachments/20080120/c8944daa/attachment.html 

[导入自Mailman归档:http://www.zeuux.org/pipermail/zeuux-python]

2008年01月20日 星期日 13:56

cunheise cunheise在hotmail.com
星期日 一月 20 13:56:02 HKT 2008

beautifulsoup google it


From: python在tairan.netTo: python-chinese在lists.python.cnDate: Sun, 20 Jan 2008 05:52:34 +0000Subject: [python-chinese] RE: ´ð¸´: ÇëÎÊ´ÓhtmlÔ´´úÂëÖÐÌáÈ¡Ôª±ê¼ÇµÄÄÚÈÝÓÃʲô¿â±È½Ï·½±ã°¡£¿



ðÃÁµÄÎÊһϠÃÀÀöµÄÌÀ ÊÇʲô£¿


From: beck917在gmail.comTo: python-chinese在lists.python.cnDate: Sat, 19 Jan 2008 09:37:31 +0800Subject: [python-chinese] ´ð¸´: ÇëÎÊ´ÓhtmlÔ´´úÂëÖÐÌáÈ¡Ôª±ê¼ÇµÄÄÚÈÝÓÃʲô¿â±È½Ï·½±ã°¡£¿



²»´í²»´í,Óлú»áÒ²³¢ÊÔÏÂÃÀÀöµÄÌÀ¡­»¹Ã»Óùý.:-)
 

·¢¼þÈË: python-chinese-bounces在lists.python.cn [mailto:python-chinese-bounces在lists.python.cn] ´ú±í µ¶°Í³æ×Ó·¢ËÍʱ¼ä: 2008Äê1ÔÂ18ÈÕ 19:59ÊÕ¼þÈË: python-chinese在lists.python.cnÖ÷Ìâ: Re: [python-chinese] ÇëÎÊ´ÓhtmlÔ´´úÂëÖÐÌáÈ¡Ôª±ê¼ÇµÄÄÚÈÝÓÃʲô¿â±È½Ï·½±ã°¡£¿
 
ÕýÔÚÊÔÓÃÃÀÀöµÄÌÀ£¬Í¦ºÃÓõģ¬ÒѾ­ÌáÈ¡³öÀ´ÁË¡«¡«Ð»Ð»Å¶¡«¡«

ÔÚ08-1-18£¬Jiahua Huang <jhuangjiahua在gmail.com> дµÀ£º
×¥ÍøÒ³ÓÃÃÀÀöµÄÌÀ£¬²»¹ýÖ»ÊǼ¸¸öÔªÊý¾Ý£¬×Ô¼ºÐ´ÕýÔòÒ²ÐÐOn Jan 18, 2008 7:17 PM, µ¶°Í³æ×Ó <acestrong在gmail.com> wrote:> DIPÀï½éÉܵÄÊÇsgmllibÀïµÄSGMLParser£¬Óиü¼Ó·½±ãµÄ¿âÂð£¿>_______________________________________________ python-chinesePost: send python-chinese在lists.python.cnSubscribe: send subscribe to python-chinese-request在lists.python.cn Unsubscribe: send unsubscribe to  python-chinese-request在lists.python.cnDetail Info: http://python.cn/mailman/listinfo/python-chinese 
-- Best Regards£¡Ace Strong ==================================================Nanjing University of Aeronautics and Astronautics.College of Civil Aviation Tao ChengE-mail: acestrong在gmail.com ;acestrong在nuaa.edu.cnTel: 86-025-84892273================================================== 

Express yourself instantly with MSN Messenger! MSN Messenger 
_________________________________________________________________
ÌìÁ¹ÁË£¬ÌíÒÂÁË£¬ÐĶ¯ÁË£¬¡°Æß¼þ¡±ÁË 
http://get.live.cn
-------------- 下一部分 --------------
Ò»¸öHTML¸½¼þ±»ÒƳý...
URL: http://python.cn/pipermail/python-chinese/attachments/20080120/fdd643ba/attachment.html 

[导入自Mailman归档:http://www.zeuux.org/pipermail/zeuux-python]

2008年01月20日 星期日 22:12

realfun realfun在gmail.com
星期日 一月 20 22:12:51 HKT 2008

ÊÇÕâ¸öÂð£¿
http://www.crummy.com/software/BeautifulSoup/

ÕâÁ½Ì춼·ÃÎʲ»ÁË°¡

ÔÚ08-1-20£¬cunheise <cunheise在hotmail.com> дµÀ£º
>
>  beautifulsoup google it
>
>  ------------------------------
> From: python在tairan.net
> To: python-chinese在lists.python.cn
> Date: Sun, 20 Jan 2008 05:52:34 +0000
> Subject: [python-chinese] RE: ´ð¸´: ÇëÎÊ´ÓhtmlÔ´´úÂëÖÐÌáÈ¡Ôª±ê¼ÇµÄÄÚÈÝÓÃʲô¿â±È½Ï·½±ã°¡£¿
>
> ðÃÁµÄÎÊһϠÃÀÀöµÄÌÀ ÊÇʲô£¿
>
>
>
>  ------------------------------
> From: beck917在gmail.com
> To: python-chinese在lists.python.cn
> Date: Sat, 19 Jan 2008 09:37:31 +0800
> Subject: [python-chinese] ´ð¸´: ÇëÎÊ´ÓhtmlÔ´´úÂëÖÐÌáÈ¡Ôª±ê¼ÇµÄÄÚÈÝÓÃʲô¿â±È½Ï·½±ã°¡£¿
>
>  ²»´í²»´í,Óлú»áÒ²³¢ÊÔÏÂÃÀÀöµÄÌÀ¡­»¹Ã»Óùý.:-)
>
>
>
> *·¢¼þÈË:* python-chinese-bounces在lists.python.cn [mailto:
> python-chinese-bounces在lists.python.cn] *´ú±í *µ¶°Í³æ×Ó
> *·¢ËÍʱ¼ä:* 2008Äê1ÔÂ18ÈÕ 19:59
> *ÊÕ¼þÈË:* python-chinese在lists.python.cn
> *Ö÷Ìâ:* Re: [python-chinese] ÇëÎÊ´ÓhtmlÔ´´úÂëÖÐÌáÈ¡Ôª±ê¼ÇµÄÄÚÈÝÓÃʲô¿â±È½Ï·½±ã°¡£¿
>
>
>
> ÕýÔÚÊÔÓÃÃÀÀöµÄÌÀ£¬Í¦ºÃÓõģ¬ÒѾ­ÌáÈ¡³öÀ´ÁË¡«¡«
> ллŶ¡«¡«
>
> ÔÚ08-1-18£¬*Jiahua Huang* <jhuangjiahua在gmail.com> дµÀ£º
>
> ×¥ÍøÒ³ÓÃÃÀÀöµÄÌÀ£¬
>
> ²»¹ýÖ»ÊǼ¸¸öÔªÊý¾Ý£¬×Ô¼ºÐ´ÕýÔòÒ²ÐÐ
>
> On Jan 18, 2008 7:17 PM, µ¶°Í³æ×Ó <acestrong在gmail.com> wrote:
> > DIPÀï½éÉܵÄÊÇsgmllibÀïµÄSGMLParser£¬Óиü¼Ó·½±ãµÄ¿âÂð£¿
> >
> _______________________________________________
> python-chinese
> Post: send python-chinese在lists.python.cn
> Subscribe: send subscribe to python-chinese-request在lists.python.cn
> Unsubscribe: send unsubscribe to  python-chinese-request在lists.python.cn
> Detail Info: http://python.cn/mailman/listinfo/python-chinese
>
>
>
>
> --
> Best Regards£¡
>
> Ace Strong
>
> ==================================================
> Nanjing University of Aeronautics and Astronautics.
> College of Civil Aviation
> Tao Cheng
> E-mail: acestrong在gmail.com ;acestrong在nuaa.edu.cn
> Tel: 86-025-84892273
> ==================================================
>
>
> ------------------------------
> Express yourself instantly with MSN Messenger! MSN Messenger<http://clk.atdmt.com/AVE/go/onm00200471ave/direct/01/>
>
>
> ------------------------------
> Óà Windows Live Spaces չʾ¸öÐÔ×ÔÎÒ£¬ÓëºÃÓÑ·ÖÏíÉú»î£¡ Á˽â¸ü¶àÐÅÏ¢£¡<http://spaces.live.com/?page=HP>
>
> _______________________________________________
> python-chinese
> Post: send python-chinese在lists.python.cn
> Subscribe: send subscribe to python-chinese-request在lists.python.cn
> Unsubscribe: send unsubscribe to  python-chinese-request在lists.python.cn
> Detail Info: http://python.cn/mailman/listinfo/python-chinese
>



-- 
http://www.2maomao.com/blog
-------------- 下一部分 --------------
Ò»¸öHTML¸½¼þ±»ÒƳý...
URL: http://python.cn/pipermail/python-chinese/attachments/20080120/67715b90/attachment-0001.htm 

[导入自Mailman归档:http://www.zeuux.org/pipermail/zeuux-python]

2008年01月20日 星期日 23:07

憨狗 hackgou在gmail.com
星期日 一月 20 23:07:32 HKT 2008

美丽的汤真的很好喝,强烈建议各位尝尝。
特别是大冬天的,一口下肚,极大减少分析html的麻烦。
可以提早上床暖被窝了!
:P


2008/1/20 realfun <realfun at gmail.com>:
> 是这个吗?
> http://www.crummy.com/software/BeautifulSoup/
>
> 这两天都访问不了啊
>
> 在08-1-20,cunheise < cunheise at hotmail.com> 写道:
> >
> > beautifulsoup google it
> >
> >
> >
> > ________________________________
>  From: python at tairan.net
> > To: python-chinese at lists.python.cn
> > Date: Sun, 20 Jan 2008 05:52:34 +0000
> > Subject: [python-chinese] RE: 答复: 请问从html源代码中提取元标记的内容用什么库比较方便啊?
> >
> >
> >
> > 冒昧的问一下 美丽的汤 是什么?
> >
> >
> >
> >
> >
> > ________________________________
>  From: beck917 at gmail.com
> > To: python-chinese at lists.python.cn
> > Date: Sat, 19 Jan 2008 09:37:31 +0800
> > Subject: [python-chinese] 答复: 请问从html源代码中提取元标记的内容用什么库比较方便啊?
> >
> >
> >
> >
> > 不错不错,有机会也尝试下美丽的汤 …还没用过 .:-)
> >
> >
> >
> >
> >
> > 发件人: python-chinese-bounces at lists.python.cn
> [mailto:python-chinese-bounces at lists.python.cn] 代表 刀巴虫子
> > 发送时间: 2008年1月 18日 19:59
> >
> > 收件人: python-chinese at lists.python.cn
> > 主题: Re: [python-chinese] 请问从html源代码中提取元标记的内容用什么库比较方便啊?
> >
> >
> >
> >
> > 正在试用美丽的汤,挺好用的,已经提取出来了~~
> > 谢谢哦~~
> >
> >
> > 在08-1-18,Jiahua Huang < jhuangjiahua at gmail.com> 写道:
> >
> > 抓网页用美丽的汤,
> >
> > 不过只是几个元数据,自己写正则也行
> >
> > On Jan 18, 2008 7:17 PM, 刀巴虫子 < acestrong at gmail.com> wrote:
> > > DIP里介绍的是sgmllib里的SGMLParser,有更加方便的库吗?
> > >
> > _______________________________________________
> > python-chinese
> > Post: send python-chinese at lists.python.cn
> > Subscribe: send subscribe to python-chinese-request at lists.python.cn
> > Unsubscribe: send unsubscribe to  python-chinese-request at lists.python.cn
> > Detail Info: http://python.cn/mailman/listinfo/python-chinese
> >
> >
> >
> >
> > --
> > Best Regards!
> >
> > Ace Strong
> >
> > ==================================================
> > Nanjing University of Aeronautics and Astronautics.
> > College of Civil Aviation
> > Tao Cheng
> > E-mail: acestrong at gmail.com ;acestrong@ nuaa.edu.cn
> > Tel: 86-025-84892273
> > ==================================================
> >
> > ________________________________
>  Express yourself instantly with MSN Messenger! MSN Messenger
> >
> > ________________________________
> 用 Windows Live Spaces 展示个性自我,与好友分享生活! 了解更多信息!
> > _______________________________________________
> >
> > python-chinese
> > Post: send python-chinese at lists.python.cn
> > Subscribe: send subscribe to python-chinese-request at lists.python.cn
> > Unsubscribe: send unsubscribe to   python-chinese-request at lists.python.cn
> > Detail Info: http://python.cn/mailman/listinfo/python-chinese
> >
>
>
>
> --
> http://www.2maomao.com/blog
> _______________________________________________
> python-chinese
> Post: send python-chinese at lists.python.cn
> Subscribe: send subscribe to python-chinese-request at lists.python.cn
> Unsubscribe: send unsubscribe to  python-chinese-request at lists.python.cn
> Detail Info: http://python.cn/mailman/listinfo/python-chinese
>



-- 
关注LAMP平台、安全、及web开发的个人blog: http://hackgou.itbbq.com
PGP KeyID: hackgou#Gmail.com
PGP KeyServ: subkeys.pgp.net

[导入自Mailman归档:http://www.zeuux.org/pipermail/zeuux-python]

如下红色区域有误,请重新填写。

    你的回复:

    请 登录 后回复。还没有在Zeuux哲思注册吗?现在 注册 !

    Zeuux © 2024

    京ICP备05028076号