Python论坛  - 讨论区

标题:Re: [python-chinese] 请教各位正则表达式对html文件的操作。

2004年04月15日 星期四 11:45

Zoom.Quiet zoomq at infopro.cn
Thu Apr 15 11:45:33 HKT 2004

Hollo info:

  如果是页面内容中的歌曲信息,你得下载页面的内容后再进行分析!
类似于"""
        shRequest.add_header("Accept-Language","zh-cn")
        shRequest.add_header("Content-Type","text/html; charset=gb2312")
        shRequest.add_header("User-Agent","Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.1.4322)")
        fload = urllib2.urlopen(shRequest)
        _fobj = fload.read()
"""
再使用 _fobj  类似文件对象 来进行匹配分析!

/******** [2004-04-15]11:43:41 ; you wrote:

info at xichen.com> 各位好!

info at xichen.com> 	     我使用下列程序来获取http的内容
info at xichen.com> def urlify(txt):
info at xichen.com>      txt =
info at xichen.com> re.findall(r"http://[0-9a-zA-Z_-]{1,}.[0-9a-zA-Z_-]{1,}.[0-9a-zA-Z_-]{1,}.[0-9a-zA-Z_-]{1,}.[0-9a-zA-Z_-]{1,}.[0-9a-zA-Z_-]{1,}.[0-9a-zA-Z_-]{1,}.[0-9a-zA-Z_-]{1,}.[0-9a-zA-Z_-]{1,}.['mp3']{2,3}"
info at xichen.com> ,txt)
info at xichen.com>      return txt
info at xichen.com> def openurl(song):
info at xichen.com>     fhin=''
info at xichen.com>     source =
info at xichen.com> 'http://mp3.baidu.com/m?rn=&tn;=baidump3&ct;=134217728&word;='+song
info at xichen.com>     try:
info at xichen.com>          fhin = urllib.urlopen(source)
info at xichen.com>     except:
info at xichen.com>          print(source+' could not be opened!')

info at xichen.com>     doc = ''
info at xichen.com>     for line in fhin.readlines(): #
info at xichen.com> Need to normalize line endings!
info at xichen.com>         tmp=string.rstrip(line)
info at xichen.com>         if tmp.find('baidu.com')==-1:
info at xichen.com>             doc = doc+string.rstrip(line)
info at xichen.com>     return urlify(doc)
info at xichen.com> print openurl('前门情深大碗茶')
    

info at xichen.com>    
info at xichen.com> 返回的结果只有html的地址,我想把歌曲的名称也取出来,请问正则表达式怎么操作?

info at xichen.com>         致
info at xichen.com> 礼!
 				

info at xichen.com>         info
info at xichen.com>         info at xichen.com
info at xichen.com>           2004-04-15


********************************************/

-- 
Free as in Freedom

 Zoom.Quiet                           

#=========================================#
]Time is unimportant, only life important![
#=========================================#

sender is the Bat!2.02 CE



[导入自Mailman归档:http://www.zeuux.org/pipermail/zeuux-python]

2004年04月15日 星期四 14:13

info at xichen.com info at xichen.com
Thu Apr 15 14:13:02 HKT 2004

Zoom.Quiet,您好!

	你好,如果按照你的方法来取,仍然存在用正则表达式取出歌曲名称的问题,因为搜索“的”会将所有包含“的”的歌曲全部取出。谢谢,希望你能帮忙解决这个问题。

======= 2004-04-15 11:45:33 您在来信中写道:=======

>Hollo info:
>
>  如果是页面内容中的歌曲信息,你得下载页面的内容后再进行分析!
>类似于"""
>        shRequest.add_header("Accept-Language","zh-cn")
>        shRequest.add_header("Content-Type","text/html; charset=gb2312")
>        shRequest.add_header("User-Agent","Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.1.4322)")
>        fload = urllib2.urlopen(shRequest)
>        _fobj = fload.read()
>"""
>再使用 _fobj  类似文件对象 来进行匹配分析!
>
>/******** [2004-04-15]11:43:41 ; you wrote:
>
>info at xichen.com> 各位好!
>
>info at xichen.com> 	     我使用下列程序来获取http的内容
>info at xichen.com> def urlify(txt):
>info at xichen.com>      txt =
>info at xichen.com> re.findall(r"http://[0-9a-zA-Z_-]{1,}.[0-9a-zA-Z_-]{1,}.[0-9a-zA-Z_-]{1,}.[0-9a-zA-Z_-]{1,}.[0-9a-zA-Z_-]{1,}.[0-9a-zA-Z_-]{1,}.[0-9a-zA-Z_-]{1,}.[0-9a-zA-Z_-]{1,}.[0-9a-zA-Z_-]{1,}.['mp3']{2,3}"
>info at xichen.com> ,txt)
>info at xichen.com>      return txt
>info at xichen.com> def openurl(song):
>info at xichen.com>     fhin=''
>info at xichen.com>     source =
>info at xichen.com> 'http://mp3.baidu.com/m?rn=&tn;=baidump3&ct;=134217728&word;='+song
>info at xichen.com>     try:
>info at xichen.com>          fhin = urllib.urlopen(source)
>info at xichen.com>     except:
>info at xichen.com>          print(source+' could not be opened!')
>
>info at xichen.com>     doc = ''
>info at xichen.com>     for line in fhin.readlines(): #
>info at xichen.com> Need to normalize line endings!
>info at xichen.com>         tmp=string.rstrip(line)
>info at xichen.com>         if tmp.find('baidu.com')==-1:
>info at xichen.com>             doc = doc+string.rstrip(line)
>info at xichen.com>     return urlify(doc)
>info at xichen.com> print openurl('前门情深大碗茶')
>    
>
>info at xichen.com>    
>info at xichen.com> 返回的结果只有html的地址,我想把歌曲的名称也取出来,请问正则表达式怎么操作?
>
>info at xichen.com>         致
>info at xichen.com> 礼!
> 				
>
>info at xichen.com>         info
>info at xichen.com>         info at xichen.com
>info at xichen.com>           2004-04-15
>
>
>********************************************/
>
>-- 
>Free as in Freedom
>
> Zoom.Quiet                           
>
>#=========================================#
>]Time is unimportant, only life important![
>#=========================================#
>
>sender is the Bat!2.02 CE
>
>_______________________________________________
>python-chinese list
>python-chinese at lists.python.cn
>http://python.cn/mailman/listinfo/python-chinese
>

= = = = = = = = = = = = = = = = = = = =
			

        致
礼!
 
				 
        info
        info at xichen.com
          2004-04-15


[导入自Mailman归档:http://www.zeuux.org/pipermail/zeuux-python]

2004年04月15日 星期四 20:52

Anew Anewboy at citiz.net
Thu Apr 15 20:52:50 HKT 2004

Zoom.Quiet,您好!

	虽然我也喜欢vi,但是最近觉得emacs真的是一个很强大的editor。vi我都用来编辑文本,因为vi的效率比较高;而emacs编程比较适合。

======= 2004-04-15 11:13:00 您在来信中写道:=======

>Hollo Anew:
>
>  http://xemacs.cosoft.org.cn/pmwiki/pmwiki.php/Emacs/HomePage
>
>是中文的 Emcas 专题站点,可以作为探索的开始!
>
>不过,建议Vim 轻便!快捷,省心哪!
>
>特别是 Cream for Vim 的配合!
>http://cream.sourceforge.net/features.html
>
>
>/******** [2004-04-15]11:10:05 ; you wrote:
>
>Anew> 大家好:
>
>Anew> 	请问,谁搭建过emacs的环境,我想用来调试python,
>Anew> python-mode太弱了(或许是我没有发现)好像不行,5555。。。。
>Anew> 	请问大家有没有好的建议,我想在linux下作开发,但没有好的ide环境。
>	
>
>Anew>         致
>Anew> 礼!
> 				
>
>Anew>         Anew
>Anew>         Anewboy at citiz.net
>Anew>           2004-04-15
>
>
>********************************************/
>
>-- 
>Free as in Freedom
>
> Zoom.Quiet                           
>
>#=========================================#
>]Time is unimportant, only life important![
>#=========================================#
>
>sender is the Bat!2.02 CE
>
>_______________________________________________
>python-chinese list
>python-chinese at lists.python.cn
>http://python.cn/mailman/listinfo/python-chinese
>

= = = = = = = = = = = = = = = = = = = =
			

        致
礼!
 
				 
        Anew
        Anewboy at citiz.net
          2004-04-15


[导入自Mailman归档:http://www.zeuux.org/pipermail/zeuux-python]

2004年04月15日 星期四 22:50

zhang yancy yancy_zhang at hotmail.com
Thu Apr 15 22:50:23 HKT 2004

如题,在win下我用editplus,感觉它的远程编辑很爽,win下提供这个功能的很多。
linux哪位知道有什么软件?


>From: "Anew"<Anewboy at citiz.net>
>Reply-To: python-chinese at lists.python.cn
>To: "python-chinese" <python-chinese at lists.python.cn>
>Subject: Re: Re: [python-chinese] 谁搭建过emacs的环境
>Date: Thu, 15 Apr 2004 20:52:50 +0800
>
>Zoom.Quiet,您好!
>
>	虽然我也喜欢vi,但是最近觉得emacs真的是一个很强大的editor。vi我都用来编辑
文本,因为vi的效率比较高;而emacs编程比较适合。
>
>======= 2004-04-15 11:13:00 您在来信中写道:=======
>
> >Hollo Anew:
> >
> >  http://xemacs.cosoft.org.cn/pmwiki/pmwiki.php/Emacs/HomePage
> >
> >是中文的 Emcas 专题站点,可以作为探索的开始!
> >
> >不过,建议Vim 轻便!快捷,省心哪!
> >
> >特别是 Cream for Vim 的配合!
> >http://cream.sourceforge.net/features.html
> >
> >
> >/******** [2004-04-15]11:10:05 ; you wrote:
> >
> >Anew> 大家好:
> >
> >Anew> 	请问,谁搭建过emacs的环境,我想用来调试python,
> >Anew> python-mode太弱了(或许是我没有发现)好像不行,5555。。。。
> >Anew> 	请问大家有没有好的建议,我想在linux下作开发,但没有好的ide环境。
> >
> >
> >Anew>         致
> >Anew> 礼!
> >
> >
> >Anew>         Anew
> >Anew>         Anewboy at citiz.net
> >Anew>           2004-04-15
> >
> >
> >********************************************/
> >
> >--
> >Free as in Freedom
> >
> > Zoom.Quiet
> >
> >#=========================================#
> >]Time is unimportant, only life important![
> >#=========================================#
> >
> >sender is the Bat!2.02 CE
> >
> >_______________________________________________
> >python-chinese list
> >python-chinese at lists.python.cn
> >http://python.cn/mailman/listinfo/python-chinese
> >
>
>= = = = = = = = = = = = = = = = = = = =
>
>
>        致
>礼!
>
>
>        Anew
>        Anewboy at citiz.net
>          2004-04-15
>
>_______________________________________________
>python-chinese list
>python-chinese at lists.python.cn
>http://python.cn/mailman/listinfo/python-chinese




[导入自Mailman归档:http://www.zeuux.org/pipermail/zeuux-python]

如下红色区域有误,请重新填写。

    你的回复:

    请 登录 后回复。还没有在Zeuux哲思注册吗?现在 注册 !

    Zeuux © 2024

    京ICP备05028076号