Python论坛  - 讨论区

标题:html�ȡ��

2007年06月15日 星期五 06:39

Hui Wang jackie_stata在yahoo.ca
星期五 六月 15 06:39:28 HKT 2007

ÎÒÏë´ÓÈçÏÂÍøÒ³ÖÐÌáÈ¡¾­ÓйؽÌÊڵĻù±¾ÐÅÏ¢£¬°üÀ¨ÐÕÃû£¬Ö°Î»£¬Ñ§Àú£¬±ÏҵԺУ£¬µç»°£¬µçÓʵØÖ·£¬ÒÔ¼°Ñо¿ÁìÓò
  
http://www.economics.utoronto.ca/index.php/index/person/faculty
   
  ÔÚÏÂÊÇÎÒժ¼µÄÁ½¸ö½ÌÊÚÐÅÏ¢µÄhtmlÓï¾ä£º
   
    
  
        http://ww2.economics.utoronto.ca/photos/AGUIRREGABIRIA.JPG" alt="Aguirregabiria, Victor">
            
  
   http://www.economics.utoronto.ca/index.php/index/person/person/faculty/746" class="name">Aguirregabiria, Victor, Associate Professor     
     
 
 
  
   Ph.D.
     CEMFI - Universidad Complutense, Madrid, 1995         
 
 
  
     
  
   416-978-4358  
 
 
     
    Research fields: Applied econometrics, Industrial organization
   
   
 1
   
 
  
  
        http://ww2.economics.utoronto.ca/photos/AIVAZIAN.JPG" alt="Aivazian, Varouj A.">
            
  
   http://www.economics.utoronto.ca/index.php/index/person/person/faculty/90" class="name">Aivazian, Varouj A., Professor; Chair, University of Toronto at Mississauga; 
  Director, MFE Program        
 
 
  
   Ph.D.
     Ohio State, 1975         
 
 
  
     
  
   416-978-2375  
 
 
     
    Research fields: Financial economics, Law and economics
   
   
   
  ÎÒÓÐÈçÏÂÎÊÌâÇë½Ì£º
1.ѧÀúÓë±ÏҵѧУËƺõûÓÐʲô¹æÂÉ£¬Ó¦¸ÃÈçºÎÓÃÕýÔò±í´ïʽÌáÈ¡£¿£¨¼´£ºÈçºÎÌáÈ¡ÉÏÃæhtmlÖеÄ"Ph.D.", "CEMFI - Universidad Complutense, Madrid, 1995" ÒÔ¼° "Ph.D.", "Ohio State, 1975"£¿)
2.ÈçºÎ´ÓÖÐÌáÈ¡³ö victor.aguirregabiria在utoronto.ca £¿
3.ÎÒÏ뽫Éú³ÉµÄÎļþ´¢´æÔÚtxtÎı¾ÖС£Ã¿Ò»ÐÐÊÇÒ»¸ö½ÌÊڵĸ÷ÖÖÐÅÏ¢£¨Ä³Ð©ÐÅÏ¢¿ÉÒÔΪ¿Õ£©¡£ÇëÎÊÈçºÎʵÏÖ£¿
   
  ±§Ç¸ÎļþÓÐЩ³¤¡£¶àл¸÷λָµ¼£¡
   
  Jackie

       
---------------------------------
Be smarter than spam. See how smart SpamGuard is at giving junk email the boot with the All-new Yahoo! Mail  
-------------- 下一部分 --------------
??HTML?????...
URL: http://python.cn/pipermail/python-chinese/attachments/20070614/e9acd522/attachment.html 

[导入自Mailman归档:http://www.zeuux.org/pipermail/zeuux-python]

2007年06月15日 星期五 07:02

Andelf andelf在gmail.com
星期五 六月 15 07:02:31 HKT 2007

在07-6-15,Hui Wang <jackie_stata at yahoo.ca> 写道:
>
>
> 我有如下问题请教:
> 1.学历与毕业学校似乎没有什么规律,应该如何用正则表达式提取?(即:如何提取上面html中的"Ph.D.", "CEMFI -
> Universidad Complutense, Madrid, 1995" 以及 "Ph.D.", "Ohio State, 1975"?)
>

学历无非就那么几种,要不就没有的,貌似不难
而且这个信息是在指定的node中,有指定的格式,通过和识别
我认为不一定非要使用re,我爱用split

 2.如何从中提取出
> victor.aguirregabiria at utoronto.ca>

先分别提取这两个字符串,然后在py中反向,合并.

 3.我想将生成的文件储存在txt文本中。每一行是一个教授的各种信息(某些信息可以为空)。请问如何实现?
>
>

推荐你使用csv文件,文本格式,py有操作csv的库
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://python.cn/pipermail/python-chinese/attachments/20070615/732acfaa/attachment.html 

[导入自Mailman归档:http://www.zeuux.org/pipermail/zeuux-python]

2007年06月15日 星期五 10:09

haur hekun06在gmail.com
星期五 六月 15 10:09:31 HKT 2007

xml±£´æ×îºÃÁË .........

ÔÚ07-6-15£¬Andelf <andelf在gmail.com> дµÀ£º
>
>
>
> ÔÚ07-6-15£¬Hui Wang <jackie_stata在yahoo.ca> дµÀ£º
> >
> >
> > ÎÒÓÐÈçÏÂÎÊÌâÇë½Ì£º
> > 1.ѧÀúÓë±ÏҵѧУËƺõûÓÐʲô¹æÂÉ£¬Ó¦¸ÃÈçºÎÓÃÕýÔò±í´ïʽÌáÈ¡£¿£¨¼´£ºÈçºÎÌáÈ¡ÉÏÃæhtmlÖеÄ"Ph.D.", "CEMFI -
> > Universidad Complutense, Madrid, 1995" ÒÔ¼° "Ph.D.", "Ohio State, 1975"£¿)
> >
>
> ѧÀúÎ޷ǾÍÄÇô¼¸ÖÖ,Òª²»¾ÍûÓеÄ,òËƲ»ÄÑ
> ¶øÇÒÕâ¸öÐÅÏ¢ÊÇÔÚÖ¸¶¨µÄnodeÖÐ,ÓÐÖ¸¶¨µÄ¸ñʽ,ͨ¹ý> colspan="2">ºÍʶ±ð
> ÎÒÈÏΪ²»Ò»¶¨·ÇҪʹÓÃre,ÎÒ°®ÓÃsplit
>
>  2.ÈçºÎ´ÓÖÐÌáÈ¡³ö
> > victor.aguirregabiria在utoronto.ca £¿
> >
>
> ÏÈ·Ö±ðÌáÈ¡ÕâÁ½¸ö×Ö·û´®,È»ºóÔÚpyÖз´Ïò,ºÏ²¢.
>
>  3.ÎÒÏ뽫Éú³ÉµÄÎļþ´¢´æÔÚtxtÎı¾ÖС£Ã¿Ò»ÐÐÊÇÒ»¸ö½ÌÊڵĸ÷ÖÖÐÅÏ¢£¨Ä³Ð©ÐÅÏ¢¿ÉÒÔΪ¿Õ£©¡£ÇëÎÊÈçºÎʵÏÖ£¿
> >
> >
>
> ÍƼöÄãʹÓÃcsvÎļþ,Îı¾¸ñʽ,pyÓвÙ×÷csvµÄ¿â
>
> _______________________________________________
> python-chinese
> Post: send python-chinese在lists.python.cn
> Subscribe: send subscribe to python-chinese-request在lists.python.cn
> Unsubscribe: send unsubscribe to  python-chinese-request在lists.python.cn
> Detail Info: http://python.cn/mailman/listinfo/python-chinese
>
-------------- 下一部分 --------------
Ò»¸öHTML¸½¼þ±»ÒƳý...
URL: http://python.cn/pipermail/python-chinese/attachments/20070615/a2e346fb/attachment-0001.html 

[导入自Mailman归档:http://www.zeuux.org/pipermail/zeuux-python]

2007年06月15日 星期五 11:53

Hui Wang jackie_stata在yahoo.ca
星期五 六月 15 11:53:50 HKT 2007

¶àл¡£
   
  ÎÒ¾õµÃÓúÍʶ±ðѧÀúÓë±ÏҵѧУÊÇÓÐÎÊÌâµÄ¡£ÈçÏÂÁУº
   
  ---------------
  
   http://www.economics.utoronto.ca/index.php/index/person/person/faculty/746" class="name">Aguirregabiria, Victor, Associate Professor     
     
 
 
  
   Ph.D.
     CEMFI - Universidad Complutense, Madrid, 1995         
  ------------------------
   
    ¡° ºÍÖС£ÈçºÎÊéдÕýÔòÒÔÌáÈ¡ºóÕߣ¿        

Andelf <andelf在gmail.com> wrote:
  

  ÔÚ07-6-15£¬Hui Wang <jackie_stata在yahoo.ca> дµÀ£º      
  ÎÒÓÐÈçÏÂÎÊÌâÇë½Ì£º
1.ѧÀúÓë±ÏҵѧУËƺõûÓÐʲô¹æÂÉ£¬Ó¦¸ÃÈçºÎÓÃÕýÔò±í´ïʽÌáÈ¡£¿£¨¼´£ºÈçºÎÌáÈ¡ÉÏÃæhtmlÖеÄ"Ph.D.", "CEMFI - Universidad Complutense, Madrid, 1995" ÒÔ¼° "Ph.D.", "Ohio State, 1975"£¿)
   
  ѧÀúÎ޷ǾÍÄÇô¼¸ÖÖ,Òª²»¾ÍûÓеÄ,òËƲ»ÄÑ
  ¶øÇÒÕâ¸öÐÅÏ¢ÊÇÔÚÖ¸¶¨µÄnodeÖÐ,ÓÐÖ¸¶¨µÄ¸ñʽ,ͨ¹ýºÍʶ±ð
  ÎÒÈÏΪ²»Ò»¶¨·ÇҪʹÓÃre,ÎÒ°®ÓÃsplit

    2.ÈçºÎ´ÓÖÐÌáÈ¡³ö victor.aguirregabiria在utoronto.ca £¿
   
  ÏÈ·Ö±ðÌáÈ¡ÕâÁ½¸ö×Ö·û´®,È»ºóÔÚpyÖз´Ïò,ºÏ²¢.

    3.ÎÒÏ뽫Éú³ÉµÄÎļþ´¢´æÔÚtxtÎı¾ÖС£Ã¿Ò»ÐÐÊÇÒ»¸ö½ÌÊڵĸ÷ÖÖÐÅÏ¢£¨Ä³Ð©ÐÅÏ¢¿ÉÒÔΪ¿Õ£©¡£ÇëÎÊÈçºÎʵÏÖ£¿
   
   
  ÍƼöÄãʹÓÃcsvÎļþ,Îı¾¸ñʽ,pyÓвÙ×÷csvµÄ¿â

_______________________________________________
python-chinese
Post: send python-chinese在lists.python.cn
Subscribe: send subscribe to python-chinese-request在lists.python.cn
Unsubscribe: send unsubscribe to python-chinese-request在lists.python.cn
Detail Info: http://python.cn/mailman/listinfo/python-chinese

       
---------------------------------
Be smarter than spam. See how smart SpamGuard is at giving junk email the boot with the All-new Yahoo! Mail  
-------------- 下一部分 --------------
??HTML?????...
URL: http://python.cn/pipermail/python-chinese/attachments/20070614/854985c8/attachment.html 

[导入自Mailman归档:http://www.zeuux.org/pipermail/zeuux-python]

2007年06月15日 星期五 12:10

Andelf andelf在gmail.com
星期五 六月 15 12:10:10 HKT 2007

在07-6-15,Hui Wang <jackie_stata at yahoo.ca> 写道:
>
> 多谢。
>       CEMFI - Universidad Complutense, Madrid, 1995" 均在> valign="top" colspan="2">和中。如何书写正则以提取后者?
>

如果是我,不怕麻烦的,提取出来再判断下.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://python.cn/pipermail/python-chinese/attachments/20070615/40e02490/attachment.htm 

[导入自Mailman归档:http://www.zeuux.org/pipermail/zeuux-python]

如下红色区域有误,请重新填写。

    你的回复:

    请 登录 后回复。还没有在Zeuux哲思注册吗?现在 注册 !

    Zeuux © 2025

    京ICP备05028076号