2008年01月02日 星期三 10:52
ÎÒ´òËãд¸ö×ÀÃæµÄgoogleËÑË÷³ÌÐò¡£ ÔÚÍøÉÏËÑË÷µ½µÄ½á¹ûÊÇʹÓà google API¡£ µ«µÇ¼ºó·¢ÏÖgoogleÒѾֹͣÁËÕâÏî·þÎñ¡£ ÇëÎʸ÷λÓÐûÓÐʲô°ì·¨£¿Ð»Ð»ÁË -- deSign thE fuTure -------------- 下一部分 -------------- Ò»¸öHTML¸½¼þ±»ÒƳý... URL: http://python.cn/pipermail/python-chinese/attachments/20080102/04504eda/attachment.html
2008年01月02日 星期三 10:58
google desktop? On Jan 2, 2008 1:52 PM, 小龙 <freefis at gmail.com> wrote: > 我打算写个桌面的google搜索程序。 在网上搜索到的结果是使用 google API。 但登录后发现google已经停止了这项服务。 > > 请问各位有没有什么办法?谢谢了 > > -- > deSign thE fuTure > _______________________________________________ > python-chinese > Post: send python-chinese at lists.python.cn > Subscribe: send subscribe to python-chinese-request at lists.python.cn > Unsubscribe: send unsubscribe to python-chinese-request at lists.python.cn > Detail Info: http://python.cn/mailman/listinfo/python-chinese > -- Any complex technology which doesn't come with documentation must be the best available.
2008年01月02日 星期三 11:24
µ½Ã»ÄÇÒ°ÐÄ¡£ Íê³Éµã»ù±¾µÄwebËÑË÷¹¦Äܾ͵ÃÁË¡£ ÄãÓнâ¾ö·½°¸Ã´£¯ ÔÚ08-1-2£¬xxmplus <xxmplus在gmail.com> дµÀ£º > > google desktop? > > On Jan 2, 2008 1:52 PM, СÁú <freefis在gmail.com> wrote: > > ÎÒ´òËãд¸ö×ÀÃæµÄgoogleËÑË÷³ÌÐò¡£ ÔÚÍøÉÏËÑË÷µ½µÄ½á¹ûÊÇʹÓà google > API¡£ µ«µÇ¼ºó·¢ÏÖgoogleÒѾֹͣÁËÕâÏî·þÎñ¡£ > > > > ÇëÎʸ÷λÓÐûÓÐʲô°ì·¨£¿Ð»Ð»ÁË > > > > -- > > deSign thE fuTure > > _______________________________________________ > > python-chinese > > Post: send python-chinese在lists.python.cn > > Subscribe: send subscribe to python-chinese-request在lists.python.cn > > Unsubscribe: send unsubscribe to python-chinese-request在lists.python.cn > > Detail Info: http://python.cn/mailman/listinfo/python-chinese > > > > > > -- > Any complex technology which doesn't come with documentation must be the > best > available. > _______________________________________________ > python-chinese > Post: send python-chinese在lists.python.cn > Subscribe: send subscribe to python-chinese-request在lists.python.cn > Unsubscribe: send unsubscribe to python-chinese-request在lists.python.cn > Detail Info: http://python.cn/mailman/listinfo/python-chinese -- deSign thE fuTure -------------- 下一部分 -------------- Ò»¸öHTML¸½¼þ±»ÒƳý... URL: http://python.cn/pipermail/python-chinese/attachments/20080102/5afe4a8c/attachment.htm
2008年01月02日 星期三 11:44
²»ÐеĻ°Ö±½ÓÓÃhttpÐÒépostµ½google£¬Ôٰѽá¹û×¥³öÀ´¡£ On 1/2/08, СÁú <freefis在gmail.com> wrote: > > µ½Ã»ÄÇÒ°ÐÄ¡£ Íê³Éµã»ù±¾µÄwebËÑË÷¹¦Äܾ͵ÃÁË¡£ ÄãÓнâ¾ö·½°¸Ã´£¯ > > ÔÚ08-1-2£¬xxmplus <xxmplus在gmail.com> дµÀ£º > > > > google desktop? > > > > On Jan 2, 2008 1:52 PM, СÁú <freefis在gmail.com> wrote: > > > ÎÒ´òËãд¸ö×ÀÃæµÄgoogleËÑË÷³ÌÐò¡£ ÔÚÍøÉÏËÑË÷µ½µÄ½á¹ûÊÇʹÓà google > > API¡£ µ«µÇ¼ºó·¢ÏÖgoogleÒѾֹͣÁËÕâÏî·þÎñ¡£ > > > > > > ÇëÎʸ÷λÓÐûÓÐʲô°ì·¨£¿Ð»Ð»ÁË > > > > > > -- > > > deSign thE fuTure > > > _______________________________________________ > > > python-chinese > > > Post: send python-chinese在lists.python.cn > > > Subscribe: send subscribe to python-chinese-request在lists.python.cn > > > Unsubscribe: send unsubscribe to python-chinese-request在lists.python.cn > > > Detail Info: http://python.cn/mailman/listinfo/python-chinese > > > > > > > > > > > -- > > Any complex technology which doesn't come with documentation must be the > > best > > available. > > _______________________________________________ > > python-chinese > > Post: send python-chinese在lists.python.cn > > Subscribe: send subscribe to python-chinese-request在lists.python.cn > > Unsubscribe: send unsubscribe to python-chinese-request在lists.python.cn > > Detail Info: http://python.cn/mailman/listinfo/python-chinese > > > > > -- > deSign thE fuTure > _______________________________________________ > python-chinese > Post: send python-chinese在lists.python.cn > Subscribe: send subscribe to python-chinese-request在lists.python.cn > Unsubscribe: send unsubscribe to python-chinese-request在lists.python.cn > Detail Info: http://python.cn/mailman/listinfo/python-chinese > -- http://www.flyaflya.com -------------- 下一部分 -------------- Ò»¸öHTML¸½¼þ±»ÒƳý... URL: http://python.cn/pipermail/python-chinese/attachments/20080102/8c963173/attachment.htm
2008年01月02日 星期三 11:50
> > googleÏÔʾ½á¹ûµÄÔÀíÊÇʲô£¿µ÷ÓÃjs »¹ÊÇʹÓà frame? ÄÜÌ×£¿ -- deSign thE fuTure -------------- 下一部分 -------------- Ò»¸öHTML¸½¼þ±»ÒƳý... URL: http://python.cn/pipermail/python-chinese/attachments/20080102/e069ac4c/attachment.html
2008年01月02日 星期三 11:52
用lucene做索引啊。。自己做个就可以了。 On Jan 2, 2008 11:44 AM, flyaflya <flyaflya at gmail.com> wrote: > 不行的话直接用http协议post到google,再把结果抓出来。 > > > On 1/2/08, 小龙 <freefis at gmail.com> wrote: > > > > 到没那野心。 完成点基本的web搜索功能就得了。 你有解决方案么/ > > > > 在08-1-2,xxmplus <xxmplus at gmail.com> 写道: > > > > > > google desktop? > > > > > > On Jan 2, 2008 1:52 PM, 小龙 < freefis at gmail.com> wrote: > > > > 我打算写个桌面的google搜索程序。 在网上搜索到的结果是使用 google > > > API。 但登录后发现google已经停止了这项服务。 > > > > > > > > 请问各位有没有什么办法?谢谢了 > > > > > > > > -- > > > > deSign thE fuTure > > > > _______________________________________________ > > > > python-chinese > > > > Post: send python-chinese at lists.python.cn > > > > Subscribe: send subscribe to python-chinese-request at lists.python.cn > > > > Unsubscribe: send unsubscribe to python-chinese-request at lists.python.cn > > > > Detail Info: http://python.cn/mailman/listinfo/python-chinese > > > > > > > > > > > > > > > > -- > > > Any complex technology which doesn't come with documentation must be > > > the best > > > available. > > > _______________________________________________ > > > python-chinese > > > Post: send python-chinese at lists.python.cn > > > Subscribe: send subscribe to python-chinese-request at lists.python.cn > > > Unsubscribe: send unsubscribe to > > > python-chinese-request at lists.python.cn > > > Detail Info: http://python.cn/mailman/listinfo/python-chinese > > > > > > > > > > -- > > deSign thE fuTure > > _______________________________________________ > > python-chinese > > Post: send python-chinese at lists.python.cn > > Subscribe: send subscribe to python-chinese-request at lists.python.cn > > Unsubscribe: send unsubscribe to python-chinese-request at lists.python.cn > > Detail Info: http://python.cn/mailman/listinfo/python-chinese > > > > > > -- > http://www.flyaflya.com > _______________________________________________ > python-chinese > Post: send python-chinese at lists.python.cn > Subscribe: send subscribe to python-chinese-request at lists.python.cn > Unsubscribe: send unsubscribe to python-chinese-request at lists.python.cn > Detail Info: http://python.cn/mailman/listinfo/python-chinese > -- regards j.L -------------- next part -------------- An HTML attachment was scrubbed... URL: http://python.cn/pipermail/python-chinese/attachments/20080102/912b6b00/attachment.htm
2008年01月02日 星期三 11:58
ÄúÌ«Éî°ÂÁË¡« ÎÒÍêȫûÕâ¸ÅÄî¡£ ÎÒËÑËÑÈ¥¡« Äã¸øµã˼· ºÇºÇ -- deSign thE fuTure -------------- 下一部分 -------------- Ò»¸öHTML¸½¼þ±»ÒƳý... URL: http://python.cn/pipermail/python-chinese/attachments/20080102/ac1d9cf4/attachment.html
2008年01月02日 星期三 14:28
On Jan 2, 2008 11:24 AM, СÁú <freefis在gmail.com> wrote: > µ½Ã»ÄÇÒ°ÐÄ¡£ Íê³Éµã»ù±¾µÄwebËÑË÷¹¦Äܾ͵ÃÁË¡£ ÄãÓнâ¾ö·½°¸Ã´£¯ > Õâ¸öÈí¼þÓÐʲôÓã¿ -------------- 下一部分 -------------- Ò»¸öHTML¸½¼þ±»ÒƳý... URL: http://python.cn/pipermail/python-chinese/attachments/20080102/b66abe7b/attachment.html
2008年01月02日 星期三 14:55
当然是抓网页最方便啦, 给你贴个 web_google.py huahua at huahua:googlekids$ python web_google.py python.cn title: FrontPage — Portal for CPUG.org url: http://python.cn/ summary: PYTHON.CN 域名及邮件服务由Exoweb捐赠, 主机空间由啄木鸟开源社区捐赠。 Plone and its visual design is Copyright (c) 2000-2008 by Alexander Limi, Alan Runyan, ... title: python-cn:CPyUG | Google Groups url: http://groups.google.com/group/python-cn summary: 同质列表: http://python.cn/mailman/listinfo/python-chinese ... Qu... at gmail.com - Dec 16. XML Send email to this group: python-cn at googlegroups.com ... title: 啄木鸟Pythonic 开源社区资源图谱 url: http://www.woodpecker.org.cn/ summary: python-cn. leaf Otter. leaf OGNS-Compass. hide kaddressbook 即时讨论. hide UC 群组. leaf 4070424. hide QQ群组. leaf 1073669. leaf 2567903. leaf 在线语音会课 ... 另外,建议用 python-cn at googlegroups.com 这个地址代替 python-chinese at lists.python.cn On Jan 2, 2008 10:52 AM, 小龙 <freefis at gmail.com> wrote: > 我打算写个桌面的google搜索程序。 在网上搜索到的结果是使用 google API。 但登录后发现google已经停止了这项服务。 > > 请问各位有没有什么办法?谢谢了 > -------------- next part -------------- A non-text attachment was scrubbed... Name: web_google.py Type: text/x-python Size: 1740 bytes Desc: not available Url : http://python.cn/pipermail/python-chinese/attachments/20080102/a2d0df6a/attachment.py
2008年01月02日 星期三 14:57
On Jan 2, 2008 2:55 PM, Jiahua Huang <jhuangjiahua at gmail.com> wrote: > 当然是抓网页最方便啦, > 给你贴个 web_google.py #!/usr/bin/python # -*- coding: UTF-8 -*- """抓取 google 搜索结果 @version: $Id$ @author: U{Jiahua Huang} @license: LGPL @see: urllib2 """ import re import urllib2 ## 设定 urllib2 的 User-Agent opener = urllib2.build_opener() opener.addheaders = [('User-Agent','Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.4) Gecko/20061201 Firefox/2.0.0.6 (Ubuntu-feisty)')] ## 搜索 URL , 需要 %(key,page) search_url = 'http://www.google.cn/search?q=%s#=100&complete;=1&hl;=zh-CN&newwindow;=1&client;=firefox&rls;=com.ubuntu:en-US:official&start;=%s&sa;=N' def _html2txt(s): '''去掉 html 标记 ''' return re.sub(r'<[^>]+>', '', s) def _gethtml(key, page=0): '''抓取 google 搜索结果页面 return 源码 ''' key = urllib2.quote(key) page = 100*int(page) url = search_url%(key, page) try: return opener.open(url).read() except: return '' def _dorev(s): '''分析抓回的 google 搜索结果页面 return [[url, title, summary],] ''' if not s: return [] rev=[] s1 = s.split(r' ')[1:] for i in s1: if i.rfind('文件格式:')>-1: continue #url = i.split('"')[1] url = re.findall('href=".*?"', i)[0][6:-1] title = _html2txt(i.split('
')[0]) try: summary = _html2txt(re.findall(r'
', i)[0]).replace('"','').replace("'","") except: continue rev.append([url, title, summary]) return rev def getrev(key): '''获取搜索结果 return [[url, title, summary],] ''' return _dorev(_gethtml(key)) if __name__=="__main__": import sys for url, title, summary in getrev(' '.join(sys.argv[1:])): print 'title:', title print 'url:', url print 'summary:', summary print
2008年01月02日 星期三 16:30
лл ¹ûÈ»ÊôÅ£ÈË¡«¡« ÔÚ08-1-2£¬Jiahua Huang <jhuangjiahua在gmail.com> дµÀ£º > > On Jan 2, 2008 2:55 PM, Jiahua Huang <jhuangjiahua在gmail.com> wrote: > > µ±È»ÊÇ×¥ÍøÒ³×î·½±ãÀ²£¬ > > ¸øÄãÌù¸ö web_google.py > > > #!/usr/bin/python > # -*- coding: UTF-8 -*- > """ץȡ google ËÑË÷½á¹û > @version: $Id$ > @author: U{Jiahua Huang} > @license: LGPL > @see: urllib2 > """ > > import re > import urllib2 > > ## É趨 urllib2 µÄ User-Agent > opener = urllib2.build_opener() > opener.addheaders = [('User-Agent','Mozilla/5.0 (X11; U; Linux i686; > en-US; rv:1.8.1.4) Gecko/20061201 Firefox/2.0.0.6 (Ubuntu-feisty)')] > > ## ËÑË÷ URL £¬ ÐèÒª %(key,page) > search_url = ' > http://www.google.cn/search?q=%s#=100&complete;=1&hl;=zh-CN&newwindow;=1&client;=firefox&rls;=com.ubuntu:en-US:official&start;=%s&sa;=N > ' > > def _html2txt(s): > '''È¥µô html ±ê¼Ç > ''' > return re.sub(r'<[^>]+>', '', s) > > def _gethtml(key, page=0): > '''ץȡ google ËÑË÷½á¹ûÒ³Ãæ > return Ô´Âë > ''' > key = urllib2.quote(key) > page = 100*int(page) > url = search_url%(key, page) > try: > return opener.open(url).read() > except: > return '' > > def _dorev(s): > '''·ÖÎö×¥»ØµÄ google ËÑË÷½á¹ûÒ³Ãæ > return [[url, title, summary],] > ''' > if not s: return [] > rev=[] > s1 = s.split(r'')[1:]
> for i in s1: > if i.rfind('Îļþ¸ñʽ:')>-1: > continue > #url = i.split('"')[1] > url = re.findall('href=".*?"', i)[0][6:-1] > title = _html2txt(i.split('')[0]) > try: > summary = _html2txt(re.findall(r'> i)[0]).replace('"','').replace("'","") > except: > continue > rev.append([url, title, summary]) > return rev > > def getrev(key): > '''»ñÈ¡ËÑË÷½á¹û > return [[url, title, summary],] > ''' > return _dorev(_gethtml(key)) > > if __name__=="__main__": > import sys > for url, title, summary in getrev(' '.join(sys.argv[1:])): > print 'title:', title > print 'url:', url > print 'summary:', summary > print > _______________________________________________ > python-chinese > Post: send python-chinese在lists.python.cn > Subscribe: send subscribe to python-chinese-request在lists.python.cn > Unsubscribe: send unsubscribe to python-chinese-request在lists.python.cn > Detail Info: http://python.cn/mailman/listinfo/python-chinese -- deSign thE fuTure -------------- 下一部分 -------------- Ò»¸öHTML¸½¼þ±»ÒƳý... URL: http://python.cn/pipermail/python-chinese/attachments/20080102/bcb82b84/attachment.html
',
Zeuux © 2024
京ICP备05028076号