Python论坛  - 讨论区

标题:[python-chinese] 打算写个桌面的google搜索程序

2008年01月02日 星期三 10:52

小龙 freefis在gmail.com
星期三 一月 2 10:52:17 HKT 2008

ÎÒ´òËãд¸ö×ÀÃæµÄgoogleËÑË÷³ÌÐò¡£  ÔÚÍøÉÏËÑË÷µ½µÄ½á¹ûÊÇʹÓà google API¡£      µ«µÇ¼ºó·¢ÏÖgoogleÒѾ­Í£Ö¹ÁËÕâÏî·þÎñ¡£

ÇëÎʸ÷λÓÐûÓÐʲô°ì·¨£¿Ð»Ð»ÁË

-- 
deSign thE  fuTure
-------------- 下一部分 --------------
Ò»¸öHTML¸½¼þ±»ÒƳý...
URL: http://python.cn/pipermail/python-chinese/attachments/20080102/04504eda/attachment.html 

[导入自Mailman归档:http://www.zeuux.org/pipermail/zeuux-python]

2008年01月02日 星期三 10:58

xxmplus xxmplus在gmail.com
星期三 一月 2 10:58:15 HKT 2008

google desktop?

On Jan 2, 2008 1:52 PM, 小龙 <freefis at gmail.com> wrote:
> 我打算写个桌面的google搜索程序。  在网上搜索到的结果是使用 google API。      但登录后发现google已经停止了这项服务。
>
> 请问各位有没有什么办法?谢谢了
>
> --
> deSign thE  fuTure
> _______________________________________________
> python-chinese
> Post: send python-chinese at lists.python.cn
> Subscribe: send subscribe to python-chinese-request at lists.python.cn
> Unsubscribe: send unsubscribe to  python-chinese-request at lists.python.cn
> Detail Info: http://python.cn/mailman/listinfo/python-chinese
>



-- 
Any complex technology which doesn't come with documentation must be the best
available.

[导入自Mailman归档:http://www.zeuux.org/pipermail/zeuux-python]

2008年01月02日 星期三 11:24

小龙 freefis在gmail.com
星期三 一月 2 11:24:21 HKT 2008

µ½Ã»ÄÇÒ°ÐÄ¡£ Íê³Éµã»ù±¾µÄwebËÑË÷¹¦Äܾ͵ÃÁË¡£ ÄãÓнâ¾ö·½°¸Ã´£¯

ÔÚ08-1-2£¬xxmplus <xxmplus在gmail.com> дµÀ£º
>
> google desktop?
>
> On Jan 2, 2008 1:52 PM, СÁú <freefis在gmail.com> wrote:
> > ÎÒ´òËãд¸ö×ÀÃæµÄgoogleËÑË÷³ÌÐò¡£  ÔÚÍøÉÏËÑË÷µ½µÄ½á¹ûÊÇʹÓà google
> API¡£      µ«µÇ¼ºó·¢ÏÖgoogleÒѾ­Í£Ö¹ÁËÕâÏî·þÎñ¡£
> >
> > ÇëÎʸ÷λÓÐûÓÐʲô°ì·¨£¿Ð»Ð»ÁË
> >
> > --
> > deSign thE  fuTure
> > _______________________________________________
> > python-chinese
> > Post: send python-chinese在lists.python.cn
> > Subscribe: send subscribe to python-chinese-request在lists.python.cn
> > Unsubscribe: send unsubscribe to  python-chinese-request在lists.python.cn
> > Detail Info: http://python.cn/mailman/listinfo/python-chinese
> >
>
>
>
> --
> Any complex technology which doesn't come with documentation must be the
> best
> available.
> _______________________________________________
> python-chinese
> Post: send python-chinese在lists.python.cn
> Subscribe: send subscribe to python-chinese-request在lists.python.cn
> Unsubscribe: send unsubscribe to  python-chinese-request在lists.python.cn
> Detail Info: http://python.cn/mailman/listinfo/python-chinese




-- 
deSign thE  fuTure
-------------- 下一部分 --------------
Ò»¸öHTML¸½¼þ±»ÒƳý...
URL: http://python.cn/pipermail/python-chinese/attachments/20080102/5afe4a8c/attachment.htm 

[导入自Mailman归档:http://www.zeuux.org/pipermail/zeuux-python]

2008年01月02日 星期三 11:44

flyaflya flyaflya在gmail.com
星期三 一月 2 11:44:22 HKT 2008

²»ÐеĻ°Ö±½ÓÓÃhttpЭÒépostµ½google£¬Ôٰѽá¹û×¥³öÀ´¡£

On 1/2/08, СÁú <freefis在gmail.com> wrote:
>
> µ½Ã»ÄÇÒ°ÐÄ¡£ Íê³Éµã»ù±¾µÄwebËÑË÷¹¦Äܾ͵ÃÁË¡£ ÄãÓнâ¾ö·½°¸Ã´£¯
>
> ÔÚ08-1-2£¬xxmplus <xxmplus在gmail.com> дµÀ£º
> >
> > google desktop?
> >
> > On Jan 2, 2008 1:52 PM, СÁú <freefis在gmail.com> wrote:
> > > ÎÒ´òËãд¸ö×ÀÃæµÄgoogleËÑË÷³ÌÐò¡£  ÔÚÍøÉÏËÑË÷µ½µÄ½á¹ûÊÇʹÓà google
> > API¡£      µ«µÇ¼ºó·¢ÏÖgoogleÒѾ­Í£Ö¹ÁËÕâÏî·þÎñ¡£
> > >
> > > ÇëÎʸ÷λÓÐûÓÐʲô°ì·¨£¿Ð»Ð»ÁË
> > >
> > > --
> > > deSign thE  fuTure
> > > _______________________________________________
> > > python-chinese
> > > Post: send python-chinese在lists.python.cn
> > > Subscribe: send subscribe to python-chinese-request在lists.python.cn
> > > Unsubscribe: send unsubscribe to  python-chinese-request在lists.python.cn
> > > Detail Info: http://python.cn/mailman/listinfo/python-chinese
> > >
> >
> >
> >
> > --
> > Any complex technology which doesn't come with documentation must be the
> > best
> > available.
> > _______________________________________________
> > python-chinese
> > Post: send python-chinese在lists.python.cn
> > Subscribe: send subscribe to python-chinese-request在lists.python.cn
> > Unsubscribe: send unsubscribe to  python-chinese-request在lists.python.cn
> > Detail Info: http://python.cn/mailman/listinfo/python-chinese
>
>
>
>
> --
> deSign thE  fuTure
> _______________________________________________
> python-chinese
> Post: send python-chinese在lists.python.cn
> Subscribe: send subscribe to python-chinese-request在lists.python.cn
> Unsubscribe: send unsubscribe to  python-chinese-request在lists.python.cn
> Detail Info: http://python.cn/mailman/listinfo/python-chinese
>



-- 
http://www.flyaflya.com
-------------- 下一部分 --------------
Ò»¸öHTML¸½¼þ±»ÒƳý...
URL: http://python.cn/pipermail/python-chinese/attachments/20080102/8c963173/attachment.htm 

[导入自Mailman归档:http://www.zeuux.org/pipermail/zeuux-python]

2008年01月02日 星期三 11:50

小龙 freefis在gmail.com
星期三 一月 2 11:50:00 HKT 2008

>
> googleÏÔʾ½á¹ûµÄÔ­ÀíÊÇʲô£¿µ÷ÓÃjs »¹ÊÇʹÓà frame?


ÄÜÌ×£¿



-- 
deSign thE  fuTure
-------------- 下一部分 --------------
Ò»¸öHTML¸½¼þ±»ÒƳý...
URL: http://python.cn/pipermail/python-chinese/attachments/20080102/e069ac4c/attachment.html 

[导入自Mailman归档:http://www.zeuux.org/pipermail/zeuux-python]

2008年01月02日 星期三 11:52

j.L liuping.james在gmail.com
星期三 一月 2 11:52:21 HKT 2008

用lucene做索引啊。。自己做个就可以了。

On Jan 2, 2008 11:44 AM, flyaflya <flyaflya at gmail.com> wrote:

> 不行的话直接用http协议post到google,再把结果抓出来。
>
>
> On 1/2/08, 小龙 <freefis at gmail.com> wrote:
> >
> > 到没那野心。 完成点基本的web搜索功能就得了。 你有解决方案么/
> >
> > 在08-1-2,xxmplus <xxmplus at gmail.com> 写道:
> > >
> > > google desktop?
> > >
> > > On Jan 2, 2008 1:52 PM, 小龙 < freefis at gmail.com> wrote:
> > > > 我打算写个桌面的google搜索程序。  在网上搜索到的结果是使用 google
> > > API。      但登录后发现google已经停止了这项服务。
> > > >
> > > > 请问各位有没有什么办法?谢谢了
> > > >
> > > > --
> > > > deSign thE  fuTure
> > > > _______________________________________________
> > > > python-chinese
> > > > Post: send python-chinese at lists.python.cn
> > > > Subscribe: send subscribe to python-chinese-request at lists.python.cn
> > > > Unsubscribe: send unsubscribe to  python-chinese-request at lists.python.cn
> > > > Detail Info: http://python.cn/mailman/listinfo/python-chinese
> > > >
> > >
> > >
> > >
> > > --
> > > Any complex technology which doesn't come with documentation must be
> > > the best
> > > available.
> > > _______________________________________________
> > > python-chinese
> > > Post: send python-chinese at lists.python.cn
> > > Subscribe: send subscribe to python-chinese-request at lists.python.cn
> > > Unsubscribe: send unsubscribe to
> > > python-chinese-request at lists.python.cn
> > > Detail Info: http://python.cn/mailman/listinfo/python-chinese
> >
> >
> >
> >
> > --
> > deSign thE  fuTure
> > _______________________________________________
> > python-chinese
> > Post: send python-chinese at lists.python.cn
> > Subscribe: send subscribe to python-chinese-request at lists.python.cn
> > Unsubscribe: send unsubscribe to  python-chinese-request at lists.python.cn
> > Detail Info: http://python.cn/mailman/listinfo/python-chinese
> >
>
>
>
> --
> http://www.flyaflya.com
> _______________________________________________
> python-chinese
> Post: send python-chinese at lists.python.cn
> Subscribe: send subscribe to python-chinese-request at lists.python.cn
> Unsubscribe: send unsubscribe to  python-chinese-request at lists.python.cn
> Detail Info: http://python.cn/mailman/listinfo/python-chinese
>



-- 
regards
j.L
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://python.cn/pipermail/python-chinese/attachments/20080102/912b6b00/attachment.htm 

[导入自Mailman归档:http://www.zeuux.org/pipermail/zeuux-python]

2008年01月02日 星期三 11:58

小龙 freefis在gmail.com
星期三 一月 2 11:58:54 HKT 2008

ÄúÌ«Éî°ÂÁË¡« ÎÒÍêȫûÕâ¸ÅÄî¡£ ÎÒËÑËÑÈ¥¡« Äã¸øµã˼· ºÇºÇ
-- 
deSign thE  fuTure
-------------- 下一部分 --------------
Ò»¸öHTML¸½¼þ±»ÒƳý...
URL: http://python.cn/pipermail/python-chinese/attachments/20080102/ac1d9cf4/attachment.html 

[导入自Mailman归档:http://www.zeuux.org/pipermail/zeuux-python]

2008年01月02日 星期三 14:28

Ben Luo benluo在gmail.com
星期三 一月 2 14:28:37 HKT 2008

On Jan 2, 2008 11:24 AM, СÁú <freefis在gmail.com> wrote:

> µ½Ã»ÄÇÒ°ÐÄ¡£ Íê³Éµã»ù±¾µÄwebËÑË÷¹¦Äܾ͵ÃÁË¡£ ÄãÓнâ¾ö·½°¸Ã´£¯
>
Õâ¸öÈí¼þÓÐʲôÓã¿
-------------- 下一部分 --------------
Ò»¸öHTML¸½¼þ±»ÒƳý...
URL: http://python.cn/pipermail/python-chinese/attachments/20080102/b66abe7b/attachment.html 

[导入自Mailman归档:http://www.zeuux.org/pipermail/zeuux-python]

2008年01月02日 星期三 14:55

Jiahua Huang jhuangjiahua在gmail.com
星期三 一月 2 14:55:49 HKT 2008

当然是抓网页最方便啦,
给你贴个 web_google.py

huahua at huahua:googlekids$ python web_google.py  python.cn
title: FrontPage — Portal for CPUG.org
url: http://python.cn/
summary: PYTHON.CN 域名及邮件服务由Exoweb捐赠, 主机空间由啄木鸟开源社区捐赠。 Plone and its
visual design is Copyright (c) 2000-2008 by Alexander Limi, Alan
Runyan, ...

title: python-cn:CPyUG | Google Groups
url: http://groups.google.com/group/python-cn
summary: 同质列表: http://python.cn/mailman/listinfo/python-chinese ...
Qu... at gmail.com - Dec 16. XML Send email to this group:
python-cn at googlegroups.com ...

title: 啄木鸟Pythonic 开源社区资源图谱
url: http://www.woodpecker.org.cn/
summary: python-cn. leaf Otter. leaf OGNS-Compass. hide kaddressbook
即时讨论. hide UC 群组. leaf 4070424. hide QQ群组. leaf 1073669. leaf 2567903.
leaf 在线语音会课 ...



另外,建议用 python-cn at googlegroups.com 这个地址代替 python-chinese at lists.python.cn

On Jan 2, 2008 10:52 AM, 小龙 <freefis at gmail.com> wrote:
> 我打算写个桌面的google搜索程序。  在网上搜索到的结果是使用 google API。      但登录后发现google已经停止了这项服务。
>
> 请问各位有没有什么办法?谢谢了
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: web_google.py
Type: text/x-python
Size: 1740 bytes
Desc: not available
Url : http://python.cn/pipermail/python-chinese/attachments/20080102/a2d0df6a/attachment.py 

[导入自Mailman归档:http://www.zeuux.org/pipermail/zeuux-python]

2008年01月02日 星期三 14:57

Jiahua Huang jhuangjiahua在gmail.com
星期三 一月 2 14:57:49 HKT 2008

On Jan 2, 2008 2:55 PM, Jiahua Huang <jhuangjiahua at gmail.com> wrote:
> 当然是抓网页最方便啦,
> 给你贴个 web_google.py


#!/usr/bin/python
# -*- coding: UTF-8 -*-
"""抓取 google 搜索结果
@version: $Id$
@author: U{Jiahua Huang }
@license: LGPL
@see: urllib2
"""

import re
import urllib2

## 设定 urllib2 的 User-Agent
opener = urllib2.build_opener()
opener.addheaders = [('User-Agent','Mozilla/5.0 (X11; U; Linux i686;
en-US; rv:1.8.1.4) Gecko/20061201 Firefox/2.0.0.6 (Ubuntu-feisty)')]

## 搜索 URL , 需要 %(key,page)
search_url = 'http://www.google.cn/search?q=%s#=100&complete;=1&hl;=zh-CN&newwindow;=1&client;=firefox&rls;=com.ubuntu:en-US:official&start;=%s&sa;=N'

def _html2txt(s):
	'''去掉 html 标记
	'''
	return re.sub(r'<[^>]+>', '', s)

def _gethtml(key, page=0):
	'''抓取 google 搜索结果页面
	return 源码
	'''
	key = urllib2.quote(key)
	page = 100*int(page)
	url = search_url%(key, page)
	try:
		return opener.open(url).read()
	except:
		return ''

def _dorev(s):
	'''分析抓回的 google 搜索结果页面
	return [[url, title, summary],]
	'''
	if not s: return []
	rev=[]
	s1 = s.split(r'

')[1:] for i in s1: if i.rfind('文件格式:')>-1: continue #url = i.split('"')[1] url = re.findall('href=".*?"', i)[0][6:-1] title = _html2txt(i.split('

')[0]) try: summary = _html2txt(re.findall(r'
', i)[0]).replace('"','').replace("'","") except: continue rev.append([url, title, summary]) return rev def getrev(key): '''获取搜索结果 return [[url, title, summary],] ''' return _dorev(_gethtml(key)) if __name__=="__main__": import sys for url, title, summary in getrev(' '.join(sys.argv[1:])): print 'title:', title print 'url:', url print 'summary:', summary print

[导入自Mailman归档:http://www.zeuux.org/pipermail/zeuux-python]

2008年01月02日 星期三 16:30

小龙 freefis在gmail.com
星期三 一月 2 16:30:31 HKT 2008

лл  ¹ûÈ»ÊôÅ£ÈË¡«¡«

ÔÚ08-1-2£¬Jiahua Huang <jhuangjiahua在gmail.com> дµÀ£º
>
> On Jan 2, 2008 2:55 PM, Jiahua Huang <jhuangjiahua在gmail.com> wrote:
> > µ±È»ÊÇ×¥ÍøÒ³×î·½±ãÀ²£¬
> > ¸øÄãÌù¸ö web_google.py
>
>
> #!/usr/bin/python
> # -*- coding: UTF-8 -*-
> """ץȡ google ËÑË÷½á¹û
> @version: $Id$
> @author: U{Jiahua Huang }
> @license: LGPL
> @see: urllib2
> """
>
> import re
> import urllib2
>
> ## É趨 urllib2 µÄ User-Agent
> opener = urllib2.build_opener()
> opener.addheaders = [('User-Agent','Mozilla/5.0 (X11; U; Linux i686;
> en-US; rv:1.8.1.4) Gecko/20061201 Firefox/2.0.0.6 (Ubuntu-feisty)')]
>
> ## ËÑË÷ URL £¬ ÐèÒª %(key,page)
> search_url = '
> http://www.google.cn/search?q=%s#=100&complete;=1&hl;=zh-CN&newwindow;=1&client;=firefox&rls;=com.ubuntu:en-US:official&start;=%s&sa;=N
> '
>
> def _html2txt(s):
>         '''È¥µô html ±ê¼Ç
>         '''
>         return re.sub(r'<[^>]+>', '', s)
>
> def _gethtml(key, page=0):
>         '''ץȡ google ËÑË÷½á¹ûÒ³Ãæ
>         return Ô´Âë
>         '''
>         key = urllib2.quote(key)
>         page = 100*int(page)
>         url = search_url%(key, page)
>         try:
>                 return opener.open(url).read()
>         except:
>                 return ''
>
> def _dorev(s):
>         '''·ÖÎö×¥»ØµÄ google ËÑË÷½á¹ûÒ³Ãæ
>         return [[url, title, summary],]
>         '''
>         if not s: return []
>         rev=[]
>         s1 = s.split(r'

')[1:]

> for i in s1: > if i.rfind('Îļþ¸ñʽ:')>-1: > continue > #url = i.split('"')[1] > url = re.findall('href=".*?"', i)[0][6:-1] > title = _html2txt(i.split('')[0]) > try: > summary = _html2txt(re.findall(r'
',
> i)[0]).replace('"','').replace("'","") > except: > continue > rev.append([url, title, summary]) > return rev > > def getrev(key): > '''»ñÈ¡ËÑË÷½á¹û > return [[url, title, summary],] > ''' > return _dorev(_gethtml(key)) > > if __name__=="__main__": > import sys > for url, title, summary in getrev(' '.join(sys.argv[1:])): > print 'title:', title > print 'url:', url > print 'summary:', summary > print > _______________________________________________ > python-chinese > Post: send python-chinese在lists.python.cn > Subscribe: send subscribe to python-chinese-request在lists.python.cn > Unsubscribe: send unsubscribe to python-chinese-request在lists.python.cn > Detail Info: http://python.cn/mailman/listinfo/python-chinese -- deSign thE fuTure -------------- 下一部分 -------------- Ò»¸öHTML¸½¼þ±»ÒƳý... URL: http://python.cn/pipermail/python-chinese/attachments/20080102/bcb82b84/attachment.html

[导入自Mailman归档:http://www.zeuux.org/pipermail/zeuux-python]

如下红色区域有误,请重新填写。

    你的回复:

    请 登录 后回复。还没有在Zeuux哲思注册吗?现在 注册 !

    Zeuux © 2024

    京ICP备05028076号