Python论坛  - 讨论区

标题:[python-chinese] 请教一个关于任意数据源读取时的编码的问题

2007年11月08日 星期四 16:51

clfff.peter clfff.peter在gmail.com
星期四 十一月 8 16:51:20 HKT 2007

ÏÂÃæµÄ´úÂëÊÇ´ÓÈÎÒâÊý¾ÝÔ´¶ÁÈ¡£¬²¢·µ»ØÊý¾Ý¡£ÎÒÒÔÇ°ÓÐÒ»¸ö½Å±¾ÊÇ´ÓÎļþÖжÁÈ¡Êý¾Ý£¬È»ºó¸ù¾Ý±àÂ뽫ÆäÖÐÎı¾¶Á³ö£¨ÓÃcodecs.open£©
£¬ÏÖÔÚÎÒÏëÓÃÏÂÃæµÄ´úÂëÀ´¸Ä½øÎҵĽű¾£¬Ê¹ËûÖ§³Ö¶àÖÖÊý¾ÝÔ´£¬µ«ÊÇÎÒÒªÔõÑù²ÅÄÜÓÃÌض¨±àÂë´ò¿ªÈÎÒâÊý¾ÝÔ´£¿
лл¡£

def openAnything(source):
    if hasattr(source, "read"):
        return source

    if source == '-':
        import sys
        return sys.stdin

    # try to open with urllib (if source is http, ftp, or file URL)
    import urllib
    try:
        return urllib.urlopen(source)
    except (IOError, OSError):
        pass

    # try to open with native open function (if source is pathname)
    try:
        return open(source)
    except (IOError, OSError):
        pass

    # treat source as string
    import StringIO
    return StringIO.StringIO(str(source))
-------------- 下一部分 --------------
Ò»¸öHTML¸½¼þ±»ÒƳý...
URL: http://python.cn/pipermail/python-chinese/attachments/20071108/24803429/attachment.html 

[导入自Mailman归档:http://www.zeuux.org/pipermail/zeuux-python]

2007年11月08日 星期四 16:57

Zoom.Quiet zoom.quiet在gmail.com
星期四 十一月 8 16:57:58 HKT 2007

On Nov 8, 2007 4:51 PM, clfff. peter <clfff.peter在gmail.com> wrote:
> 下面的代码是从任意数据源读取,并返回数据。我以前有一个脚本是从文件中读取数据,然后根据编码将其中文本读出(用codecs.open),现在我想用下面的代码来改进我的脚本,使他支持多种数据源,但是我要怎样才能用特定编码打开任意数据源?
> 谢谢。
http://openbookproject.googlecode.com/svn/trunk/LovelyPython/PyDays/pyd-0/cdctools.py
def _smartcode(ustring):
    '''智能字串编码转换函式
    @note: 利用chardet.detect() 猜测字串的编码值,然后统一转换为utf8
    @param ustring: 有正确编码的中文字串
    @todo: 更加精确的猜测处理
    '''
嗬嗬嗬,可爱的Python 中,最简单的思路...
>
> def openAnything(source):
>     if hasattr(source, "read"):
>         return source
>
>     if source == '-':
>         import sys
>         return sys.stdin
>
>     # try to open with urllib (if source is http, ftp, or file URL)
>     import urllib
>     try:
>         return urllib.urlopen(source)
>     except (IOError, OSError):
>         pass
>
>     # try to open with native open function (if source is pathname)
>     try:
>         return open(source)
>      except (IOError, OSError):
>         pass
>
>     # treat source as string
>     import StringIO
>     return StringIO.StringIO(str(source))
> _______________________________________________
> python-chinese
> Post: send python-chinese在lists.python.cn
> Subscribe: send subscribe to python-chinese-request在lists.python.cn
> Unsubscribe: send unsubscribe to  python-chinese-request在lists.python.cn
> Detail Info: http://python.cn/mailman/listinfo/python-chinese
>



-- 
'''Time is unimportant, only life important!
过程改进乃是开始催生可促生靠谱的人的组织!
'''http://zoomquiet.org
blog  @ http://blog.zoomquiet.org/pyblosxom/
wiki  @ http://wiki.woodpecker.org.cn/moin/ZoomQuiet
scrap @ http://floss.zoomquiet.org ; http://skm.zoomquiet.org
douban@ http://www.douban.com/people/zoomq/
好看簿 @ http://zoomq.haokanbu.com/
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Pls. usage OOo to replace M$ Office. http://zh.openoffice.org
Pls. usage 7-zip to replace WinRAR/WinZip.  http://7-zip.org
You can get the truely Freedom 4 software.

[导入自Mailman归档:http://www.zeuux.org/pipermail/zeuux-python]

2007年11月08日 星期四 17:15

clfff.peter clfff.peter在gmail.com
星期四 十一月 8 17:15:18 HKT 2007

eeeeee,ºÃÏñ¸úÎÒÎʵIJ»ÊÇÒ»»ØÊÂÂ𣿣º£¨

ÔÚ07-11-8£¬Zoom. Quiet <zoom.quiet在gmail.com> дµÀ£º
>
> On Nov 8, 2007 4:51 PM, clfff. peter <clfff.peter在gmail.com> wrote:
> > ÏÂÃæµÄ´úÂëÊÇ´ÓÈÎÒâÊý¾ÝÔ´¶ÁÈ¡£¬²¢·µ»ØÊý¾Ý¡£ÎÒÒÔÇ°ÓÐÒ»¸ö½Å±¾ÊÇ´ÓÎļþÖжÁÈ¡Êý¾Ý£¬È»ºó¸ù¾Ý±àÂ뽫ÆäÖÐÎı¾¶Á³ö£¨ÓÃcodecs.open£©
> £¬ÏÖÔÚÎÒÏëÓÃÏÂÃæµÄ´úÂëÀ´¸Ä½øÎҵĽű¾£¬Ê¹ËûÖ§³Ö¶àÖÖÊý¾ÝÔ´£¬µ«ÊÇÎÒÒªÔõÑù²ÅÄÜÓÃÌض¨±àÂë´ò¿ªÈÎÒâÊý¾ÝÔ´£¿
> > лл¡£
>
> http://openbookproject.googlecode.com/svn/trunk/LovelyPython/PyDays/pyd-0/cdctools.py
> def _smartcode(ustring):
>    '''ÖÇÄÜ×Ö´®±àÂëת»»º¯Ê½
>    @note: ÀûÓÃchardet.detect() ²Â²â×Ö´®µÄ±àÂëÖµ,È»ºóͳһת»»Îªutf8
>    @param ustring: ÓÐÕýÈ·±àÂëµÄÖÐÎÄ×Ö´®
>    @todo: ¸ü¼Ó¾«È·µÄ²Â²â´¦Àí
>    '''
> àÀàÀàÀ,¿É°®µÄPython ÖÐ,×î¼òµ¥µÄ˼·...
> >
> > def openAnything(source):
> >     if hasattr(source, "read"):
> >         return source
> >
> >     if source == '-':
> >         import sys
> >         return sys.stdin
> >
> >     # try to open with urllib (if source is http, ftp, or file URL)
> >     import urllib
> >     try:
> >         return urllib.urlopen(source)
> >     except (IOError, OSError):
> >         pass
> >
> >     # try to open with native open function (if source is pathname)
> >     try:
> >         return open(source)
> >      except (IOError, OSError):
> >         pass
> >
> >     # treat source as string
> >     import StringIO
> >     return StringIO.StringIO(str(source))
> > _______________________________________________
> > python-chinese
> > Post: send python-chinese在lists.python.cn
> > Subscribe: send subscribe to python-chinese-request在lists.python.cn
> > Unsubscribe: send unsubscribe to  python-chinese-request在lists.python.cn
> > Detail Info: http://python.cn/mailman/listinfo/python-chinese
> >
>
>
>
> --
> '''Time is unimportant, only life important!
> ¹ý³Ì¸Ä½øÄËÊÇ¿ªÊ¼´ßÉú¿É´ÙÉú¿¿Æ×µÄÈ˵Ä×éÖ¯!
> '''http://zoomquiet.org
> blog  @ http://blog.zoomquiet.org/pyblosxom/
> wiki  @ http://wiki.woodpecker.org.cn/moin/ZoomQuiet
> scrap @ http://floss.zoomquiet.org ; http://skm.zoomquiet.org
> douban@ http://www.douban.com/people/zoomq/
> ºÃ¿´²¾ @ http://zoomq.haokanbu.com/
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> Pls. usage OOo to replace M$ Office. http://zh.openoffice.org
> Pls. usage 7-zip to replace WinRAR/WinZip.  http://7-zip.org
> You can get the truely Freedom 4 software.
> _______________________________________________
> python-chinese
> Post: send python-chinese在lists.python.cn
> Subscribe: send subscribe to python-chinese-request在lists.python.cn
> Unsubscribe: send unsubscribe to  python-chinese-request在lists.python.cn
> Detail Info: http://python.cn/mailman/listinfo/python-chinese
-------------- 下一部分 --------------
Ò»¸öHTML¸½¼þ±»ÒƳý...
URL: http://python.cn/pipermail/python-chinese/attachments/20071108/833f19e4/attachment-0001.html 

[导入自Mailman归档:http://www.zeuux.org/pipermail/zeuux-python]

如下红色区域有误,请重新填写。

    你的回复:

    请 登录 后回复。还没有在Zeuux哲思注册吗?现在 注册 !

    Zeuux © 2025

    京ICP备05028076号