Python论坛的帖子： - 哲思

Python论坛 - 讨论区

返回群组主页

标题：[python-chinese] 为什么unicode字符和中文字符相加会出错

分享

徐继哲

楼主 2006年08月10日星期四 01:06

bird devdoer devdoer at gmail.com
Thu Aug 10 01:06:18 HKT 2006

测试1：
s1='a'
s2='中国'
s＝s1+s2
没问题
测试2：
s1=u'a'
s2='中国'
s=s1+s2
报错
-- 
devdoer
devdoer at gmail.com
http://project.mytianwang.cn/cgi-bin/blog
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.exoweb.net/pipermail/python-chinese/attachments/20060810/1d951d8c/attachment.html

[导入自Mailman归档：http://www.zeuux.org/pipermail/zeuux-python]

徐继哲

0楼 2006年08月10日星期四 06:35

shhgs shhgs.efhilt at gmail.com
Thu Aug 10 06:35:27 HKT 2006

当然报错。unicode是unicode，中文字符实际上还是字符串，是raw byte。两样东西类型不一样，当然不能相加。

On 8/9/06, bird devdoer <devdoer at gmail.com> wrote:
>
> 测试1：
> s1='a'
> s2='中国'
> s＝s1+s2
> 没问题
>
> 测试2：
> s1=u'a'
> s2='中国'
> s=s1+s2
> 报错
> --
> devdoer
> devdoer at gmail.com
> http://project.mytianwang.cn/cgi-bin/blog
> _______________________________________________
> python-chinese
> Post: send python-chinese at lists.python.cn
> Subscribe: send subscribe to
> python-chinese-request at lists.python.cn
> Unsubscribe: send unsubscribe to
> python-chinese-request at lists.python.cn
> Detail Info:
> http://python.cn/mailman/listinfo/python-chinese
>
>

[导入自Mailman归档：http://www.zeuux.org/pipermail/zeuux-python]

徐继哲

0楼 2006年08月10日星期四 13:48

bird devdoer devdoer at gmail.com
Thu Aug 10 13:48:40 HKT 2006

恩 。 是python的类型检查的效果。
我想知道怎么不区分其类型，只把它们看作二进制，进行两者的合并





在06-8-10，shhgs <shhgs.efhilt at gmail.com> 写道：
>
> 当然报错。unicode是unicode，中文字符实际上还是字符串，是raw byte。两样东西类型不一样，当然不能相加。
>
> On 8/9/06, bird devdoer <devdoer at gmail.com> wrote:
> >
> > 测试1：
> > s1='a'
> > s2='中国'
> > s＝s1+s2
> > 没问题
> >
> > 测试2：
> > s1=u'a'
> > s2='中国'
> > s=s1+s2
> > 报错
> > --
> > devdoer
> > devdoer at gmail.com
> > http://project.mytianwang.cn/cgi-bin/blog
> > _______________________________________________
> > python-chinese
> > Post: send python-chinese at lists.python.cn
> > Subscribe: send subscribe to
> > python-chinese-request at lists.python.cn
> > Unsubscribe: send unsubscribe to
> > python-chinese-request at lists.python.cn
> > Detail Info:
> > http://python.cn/mailman/listinfo/python-chinese
> >
> >
>
> _______________________________________________
> python-chinese
> Post: send python-chinese at lists.python.cn
> Subscribe: send subscribe to python-chinese-request at lists.python.cn
> Unsubscribe: send unsubscribe to  python-chinese-request at lists.python.cn
> Detail Info: http://python.cn/mailman/listinfo/python-chinese
>
>


-- 
devdoer
devdoer at gmail.com
http://project.mytianwang.cn/cgi-bin/blog
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.exoweb.net/pipermail/python-chinese/attachments/20060810/09bf6c4e/attachment.html

[导入自Mailman归档：http://www.zeuux.org/pipermail/zeuux-python]

李迎辉

0楼 2006年08月10日星期四 13:51

limodou limodou at gmail.com
Thu Aug 10 13:51:27 HKT 2006

On 8/10/06, bird devdoer <devdoer at gmail.com> wrote:
>
> 恩 。 是python的类型检查的效果。
> 我想知道怎么不区分其类型，只把它们看作二进制，进行两者的合并
>
要进行编码转换，转成同一种才行。

-- 
I like python!
My Blog: http://www.donews.net/limodou
My Django Site: http://www.djangocn.org
NewEdit Maillist: http://groups.google.com/group/NewEdit

[导入自Mailman归档：http://www.zeuux.org/pipermail/zeuux-python]

徐继哲

0楼 2006年08月10日星期四 13:54

bird devdoer devdoer at gmail.com
Thu Aug 10 13:54:23 HKT 2006

我就是想把unicode看成rawbyte  ，难道unicode在物理上是不存在的么

在06-8-10，limodou <limodou at gmail.com> 写道：
>
> On 8/10/06, bird devdoer <devdoer at gmail.com> wrote:
> >
> > 恩 。 是python的类型检查的效果。
> > 我想知道怎么不区分其类型，只把它们看作二进制，进行两者的合并
> >
> 要进行编码转换，转成同一种才行。
>
> --
> I like python!
> My Blog: http://www.donews.net/limodou
> My Django Site: http://www.djangocn.org
> NewEdit Maillist: http://groups.google.com/group/NewEdit
>
> _______________________________________________
> python-chinese
> Post: send python-chinese at lists.python.cn
> Subscribe: send subscribe to python-chinese-request at lists.python.cn
> Unsubscribe: send unsubscribe to  python-chinese-request at lists.python.cn
> Detail Info: http://python.cn/mailman/listinfo/python-chinese
>
>


-- 
devdoer
devdoer at gmail.com
http://project.mytianwang.cn/cgi-bin/blog
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.exoweb.net/pipermail/python-chinese/attachments/20060810/9bd21521/attachment.htm

[导入自Mailman归档：http://www.zeuux.org/pipermail/zeuux-python]

李迎辉

0楼 2006年08月10日星期四 14:29

limodou limodou at gmail.com
Thu Aug 10 14:29:57 HKT 2006

On 8/10/06, bird devdoer <devdoer at gmail.com> wrote:
> 我就是想把unicode看成rawbyte  ，难道unicode在物理上是不存在的么
>
它是整数表示的。你看到的\uxxxx其实就是一个16位或32位的整数。转成字符串要么转为utf-8要么utf-16，要么是其它的字符串。可以去查一查以前的unicode的讨论。象utf-8这种编码是向向字节的。而unicode是面向整数的，不一样，因为它根本不是用一个个的字节来表示的，只能转换。

-- 
I like python!
My Blog: http://www.donews.net/limodou
My Django Site: http://www.djangocn.org
NewEdit Maillist: http://groups.google.com/group/NewEdit

[导入自Mailman归档：http://www.zeuux.org/pipermail/zeuux-python]

徐继哲

0楼 2006年08月10日星期四 14:34

bird devdoer devdoer at gmail.com
Thu Aug 10 14:34:15 HKT 2006

内部是存的整数阿  我明白了  谢谢limdou

在06-8-10，limodou <limodou at gmail.com> 写道：
>
> On 8/10/06, bird devdoer <devdoer at gmail.com> wrote:
> > 我就是想把unicode看成rawbyte  ，难道unicode在物理上是不存在的么
> >
>
> 它是整数表示的。你看到的\uxxxx其实就是一个16位或32位的整数。转成字符串要么转为utf-8要么utf-16，要么是其它的字符串。可以去查一查以前的unicode的讨论。象utf-8这种编码是向向字节的。而unicode是面向整数的，不一样，因为它根本不是用一个个的字节来表示的，只能转换。
>
> --
> I like python!
> My Blog: http://www.donews.net/limodou
> My Django Site: http://www.djangocn.org
> NewEdit Maillist: http://groups.google.com/group/NewEdit
>
> _______________________________________________
> python-chinese
> Post: send python-chinese at lists.python.cn
> Subscribe: send subscribe to python-chinese-request at lists.python.cn
> Unsubscribe: send unsubscribe to  python-chinese-request at lists.python.cn
> Detail Info: http://python.cn/mailman/listinfo/python-chinese
>
>


-- 
devdoer
devdoer at gmail.com
http://project.mytianwang.cn/cgi-bin/blog
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.exoweb.net/pipermail/python-chinese/attachments/20060810/a2cbee70/attachment.htm

[导入自Mailman归档：http://www.zeuux.org/pipermail/zeuux-python]

徐继哲

0楼 2006年08月10日星期四 15:19

helium helium.sun at gmail.com
Thu Aug 10 15:19:28 HKT 2006

>>> u'a'+'b'
u'ab'
>>>

普通字符串和unicode字符串是可以相加的，结果是unicode，其实是把普通字符串decode成unicode。但一般缺省编码都是ascii，中文有大于127的字符，所以出错。


在06-8-10，bird devdoer <devdoer at gmail.com > 写道：
>
> 内部是存的整数阿  我明白了  谢谢limdou
>
> 在06-8-10，limodou <limodou at gmail.com > 写道：
>
> > On 8/10/06, bird devdoer < devdoer at gmail.com> wrote:
> > 我就是想把unicode看成rawbyte  ，难道unicode在物理上是不存在的么
> >
>
> 它是整数表示的。你看到的\uxxxx其实就是一个16位或32位的整数。转成字符串要么转为utf-8要么utf-16，要么是其它的字符串。可以去查一查以前的unicode的讨论。象utf-8这种编码是向向字节的。而unicode是面向整数的，不一样，因为它根本不是用一个个的字节来表示的，只能转换。
>
> --
> I like python!
> My Blog: http://www.donews.net/limodou
> My Django Site: http://www.djangocn.org
> NewEdit Maillist: http://groups.google.com/group/NewEdit
>
> _______________________________________________
>
> python-chinese
> Post: send python-chinese at lists.python.cn
> Subscribe: send subscribe to python-chinese-request at lists.python.cn
> Unsubscribe: send unsubscribe to  python-chinese-request at lists.python.cn
> Detail Info: http://python.cn/mailman/listinfo/python-chinese
>
>
>
>
> --
> devdoer
> devdoer at gmail.com
> http://project.mytianwang.cn/cgi-bin/blog
>
> _______________________________________________
> python-chinese
> Post: send python-chinese at lists.python.cn
> Subscribe: send subscribe to python-chinese-request at lists.python.cn
> Unsubscribe: send unsubscribe to   python-chinese-request at lists.python.cn
> Detail Info: http://python.cn/mailman/listinfo/python-chinese
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.exoweb.net/pipermail/python-chinese/attachments/20060810/ad6915f5/attachment.htm

[导入自Mailman归档：http://www.zeuux.org/pipermail/zeuux-python]

李迎辉

0楼 2006年08月10日星期四 15:23

limodou limodou at gmail.com
Thu Aug 10 15:23:44 HKT 2006

On 8/10/06, helium <helium.sun at gmail.com> wrote:
> >>> u'a'+'b'
> u'ab'
> >>>
>
> 普通字符串和unicode字符串是可以相加的，结果是unicode，其实是把普通字符串decode成unicode。但一般缺省编码都是ascii，中文有大于127的字符，所以出错。
>
>
这也是自动进行了转换，只是不是由你来做的而已。而且是转成了unicode了。

-- 
I like python!
My Blog: http://www.donews.net/limodou
My Django Site: http://www.djangocn.org
NewEdit Maillist: http://groups.google.com/group/NewEdit

[导入自Mailman归档：http://www.zeuux.org/pipermail/zeuux-python]

请登录后回复。还没有在Zeuux哲思注册吗？现在注册！

Zeuux © 2025

京ICP备05028076号