Python论坛的帖子：

星期三十二月 6 09:43:57 HKT 2006

我觉得，这样的信，可能还是发在maillist里面给大家讨论比较好~

On Tue, 05 Dec 2006 18:50:01 +0800, boyeestudio <boyee118 at gmail.com> wrote:

> 大哥，我在这方面是新手，看了你的回复，我自己也试了一下，不明白其中的道 
> 理，遂向您请教，先谢谢了。
>
>
> 2006/12/5, Leira Hua <lhua at altigen.com.cn>:
>>
>> 1. 12个宽字符是 24s，C字符串以0结尾。
>
>
> 1。这个宽字符个数是用十六进制编辑器中查到的吗？

你的struct定义里面就是WCHAR  wchPtName[12]; WCHAR其实就是个short，占两个字
节，所以这个数组占24个字节。

>
> 2. 字符串是'utf-16le'编码的。
>
>
> 2。您如何知道他的编码是这个呢？如果事先不道是用VC，要想知道准确编码，该如 
> 何处理？

判断编码是个大问题。至于这个，你用python换几种编码试一下就知道了~  既然你说
是用VC的unicode编码的，又是10个字节的数据，肯定是utf-16的编码，只是le/be的
问题~ 试一下就知道了~

>
> 3. 这个文件的纪录是\r\n结尾的吧？
>>
>> import struct
>>
>> fp = open('poi1.dat')
>> rec = fp.readline()
>>
>> fmt = '24sdd'
>
>
> 3。这个'24sdd'是什么意思，没弄明白，请指教！

这个struct有3个子段，第一个市24个字节长的字符串，第二个和第三个都是double浮
点数。


>
> pyrec = struct.unpack(fmt, rec)
>>
>> name = unicode(pyrec[0].split('\x00\x00')[0], 'utf-16le')
>>
>> print name.encode('gbk'), pyrec[1], pyrec[2]
>>
>>
>>
>> 读取成功，内容为： 甘家口大厦 116.1313 39.12345
>>
>> On Tue, 05 Dec 2006 12:17:59 +0800, yang haijun
>> <veldtwolf at gmail.com> wrote:
>>
>> > 不行啊，附件是一个例子文件，只有一条记录。
>> > 能帮我读出来看看吗
>> > 另外，struct模块中参数没有unicode的参数，'s'和'p'都是char的，
>> > 还有其它模块可以读取unicode编码的二进制文件吗
>> >
>> >
>> >
>> > 在06-12-5，刘鑫 <march.liu at gmail.com> 写道：
>> >>
>> >> 读完以后用unicode.decode(wchPtName, "utf-16")解码试试看，不行的话试试
>> >> utf-8、mbcs或gbk。
>> >>
>> >> 在06-12-5，yang haijun <veldtwolf at gmail.com > 写道：
>> >> >
>> >> > 我碰到的问题是这样的：
>> >> > 用vc写的二进制文件，内容是多条结构记录的文件，结构大致如下：
>> >> > struct POI
>> >> > ｛
>> >> > WCHAR  wchPtName[12];
>> >> > double  dLongitude;
>> >> > double  dLatitude;
>> >> > ｝；
>> >> >
>> >> > 注意这个wchPtName字段，是采用VC中Unicode编码存储的，而不是通常的 
>> ANSI，
>> >> 内容是汉字。
>> >> >
>> >> > 我的代码大致如下：
>> >> > import struct
>> >> > fp = open('poi.dat', 'rb')
>> >> >
>> >> > fmt = '8sdd'
>> >> > count = struct.calcsize(fmt)
>> >> >
>> >> > rec = fp.read(count)
>> >> >
>> >> > pyrec = struct.unpack(fmt, rec)
>> >> >
>> >> > 然后显示pyrec内容是乱的，如果将wchPtName改成Ansi编码，就没有问题 
>> 了，
>> >> > 我想可能需要编码转换吧，但是没有转换成功。
>> >> >
>> >> > 要求是：不能转换wchPtName为Ansi编码，也不使用python的c/c++扩展方式 
>> 读取
>> >> 这个poi.dat文件。
>> >> > 如果不使用struct模块读取，还有其它的模块能读取unicode编码的数据吗？
>> >> > 请大家帮忙看看。
>> >> >
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > python-chinese
>> >> > Post: send python-chinese at lists.python.cn
>> >> > Subscribe: send subscribe to python-chinese-request at lists.python.cn
>> >> > Unsubscribe: send unsubscribe to
>> >> > python-chinese-request at lists.python.cn
>> >> > Detail Info: http://python.cn/mailman/listinfo/python-chinese
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> 欢迎访问：
>> >> http://blog.csdn.net/ccat
>> >>
>> >> 刘鑫
>> >> March.Liu
>> >>
>> >> _______________________________________________
>> >> python-chinese
>> >> Post: send python-chinese at lists.python.cn
>> >> Subscribe: send subscribe to python-chinese-request at lists.python.cn
>> >> Unsubscribe: send unsubscribe to
>> python-chinese-request at lists.python.cn
>> >> Detail Info: http://python.cn/mailman/listinfo/python-chinese
>> >>
>>
>>
>>
>> --
>> Leira Hua
>> http://my.opera.com/Leira
>>
>> _______________________________________________
>> python-chinese
>> Post: send python-chinese at lists.python.cn
>> Subscribe: send subscribe to python-chinese-request at lists.python.cn
>> Unsubscribe: send unsubscribe to  python-chinese-request at lists.python.cn
>> Detail Info: http://python.cn/mailman/listinfo/python-chinese


-- 
Leira Hua
http://my.opera.com/Leira

标题：[python-chinese] 困扰我一个多月的问题：python读取vc写的二进制结构文件