2007年10月28日 星期日 11:01
1¡µÎªÊ²Ã´ÏÂÃæÓï¾ä»á³ö´í£¿
a> a=*u*"ΪʲôÏÂÃæÓï¾ä»á³ö´í"
b=a.encode('utf-8')
c=b.encode('utf-16')
-----------------------------------------------------
b> a="ΪʲôÏÂÃæÓï¾ä»á³ö´í"
b=a.encode('utf-8')
2¡µÎªÊ²Ã´bµÄÀàÐÍÊÇ"str"£¬bÔÚת»»¹ý³ÌÖÐÊDz»ÊǶªµôÁËËûµÄÀàÐÍ£¿»òÕß˵"utf-8"ÊÇÈçºÎ·´Ó³³öÀ´µÄ£¿
a=u"bµÄÀàÐÍ"
b=a.encode('utf-8')
print type(b)
½á¹ûÊÇ£ºstr
3¡µÈçºÎµÃµ½Ä¬ÈϱàÂëÀàÐÍ£¿
4¡µencodeÓëdecodeµ½µ×ÊÇÒÔËΪĿ±ê±àÂëµÄ£¿£¿
ÏÈллÀ²¡£
-------------- 下一部分 --------------
Ò»¸öHTML¸½¼þ±»ÒƳý...
URL: http://python.cn/pipermail/python-chinese/attachments/20071028/ce8cfe90/attachment-0001.htm
2007年10月28日 星期日 11:38
在07-10-28,??? ?? <clfff.peter在gmail.com> 写道: > > 1〉为什么下面语句会出错? > a> a=u"为什么下面语句会出错" > b=a.encode('utf-8') > c=b.encode('utf-16') b 已经是普通字符串了. > ----------------------------------------------------- > b> a="为什么下面语句会出错" > b=a.encode('utf-8') a 不是unicode 字符串. > > 2〉为什么b的类型是"str",b在转换过程中是不是丢掉了他的类型?或者说"utf-8"是如何反映出来的? > a=u"b的类型" > b=a.encode('utf-8') > print type(b) > 结果是:str > b 是按utf-8 编码的普通字符串. > > > 3〉如何得到默认编码类型? sys.getdefaultencoding() 这个是python 的默认编码. > > 4〉encode与decode到底是以谁为目标编码的?? encode 是把 unicode 字符串, 转换成对应编码的字符串. decode 是相反. > > 先谢谢啦。 > _______________________________________________ > python-chinese > Post: send python-chinese在lists.python.cn > Subscribe: send subscribe to python-chinese-request在lists.python.cn > Unsubscribe: send unsubscribe to python-chinese-request在lists.python.cn > Detail Info: http://python.cn/mailman/listinfo/python-chinese > -- Tao Fei (陶飞) My Blog: blog.filia.cn My Summer Of Code Blog: filiasoc.blogspot.com
2007年10月29日 星期一 13:30
µ«ÊÇ£¬ÏÂÃæÓï¾ä£º
a="ΪʲôÏÂÃæÓï¾ä»á³ö´í"
a.decode('utf-16')
a.decode('utf-8') #ΪʲôֻÓÐÕâÒ»Ðлá³ö´í£¿
a.decode('big5')
a.decode('gb2312')
»¹ÓУ¬Äã˵£º
encode ÊÇ°Ñ unicode ×Ö·û´®, ת»»³É¶ÔÓ¦±àÂëµÄ×Ö·û´®.
decode ÊÇÏà·´.
µ«ÊÇstrºÍunicode¶¼ÓÐdecodeºÍencodeѽ
ÔÚ07-10-28£¬Tao Fei <filia.tao在gmail.com> дµÀ£º
>
> ÔÚ07-10-28£¬??? ?? <clfff.peter在gmail.com> дµÀ£º
> >
> > 1¡µÎªÊ²Ã´ÏÂÃæÓï¾ä»á³ö´í£¿
> > a> a=u"ΪʲôÏÂÃæÓï¾ä»á³ö´í"
> > b=a.encode('utf-8')
> > c=b.encode('utf-16')
> b ÒѾÊÇÆÕͨ×Ö·û´®ÁË.
> > -----------------------------------------------------
> > b> a="ΪʲôÏÂÃæÓï¾ä»á³ö´í"
> > b=a.encode('utf-8')
> a ²»ÊÇunicode ×Ö·û´®.
> >
> > 2¡µÎªÊ²Ã´bµÄÀàÐÍÊÇ"str"£¬bÔÚת»»¹ý³ÌÖÐÊDz»ÊǶªµôÁËËûµÄÀàÐÍ£¿»òÕß˵"utf-8"ÊÇÈçºÎ·´Ó³³öÀ´µÄ£¿
> > a=u"bµÄÀàÐÍ"
> > b=a.encode('utf-8')
> > print type(b)
> > ½á¹ûÊÇ£ºstr
> >
> b Êǰ´utf-8 ±àÂëµÄÆÕͨ×Ö·û´®.
> >
>
> >
> > 3¡µÈçºÎµÃµ½Ä¬ÈϱàÂëÀàÐÍ£¿
>
> sys.getdefaultencoding()
> Õâ¸öÊÇpython µÄĬÈϱàÂë.
>
> >
> > 4¡µencodeÓëdecodeµ½µ×ÊÇÒÔËΪĿ±ê±àÂëµÄ£¿£¿
> encode ÊÇ°Ñ unicode ×Ö·û´®, ת»»³É¶ÔÓ¦±àÂëµÄ×Ö·û´®.
> decode ÊÇÏà·´.
> >
> > ÏÈллÀ²¡£
> > _______________________________________________
> > python-chinese
> > Post: send python-chinese在lists.python.cn
> > Subscribe: send subscribe to python-chinese-request在lists.python.cn
> > Unsubscribe: send unsubscribe to python-chinese-request在lists.python.cn
> > Detail Info: http://python.cn/mailman/listinfo/python-chinese
> >
>
>
>
> --
> Tao Fei (ÌÕ·É)
> My Blog: blog.filia.cn
> My Summer Of Code Blog: filiasoc.blogspot.com
> _______________________________________________
> python-chinese
> Post: send python-chinese在lists.python.cn
> Subscribe: send subscribe to python-chinese-request在lists.python.cn
> Unsubscribe: send unsubscribe to python-chinese-request在lists.python.cn
> Detail Info: http://python.cn/mailman/listinfo/python-chinese
-------------- 下一部分 --------------
Ò»¸öHTML¸½¼þ±»ÒƳý...
URL: http://python.cn/pipermail/python-chinese/attachments/20071029/bd85b992/attachment.html
2007年10月29日 星期一 13:55
可能你的文字里包含了些特殊字符吧。加个参数a.decode('utf-8','ignore') 对无法转换的字符进行忽略。
在 07-10-29,??? ??<clfff.peter在gmail.com> 写道:
> 但是,下面语句:
>
> a="为什么下面语句会出错"
> a.decode('utf-16')
> a.decode('utf-8') #为什么只有这一行会出错?
> a.decode('big5')
> a.decode('gb2312')
>
> 还有,你说:
> encode 是把 unicode 字符串, 转换成对应编码的字符串.
> decode 是相反.
> 但是str和unicode都有decode和encode呀
--
Blog http://vicalloy.spaces.live.com/
My googlepage http://vicalloy.googlepages.com/
OldPhoto http://www.lzpian.com/
2007年10月29日 星期一 14:35
ÇëÎÊÏ£º
a="ΪʲôÏÂÃæÓï¾ä»á³ö´í"
a.decode('gb2312') #ÕâÒ»¾äµ½µ××öÁËʲô£¿
u'\u4e3a\u4ec0\u4e48\u4e0b\u9762\u8bed\u53e5\u4f1a\u51fa\u9519' #ÕâÊǽá¹û
ÊÇ´Ógbkתµ½gb2312Âð£¿
µ«Êǽá¹ûÖÐΪʲôÒÔu´òÍ·£¬¶øÇÒÿ¸ö¶¼ÒÔ'\u'´òÍ·£¬ÄܽâÊÍÏÂΪʲôÂ𣿻òÕßÕâ´ú±íʲô£¿
ÔÚ07-10-29£¬vicalloy <zbirder在gmail.com> дµÀ£º
>
> ¿ÉÄÜÄãµÄÎÄ×ÖÀï°üº¬ÁËÐ©ÌØÊâ×Ö·û°É¡£¼Ó¸ö²ÎÊýa.decode('utf-8','ignore') ¶ÔÎÞ·¨×ª»»µÄ×Ö·û½øÐкöÂÔ¡£
> ÔÚ 07-10-29£¬??? ??<clfff.peter在gmail.com> дµÀ£º
> > µ«ÊÇ£¬ÏÂÃæÓï¾ä£º
> >
> > a="ΪʲôÏÂÃæÓï¾ä»á³ö´í"
> > a.decode('utf-16')
> > a.decode('utf-8') #ΪʲôֻÓÐÕâÒ»Ðлá³ö´í£¿
> > a.decode('big5')
> > a.decode('gb2312')
> >
> > »¹ÓУ¬Äã˵£º
> > encode ÊÇ°Ñ unicode ×Ö·û´®, ת»»³É¶ÔÓ¦±àÂëµÄ×Ö·û´®.
> > decode ÊÇÏà·´.
> > µ«ÊÇstrºÍunicode¶¼ÓÐdecodeºÍencodeѽ
>
>
> --
> Blog http://vicalloy.spaces.live.com/
> My googlepage http://vicalloy.googlepages.com/
> OldPhoto http://www.lzpian.com/
> _______________________________________________
> python-chinese
> Post: send python-chinese在lists.python.cn
> Subscribe: send subscribe to python-chinese-request在lists.python.cn
> Unsubscribe: send unsubscribe to python-chinese-request在lists.python.cn
> Detail Info: http://python.cn/mailman/listinfo/python-chinese
-------------- 下一部分 --------------
Ò»¸öHTML¸½¼þ±»ÒƳý...
URL: http://python.cn/pipermail/python-chinese/attachments/20071029/597dbfc3/attachment.htm
2007年10月29日 星期一 14:47
gbkÖ»²»¹ýÊDZÈgb2312 ·¶Î§´ó£¬ºÃÏñÒ²°üÀ¨·±ÌåÖÐÎÄ¡£
a.decode('gb2312') ÊÇ˵˵£¬aÊÇÓÃgb2312±àÂëµÄ£¬ÏÖÔÚÒª°ÑËü±äΪunicode
faitherQ
2007-10-29
·¢¼þÈË£º ??? ??
·¢ËÍʱ¼ä£º 2007-10-29 14:35:43
ÊÕ¼þÈË£º python-chinese在lists.python.cn
³ËÍ£º
Ö÷Ì⣺ Re: [python-chinese]Çë½Ì¼¸¸öÎÊÌâ¡£
ÇëÎÊÏ£º
a="ΪʲôÏÂÃæÓï¾ä»á³ö´í"
a.decode('gb2312') #ÕâÒ»¾äµ½µ××öÁËʲô£¿
u'\u4e3a\u4ec0\u4e48\u4e0b\u9762\u8bed\u53e5\u4f1a\u51fa\u9519' #ÕâÊǽá¹û
ÊÇ´Ógbkתµ½gb2312Âð£¿
µ«Êǽá¹ûÖÐΪʲôÒÔu´òÍ·£¬¶øÇÒÿ¸ö¶¼ÒÔ'\u'´òÍ·£¬ÄܽâÊÍÏÂΪʲôÂ𣿻òÕßÕâ´ú±íʲô£¿
ÔÚ07-10-29£¬vicalloy <zbirder在gmail.com> дµÀ£º
¿ÉÄÜÄãµÄÎÄ×ÖÀï°üº¬ÁËÐ©ÌØÊâ×Ö·û°É¡£¼Ó¸ö²ÎÊýa.decode('utf-8','ignore') ¶ÔÎÞ·¨×ª»»µÄ×Ö·û½øÐкöÂÔ¡£
ÔÚ 07-10-29£¬??? ??< clfff.peter在gmail.com> дµÀ£º
> µ«ÊÇ£¬ÏÂÃæÓï¾ä£º
>
> a="ΪʲôÏÂÃæÓï¾ä»á³ö´í"
> a.decode('utf-16')
> a.decode('utf-8') #ΪʲôֻÓÐÕâÒ»Ðлá³ö´í£¿
> a.decode('big5')
> a.decode ('gb2312')
>
> »¹ÓУ¬Äã˵£º
> encode ÊÇ°Ñ unicode ×Ö·û´®, ת»»³É¶ÔÓ¦±àÂëµÄ×Ö·û´®.
> decode ÊÇÏà·´.
> µ«ÊÇstrºÍunicode¶¼ÓÐdecodeºÍencodeѽ
--
Blog http://vicalloy.spaces.live.com/
My googlepage http://vicalloy.googlepages.com/
OldPhoto http://www.lzpian.com/
_______________________________________________
python-chinese
Post: send python-chinese在lists.python.cn
Subscribe: send subscribe to python-chinese-request在lists.python.cn
Unsubscribe: send unsubscribe to python-chinese-request在lists.python.cn
Detail Info: http://python.cn/mailman/listinfo/python-chinese
-------------- 下一部分 --------------
Ò»¸öHTML¸½¼þ±»ÒƳý...
URL: http://python.cn/pipermail/python-chinese/attachments/20071029/31f1920d/attachment-0001.html
2007年10月29日 星期一 14:51
看了一下文档. Unicode strings are stored internally as sequences of codepoints (to be precise as Py_UNICODE arrays). ........................... Once a Unicode object is used outside of CPU and memory, CPU endianness and how these arrays are stored as bytes become an issue. Transforming a unicode object into a sequence of bytes is called encoding and recreating the unicode object from the sequence of bytes is known as decoding. 在 07-10-29,??? ??<clfff.peter在gmail.com> 写道: > 请问下: > a="为什么下面语句会出错" > a.decode('gb2312') #这一句到底做了什么? a 本身是几个 gb2312 编码的字符串 . 也就是上面所说的 the sequence of bytes . len(a) = 20 , 是按字节保存的. > u'\u4e3a\u4ec0\u4e48\u4e0b\u9762\u8bed\u53e5\u4f1a\u51fa\u9519' 这个是一个unicode 对象. 是一个 Py_UNICODE arrays. 你可以尝试len(a.decode('gb2312')) = 10 > #这是结果 > 是从gbk转到gb2312吗? > 但是结果中为什么以u打头,而且每个都以'\u'打头,能解释下为什么吗?或者这代表什么? > > > > 在07-10-29,vicalloy <zbirder在gmail.com> 写道: > > 可能你的文字里包含了些特殊字符吧。加个参数a.decode('utf-8','ignore') 对无法转换的字符进行忽略。 > > 在 07-10-29,??? ??< clfff.peter在gmail.com> 写道: > > > 但是,下面语句: > > > > > > a="为什么下面语句会出错" > > > a.decode('utf-16') > > > a.decode('utf-8') #为什么只有这一行会出错? > > > a.decode('big5') > > > a.decode ('gb2312') > > > > > > 还有,你说: > > > encode 是把 unicode 字符串, 转换成对应编码的字符串. > > > decode 是相反. > > > 但是str和unicode都有decode和encode呀 > > > > > > -- > > Blog http://vicalloy.spaces.live.com/ > > My googlepage http://vicalloy.googlepages.com/ > > OldPhoto http://www.lzpian.com/ > > _______________________________________________ > > python-chinese > > Post: send python-chinese在lists.python.cn > > Subscribe: send subscribe to > python-chinese-request在lists.python.cn > > Unsubscribe: send unsubscribe to > python-chinese-request在lists.python.cn > > Detail Info: > http://python.cn/mailman/listinfo/python-chinese > > > _______________________________________________ > python-chinese > Post: send python-chinese在lists.python.cn > Subscribe: send subscribe to > python-chinese-request在lists.python.cn > Unsubscribe: send unsubscribe to > python-chinese-request在lists.python.cn > Detail Info: > http://python.cn/mailman/listinfo/python-chinese > -- Tao Fei (陶飞) My Blog: blog.filia.cn My Summer Of Code Blog: filiasoc.blogspot.com
2007年10月29日 星期一 15:18
在07-10-29,??? ?? <clfff.peter at gmail.com> 写道: > > 请问下: > a="为什么下面语句会出错" > a.decode('gb2312') #这一句到底做了什么? > 把a当作gb2312进行解码 u'\u4e3a\u4ec0\u4e48\u4e0b\u9762\u8bed\u53e5\u4f1a\u51fa\u9519' #这是结果 > 是从gbk转到gb2312吗? > 不是。是从gb2312到unicode 但是结果中为什么以u打头,而且每个都以'\u'打头,能解释下为什么吗?或者这代表什么? > > \u代表 unicode > 1 str 是指带有编码的字符串 2 unicode 是指不带有编码的字符串 这两个概念的相互转换是这样进行的: str ------> unicode --------> str decode encode 解码 编码 在07-10-29,vicalloy <zbirder at gmail.com> 写道: > > > > 可能你的文字里包含了些特殊字符吧。加个参数a.decode('utf-8','ignore') 对无法转换的字符进行忽略。 > > 在 07-10-29,??? ??< clfff.peter at gmail.com> 写道: > > > 但是,下面语句: > > > > > > a="为什么下面语句会出错" > > > a.decode('utf-16') > > > a.decode('utf-8') #为什么只有这一行会出错? > > > a.decode('big5') > > > a.decode ('gb2312') > > > > > > 还有,你说: > > > encode 是把 unicode 字符串, 转换成对应编码的字符串. > > > decode 是相反. > > > 但是str和unicode都有decode和encode呀 > > > > > > -- > > Blog http://vicalloy.spaces.live.com/ > > My googlepage http://vicalloy.googlepages.com/ > > OldPhoto http://www.lzpian.com/ > > _______________________________________________ > > python-chinese > > Post: send python-chinese at lists.python.cn > > Subscribe: send subscribe to python-chinese-request at lists.python.cn > > Unsubscribe: send unsubscribe to python-chinese-request at lists.python.cn > > Detail Info: http://python.cn/mailman/listinfo/python-chinese > > > > _______________________________________________ > python-chinese > Post: send python-chinese at lists.python.cn > Subscribe: send subscribe to python-chinese-request at lists.python.cn > Unsubscribe: send unsubscribe to python-chinese-request at lists.python.cn > Detail Info: http://python.cn/mailman/listinfo/python-chinese > -- wayne http://blog.csdn.net/wayne92 Kingsoft(Zhuhai) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://python.cn/pipermail/python-chinese/attachments/20071029/d1aaa3df/attachment.htm
2007年10月29日 星期一 15:25
лл´ó¼Ò£¬ÓÐЩÃ÷°×ÁË¡£^____^ ÔÚ07-10-29£¬Wayne <moonbingbing在gmail.com> дµÀ£º > > > > ÔÚ07-10-29£¬??? ?? <clfff.peter在gmail.com> дµÀ£º > > > > ÇëÎÊÏ£º > > a="ΪʲôÏÂÃæÓï¾ä»á³ö´í" > > a.decode('gb2312') #ÕâÒ»¾äµ½µ××öÁËʲô£¿ > > > > °Ñaµ±×÷gb2312½øÐнâÂë > > u'\u4e3a\u4ec0\u4e48\u4e0b\u9762\u8bed\u53e5\u4f1a\u51fa\u9519' #ÕâÊǽá¹û > > ÊÇ´Ógbkתµ½gb2312Â𣿠> > > > ²»ÊÇ¡£ÊÇ´Ógb2312µ½unicode > > > µ«Êǽá¹ûÖÐΪʲôÒÔu´òÍ·£¬¶øÇÒÿ¸ö¶¼ÒÔ'\u'´òÍ·£¬ÄܽâÊÍÏÂΪʲôÂ𣿻òÕßÕâ´ú±íʲô£¿ > > > > \u´ú±í unicode > > > > 1 str ÊÇÖ¸´øÓбàÂëµÄ×Ö·û´® > 2 unicode ÊÇÖ¸²»´øÓбàÂëµÄ×Ö·û´® > > ÕâÁ½¸ö¸ÅÄîµÄÏ໥ת»»ÊÇÕâÑù½øÐеģº > str ------> unicode --------> str > decode encode > ½âÂë ±àÂë > > > ÔÚ07-10-29£¬vicalloy <zbirder在gmail.com> дµÀ£º > > > > > > ¿ÉÄÜÄãµÄÎÄ×ÖÀï°üº¬ÁËÐ©ÌØÊâ×Ö·û°É¡£¼Ó¸ö²ÎÊýa.decode('utf-8','ignore') ¶ÔÎÞ·¨×ª»»µÄ×Ö·û½øÐкöÂÔ¡£ > > > ÔÚ 07-10-29£¬??? ??< clfff.peter在gmail.com> дµÀ£º > > > > µ«ÊÇ£¬ÏÂÃæÓï¾ä£º > > > > > > > > a="ΪʲôÏÂÃæÓï¾ä»á³ö´í" > > > > a.decode('utf-16') > > > > a.decode('utf-8') #ΪʲôֻÓÐÕâÒ»Ðлá³ö´í£¿ > > > > a.decode('big5') > > > > a.decode ('gb2312') > > > > > > > > »¹ÓУ¬Äã˵£º > > > > encode ÊÇ°Ñ unicode ×Ö·û´®, ת»»³É¶ÔÓ¦±àÂëµÄ×Ö·û´®. > > > > decode ÊÇÏà·´. > > > > µ«ÊÇstrºÍunicode¶¼ÓÐdecodeºÍencodeѽ > > > > > > > > > -- > > > Blog http://vicalloy.spaces.live.com/ > > > My googlepage http://vicalloy.googlepages.com/ > > > OldPhoto http://www.lzpian.com/ > > > _______________________________________________ > > > python-chinese > > > Post: send python-chinese在lists.python.cn > > > Subscribe: send subscribe to python-chinese-request在lists.python.cn > > > Unsubscribe: send unsubscribe to > > > python-chinese-request在lists.python.cn > > > Detail Info: http://python.cn/mailman/listinfo/python-chinese > > > > > > > > _______________________________________________ > > python-chinese > > Post: send python-chinese在lists.python.cn > > Subscribe: send subscribe to python-chinese-request在lists.python.cn > > Unsubscribe: send unsubscribe to > > python-chinese-request在lists.python.cn > > Detail Info: http://python.cn/mailman/listinfo/python-chinese > > > > > > -- > wayne > http://blog.csdn.net/wayne92 > Kingsoft(Zhuhai) > _______________________________________________ > python-chinese > Post: send python-chinese在lists.python.cn > Subscribe: send subscribe to python-chinese-request在lists.python.cn > Unsubscribe: send unsubscribe to python-chinese-request在lists.python.cn > Detail Info: http://python.cn/mailman/listinfo/python-chinese > -------------- 下一部分 -------------- Ò»¸öHTML¸½¼þ±»ÒƳý... URL: http://python.cn/pipermail/python-chinese/attachments/20071029/45e26dcf/attachment-0001.html
Zeuux © 2025
京ICP备05028076号