博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
将UTF8编码转化为中文 - NSString方法
阅读量:6568 次
发布时间:2019-06-24

本文共 2738 字,大约阅读时间需要 9 分钟。

方法一:

代码如下,如有更好的方法 麻烦贴出来,这个方法是通过webview进行解码的

        UIWebView *web = [[UIWebView alloc] init];
    NSString *tsw = @"%E4%B8%AD%E5%9B%BD";
    NSString *sc = [NSString stringWithFormat:@"decodeURIComponent('%@')",tsw];
    NSString *st = [web stringByEvaluatingJavaScriptFromString:sc];
    NSLog(st);
    [web release]; 

方法二:

测试了一下,搞定了,用NSString的stringByReplacingPercentEscapesUsingEncoding方法就可以了,可以这样子:

    NSString* strAfterDecodeByUTF8AndURI = [@"%E4%B8%AD%E5%9B%BD" stringByReplacingPercentEscapesUsingEncoding:NSUTF8StringEncoding];

    NSLog(@"strAfterDecodeByUTF8AndURI=%@", strAfterDecodeByUTF8AndURI);    
这个问题的本质时,首先这段内容是utf-8编码,然后又进行了URL Encode,所以解码的时候,先URL Decode,再utf-8解码即可
什么是url encode参见 
所以
stringByReplacingPercentEscapesUsingEncoding 方法是用于url decode
然后其中的参数NSUTF8StringEncoding是指定了UTF-8编码即可
=================================
关于  的实现,也做了些分析如下:
从原理上解释下这种做法。
编码定义,见下面的c)
A: There are three or four options for making Unicode fit into an 8-bit format.
a) Use UTF-8. This preserves ASCII, but not Latin-1, because the characters >127 are different from Latin-1. UTF-8 uses the bytes in the ASCII only for ASCII characters. Therefore, it works well in any environment where ASCII characters have a significance as syntax characters, e.g. file name syntaxes, markup languages, etc., but where the all other characters may use arbitrary bytes.
Example: “Latin Small Letter s with Acute” (015B) would be encoded as two bytes: C5 9B.
b) Use Java or C style escapes, of the form \uXXXXX or \xXXXXX. This format is not standard for text files, but well defined in the framework of the languages in question, primarily for source files.
Example: The Polish word “wyjście” with character “Latin Small Letter s with Acute” (015B) in the middle (ś is one character) would look like: “wyj\u015Bcie".
c) Use the &#xXXXX; or &#DDDDD; numeric character escapes as in HTML or XML. Again, these are not standard for plain text files, but well defined within the framework of these markup languages.
Example: “wyjście” would look like “wyjście"
d) Use SCSU. This format compresses Unicode into 8-bit format, preserving most of ASCII, but using some of the control codes as commands for the decoder. However, while ASCII text will look like ASCII text after being encoded in SCSU, other characters may occasionally be encoded with the same byte values, making SCSU unsuitable for 8-bit channels that blindly interpret any of the bytes as ASCII characters.
Example: “<SC2> wyjÛcie” where <SC2> indicates the byte 0x12 and “Û” corresponds to byte 0xDB.
如c所描述,这是一种“未标准"但广泛采用的做法,说是山寨编码也行 :-)
所以编码过程是
字符串 -> Unicode编码 -> &#xXXXX; or &#DDDDD; 
解码过程反过来即可
注意:由于这种编码方式是“山寨”未“标准”的编码,所以iPhone的SDK没有支持(无法向上面utf-8编码一样),只能自己搞定(也不是很难了,见dboylx的实现)

转载于:https://www.cnblogs.com/pandas/p/4214642.html

你可能感兴趣的文章
即时通讯有标准 IM的四种即时通讯协议简介
查看>>
2.7、Android Studio使用翻译编辑器本地化UI
查看>>
雷林鹏分享:PHP 魔术常量
查看>>
[BZOJ2216][Poi2011]Lightning Conductor[决策单调性优化]
查看>>
安装laravel框架
查看>>
java 循环时间调用 程序(转)
查看>>
逻辑电路 - 与非门Nand Gate
查看>>
linux下vi命令修改文件及保存的使用方法
查看>>
SpringCloud成长之路 一 注册与发现(Eureka)
查看>>
if else流程判断
查看>>
堆排序详解
查看>>
第一章基本语法
查看>>
mysql数据库从删库到跑路之mysql完整性约束
查看>>
简单的Writer和Reader
查看>>
zabbix学习(四)IT_Service管理
查看>>
linux 下的lamp的简单安装
查看>>
Typescript 其实就想排个序和枚举取数
查看>>
virt-manager管理kvm
查看>>
python测试rabbitmq的消息收发
查看>>
熊猫直播Rancho发布系统构建之路
查看>>