Java微信表情包字符处理,数据库存储以及转义解决方式
解决的方法一
<dependency
>
<groupId
>com.vdurmont</groupId
>
<artifactId
>emoji
-java</artifactId
>
<version
>5.1.1</version
>
</dependency
>
public class TestUtils {
public static void main(String args
[]){
String strs
= "苍天厚土\uD83D\uDE01 \uD83D\uDC36 \uD83E\uDD14 \uD83D\uDC7B \uD83D\uDE92";
System
.out
.println("原始字符为:\n" + strs
);
System
.out
.println("to aliases 之后:");
System
.out
.println("是否包含表情包"+ EmojiManager
.containsEmoji(strs
));
System
.out
.println("是否全是表情包"+EmojiManager
.isOnlyEmojis(strs
));
String str
= "An :grinning:awesome :smiley:string 😄with a few :wink:emojis!";
String result
= EmojiParser
.parseToUnicode(str
);
System
.out
.println("字符转为表情包:EmojiParser.parseToUnicode"+result
);
String str1
= "An 😀awesome 😃string with a few 😉emojis!";
String result1
= EmojiParser
.parseToAliases(str1
);
System
.out
.println("表情包转为字符:EmojiParser.parseToAliases"+result1
);
String str2
= "An 😀awesome 😃string with a few 😉emojis!";
Collection
<Emoji> collection
= new ArrayList<Emoji>();
collection
.add(EmojiManager
.getForAlias("wink"));
System
.out
.println("去除所有表情包:"+EmojiParser
.removeAllEmojis(str2
));
System
.out
.println("去除指定表情包以外的表情包"+EmojiParser
.removeAllEmojisExcept(str2
, collection
));
System
.out
.println("去除指定表情包"+EmojiParser
.removeEmojis(str2
, collection
));
System
.out
.println("去除指定表情包");
String str3
= "\uD83D\uDC66\uD83C\uDFFF 😀😃😉";
System
.out
.println("parseToHtmlDecimal"+EmojiParser
.parseToHtmlDecimal(str3
));
System
.out
.println("parseToAliases"+EmojiParser
.parseToAliases(str3
));
System
.out
.println("parseToUnicode"+EmojiParser
.parseToUnicode(str3
));
System
.out
.println("parseToHtmlHexadecimal"+EmojiParser
.parseToHtmlHexadecimal(str3
));
System
.out
.println("parseToAliases");
System
.out
.println(EmojiParser
.parseToHtmlDecimal(str3
, EmojiParser
.FitzpatrickAction
.PARSE
));
System
.out
.println(EmojiParser
.parseToHtmlDecimal(str3
, EmojiParser
.FitzpatrickAction
.REMOVE
));
System
.out
.println(EmojiParser
.parseToHtmlDecimal(str3
, EmojiParser
.FitzpatrickAction
.IGNORE
));
}
}
解决的方法二:
保证数据库的版本,可以使用utf-8mb4的字符集,库,表,字段的字符集都使用utf8mb4。字段的排序规则为unicode_ci
参考的博客:https://blog.csdn.net/Yan_Less/article/details/104969206?utm_medium=distribute.pc_relevant.none-task-blog-title-5&spm=1001.2101.3001.4242