Qual é a regex para extrair todos os emojis de uma string?

Eu tenho uma String codificada em UTF-8. Por exemplo:

Thats a nice joke 😆😆😆 😛 

Eu tenho que extrair todos os emojis presentes na frase. E o emoji pode ser qualquer

Quando esta frase é visualizada no terminal usando o comando less text.txt ela é visualizada como:

 Thats a nice joke   

Este é o código UTF correspondente para o emoji. Todos os códigos para emojis podem ser encontrados em emojiegrado .

Para o propósito de encontrar todas as ocorrências, usei um padrão de expressão regular () Mas ele não funcionou para a string codificada em UTF-8.

A seguir está meu código:

  String s="Thats a nice joke 😆😆😆 😛"; Pattern pattern = Pattern.compile("()"); Matcher matcher = pattern.matcher(s); List matchList = new ArrayList(); while (matcher.find()) { matchList.add(matcher.group()); } for(int i=0;i<matchList.size();i++){ System.out.println(matchList.get(i)); } 

Este pdf diz Range: 1F300–1F5FF for Miscellaneous Symbols and Pictographs . Então eu quero capturar qualquer personagem dentro desse intervalo.

o pdf que você acabou de mencionar diz Range: 1F300–1F5FF para diversos símbolos e pictogramas. Então digamos que eu queira capturar qualquer personagem dentro desse intervalo. Agora o que fazer?

Ok, mas vou apenas notar que os emojis da sua pergunta estão fora desse intervalo! 🙂

O fato de estarem acima de 0xFFFF complica as coisas, porque as strings Java armazenam o UTF-16. Portanto, não podemos usar apenas uma class de caracteres simples para isso. Nós vamos ter pares substitutos . (Mais: http://www.unicode.org/faq/utf_bom.html )

U + 1F300 em UTF-16 acaba sendo o par \uD83C\uDF00 ; U + 1F5FF acaba sendo \uD83D\uDDFF . Note que o primeiro personagem subiu, nós cruzamos pelo menos um limite. Então, temos que saber quais intervalos de pares substitutos estamos procurando.

Não estando mergulhado em conhecimento sobre o funcionamento interno do UTF-16, eu escrevi um programa para descobrir (fonte no final – eu verificaria se eu fosse você, ao invés de confiar em mim). Ele me diz que estamos procurando por \uD83C seguido por qualquer coisa no intervalo \uDF00-\uDFFF (inclusive), ou \uD83D seguido por qualquer coisa no intervalo \uDC00-\uDDFF (inclusive).

Então, armado com esse conhecimento, em teoria, poderíamos agora escrever um padrão:

 // This is wrong, keep reading Pattern p = Pattern.compile("(?:\uD83C[\uDF00-\uDFFF])|(?:\uD83D[\uDC00-\uDDFF])"); 

Essa é uma alternação de dois grupos que não capturam, o primeiro grupo para os pares que começam com \uD83C e o segundo grupo para os pares que começam com \uD83D .

Mas isso falha (não encontra nada). Tenho certeza de que é porque estamos tentando especificar metade de um par substituto em vários lugares:

 Pattern p = Pattern.compile("(?:\uD83C[\uDF00-\uDFFF])|(?:\uD83D[\uDC00-\uDDFF])"); // Half of a pair --------------^------^------^-----------^------^------^ 

Não podemos simplesmente dividir pares substitutos assim, eles são chamados de pares substitutos por uma razão. 🙂

Consequentemente, não acho que podemos usar expressões regulares (ou mesmo qualquer abordagem baseada em string) para isso. Acho que temos que pesquisar por matrizes de char .

char arrays de char mantêm valores UTF-16, para que possamos encontrar esses meios pares nos dados, se procurarmos da maneira mais difícil:

 String s = new StringBuilder() .append("Thats a nice joke ") .appendCodePoint(0x1F606) .appendCodePoint(0x1F606) .appendCodePoint(0x1F606) .append(" ") .appendCodePoint(0x1F61B) .toString(); char[] chars = s.toCharArray(); int index; char ch1; char ch2; index = 0; while (index < chars.length - 1) { // -1 because we're looking for two-char-long things ch1 = chars[index]; if ((int)ch1 == 0xD83C) { ch2 = chars[index+1]; if ((int)ch2 >= 0xDF00 && (int)ch2 < = 0xDFFF) { System.out.println("Found emoji at index " + index); index += 2; continue; } } else if ((int)ch1 == 0xD83D) { ch2 = chars[index+1]; if ((int)ch2 >= 0xDC00 && (int)ch2 < = 0xDDFF) { System.out.println("Found emoji at index " + index); index += 2; continue; } } ++index; } 

Obviamente, isso é apenas código de nível de debugging, mas faz o trabalho. (Na sua string dada, com seu emoji, é claro que ele não encontrará nada, pois eles estão fora do intervalo. Mas se você alterar o limite superior do segundo par para 0xDEFF vez de 0xDDFF , não haverá. também includeia não-emojis.)


Fonte do meu programa para descobrir quais eram os intervalos substitutos:

 public class FindRanges { public static void main(String[] args) { char last0 = '\0'; char last1 = '\0'; for (int x = 0x1F300; x < = 0x1F5FF; ++x) { char[] chars = new StringBuilder().appendCodePoint(x).toString().toCharArray(); if (chars[0] != last0) { if (last0 != '\0') { System.out.println("-\\u" + Integer.toHexString((int)last1).toUpperCase()); } System.out.print("\\u" + Integer.toHexString((int)chars[0]).toUpperCase() + " \\u" + Integer.toHexString((int)chars[1]).toUpperCase()); last0 = chars[0]; } last1 = chars[1]; } if (last0 != '\0') { System.out.println("-\\u" + Integer.toHexString((int)last1).toUpperCase()); } } } 

Saída:

  \ uD83C \ uDF00- \ uDFFFF
 \ uD83D \ uDC00- \ uDDFF 

Usando emoji-java eu escrevi um método simples que remove todos os emojis, incluindo os modificadores fitzpatrick . Requer uma biblioteca externa, mas mais fácil de manter do que aquelas regexes de monstros.

Usar:

 String input = "A string 😄with a \uD83D\uDC66\uD83C\uDFFFfew 😉emojis!"; String result = EmojiParser.removeAllEmojis(input); 

instalação do maven emoji-java:

  com.vdurmont emoji-java 3.1.3  

gradle:

 compile 'com.vdurmont:emoji-java:3.1.3' 

EDIT: resposta enviada anteriormente foi puxada em código-fonte emoji-java.

Tive um problema semelhante. O seguinte me serviu bem e combina pares substitutos

 public class SplitByUnicode { public static void main(String[] argv) throws Exception { String string = "Thats a nice joke 😆😆😆 😛"; System.out.println("Original String:"+string); String regexPattern = "[\uD83C-\uDBFF\uDC00-\uDFFF]+"; byte[] utf8 = string.getBytes("UTF-8"); String string1 = new String(utf8, "UTF-8"); Pattern pattern = Pattern.compile(regexPattern); Matcher matcher = pattern.matcher(string1); List matchList = new ArrayList(); while (matcher.find()) { matchList.add(matcher.group()); } for(int i=0;i 

A saída é:

 Original String:Thats a nice joke 😆😆😆 😛 0:😆😆😆 1:😛 

Encontrou o regex em https://stackoverflow.com/a/24071599/915972

Isso funcionou para mim no java 8:

 public static String mysqlSafe(String input) { if (input == null) return null; StringBuilder sb = new StringBuilder(); for (int i = 0; i < input.length(); i++) { if (i < (input.length() - 1)) { // Emojis are two characters long in java, eg a rocket emoji is "\uD83D\uDE80"; if (Character.isSurrogatePair(input.charAt(i), input.charAt(i + 1))) { i += 1; //also skip the second character of the emoji continue; } } sb.append(input.charAt(i)); } return sb.toString(); } 

você pode fazer assim

  String s="Thats a nice joke 😆😆😆 😛"; Pattern pattern = Pattern.compile("[\ud83c\udc00-\ud83c\udfff]|[\ud83d\udc00-\ud83d\udfff]|[\u2600-\u27ff]", Pattern.UNICODE_CASE | Pattern.CASE_INSENSITIVE); Matcher matcher = pattern.matcher(s); List matchList = new ArrayList(); while (matcher.find()) { matchList.add(matcher.group()); } for(int i=0;i 

Supondo que você esteja solicitando intervalos padrão de emojis Unicode (existem diferentes blocos por fornecedor), considere estes três intervalos:

  • 0x20a0 – 0x32ff
  • 0x1f000 – 0x1ffff
  • 0xfe4e5 – 0xfe4ee

Além de toda a explicação cuidadosa que TJCrowder compartilhou conosco, é preciso dizer que, começando com o Java 7, é possível combinar com facilidade pares de substitutos codificados em UTF-16.

Dê uma olhada nos docs:

http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html

Um caractere Unicode também pode ser representado em uma expressão regular usando sua notação hexadecimal (valor de ponto de código hexadecimal) diretamente conforme descrito em construct \ x {…}, por exemplo, um caractere suplementar U + 2011F pode ser especificado como \ x {2011F}, em vez de duas seqüências de escape Unicode consecutivas do par substituto \ uD840 \ uDD1F.

No entanto, se você não puder alternar para o Java 7, poderá estender o valioso UnicodeEscaper fornecido pela Guava.

Aqui uma implementação por exemplo:

 public class SimpleEscaper extends UnicodeEscaper { @Override protected char[] escape(int codePoint) { if (0x1f000 >= codePoint && codePoint < = 0x1ffff) { return Integer.toHexString(codePoint).toCharArray(); } return Character.toChars(codePoint); } } 

O melhor regex para extrair todos os emoji é este:

 (?:[\u2700-\u27bf]|(?:\ud83c[\udde6-\uddff]){2}|[\ud800-\udbff][\udc00-\udfff]|[\u0023-\u0039]\ufe0f?\u20e3|\u3299|\u3297|\u303d|\u3030|\u24c2|\ud83c[\udd70-\udd71]|\ud83c[\udd7e-\udd7f]|\ud83c\udd8e|\ud83c[\udd91-\udd9a]|\ud83c[\udde6-\uddff]|[\ud83c[\ude01-\ude02]|\ud83c\ude1a|\ud83c\ude2f|[\ud83c[\ude32-\ude3a]|[\ud83c[\ude50-\ude51]|\u203c|\u2049|[\u25aa-\u25ab]|\u25b6|\u25c0|[\u25fb-\u25fe]|\u00a9|\u00ae|\u2122|\u2139|\ud83c\udc04|[\u2600-\u26FF]|\u2b05|\u2b06|\u2b07|\u2b1b|\u2b1c|\u2b50|\u2b55|\u231a|\u231b|\u2328|\u23cf|[\u23e9-\u23f3]|[\u23f8-\u23fa]|\ud83c\udccf|\u2934|\u2935|[\u2190-\u21ff]) 

Ele identifica muitos emojis de caractere único que as outras respostas não levam em conta. Para mais informações sobre como este regex funciona, dê uma olhada neste post. https://medium.com/@thekevinscott/emojis-in-javascript-f693d0eb79fb#.enomgcu63

Você também pode usar a biblioteca emoji4j .

 String emojiText = "A 🐱, 🐱 and a 🐭 became friends. For 🐶's birthday party, they all had 🍔s, 🍟s, 🍪s and 🍰."; EmojiUtils.removeAllEmojis(emojiText);//returns "A , and a became friends. For 's birthday party, they all had s, s, s and . 

Isto é o que eu uso para remover emojis e até agora ele mostrou para permitir todos os outros alfabetos.

 private static String remove_Emojis(String name) { //we will store all the letters in this array ArrayList nonEmoji = new ArrayList<>(); // and when we rebuild the name we will put it in here String newName = ""; // we are going to loop through checking each character to see if its an emoji or not for (int i = 0; i < name.length(); i++) { if (Character.isLetterOrDigit(name.charAt(i))) { nonEmoji.add(name.charAt(i)); } else { // this is just a 2nd check in case the other method didn't allow some letter if (Build.VERSION.SDK_INT > 18) { if (Character.isAlphabetic(name.charAt(i))) { nonEmoji.add(name.charAt(i)); } } } if (name.charAt(i) == ' ')// may want to consider adding or '-' or '\'' { nonEmoji.add(i);// just add it } if (name.charAt(i) == '@' && !name.contains(" "))// I put this in for email addresses { nonEmoji.add('@'); } } // finally just loop through building it back out for (int i = 0; i < nonEmoji.size(); i++) { newName += nonEmoji.get(i); } return newName; } 

Existem duas maneiras de resolver esse problema persistente.

O primeiro deles é Usar bibliotecas de terceiros como emoji-java e emoji4j. Estes são mencionados acima. Você pode facilmente usar o método containsEmoji ou removesEmoji , etc. E em seus próprios Apps, você precisa manter a atualização com essas bibliotecas.

Quanto a mim, quero encontrar uma solução simples para resolver esse problema.

Após um dia inteiro de pesquisa, encontrei um regex mágico:

"(?:[\uD83C\uDF00-\uD83D\uDDFF]|[\uD83E\uDD00-\uD83E\uDDFF]|[\uD83D\uDE00-\uD83D\uDE4F]|[\uD83D\uDE80-\uD83D\uDEFF]|[\u2600-\u26FF]\uFE0F?|[\u2700-\u27BF]\uFE0F?|\u24C2\uFE0F?|[\uD83C\uDDE6-\uD83C\uDDFF]{1,2}|[\uD83C\uDD70\uD83C\uDD71\uD83C\uDD7E\uD83C\uDD7F\uD83C\uDD8E\uD83C\uDD91-\uD83C\uDD9A]\uFE0F?|[\u0023\u002A\u0030-\u0039]\uFE0F?\u20E3|[\u2194-\u2199\u21A9-\u21AA]\uFE0F?|[\u2B05-\u2B07\u2B1B\u2B1C\u2B50\u2B55]\uFE0F?|[\u2934\u2935]\uFE0F?|[\u3030\u303D]\uFE0F?|[\u3297\u3299]\uFE0F?|[\uD83C\uDE01\uD83C\uDE02\uD83C\uDE1A\uD83C\uDE2F\uD83C\uDE32-\uD83C\uDE3A\uD83C\uDE50\uD83C\uDE51]\uFE0F?|[\u203C\u2049]\uFE0F?|[\u25AA\u25AB\u25B6\u25C0\u25FB-\u25FE]\uFE0F?|[\u00A9\u00AE]\uFE0F?|[\u2122\u2139]\uFE0F?|\uD83C\uDC04\uFE0F?|\uD83C\uDCCF\uFE0F?|[\u231A\u231B\u2328\u23CF\u23E9-\u23F3\u23F8-\u23FA]\uFE0F?)"

que eu testei OK em Java. Isso resolveu perfeitamente o meu problema.

Você pode ver isso na página do Github:

https://github.com/zly394/EmojiRegex

Notas:

A resposta fornecida por @Eric Nakagawa contém alguns erros, que não podem ser operados corretamente.

Emoji regex

 public static final String sEmojiRegex = "(?:[\\u2700-\\u27bf]|" + "(?:[\\ud83c\\udde6-\\ud83c\\uddff]){2}|" + "[\\ud800\\udc00-\\uDBFF\\uDFFF]|[\\u2600-\\u26FF])[\\ufe0e\\ufe0f]?(?:[\\u0300-\\u036f\\ufe20-\\ufe23\\u20d0-\\u20f0]|[\\ud83c\\udffb-\\ud83c\\udfff])?" + "(?:\\u200d(?:[^\\ud800-\\udfff]|" + "(?:[\\ud83c\\udde6-\\ud83c\\uddff]){2}|" + "[\\ud800\\udc00-\\uDBFF\\uDFFF]|[\\u2600-\\u26FF])[\\ufe0e\\ufe0f]?(?:[\\u0300-\\u036f\\ufe20-\\ufe23\\u20d0-\\u20f0]|[\\ud83c\\udffb-\\ud83c\\udfff])?)*|" + "[\\u0023-\\u0039]\\ufe0f?\\u20e3|\\u3299|\\u3297|\\u303d|\\u3030|\\u24c2|[\\ud83c\\udd70-\\ud83c\\udd71]|[\\ud83c\\udd7e-\\ud83c\\udd7f]|\\ud83c\\udd8e|[\\ud83c\\udd91-\\ud83c\\udd9a]|[\\ud83c\\udde6-\\ud83c\\uddff]|[\\ud83c\\ude01-\\ud83c\\ude02]|\\ud83c\\ude1a|\\ud83c\\ude2f|[\\ud83c\\ude32-\\ud83c\\ude3a]|[\\ud83c\\ude50-\\ud83c\\ude51]|\\u203c|\\u2049|[\\u25aa-\\u25ab]|\\u25b6|\\u25c0|[\\u25fb-\\u25fe]|\\u00a9|\\u00ae|\\u2122|\\u2139|\\ud83c\\udc04|[\\u2600-\\u26FF]|\\u2b05|\\u2b06|\\u2b07|\\u2b1b|\\u2b1c|\\u2b50|\\u2b55|\\u231a|\\u231b|\\u2328|\\u23cf|[\\u23e9-\\u23f3]|[\\u23f8-\\u23fa]|\\ud83c\\udccf|\\u2934|\\u2935|[\\u2190-\\u21ff]"; 

alguns emojis (1627)

 // count = 1627 public static final String sEmojiTest = "😀😃😄😁😆😅😂🤣☺️😊😇🙂🙃😉😌😍😘😗😙😚😋😜😝😛🤑🤗🤓😎🤡🤠😏😒😞😔😟😕🙁☹️😣😖😫😩😤😠😡😶😐😑😯😦😧😮😲😵😳😱😨😰😢😥🤤😭😓😪😴🙄🤔🤥😬🤐🤢🤧😷🤒🤕😈👿👹👺💩👻💀☠️👽👾🤖🎃😺😸😹😻😼😽🙀😿😾👐🙌👏🙏🤝👍👎👊✊🤛🤜🤞✌️🤘👌👈👉👆👇☝️✋🤚🖐🖖👋🤙💪🖕✍️🤳💅💍💄💋👄👅👂👃👣👁👀🗣👤👥👶👦👧👨👩👱‍♀👱👴👵👲👳‍♀👳👮‍♀👮👷‍♀👷💂‍♀💂🕵️‍♀️🕵👩‍⚕👨‍⚕👩‍🌾👨‍🌾👩‍🍳👨‍🍳👩‍🎓👨‍🎓👩‍🎤👨‍🎤👩‍🏫👨‍🏫👩‍🏭👨‍🏭👩‍💻👨‍💻👩‍💼👨‍💼👩‍🔧👨‍🔧👩‍🔬👨‍🔬👩‍🎨👨‍🎨👩‍🚒👨‍🚒👩‍✈👨‍✈👩‍🚀👨‍🚀👩‍⚖👨‍⚖🤶🎅👸🤴👰🤵👼🤰🙇‍♀🙇💁💁‍♂🙅🙅‍♂🙆🙆‍♂🙋🙋‍♂🤦‍♀🤦‍♂🤷‍♀🤷‍♂🙎🙎‍♂🙍🙍‍♂💇💇‍♂💆💆‍♂🕴💃🕺👯👯‍♂🚶‍♀🚶🏃‍♀🏃👫👭👬💑👩‍❤️‍👩👨‍❤️‍👨💏👩‍❤️‍💋‍👩👨‍❤️‍💋‍👨👪👨‍👩‍👧👨‍👩‍👧‍👦👨‍👩‍👦‍👦👨‍👩‍👧‍👧👩‍👩‍👦👩‍👩‍👧👩‍👩‍👧‍👦👩‍👩‍👦‍👦👩‍👩‍👧‍👧👨‍👨‍👦👨‍👨‍👧👨‍👨‍👧‍👦👨‍👨‍👦‍👦👨‍👨‍👧‍👧👩‍👦👩‍👧👩‍👧‍👦👩‍👦‍👦👩‍👧‍👧👨‍👦👨‍👧👨‍👧‍👦👨‍👦‍👦👨‍👧‍👧👚👕👖👔👗👙👘👠👡👢👞👟👒🎩🎓👑⛑🎒👝👛👜💼👓🕶🌂☂️🐶🐱🐭🐹🐰🦊🐻🐼🐨🐯🦁🐮🐷🐽🐸🐵🙈🙉🙊🐒🐔🐧🐦🐤🐣🐥🦆🦅🦉🦇🐺🐗🐴🦄🐝🐛🦋🐌🐚🐞🐜🕷🕸🐢🐍🦎🦂🦀🦑🐙🦐🐠🐟🐡🐬🦈🐳🐋🐊🐆🐅🐃🐂🐄🦌🐪🐫🐘🦏🦍🐎🐖🐐🐏🐑🐕🐩🐈🐓🦃🕊🐇🐁🐀🐿🐾🐉🐲🌵🎄🌲🌳🌴🌱🌿☘️🍀🎍🎋🍃🍂🍁🍄🌾💐🌷🌹🥀🌻🌼🌸🌺🌎🌍🌏🌕🌖🌗🌘🌑🌒🌓🌔🌚🌝🌞🌛🌜🌙💫⭐️🌟✨⚡️🔥💥☄☀️🌤⛅️🌥🌦🌈☁️🌧⛈🌩🌨☃️⛄️❄️🌬💨🌪🌫🌊💧💦☔️🍏🍎🍐🍊🍋🍌🍉🍇🍓🍈🍒🍑🍍🥝🥑🍅🍆🥒🥕🌽🌶🥔🍠🌰🥜🍯🥐🍞🥖🧀🥚🍳🥓🥞🍤🍗🍖🍕🌭🍔🍟🥙🌮🌯🥗🥘🍝🍜🍲🍥🍣🍱🍛🍚🍙🍘🍢🍡🍧🍨🍦🍰🎂🍮🍭🍬🍫🍿🍩🍪🥛🍼☕️🍵🍶🍺🍻🥂🍷🥃🍸🍹🍾🥄🍴🍽⚽️🏀🏈⚾️🎾🏐🏉🎱🏓🏸🥅🏒🏑🏏⛳️🏹🎣🥊🥋⛸🎿⛷🏂🏋️‍♀️🏋🤺🤼‍♀🤼‍♂🤸‍♀🤸‍♂⛹️‍♀️⛹🤾‍♀🤾‍♂🏌️‍♀️🏌🏄‍♀🏄🏊‍♀🏊🤽‍♀🤽‍♂🚣‍♀🚣🏇🚴‍♀🚴🚵‍♀🚵🎽🏅🎖🥇🥈🥉🏆🏵🎗🎫🎟🎪🤹‍♀🤹‍♂🎭🎨🎬🎤🎧🎼🎹🥁🎷🎺🎸🎻🎲🎯🎳🎮🎰🚗🚕🚙🚌🚎🏎🚓🚑🚒🚐🚚🚛🚜🛴🚲🛵🏍🚨🚔🚍🚘🚖🚡🚠🚟🚃🚋🚞🚝🚄🚅🚈🚂🚆🚇🚊🚉🚁🛩✈️🛫🛬🚀🛰💺🛶⛵️🛥🚤🛳⛴🚢⚓️🚧⛽️🚏🚦🚥🗺🗿🗽⛲️🗼🏰🏯🏟🎡🎢🎠⛱🏖🏝⛰🏔🗻🌋🏜🏕⛺️🛤🛣🏗🏭🏠🏡🏘🏚🏢🏬🏣🏤🏥🏦🏨🏪🏫🏩💒🏛⛪️🕌🕍🕋⛩🗾🎑🏞🌅🌄🌠🎇🎆🌇🌆🏙🌃🌌🌉🌁⌚️📱📲💻⌨️🖥🖨🖱🖲🕹🗜💽💾💿📀📼📷📸📹🎥📽🎞📞☎️📟📠📺📻🎙🎚🎛⏱⏲⏰🕰⌛️⏳📡🔋🔌💡🔦🕯🗑🛢💸💵💴💶💷💰💳💎⚖️🔧🔨⚒🛠⛏🔩⚙️⛓🔫💣🔪🗡⚔️🛡🚬⚰️⚱️🏺🔮📿💈⚗️🔭🔬🕳💊💉🌡🚽🚰🚿🛁🛀🛎🔑🗝🚪🛋🛏🛌🖼🛍🛒🎁🎈🎏🎀🎊🎉🎎🏮🎐✉️📩📨📧💌📥📤📦🏷📪📫📬📭📮📯📜📃📄📑📊📈📉🗒🗓📆📅📇🗃🗳🗄📋📁📂🗂🗞📰📓📔📒📕📗📘📙📚📖🔖🔗📎🖇📐📏📌📍✂️🖊🖋✒️🖌🖍📝✏️🔍🔎🔏🔐🔒🔓❤️💛💚💙💜🖤💔❣️💕💞💓💗💖💘💝💟☮️✝️☪️🕉☸️✡️🔯🕎☯️☦️🛐⛎♈️♉️♊️♋️♌️♍️♎️♏️♐️♑️♒️♓️🆔⚛️🉑☢️☣️📴📳🈶🈚️🈸🈺🈷️✴️🆚💮🉐㊙️㊗️🈴🈵🈹🈲🅰️🅱️🆎🆑🅾️🆘❌⭕️🛑⛔️📛🚫💯💢♨️🚷🚯🚳🚱🔞📵🚭❗️❕❓❔‼️⁉️🔅🔆〽️⚠️🚸🔱⚜️🔰♻️✅🈯️💹❇️✳️❎🌐💠Ⓜ️🌀💤🏧🚾♿️🅿️🈳🈂️🛂🛃🛄🛅🚹🚺🚼🚻🚮🎦📶🈁🔣ℹ️🔤🔡🔠🆖🆗🆙🆒🆕🆓0️⃣1️⃣2️⃣3️⃣4️⃣5️⃣6️⃣7️⃣8️⃣9️⃣🔟🔢#️⃣*️⃣▶️⏸⏯⏹⏺⏭⏮⏩⏪⏫⏬◀️🔼🔽➡️⬅️⬆️⬇️↗️↘️↙️↖️↕️↔️↪️↩️⤴️⤵️🔀🔁🔂🔄🔃🎵🎶➕➖➗✖️💲💱™️©️®️〰️➰➿🔚🔙🔛🔝🔜✔️☑️🔘⚪️⚫️🔴🔵🔺🔻🔸🔹🔶🔷🔳🔲▪️▫️◾️◽️◼️◻️⬛️⬜️🔈🔇🔉🔊🔔🔕📣📢👁‍🗨💬💭🗯♠️♣️♥️♦️🃏🎴🀄️🕐🕑🕒🕓🕔🕕🕖🕗🕘🕙🕚🕛🕜🕝🕞🕟🕠🕡🕢🕣🕤🕥🕦🕧🏳️🏴🏁🚩🏳️‍🌈🇦🇫🇦🇽🇦🇱🇩🇿🇦🇸🇦🇩🇦🇴🇦🇮🇦🇶🇦🇬🇦🇷🇦🇲🇦🇼🇦🇺🇦🇹🇦🇿🇧🇸🇧🇭🇧🇩🇧🇧🇧🇾🇧🇪🇧🇿🇧🇯🇧🇲🇧🇹🇧🇴🇧🇶🇧🇦🇧🇼🇧🇷🇮🇴🇻🇬🇧🇳🇧🇬🇧🇫🇧🇮🇨🇻🇰🇭🇨🇲🇨🇦🇮🇨🇰🇾🇨🇫🇹🇩🇨🇱🇨🇳🇨🇽🇨🇨🇨🇴🇰🇲🇨🇬🇨🇩🇨🇰🇨🇷🇨🇮🇭🇷🇨🇺🇨🇼🇨🇾🇨🇿🇩🇰🇩🇯🇩🇲🇩🇴🇪🇨🇪🇬🇸🇻🇬🇶🇪🇷🇪🇪🇪🇹🇪🇺🇫🇰🇫🇴🇫🇯🇫🇮🇫🇷🇬🇫🇵🇫🇹🇫🇬🇦🇬🇲🇬🇪🇩🇪🇬🇭🇬🇮🇬🇷🇬🇱🇬🇩🇬🇵🇬🇺🇬🇹🇬🇬🇬🇳🇬🇼🇬🇾🇭🇹🇭🇳🇭🇰🇭🇺🇮🇸🇮🇳🇮🇩🇮🇷🇮🇶🇮🇪🇮🇲🇮🇱🇮🇹🇯🇲🇯🇵🎌🇯🇪🇯🇴🇰🇿🇰🇪🇰🇮🇽🇰🇰🇼🇰🇬🇱🇦🇱🇻🇱🇧🇱🇸🇱🇷🇱🇾🇱🇮🇱🇹🇱🇺🇲🇴🇲🇰🇲🇬🇲🇼🇲🇾🇲🇻🇲🇱🇲🇹🇲🇭🇲🇶🇲🇷🇲🇺🇾🇹🇲🇽🇫🇲🇲🇩🇲🇨🇲🇳🇲🇪🇲🇸🇲🇦🇲🇿🇲🇲🇳🇦🇳🇷🇳🇵🇳🇱🇳🇨🇳🇿🇳🇮🇳🇪🇳🇬🇳🇺🇳🇫🇲🇵🇰🇵🇳🇴🇴🇲🇵🇰🇵🇼🇵🇸🇵🇦🇵🇬🇵🇾🇵🇪🇵🇭🇵🇳🇵🇱🇵🇹🇵🇷🇶🇦🇷🇪🇷🇴🇷🇺🇷🇼🇧🇱🇸🇭🇰🇳🇱🇨🇵🇲🇻🇨🇼🇸🇸🇲🇸🇹🇸🇦🇸🇳🇷🇸🇸🇨🇸🇱🇸🇬🇸🇽🇸🇰🇸🇮🇸🇧🇸🇴🇿🇦🇬🇸🇰🇷🇸🇸🇪🇸🇱🇰🇸🇩🇸🇷🇸🇿🇸🇪🇨🇭🇸🇾🇹🇼🇹🇯🇹🇿🇹🇭🇹🇱🇹🇬🇹🇰🇹🇴🇹🇹🇹🇳🇹🇷🇹🇲🇹🇨🇹🇻🇺🇬🇺🇦🇦🇪🇬🇧🇺🇸🇻🇮🇺🇾🇺🇿🇻🇺🇻🇦🇻🇪🇻🇳🇼🇫🇪🇭🇾🇪🇿🇲🇿🇼⚽️🏀🏈⚾️🎾🏐🏉🎱🏓🏸🥅🏒🏑🏏⛳️🏹🎣🥊🥋⛸🎿⛷🏂🏋️‍♀️🏋🏻‍♀️🏋🏼‍♀️🏋🏽‍♀️🏋🏾‍♀️🏋🏿‍♀️🏋️🏋🏻🏋🏼🏋🏽🏋🏾🏋🏿🤺🤼‍♀️🤼‍♂️🤸‍♀️🤸🏻‍♀️🤸🏼‍♀️🤸🏽‍♀️🤸🏾‍♀️🤸🏿‍♀️🤸‍♂️🤸🏻‍♂️🤸🏼‍♂️🤸🏽‍♂️🤸🏾‍♂️🤸🏿‍♂️⛹️‍♀️⛹🏻‍♀️⛹🏼‍♀️⛹🏽‍♀️⛹🏾‍♀️⛹🏿‍♀️⛹️⛹🏻⛹🏼⛹🏽⛹🏾⛹🏿🤾‍♀️🤾🏻‍♀️🤾🏼‍♀️🤾🏽‍♀️🤾🏾‍♀️🤾🏿‍♀️🤾‍♂️🤾🏻‍♂️🤾🏼‍♂️🤾🏽‍♂️🤾🏾‍♂️🤾🏿‍♂️🏌️‍♀️🏌🏻‍♀️🏌🏼‍♀️🏌🏽‍♀️🏌🏾‍♀️🏌🏿‍♀️🏌️🏌🏻🏌🏼🏌🏽🏌🏾🏌🏿🏄‍♀️🏄🏻‍♀️🏄🏼‍♀️🏄🏽‍♀️🏄🏾‍♀️🏄🏿‍♀️🏄🏄🏻🏄🏼🏄🏽🏄🏾🏄🏿🏊‍♀️🏊🏻‍♀️🏊🏼‍♀️🏊🏽‍♀️🏊🏾‍♀️🏊🏿‍♀️🏊🏊🏻🏊🏼🏊🏽🏊🏾🏊🏿🤽‍♀️🤽🏻‍♀️🤽🏼‍♀️🤽🏽‍♀️🤽🏾‍♀️🤽🏿‍♀️🤽‍♂️🤽🏻‍♂️🤽🏼‍♂️🤽🏽‍♂️🤽🏾‍♂️🤽🏿‍♂️🚣‍♀️🚣🏻‍♀️🚣🏼‍♀️🚣🏽‍♀️🚣🏾‍♀️🚣🏿‍♀️🚣🚣🏻🚣🏼🚣🏽🚣🏾🚣🏿🏇🏇🏻🏇🏼🏇🏽🏇🏾🏇🏿🚴‍♀️🚴🏻‍♀️🚴🏼‍♀️🚴🏽‍♀️🚴🏾‍♀️🚴🏿‍♀️🚴🚴🏻🚴🏼🚴🏽🚴🏾🚴🏿🚵‍♀️🚵🏻‍♀️🚵🏼‍♀️🚵🏽‍♀️🚵🏾‍♀️🚵🏿‍♀️🚵🚵🏻🚵🏼🚵🏽🚵🏾🚵🏿🎽🏅🎖🥇🥈🥉🏆🏵🎗🎫🎟🎪🤹‍♀️🤹‍♂️🎭🎨🎬🎤🎧🎼🎹🥁🎷🎺🎸🎻🎲🎯🎳🎮🎰"; 

function para testar emojis

 public void checkMatchingEmojis() { final Pattern pattern = Pattern.compile(sEmojiRegex); final Matcher matcher = pattern.matcher(sEmojiTest); int foundEmojiCount = 0; while (matcher.find()) { System.out.println("Full match: " + matcher.group(0)); foundEmojiCount++; } System.out.println("*******************************************"); System.out.println("Input Emoji count = 1627"); System.out.println("Captured Emoji count = " + foundEmojiCount); System.out.println("*******************************************"); } 

Aqui está a essência, testada em todos os emojis unicode 10

Obrigado a Kevin Scott por escrever o melhor exemplo

Apenas para usar o regex para resolvê-lo:

 s = s.replaceAll("\\p{So}+", ""); 

Você pode encontrá-lo em

http://www.regular-expressions.info/unicode.html

https://docs.oracle.com/javase/7/docs/api/java/lang/Character.html#OTHER_SYMBOL


insira a descrição da imagem aqui

Você pode gerar seu próprio regex sempre que a especificação for alterada.
Esta ferramenta (captura de canvas aqui ).

Para o modo utf-8/32 (string), modo expandido:

 " # Use the 'Mega-Conversion' tool to change into other syntaxes" " # -------------------------------------------------------------" " " " [#*0-9] \\x{FE0F} \\x{20E3}" " | [\\x{A9}\\x{AE}\\x{203C}\\x{2049}\\x{2122}\\x{2139}\\x{2194}-\\x{2199}\\x{21A9}\\x{21AA}\\x{231A}\\x{231B}\\x{2328}\\x{23CF}\\x{23E9}-\\x{23F3}\\x{23F8}-\\x{23FA}\\x{24C2}\\x{25AA}\\x{25AB}\\x{25B6}\\x{25C0}\\x{25FB}-\\x{25FE}\\x{2600}-\\x{2604}\\x{260E}\\x{2611}\\x{2614}\\x{2615}\\x{2618}]" " | \\x{261D} [\\x{1F3FB}-\\x{1F3FF}]?" " | [\\x{2620}\\x{2622}\\x{2623}\\x{2626}\\x{262A}\\x{262E}\\x{262F}\\x{2638}-\\x{263A}\\x{2640}\\x{2642}\\x{2648}-\\x{2653}\\x{265F}\\x{2660}\\x{2663}\\x{2665}\\x{2666}\\x{2668}\\x{267B}\\x{267E}\\x{267F}\\x{2692}-\\x{2697}\\x{2699}\\x{269B}\\x{269C}\\x{26A0}\\x{26A1}\\x{26AA}\\x{26AB}\\x{26B0}\\x{26B1}\\x{26BD}\\x{26BE}\\x{26C4}\\x{26C5}\\x{26C8}\\x{26CE}\\x{26CF}\\x{26D1}\\x{26D3}\\x{26D4}\\x{26E9}\\x{26EA}\\x{26F0}-\\x{26F5}\\x{26F7}\\x{26F8}]" " | \\x{26F9}" " (?:" " \\x{FE0F} \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}" " | [\\x{1F3FB}-\\x{1F3FF}]" " (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?" " )?" " | [\\x{26FA}\\x{26FD}\\x{2702}\\x{2705}\\x{2708}\\x{2709}]" " | [\\x{270A}-\\x{270D}] [\\x{1F3FB}-\\x{1F3FF}]?" " | [\\x{270F}\\x{2712}\\x{2714}\\x{2716}\\x{271D}\\x{2721}\\x{2728}\\x{2733}\\x{2734}\\x{2744}\\x{2747}\\x{274C}\\x{274E}\\x{2753}-\\x{2755}\\x{2757}\\x{2763}\\x{2764}\\x{2795}-\\x{2797}\\x{27A1}\\x{27B0}\\x{27BF}\\x{2934}\\x{2935}\\x{2B05}-\\x{2B07}\\x{2B1B}\\x{2B1C}\\x{2B50}\\x{2B55}\\x{3030}\\x{303D}\\x{3297}\\x{3299}\\x{1F004}\\x{1F0CF}\\x{1F170}\\x{1F171}\\x{1F17E}\\x{1F17F}\\x{1F18E}\\x{1F191}-\\x{1F19A}]" " | \\x{1F1E6} [\\x{1F1E8}-\\x{1F1EC}\\x{1F1EE}\\x{1F1F1}\\x{1F1F2}\\x{1F1F4}\\x{1F1F6}-\\x{1F1FA}\\x{1F1FC}\\x{1F1FD}\\x{1F1FF}]" " | \\x{1F1E7} [\\x{1F1E6}\\x{1F1E7}\\x{1F1E9}-\\x{1F1EF}\\x{1F1F1}-\\x{1F1F4}\\x{1F1F6}-\\x{1F1F9}\\x{1F1FB}\\x{1F1FC}\\x{1F1FE}\\x{1F1FF}]" " | \\x{1F1E8} [\\x{1F1E6}\\x{1F1E8}\\x{1F1E9}\\x{1F1EB}-\\x{1F1EE}\\x{1F1F0}-\\x{1F1F5}\\x{1F1F7}\\x{1F1FA}-\\x{1F1FF}]" " | \\x{1F1E9} [\\x{1F1EA}\\x{1F1EC}\\x{1F1EF}\\x{1F1F0}\\x{1F1F2}\\x{1F1F4}\\x{1F1FF}]" " | \\x{1F1EA} [\\x{1F1E6}\\x{1F1E8}\\x{1F1EA}\\x{1F1EC}\\x{1F1ED}\\x{1F1F7}-\\x{1F1FA}]" " | \\x{1F1EB} [\\x{1F1EE}-\\x{1F1F0}\\x{1F1F2}\\x{1F1F4}\\x{1F1F7}]" " | \\x{1F1EC} [\\x{1F1E6}\\x{1F1E7}\\x{1F1E9}-\\x{1F1EE}\\x{1F1F1}-\\x{1F1F3}\\x{1F1F5}-\\x{1F1FA}\\x{1F1FC}\\x{1F1FE}]" " | \\x{1F1ED} [\\x{1F1F0}\\x{1F1F2}\\x{1F1F3}\\x{1F1F7}\\x{1F1F9}\\x{1F1FA}]" " | \\x{1F1EE} [\\x{1F1E8}-\\x{1F1EA}\\x{1F1F1}-\\x{1F1F4}\\x{1F1F6}-\\x{1F1F9}]" " | \\x{1F1EF} [\\x{1F1EA}\\x{1F1F2}\\x{1F1F4}\\x{1F1F5}]" " | \\x{1F1F0} [\\x{1F1EA}\\x{1F1EC}-\\x{1F1EE}\\x{1F1F2}\\x{1F1F3}\\x{1F1F5}\\x{1F1F7}\\x{1F1FC}\\x{1F1FE}\\x{1F1FF}]" " | \\x{1F1F1} [\\x{1F1E6}-\\x{1F1E8}\\x{1F1EE}\\x{1F1F0}\\x{1F1F7}-\\x{1F1FB}\\x{1F1FE}]" " | \\x{1F1F2} [\\x{1F1E6}\\x{1F1E8}-\\x{1F1ED}\\x{1F1F0}-\\x{1F1FF}]" " | \\x{1F1F3} [\\x{1F1E6}\\x{1F1E8}\\x{1F1EA}-\\x{1F1EC}\\x{1F1EE}\\x{1F1F1}\\x{1F1F4}\\x{1F1F5}\\x{1F1F7}\\x{1F1FA}\\x{1F1FF}]" " | \\x{1F1F4} \\x{1F1F2}" " | \\x{1F1F5} [\\x{1F1E6}\\x{1F1EA}-\\x{1F1ED}\\x{1F1F0}-\\x{1F1F3}\\x{1F1F7}-\\x{1F1F9}\\x{1F1FC}\\x{1F1FE}]" " | \\x{1F1F6} \\x{1F1E6}" " | \\x{1F1F7} [\\x{1F1EA}\\x{1F1F4}\\x{1F1F8}\\x{1F1FA}\\x{1F1FC}]" " | \\x{1F1F8} [\\x{1F1E6}-\\x{1F1EA}\\x{1F1EC}-\\x{1F1F4}\\x{1F1F7}-\\x{1F1F9}\\x{1F1FB}\\x{1F1FD}-\\x{1F1FF}]" " | \\x{1F1F9} [\\x{1F1E6}\\x{1F1E8}\\x{1F1E9}\\x{1F1EB}-\\x{1F1ED}\\x{1F1EF}-\\x{1F1F4}\\x{1F1F7}\\x{1F1F9}\\x{1F1FB}\\x{1F1FC}\\x{1F1FF}]" " | \\x{1F1FA} [\\x{1F1E6}\\x{1F1EC}\\x{1F1F2}\\x{1F1F3}\\x{1F1F8}\\x{1F1FE}\\x{1F1FF}]" " | \\x{1F1FB} [\\x{1F1E6}\\x{1F1E8}\\x{1F1EA}\\x{1F1EC}\\x{1F1EE}\\x{1F1F3}\\x{1F1FA}]" " | \\x{1F1FC} [\\x{1F1EB}\\x{1F1F8}]" " | \\x{1F1FD} \\x{1F1F0}" " | \\x{1F1FE} [\\x{1F1EA}\\x{1F1F9}]" " | \\x{1F1FF} [\\x{1F1E6}\\x{1F1F2}\\x{1F1FC}]" " | [\\x{1F201}\\x{1F202}\\x{1F21A}\\x{1F22F}\\x{1F232}-\\x{1F23A}\\x{1F250}\\x{1F251}\\x{1F300}-\\x{1F321}\\x{1F324}-\\x{1F384}]" " | \\x{1F385} [\\x{1F3FB}-\\x{1F3FF}]?" " | [\\x{1F386}-\\x{1F393}\\x{1F396}\\x{1F397}\\x{1F399}-\\x{1F39B}\\x{1F39E}-\\x{1F3C1}]" " | \\x{1F3C2} [\\x{1F3FB}-\\x{1F3FF}]?" " | [\\x{1F3C3}\\x{1F3C4}]" " (?:" " \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}" " | [\\x{1F3FB}-\\x{1F3FF}]" " (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?" " )?" " | [\\x{1F3C5}\\x{1F3C6}]" " | \\x{1F3C7} [\\x{1F3FB}-\\x{1F3FF}]?" " | [\\x{1F3C8}\\x{1F3C9}]" " | \\x{1F3CA}" " (?:" " \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}" " | [\\x{1F3FB}-\\x{1F3FF}]" " (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?" " )?" " | [\\x{1F3CB}\\x{1F3CC}]" " (?:" " \\x{FE0F} \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}" " | [\\x{1F3FB}-\\x{1F3FF}]" " (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?" " )?" " | [\\x{1F3CD}-\\x{1F3F0}]" " | \\x{1F3F3}" " (?: \\x{FE0F} \\x{200D} \\x{1F308} )?" " | \\x{1F3F4}" " (?:" " \\x{200D} \\x{2620} \\x{FE0F}" " | \\x{E0067} \\x{E0062}" " (?:" " \\x{E0065} \\x{E006E} \\x{E0067}" " | \\x{E0073} \\x{E0063} \\x{E0074}" " | \\x{E0077} \\x{E006C} \\x{E0073}" " )" " \\x{E007F}" " )?" " | [\\x{1F3F5}\\x{1F3F7}-\\x{1F440}]" " | \\x{1F441}" " (?: \\x{FE0F} \\x{200D} \\x{1F5E8} \\x{FE0F} )?" " | [\\x{1F442}\\x{1F443}] [\\x{1F3FB}-\\x{1F3FF}]?" " | [\\x{1F444}\\x{1F445}]" " | [\\x{1F446}-\\x{1F450}] [\\x{1F3FB}-\\x{1F3FF}]?" " | [\\x{1F451}-\\x{1F465}]" " | [\\x{1F466}\\x{1F467}] [\\x{1F3FB}-\\x{1F3FF}]?" " | \\x{1F468}" " (?:" " \\x{200D}" " (?:" " [\\x{2695}\\x{2696}\\x{2708}] \\x{FE0F}" " | \\x{2764} \\x{FE0F} \\x{200D}" " (?: \\x{1F48B} \\x{200D} )?" " \\x{1F468}" " | [\\x{1F33E}\\x{1F373}\\x{1F393}\\x{1F3A4}\\x{1F3A8}\\x{1F3EB}\\x{1F3ED}]" " | \\x{1F466}" " (?: \\x{200D} \\x{1F466} )?" " | \\x{1F467}" " (?: \\x{200D} [\\x{1F466}\\x{1F467}] )?" " | [\\x{1F468}\\x{1F469}] \\x{200D}" " (?:" " \\x{1F466}" " (?: \\x{200D} \\x{1F466} )?" " | \\x{1F467}" " (?: \\x{200D} [\\x{1F466}\\x{1F467}] )?" " )" " | [\\x{1F4BB}\\x{1F4BC}\\x{1F527}\\x{1F52C}\\x{1F680}\\x{1F692}\\x{1F9B0}-\\x{1F9B3}]" " )" " | [\\x{1F3FB}-\\x{1F3FF}]" " (?:" " \\x{200D}" " (?:" " [\\x{2695}\\x{2696}\\x{2708}] \\x{FE0F}" " | [\\x{1F33E}\\x{1F373}\\x{1F393}\\x{1F3A4}\\x{1F3A8}\\x{1F3EB}\\x{1F3ED}\\x{1F4BB}\\x{1F4BC}\\x{1F527}\\x{1F52C}\\x{1F680}\\x{1F692}\\x{1F9B0}-\\x{1F9B3}]" " )" " )?" " )?" " | \\x{1F469}" " (?:" " \\x{200D}" " (?:" " [\\x{2695}\\x{2696}\\x{2708}] \\x{FE0F}" " | \\x{2764} \\x{FE0F} \\x{200D}" " (?: \\x{1F48B} \\x{200D} )?" " [\\x{1F468}\\x{1F469}]" " | [\\x{1F33E}\\x{1F373}\\x{1F393}\\x{1F3A4}\\x{1F3A8}\\x{1F3EB}\\x{1F3ED}]" " | \\x{1F466}" " (?: \\x{200D} \\x{1F466} )?" " | \\x{1F467}" " (?: \\x{200D} [\\x{1F466}\\x{1F467}] )?" " | \\x{1F469} \\x{200D}" " (?:" " \\x{1F466}" " (?: \\x{200D} \\x{1F466} )?" " | \\x{1F467}" " (?: \\x{200D} [\\x{1F466}\\x{1F467}] )?" " )" " | [\\x{1F4BB}\\x{1F4BC}\\x{1F527}\\x{1F52C}\\x{1F680}\\x{1F692}\\x{1F9B0}-\\x{1F9B3}]" " )" " | [\\x{1F3FB}-\\x{1F3FF}]" " (?:" " \\x{200D}" " (?:" " [\\x{2695}\\x{2696}\\x{2708}] \\x{FE0F}" " | [\\x{1F33E}\\x{1F373}\\x{1F393}\\x{1F3A4}\\x{1F3A8}\\x{1F3EB}\\x{1F3ED}\\x{1F4BB}\\x{1F4BC}\\x{1F527}\\x{1F52C}\\x{1F680}\\x{1F692}\\x{1F9B0}-\\x{1F9B3}]" " )" " )?" " )?" " | [\\x{1F46A}-\\x{1F46D}]" " | \\x{1F46E}" " (?:" " \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}" " | [\\x{1F3FB}-\\x{1F3FF}]" " (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?" " )?" " | \\x{1F46F}" " (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?" " | \\x{1F470} [\\x{1F3FB}-\\x{1F3FF}]?" " | \\x{1F471}" " (?:" " \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}" " | [\\x{1F3FB}-\\x{1F3FF}]" " (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?" " )?" " | \\x{1F472} [\\x{1F3FB}-\\x{1F3FF}]?" " | \\x{1F473}" " (?:" " \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}" " | [\\x{1F3FB}-\\x{1F3FF}]" " (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?" " )?" " | [\\x{1F474}-\\x{1F476}] [\\x{1F3FB}-\\x{1F3FF}]?" " | \\x{1F477}" " (?:" " \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}" " | [\\x{1F3FB}-\\x{1F3FF}]" " (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?" " )?" " | \\x{1F478} [\\x{1F3FB}-\\x{1F3FF}]?" " | [\\x{1F479}-\\x{1F47B}]" " | \\x{1F47C} [\\x{1F3FB}-\\x{1F3FF}]?" " | [\\x{1F47D}-\\x{1F480}]" " | [\\x{1F481}\\x{1F482}]" " (?:" " \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}" " | [\\x{1F3FB}-\\x{1F3FF}]" " (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?" " )?" " | \\x{1F483} [\\x{1F3FB}-\\x{1F3FF}]?" " | \\x{1F484}" " | \\x{1F485} [\\x{1F3FB}-\\x{1F3FF}]?" " | [\\x{1F486}\\x{1F487}]" " (?:" " \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}" " | [\\x{1F3FB}-\\x{1F3FF}]" " (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?" " )?" " | [\\x{1F488}-\\x{1F4A9}]" " | \\x{1F4AA} [\\x{1F3FB}-\\x{1F3FF}]?" " | [\\x{1F4AB}-\\x{1F4FD}\\x{1F4FF}-\\x{1F53D}\\x{1F549}-\\x{1F54E}\\x{1F550}-\\x{1F567}\\x{1F56F}\\x{1F570}\\x{1F573}]" " | \\x{1F574} [\\x{1F3FB}-\\x{1F3FF}]?" " | \\x{1F575}" " (?:" " \\x{FE0F} \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}" " | [\\x{1F3FB}-\\x{1F3FF}]" " (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?" " )?" " | [\\x{1F576}-\\x{1F579}]" " | \\x{1F57A} [\\x{1F3FB}-\\x{1F3FF}]?" " | [\\x{1F587}\\x{1F58A}-\\x{1F58D}]" " | [\\x{1F590}\\x{1F595}\\x{1F596}] [\\x{1F3FB}-\\x{1F3FF}]?" " | [\\x{1F5A4}\\x{1F5A5}\\x{1F5A8}\\x{1F5B1}\\x{1F5B2}\\x{1F5BC}\\x{1F5C2}-\\x{1F5C4}\\x{1F5D1}-\\x{1F5D3}\\x{1F5DC}-\\x{1F5DE}\\x{1F5E1}\\x{1F5E3}\\x{1F5E8}\\x{1F5EF}\\x{1F5F3}\\x{1F5FA}-\\x{1F644}]" " | [\\x{1F645}-\\x{1F647}]" " (?:" " \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}" " | [\\x{1F3FB}-\\x{1F3FF}]" " (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?" " )?" " | [\\x{1F648}-\\x{1F64A}]" " | \\x{1F64B}" " (?:" " \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}" " | [\\x{1F3FB}-\\x{1F3FF}]" " (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?" " )?" " | \\x{1F64C} [\\x{1F3FB}-\\x{1F3FF}]?" " | [\\x{1F64D}\\x{1F64E}]" " (?:" " \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}" " | [\\x{1F3FB}-\\x{1F3FF}]" " (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?" " )?" " | \\x{1F64F} [\\x{1F3FB}-\\x{1F3FF}]?" " | [\\x{1F680}-\\x{1F6A2}]" " | \\x{1F6A3}" " (?:" " \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}" " | [\\x{1F3FB}-\\x{1F3FF}]" " (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?" " )?" " | [\\x{1F6A4}-\\x{1F6B3}]" " | [\\x{1F6B4}-\\x{1F6B6}]" " (?:" " \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}" " | [\\x{1F3FB}-\\x{1F3FF}]" " (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?" " )?" " | [\\x{1F6B7}-\\x{1F6BF}]" " | \\x{1F6C0} [\\x{1F3FB}-\\x{1F3FF}]?" " | [\\x{1F6C1}-\\x{1F6C5}\\x{1F6CB}]" " | \\x{1F6CC} [\\x{1F3FB}-\\x{1F3FF}]?" " | [\\x{1F6CD}-\\x{1F6D2}\\x{1F6E0}-\\x{1F6E5}\\x{1F6E9}\\x{1F6EB}\\x{1F6EC}\\x{1F6F0}\\x{1F6F3}-\\x{1F6F9}\\x{1F910}-\\x{1F917}]" " | [\\x{1F918}-\\x{1F91C}] [\\x{1F3FB}-\\x{1F3FF}]?" " | \\x{1F91D}" " | [\\x{1F91E}\\x{1F91F}] [\\x{1F3FB}-\\x{1F3FF}]?" " | [\\x{1F920}-\\x{1F925}]" " | \\x{1F926}" " (?:" " \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}" " | [\\x{1F3FB}-\\x{1F3FF}]" " (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?" " )?" " | [\\x{1F927}-\\x{1F92F}]" " | [\\x{1F930}-\\x{1F936}] [\\x{1F3FB}-\\x{1F3FF}]?" " | \\x{1F937}" " (?:" " \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}" " | [\\x{1F3FB}-\\x{1F3FF}]" " (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?" " )?" " | [\\x{1F938}\\x{1F939}]" " (?:" " \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}" " | [\\x{1F3FB}-\\x{1F3FF}]" " (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?" " )?" " | \\x{1F93A}" " | \\x{1F93C}" " (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?" " | [\\x{1F93D}\\x{1F93E}]" " (?:" " \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}" " | [\\x{1F3FB}-\\x{1F3FF}]" " (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?" " )?" " | [\\x{1F940}-\\x{1F945}\\x{1F947}-\\x{1F970}\\x{1F973}-\\x{1F976}\\x{1F97A}\\x{1F97C}-\\x{1F9A2}\\x{1F9B0}-\\x{1F9B4}]" " | [\\x{1F9B5}\\x{1F9B6}] [\\x{1F3FB}-\\x{1F3FF}]?" " | \\x{1F9B7}" " | [\\x{1F9B8}\\x{1F9B9}]" " (?:" " \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}" " | [\\x{1F3FB}-\\x{1F3FF}]" " (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?" " )?" " | [\\x{1F9C0}-\\x{1F9C2}\\x{1F9D0}]" " | [\\x{1F9D1}-\\x{1F9D5}] [\\x{1F3FB}-\\x{1F3FF}]?" " | \\x{1F9D6}" " (?:" " \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}" " | [\\x{1F3FB}-\\x{1F3FF}]" " (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?" " )?" " | [\\x{1F9D7}-\\x{1F9DD}]" " (?:" " \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}" " | [\\x{1F3FB}-\\x{1F3FF}]" " (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?" " )?" " | [\\x{1F9DE}\\x{1F9DF}]" " (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?" " | [\\x{1F9E0}-\\x{1F9FF}]" 

Para o modo utf-16 (stringed), modo comprimido:

 "[#*0-9]\\uFE0F\\u20E3|[\\u00A9\\u00AE\\u203C\\u2049\\u2122\\u2139\\u2" "194-\\u2199\\u21A9\\u21AA\\u231A\\u231B\\u2328\\u23CF\\u23E9-\\u23F3\\" "u23F8-\\u23FA\\u24C2\\u25AA\\u25AB\\u25B6\\u25C0\\u25FB-\\u25FE\\u260" "0-\\u2604\\u260E\\u2611\\u2614\\u2615\\u2618]|\\u261D(?:\\uD83C[\\uDF" "FB-\\uDFFF])?|[\\u2620\\u2622\\u2623\\u2626\\u262A\\u262E\\u262F\\u26" "38-\\u263A\\u2640\\u2642\\u2648-\\u2653\\u265F\\u2660\\u2663\\u2665\\u" "2666\\u2668\\u267B\\u267E\\u267F\\u2692-\\u2697\\u2699\\u269B\\u269C\\" "u26A0\\u26A1\\u26AA\\u26AB\\u26B0\\u26B1\\u26BD\\u26BE\\u26C4\\u26C5\\" "u26C8\\u26CE\\u26CF\\u26D1\\u26D3\\u26D4\\u26E9\\u26EA\\u26F0-\\u26F5" "\\u26F7\\u26F8]|\\u26F9(?:\\uD83C[\\uDFFB-\\uDFFF](?:\\u200D[\\u2640" "\\u2642]\\uFE0F)?|\\uFE0F\\u200D[\\u2640\\u2642]\\uFE0F)?|[\\u26FA\\u" "26FD\\u2702\\u2705\\u2708\\u2709]|[\\u270A-\\u270D](?:\\uD83C[\\uDFF" "B-\\uDFFF])?|[\\u270F\\u2712\\u2714\\u2716\\u271D\\u2721\\u2728\\u273" "3\\u2734\\u2744\\u2747\\u274C\\u274E\\u2753-\\u2755\\u2757\\u2763\\u27" "64\\u2795-\\u2797\\u27A1\\u27B0\\u27BF\\u2934\\u2935\\u2B05-\\u2B07\\u" "2B1B\\u2B1C\\u2B50\\u2B55\\u3030\\u303D\\u3297\\u3299]|\\uD83C(?:[\\u" "DC04\\uDCCF\\uDD70\\uDD71\\uDD7E\\uDD7F\\uDD8E\\uDD91-\\uDD9A]|\\uDDE" "6\\uD83C[\\uDDE8-\\uDDEC\\uDDEE\\uDDF1\\uDDF2\\uDDF4\\uDDF6-\\uDDFA\\u" "DDFC\\uDDFD\\uDDFF]|\\uDDE7\\uD83C[\\uDDE6\\uDDE7\\uDDE9-\\uDDEF\\uDD" "F1-\\uDDF4\\uDDF6-\\uDDF9\\uDDFB\\uDDFC\\uDDFE\\uDDFF]|\\uDDE8\\uD83C" "[\\uDDE6\\uDDE8\\uDDE9\\uDDEB-\\uDDEE\\uDDF0-\\uDDF5\\uDDF7\\uDDFA-\\u" "DDFF]|\\uDDE9\\uD83C[\\uDDEA\\uDDEC\\uDDEF\\uDDF0\\uDDF2\\uDDF4\\uDDF" "F]|\\uDDEA\\uD83C[\\uDDE6\\uDDE8\\uDDEA\\uDDEC\\uDDED\\uDDF7-\\uDDFA]" "|\\uDDEB\\uD83C[\\uDDEE-\\uDDF0\\uDDF2\\uDDF4\\uDDF7]|\\uDDEC\\uD83C[" "\\uDDE6\\uDDE7\\uDDE9-\\uDDEE\\uDDF1-\\uDDF3\\uDDF5-\\uDDFA\\uDDFC\\uD" "DFE]|\\uDDED\\uD83C[\\uDDF0\\uDDF2\\uDDF3\\uDDF7\\uDDF9\\uDDFA]|\\uDD" "EE\\uD83C[\\uDDE8-\\uDDEA\\uDDF1-\\uDDF4\\uDDF6-\\uDDF9]|\\uDDEF\\uD8" "3C[\\uDDEA\\uDDF2\\uDDF4\\uDDF5]|\\uDDF0\\uD83C[\\uDDEA\\uDDEC-\\uDDE" "E\\uDDF2\\uDDF3\\uDDF5\\uDDF7\\uDDFC\\uDDFE\\uDDFF]|\\uDDF1\\uD83C[\\u" "DDE6-\\uDDE8\\uDDEE\\uDDF0\\uDDF7-\\uDDFB\\uDDFE]|\\uDDF2\\uD83C[\\uD" "DE6\\uDDE8-\\uDDED\\uDDF0-\\uDDFF]|\\uDDF3\\uD83C[\\uDDE6\\uDDE8\\uDD" "EA-\\uDDEC\\uDDEE\\uDDF1\\uDDF4\\uDDF5\\uDDF7\\uDDFA\\uDDFF]|\\uDDF4\\" "uD83C\\uDDF2|\\uDDF5\\uD83C[\\uDDE6\\uDDEA-\\uDDED\\uDDF0-\\uDDF3\\uD" "DF7-\\uDDF9\\uDDFC\\uDDFE]|\\uDDF6\\uD83C\\uDDE6|\\uDDF7\\uD83C[\\uDD" "EA\\uDDF4\\uDDF8\\uDDFA\\uDDFC]|\\uDDF8\\uD83C[\\uDDE6-\\uDDEA\\uDDEC" "-\\uDDF4\\uDDF7-\\uDDF9\\uDDFB\\uDDFD-\\uDDFF]|\\uDDF9\\uD83C[\\uDDE6" "\\uDDE8\\uDDE9\\uDDEB-\\uDDED\\uDDEF-\\uDDF4\\uDDF7\\uDDF9\\uDDFB\\uDD" "FC\\uDDFF]|\\uDDFA\\uD83C[\\uDDE6\\uDDEC\\uDDF2\\uDDF3\\uDDF8\\uDDFE\\" "uDDFF]|\\uDDFB\\uD83C[\\uDDE6\\uDDE8\\uDDEA\\uDDEC\\uDDEE\\uDDF3\\uDD" "FA]|\\uDDFC\\uD83C[\\uDDEB\\uDDF8]|\\uDDFD\\uD83C\\uDDF0|\\uDDFE\\uD8" "3C[\\uDDEA\\uDDF9]|\\uDDFF\\uD83C[\\uDDE6\\uDDF2\\uDDFC]|[\\uDE01\\uD" "E02\\uDE1A\\uDE2F\\uDE32-\\uDE3A\\uDE50\\uDE51\\uDF00-\\uDF21\\uDF24-" "\\uDF84]|\\uDF85(?:\\uD83C[\\uDFFB-\\uDFFF])?|[\\uDF86-\\uDF93\\uDF9" "6\\uDF97\\uDF99-\\uDF9B\\uDF9E-\\uDFC1]|\\uDFC2(?:\\uD83C[\\uDFFB-\\u" "DFFF])?|[\\uDFC3\\uDFC4](?:\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\" "uDFFB-\\uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|[\\uDFC5\\uDFC6" "]|\\uDFC7(?:\\uD83C[\\uDFFB-\\uDFFF])?|[\\uDFC8\\uDFC9]|\\uDFCA(?:\\" "u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-\\uDFFF](?:\\u200D[\\u2" "640\\u2642]\\uFE0F)?)?|[\\uDFCB\\uDFCC](?:\\uD83C[\\uDFFB-\\uDFFF](" "?:\\u200D[\\u2640\\u2642]\\uFE0F)?|\\uFE0F\\u200D[\\u2640\\u2642]\\uF" "E0F)?|[\\uDFCD-\\uDFF0]|\\uDFF3(?:\\uFE0F\\u200D\\uD83C\\uDF08)?|\\u" "DFF4(?:\\u200D\\u2620\\uFE0F|\\uDB40\\uDC67\\uDB40\\uDC62\\uDB40(?:\\" "uDC65\\uDB40\\uDC6E\\uDB40\\uDC67|\\uDC73\\uDB40\\uDC63\\uDB40\\uDC74" "|\\uDC77\\uDB40\\uDC6C\\uDB40\\uDC73)\\uDB40\\uDC7F)?|[\\uDFF5\\uDFF7" "-\\uDFFF])|\\uD83D(?:[\\uDC00-\\uDC40]|\\uDC41(?:\\uFE0F\\u200D\\uD8" "3D\\uDDE8\\uFE0F)?|[\\uDC42\\uDC43](?:\\uD83C[\\uDFFB-\\uDFFF])?|[\\" "uDC44\\uDC45]|[\\uDC46-\\uDC50](?:\\uD83C[\\uDFFB-\\uDFFF])?|[\\uDC" "51-\\uDC65]|[\\uDC66\\uDC67](?:\\uD83C[\\uDFFB-\\uDFFF])?|\\uDC68(?" ":\\u200D(?:[\\u2695\\u2696\\u2708]\\uFE0F|\\u2764\\uFE0F\\u200D\\uD83" "D(?:\\uDC8B\\u200D\\uD83D)?\\uDC68|\\uD83C[\\uDF3E\\uDF73\\uDF93\\uDF" "A4\\uDFA8\\uDFEB\\uDFED]|\\uD83D(?:\\uDC66(?:\\u200D\\uD83D\\uDC66)?" "|\\uDC67(?:\\u200D\\uD83D[\\uDC66\\uDC67])?|[\\uDC68\\uDC69]\\u200D\\" "uD83D(?:\\uDC66(?:\\u200D\\uD83D\\uDC66)?|\\uDC67(?:\\u200D\\uD83D[" "\\uDC66\\uDC67])?)|[\\uDCBB\\uDCBC\\uDD27\\uDD2C\\uDE80\\uDE92])|\\uD" "83E[\\uDDB0-\\uDDB3])|\\uD83C[\\uDFFB-\\uDFFF](?:\\u200D(?:[\\u2695" "\\u2696\\u2708]\\uFE0F|\\uD83C[\\uDF3E\\uDF73\\uDF93\\uDFA4\\uDFA8\\uD" "FEB\\uDFED]|\\uD83D[\\uDCBB\\uDCBC\\uDD27\\uDD2C\\uDE80\\uDE92]|\\uD8" "3E[\\uDDB0-\\uDDB3]))?)?|\\uDC69(?:\\u200D(?:[\\u2695\\u2696\\u2708" "]\\uFE0F|\\u2764\\uFE0F\\u200D\\uD83D(?:\\uDC8B\\u200D\\uD83D)?[\\uDC" "68\\uDC69]|\\uD83C[\\uDF3E\\uDF73\\uDF93\\uDFA4\\uDFA8\\uDFEB\\uDFED]" "|\\uD83D(?:\\uDC66(?:\\u200D\\uD83D\\uDC66)?|\\uDC67(?:\\u200D\\uD83" "D[\\uDC66\\uDC67])?|\\uDC69\\u200D\\uD83D(?:\\uDC66(?:\\u200D\\uD83D" "\\uDC66)?|\\uDC67(?:\\u200D\\uD83D[\\uDC66\\uDC67])?)|[\\uDCBB\\uDCB" "C\\uDD27\\uDD2C\\uDE80\\uDE92])|\\uD83E[\\uDDB0-\\uDDB3])|\\uD83C[\\u" "DFFB-\\uDFFF](?:\\u200D(?:[\\u2695\\u2696\\u2708]\\uFE0F|\\uD83C[\\u" "DF3E\\uDF73\\uDF93\\uDFA4\\uDFA8\\uDFEB\\uDFED]|\\uD83D[\\uDCBB\\uDCB" "C\\uDD27\\uDD2C\\uDE80\\uDE92]|\\uD83E[\\uDDB0-\\uDDB3]))?)?|[\\uDC6" "A-\\uDC6D]|\\uDC6E(?:\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-" "\\uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|\\uDC6F(?:\\u200D[\\u2" "640\\u2642]\\uFE0F)?|\\uDC70(?:\\uD83C[\\uDFFB-\\uDFFF])?|\\uDC71(?" ":\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-\\uDFFF](?:\\u200D[\\" "u2640\\u2642]\\uFE0F)?)?|\\uDC72(?:\\uD83C[\\uDFFB-\\uDFFF])?|\\uDC" "73(?:\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-\\uDFFF](?:\\u20" "0D[\\u2640\\u2642]\\uFE0F)?)?|[\\uDC74-\\uDC76](?:\\uD83C[\\uDFFB-\\" "uDFFF])?|\\uDC77(?:\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-\\" "uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|\\uDC78(?:\\uD83C[\\uDF" "FB-\\uDFFF])?|[\\uDC79-\\uDC7B]|\\uDC7C(?:\\uD83C[\\uDFFB-\\uDFFF])" "?|[\\uDC7D-\\uDC80]|[\\uDC81\\uDC82](?:\\u200D[\\u2640\\u2642]\\uFE0" "F|\\uD83C[\\uDFFB-\\uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|\\uD" "C83(?:\\uD83C[\\uDFFB-\\uDFFF])?|\\uDC84|\\uDC85(?:\\uD83C[\\uDFFB-" "\\uDFFF])?|[\\uDC86\\uDC87](?:\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C" "[\\uDFFB-\\uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|[\\uDC88-\\uD" "CA9]|\\uDCAA(?:\\uD83C[\\uDFFB-\\uDFFF])?|[\\uDCAB-\\uDCFD\\uDCFF-\\" "uDD3D\\uDD49-\\uDD4E\\uDD50-\\uDD67\\uDD6F\\uDD70\\uDD73]|\\uDD74(?:" "\\uD83C[\\uDFFB-\\uDFFF])?|\\uDD75(?:\\uD83C[\\uDFFB-\\uDFFF](?:\\u2" "00D[\\u2640\\u2642]\\uFE0F)?|\\uFE0F\\u200D[\\u2640\\u2642]\\uFE0F)?" "|[\\uDD76-\\uDD79]|\\uDD7A(?:\\uD83C[\\uDFFB-\\uDFFF])?|[\\uDD87\\uD" "D8A-\\uDD8D]|[\\uDD90\\uDD95\\uDD96](?:\\uD83C[\\uDFFB-\\uDFFF])?|[" "\\uDDA4\\uDDA5\\uDDA8\\uDDB1\\uDDB2\\uDDBC\\uDDC2-\\uDDC4\\uDDD1-\\uDD" "D3\\uDDDC-\\uDDDE\\uDDE1\\uDDE3\\uDDE8\\uDDEF\\uDDF3\\uDDFA-\\uDE44]|" "[\\uDE45-\\uDE47](?:\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-\\" "uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|[\\uDE48-\\uDE4A]|\\uDE" "4B(?:\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-\\uDFFF](?:\\u20" "0D[\\u2640\\u2642]\\uFE0F)?)?|\\uDE4C(?:\\uD83C[\\uDFFB-\\uDFFF])?|" "[\\uDE4D\\uDE4E](?:\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-\\u" "DFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|\\uDE4F(?:\\uD83C[\\uDFF" "B-\\uDFFF])?|[\\uDE80-\\uDEA2]|\\uDEA3(?:\\u200D[\\u2640\\u2642]\\uF" "E0F|\\uD83C[\\uDFFB-\\uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|[" "\\uDEA4-\\uDEB3]|[\\uDEB4-\\uDEB6](?:\\u200D[\\u2640\\u2642]\\uFE0F|" "\\uD83C[\\uDFFB-\\uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|[\\uDE" "B7-\\uDEBF]|\\uDEC0(?:\\uD83C[\\uDFFB-\\uDFFF])?|[\\uDEC1-\\uDEC5\\u" "DECB]|\\uDECC(?:\\uD83C[\\uDFFB-\\uDFFF])?|[\\uDECD-\\uDED2\\uDEE0-" "\\uDEE5\\uDEE9\\uDEEB\\uDEEC\\uDEF0\\uDEF3-\\uDEF9])|\\uD83E(?:[\\uDD" "10-\\uDD17]|[\\uDD18-\\uDD1C](?:\\uD83C[\\uDFFB-\\uDFFF])?|\\uDD1D|" "[\\uDD1E\\uDD1F](?:\\uD83C[\\uDFFB-\\uDFFF])?|[\\uDD20-\\uDD25]|\\uD" "D26(?:\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-\\uDFFF](?:\\u2" "00D[\\u2640\\u2642]\\uFE0F)?)?|[\\uDD27-\\uDD2F]|[\\uDD30-\\uDD36](" "?:\\uD83C[\\uDFFB-\\uDFFF])?|\\uDD37(?:\\u200D[\\u2640\\u2642]\\uFE0" "F|\\uD83C[\\uDFFB-\\uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|[\\u" "DD38\\uDD39](?:\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-\\uDFF" "F](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|\\uDD3A|\\uDD3C(?:\\u200D[\\" "u2640\\u2642]\\uFE0F)?|[\\uDD3D\\uDD3E](?:\\u200D[\\u2640\\u2642]\\u" "FE0F|\\uD83C[\\uDFFB-\\uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|" "[\\uDD40-\\uDD45\\uDD47-\\uDD70\\uDD73-\\uDD76\\uDD7A\\uDD7C-\\uDDA2\\" "uDDB0-\\uDDB4]|[\\uDDB5\\uDDB6](?:\\uD83C[\\uDFFB-\\uDFFF])?|\\uDDB" "7|[\\uDDB8\\uDDB9](?:\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-" "\\uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|[\\uDDC0-\\uDDC2\\uDDD" "0]|[\\uDDD1-\\uDDD5](?:\\uD83C[\\uDFFB-\\uDFFF])?|\\uDDD6(?:\\u200D" "[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-\\uDFFF](?:\\u200D[\\u2640\\u" "2642]\\uFE0F)?)?|[\\uDDD7-\\uDDDD](?:\\u200D[\\u2640\\u2642]\\uFE0F" "|\\uD83C[\\uDFFB-\\uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|[\\uD" "DDE\\uDDDF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?|[\\uDDE0-\\uDDFF])"