Manipulation

Learn Unicode Character Types and Literals in Modern C++

C++11 brings a lot of improvements and I think one of the most important features were the Unicode Character Types and Literals that allow more support for strings in different languages globally. C++11 introduced a new character type to manipulate Unicode character strings. This can be used in C++11, C++14, C++17, and above. This feature improved interactions in next generation C++ applications, like chat, social media applications, and so on by allowing a more diverse set of language characters and symbols to be displayed as well as emoticons. In this post, we explain what are Unicode character types and literals in Modern C++. What are Unicode character types and literals in Modern C++ 11? Unicode character types and literals allow more support for different languages, characters, and symbols in strings. C++11 introduces new character types to manipulate Unicode character strings. These can be used in C++11, C++14, C++17, and above. This feature improved language support in editor and design applications (i.e. RAD Studio uses Unicode Strings). It also vastly improved interactions in the next generation C++ applications like chat and social media. This is why we can display smiley faces ????, Vulcan hand signals ???? and love hearts ????. C++ Builder implements new character types and character literals for Unicode. These types are among the C++11 features added to bcc32, bcc32c, and bcc64 compilers. 1. New character types C++11 introduces new character types to manipulate Unicode character strings. For more information on this feature, see Unicode Character Types and Literals (C++11). 2. Unicode string literals C++11 introduces new character types to manipulate Unicode string literals. For more information on this feature, see Unicode Character Types and Literals (C++11). 3. Raw string literals 4. Universal character names in literals In order to make the C++ code less platform-dependent, C++11 lifts the prohibitions regarding control and basic source universal character names within character and string literals. Prohibitions against surrogate values in all universal character names are added. For more information on this feature, see Universal character names in literals Proposal document. 5. User-defined literals C++11 introduces new forms of literals using modified syntax and semantics in order to provide user-defined literals. Using user-defined literals, user-defined classes can provide new literal syntax. For more information on this feature, see User-defined literals Proposal document. What are the Unicode character types char16_t and char32_t in Modern C++? With the C++11 standards, two new types were introduced to represent Unicode characters: char16_t is a 16-bit character type. char16_t is a C++ keyword. This type can be used for UTF-16 characters. char32_t is a 32-bit character type. char32_t is a C++ keyword. This type can be used for UTF-32 characters. The existing wchar_t type is a type for a wide character in the execution wide-character set. A wchar_t wide-character literal begins with an uppercase L (such as L’c’). We have a very good post that explains how you can use character literals in modern C++. What are the character literals u’character’ and U’character’ in Modern C++? There are two new ways to create character literals of the new types: u’character’ is a literal for a single char16_t character, such as u’g’. A multicharacter literal such as u’kh’ is badly formed. The value of a char16_t literal is equal to its ISO 10646 code point value, provided that the code point is representable as a 16-bit value. Only characters in the basic multilingual plane (BMP) can be represented. U’character’ is a literal for a single char32_t character, such as U’t’. A multicharacter literal such as U’de’ is ill-formed. […]

Read More

What Is The ‘>>’ Right-Angle Bracket Support In C++?

C++11 brings a lot of improvements over C++98. In C++98, two consecutive right-angle brackets (>>) give an error, and these constructions are treated according to the C++11 standard which means CLANG compilers no longer generate an error about right angle brackets. In this post, we explain this and how to solve the right-angle bracket problem in C++. What is the right-angle bracket problem in C++? Ever since the introduction of angle brackets in C++98, C++ developers have been surprised by the fact that two consecutive right-angle brackets must be separated by whitespace. For example, if you declare two-dimensional vector (int and bool) as below: #include   typedef std::vector vec1;  // OK   typedef std::vector vec2;  // Error In C++98, the first declaration is OK, but the second declaration give errors because of ‘>>‘ (right angle brackets). However, both are OK in C++11 and above. One of the problems was an immediate consequence of the “maximum munch” principle and the fact that >> is a valid token (right shift) in C++. In the CLANG-enhanced C++ compilers, two consecutive right-angle brackets no longer generate an error, and these constructions are treated according to the C++11 standard. This issue was a minor issue in C++98, but persisting, annoying, and somewhat embarrassing problem. The cost was reasonable, and it seems therefore worthwhile to eliminate the surprise. C++98 developers needed to add space between them. If you want to get more information, you can see details here. How can I solve the right-angle bracket problem in C++? If you have C++98 compiler and come across the right-angle bracket problem, you need to add space between two ‘>’ right angle brackets. ‘>>’ should be written as ‘> >’ as shown in the example below. #include   typedef std::vector vec2;  // OK   int main() { } Or you should change your C++ compiler so that it supports C++11 or above. C++17 is recommended. Note that the latest RAD Studio, C++ Builder standard and CLANG compilers supports C++17 features. What is the right-angle bracket support in C++ 11? In the Clang-enhanced C++ compilers, two consecutive right-angle brackets no longer generate an error, and these constructions are treated according to the C++11 standard.This example below with ‘>>’ right angle brackets can be successfully compiled with any compiler that supports C++11 and above. #include   typedef std::vector vec1;  // OK C++11 and above   int main() { } For more information, see the C++11 proposal document at Right Angle Brackets Proposal document

Read More