QStringConverter 类提供用于编码和解码文本的基类。 更多...
头: | #include <QStringConverter> |
CMake: |
find_package(Qt6 REQUIRED COMPONENTS Core)
target_link_libraries(mytarget PRIVATE Qt6::Core) |
qmake: | QT += core |
继承者: |
注意: 此类的所有函数 可重入 .
enum | 编码 { Utf8, Utf16, Utf16BE, Utf16LE, Utf32, …, System } |
enum class | Flag { Default, ConvertInvalidToNull, WriteBom, ConvertInitialBom, Stateless } |
flags | Flags |
bool | hasError () const |
bool | isValid () const |
const char * | name () const |
void | resetState () |
QStringList | availableCodecs () |
std::optional<QStringConverter::Encoding> | encodingForData (QByteArrayView data , char16_t expectedFirstCharacter = 0) |
std::optional<QStringConverter::Encoding> | encodingForHtml (QByteArrayView data ) |
std::optional<QStringConverter::Encoding> | encodingForName (const char * name ) |
const char * | nameForEncoding (QStringConverter::Encoding e ) |
Qt uses UTF-16 to store, draw and manipulate strings. In many situations you may wish to deal with data that uses a different encoding. Most text data transferred over files and network connections is encoded in UTF-8.
The QStringConverter class is a base class for the QStringEncoder and QStringDecoder classes that help with converting between different text encodings. QStringDecoder can decode a string from an encoded representation into UTF-16, the format Qt uses internally. QStringEncoder does the opposite operation, encoding UTF-16 encoded data (usually in the form of a QString ) to the requested encoding.
The following encodings are always supported:
QStringConverter may support more encodings depending on how Qt was compiled. If more codecs are supported, they can be listed using availableCodecs ().
QStringConverter s can be used as follows to convert some encoded string to and from UTF-16.
Suppose you have some string encoded in UTF-8, and want to convert it to a QString . The simple way to do it is to use a QStringDecoder 像这样:
QByteArray encodedString = "..."; auto toUtf16 = QStringDecoder(QStringDecoder::Utf8); QString string = toUtf16(encodedString);
After this,
string
holds the text in decoded form. Converting a string from Unicode to the local encoding is just as easy using the
QStringEncoder
类:
QString string = "..."; auto fromUtf16 = QStringEncoder(QStringEncoder::Utf8); QByteArray encodedString = fromUtf16(string);
To read or write text files in various encodings, use QTextStream 及其 setEncoding () 函数。
Some care must be taken when trying to convert the data in chunks, for example, when receiving it over a network. In such cases it is possible that a multi-byte character will be split over two chunks. At best this might result in the loss of a character and at worst cause the entire conversion to fail.
Both QStringEncoder and QStringDecoder make this easy, by tracking this in an internal state. So simply calling the encoder or decoder again with the next chunk of data will automatically continue encoding or decoding the data correctly:
auto toUtf16 = QStringDecoder(QStringDecoder::Utf8); QString string; while (new_data_available()) { QByteArray chunk = get_new_data(); string += toUtf16(chunk); }
The QStringDecoder object maintains state between chunks and therefore works correctly even if a multi-byte character is split between chunks.
QStringConverter objects can't be copied because of their internal state, but can be moved.
另请参阅 QTextStream , QStringDecoder ,和 QStringEncoder .
常量 | 值 | 描述 |
---|---|---|
QStringConverter::Utf8
|
0
|
Create a converter to or from UTF-8 |
QStringConverter::Utf16
|
1
|
Create a converter to or from UTF-16. When decoding, the byte order will get automatically detected by a leading byte order mark. If none exists or when encoding, the system byte order will be assumed. |
QStringConverter::Utf16BE
|
3
|
Create a converter to or from big-endian UTF-16. |
QStringConverter::Utf16LE
|
2
|
Create a converter to or from little-endian UTF-16. |
QStringConverter::Utf32
|
4
|
Create a converter to or from UTF-32. When decoding, the byte order will get automatically detected by a leading byte order mark. If none exists or when encoding, the system byte order will be assumed. |
QStringConverter::Utf32BE
|
6
|
Create a converter to or from big-endian UTF-32. |
QStringConverter::Utf32LE
|
5
|
Create a converter to or from little-endian UTF-32. |
QStringConverter::Latin1
|
7
|
Create a converter to or from ISO-8859-1 (Latin1). |
QStringConverter::System
|
8
|
Create a converter to or from the underlying encoding of the operating systems locale. This is always assumed to be UTF-8 for Unix based systems. On Windows, this converts to and from the locale code page. |
常量 | 值 | 描述 |
---|---|---|
QStringConverter::Flag::Default
|
0
|
Default conversion rules apply. |
QStringConverter::Flag::ConvertInvalidToNull
|
0x2
|
If this flag is set, each invalid input character is output as a null character. If it is not set, invalid input characters are represented as QChar::ReplacementCharacter if the output encoding can represent that character, otherwise as a question mark. |
QStringConverter::Flag::WriteBom
|
0x4
|
When converting from a QString to an output encoding, write a QChar::ByteOrderMark as the first character if the output encoding supports this. This is the case for UTF-8, UTF-16 and UTF-32 encodings. |
QStringConverter::Flag::ConvertInitialBom
|
0x8
|
When converting from an input encoding to a QString the QStringDecoder usually skips an leading QChar::ByteOrderMark . When this flag is set, the byte order mark will not be skipped, but converted to utf-16 and inserted at the start of the created QString . |
QStringConverter::Flag::Stateless
|
0x1
|
Ignore possible converter states between different function calls to encode or decode strings. This will also cause the QStringConverter to raise an error if an incomplete sequence of data is encountered. |
Flags 类型是 typedef 对于 QFlags <Flag>。它存储 Flag 值的 OR 组合。
[static]
QStringList
QStringConverter::
availableCodecs
()
Returns a list of names of supported codecs. The names returned by this function can be passed to QStringEncoder 's and QStringDecoder 's constructor to create a en- or decoder for the given codec.
This function may be used to obtain a listing of additional codecs beyond the standard ones. Support for additional codecs requires Qt be compiled with support for the ICU library.
注意: The order of codecs is an internal implementation detail and not guaranteed to be stable.
[static noexcept]
std::optional
<
QStringConverter::Encoding
> QStringConverter::
encodingForData
(
QByteArrayView
data
,
char16_t
expectedFirstCharacter
= 0)
Returns the encoding for the content of data if it can be determined. expectedFirstCharacter can be passed as an additional hint to help determine the encoding.
The returned optional is empty, if the encoding is unclear.
[static]
std::optional
<
QStringConverter::Encoding
> QStringConverter::
encodingForHtml
(
QByteArrayView
data
)
Tries to determine the encoding of the HTML in data by looking at leading byte order marks or a charset specifier in the HTML meta tag. If the optional is empty, the encoding specified is not supported by QStringConverter . If no encoding is detected, the method returns Utf8.
另请参阅 QStringDecoder::decoderForHtml ().
[static noexcept]
std::optional
<
QStringConverter::Encoding
> QStringConverter::
encodingForName
(const
char
*
name
)
Convert name to the corresponding 编码 member, if there is one.
若
name
is not the name of a codec listed in the Encoding enumeration,
std::nullopt
is returned. Such a name may, none the less, be accepted by the
QStringConverter
constructor when Qt is built with ICU, if ICU provides a converter with the given name.
name is expected to be UTF-8 encoded.
[noexcept]
bool
QStringConverter::
hasError
() const
Returns true if a conversion could not correctly convert a character. This could for example get triggered by an invalid UTF-8 sequence or when a character can't get converted due to limitations in the target encoding.
[noexcept]
bool
QStringConverter::
isValid
() const
Returns true if this is a valid string converter that can be used for encoding or decoding text.
Default constructed string converters or converters constructed with an unsupported name are not valid.
[noexcept]
const
char
*QStringConverter::
name
() const
Returns the canonical name of the encoding this QStringConverter can encode or decode. Returns a nullptr if the converter is not valid. The returned name is UTF-8 encoded.
另请参阅 isValid ().
[static]
const
char
*QStringConverter::
nameForEncoding
(
QStringConverter::Encoding
e
)
Returns the canonical name for encoding e .
[noexcept]
void
QStringConverter::
resetState
()
Resets the internal state of the converter, clearing potential errors or partial conversions.