Cross-Platform C++

ot
class Character

#include "ot/base/Character.h"

Represents a Unicode character using an internal sequence of one or more CharType values. It provides optimized routines for converting Unicode characters into a sequence of one or more CharType values and for decoding a CharType sequence into a single UCS4Char value.

The Character class also contains a number of convenient methods for querying the characteristics of the encoded Unicode character. These routines have counterparts in the standard C++ library, but the standard library routines rely on the capabilities of a locale which may not be available for Unicode.




Constructor/Destructor Summary
Character()
         Default constructor.
Character(const Character& rhs)
         Copy constructor.
Character(UCS4Char ch)
         Constructs a Character by transforming the code-point value into a sequence of CharType values representing Unicode characters encoded into to the native OpenTop encoding.
Character(const CharType* pSeqStart, size_t len)
         Constructs a Character given a pointer to a sequence of CharType elements and its maximum length.

Method Summary
 void appendToString(String& ) const
         Appends the multi-character sequence controlled by this Character to the passed String str.
 const CharType* data() const
         Returns a pointer to the controlled CharType sequence buffer.
 CharType first() const
         Returns the first CharType value in the controlled sequence.
 bool isAscii() const
         Tests if the Unicode character represented by this Character is in the ASCII range U+0000-U+007F.
 bool isDigit() const
         Tests if the Unicode character represented by this Character represents an ASCII decimal digit 0-9.
 bool isEOF() const
         Tests if this Character is equal to the special Character: Character::EndOfFileCharacter.
 bool isHexDigit() const
         Tests if the Unicode character represented by this Character represents an ASCII hexadecimal digit [0-9], [A-F], [a-f].
 bool isSpace() const
         Tests if the Unicode character represented by this Character represents white-space according to common Windows and Unix conventions.
 size_t length() const
         Returns the number of CharType elements in the controlled character sequence which are being used to encode the represented Unicode character..
 bool operator!=(const Character& rhs) const
         Inequality operator.
 bool operator!=(CharType c) const
         Inequality operator.
 Character& operator=(const Character& rhs)
         Assignment operator.
 bool operator==(const Character& rhs) const
         Equality operator.
 bool operator==(CharType c) const
         Equality operator.
 String toString() const
         Returns a String containing an identical sequence of CharType values as the sequence controller and contained by this Character.
 UCS4Char toUnicode() const
         Converts the controlled multi-character sequence into a 32-bit Unicode code-point value.

Public Static Data Members

EndOfFileCharacter

Character EndOfFileCharacter

A Character representing the 'end of file' condition. This is a special Character that can be returned from functions that read a single Character when the end of file condition has been reached.


Constructor/Destructor Detail

Character

 Character()
Default constructor. Creates a Character that is equivalent to the EndOfFile character.


Character

 Character(const Character& rhs)
Copy constructor. Constructs a Character with the same value as rhs.

Parameters:
rhs - the Character to copy

Character

 Character(UCS4Char ch)
Constructs a Character by transforming the code-point value into a sequence of CharType values representing Unicode characters encoded into to the native OpenTop encoding.

Parameters:
ch - the code-point value of the Unicode character.
Exceptions:
IllegalCharacterException - if ch is not a legal Unicode character in the range U+0000-U+10FFFF or, if OpenTop has been configured to use an incomplete Unicode character encoding (such as ISO-8859-1) if the character is not mappable in the configured native encoding.

Character

 Character(const CharType* pSeqStart,
           size_t len)
Constructs a Character given a pointer to a sequence of CharType elements and its maximum length. The input sequence consists of one or more CharType values that, when decoded, represent a single Unicode character.

The sequence, from the first CharType element and including any trailing elements (indicated by the value of the first element), are copied into the internal CharType sequence.

Parameters:
pSeqStart - a pointer to the first element of a CharType sequence that represents at least one Unicode character.
len - the number of CharType elements that are legally addressable within the array starting at pSeqStart
Exceptions:
NullPointerException - if pSeqStart is null.
IllegalCharacterException - if the array starting at pSeqStart does not represent a valid Unicode character in the native OpenTop encoding.

Method Detail

appendToString

void appendToString(String& ) const
Appends the multi-character sequence controlled by this Character to the passed String str.

Parameters:
str - the String which will have this Character appended

data

const CharTypedata() const
Returns a pointer to the controlled CharType sequence buffer.

Returns:
a pointer to the controlled CharType sequence.
See also:
length()

first

CharType first() const
Returns the first CharType value in the controlled sequence.

Returns:
the first CharType value in the controlled sequence.
Exceptions:
IllegalCharacterException - if this Character does not represent a valid Unicode character in the range U+0000-U+10FFFF.

isAscii

bool isAscii() const
Tests if the Unicode character represented by this Character is in the ASCII range U+0000-U+007F.

Returns:
true if this Character is in the ASCII range; false otherwise.
See also:
UnicodeCharacterType::IsAscii()

isDigit

bool isDigit() const
Tests if the Unicode character represented by this Character represents an ASCII decimal digit 0-9.

Returns:
true if this Character is a decimal digit [0-9]; false otherwise.
See also:
UnicodeCharacterType::IsDigit()

isEOF

bool isEOF() const
Tests if this Character is equal to the special Character: Character::EndOfFileCharacter. Functions that read a character stream and return a Character need a method to indicate that the end of stream has been reached. To achieve this they return a special Character with a unique value that is different from all valid Unicode characters.

Returns:
true if this Character is equal to the Character::EndOfFileCharacter; false otherwise.

isHexDigit

bool isHexDigit() const
Tests if the Unicode character represented by this Character represents an ASCII hexadecimal digit [0-9], [A-F], [a-f].

Returns:
true if this Character is a hexadecimal digit; false otherwise.
See also:
UnicodeCharacterType::IsHexDigit()

isSpace

bool isSpace() const
Tests if the Unicode character represented by this Character represents white-space according to common Windows and Unix conventions. Space characters are:-

Returns:
true if this Character is a space character; false otherwise.
See also:
UnicodeCharacterType::IsSpace()

length

size_t length() const
Returns the number of CharType elements in the controlled character sequence which are being used to encode the represented Unicode character..

Returns:
the length of the controlled CharType sequence.
See also:
data()

operator!=

bool operator!=(const Character& rhs) const
Inequality operator. Tests if the Unicode character represented by this is not the same Unicode character as rhs;

Returns:
false if the Unicode character represented by this Character is equal to the Unicode character rhs; true otherwise

operator!=

bool operator!=(CharType c) const
Inequality operator. Tests if the internal multi-character sequence has a length other than 1 or the first member is not equal to c.

Returns:
true if the Unicode character represented by this Character is equal to the single CharType value c; false otherwise

operator=

Character& operator=(const Character& rhs)
Assignment operator. Sets this Character equal to rhs.

Returns:
a reference to this Character.

operator==

bool operator==(const Character& rhs) const
Equality operator. Tests if the Unicode character represented by this is the same Unicode character as rhs;

Returns:
true if the Unicode character represented by this Character is equal to the Unicode character rhs; false otherwise

operator==

bool operator==(CharType c) const
Equality operator. Tests if the internal multi-character sequence has a length of 1 and the first member is equal to c.

Returns:
true if the Unicode character represented by this Character is equal to the CharType value c; false otherwise

toString

String toString() const
Returns a String containing an identical sequence of CharType values as the sequence controller and contained by this Character.

Returns:
a String representing the same sequence of CharType values.

toUnicode

UCS4Char toUnicode() const
Converts the controlled multi-character sequence into a 32-bit Unicode code-point value.

Returns:
the Unicode character represented by this Character as a 32-bit value. A value of 0xFFFF is returned if this Character has not been initialized.
Exceptions:
IllegalCharacterException - if this Character does not represent a valid Unicode character.


Cross-Platform C++

Found a bug or missing feature? Please email us at support@elcel.com

Copyright © 2000-2005 ElCel Technology   Trademark Acknowledgements