Cross-Platform C++

ot
class StringUtils

#include "ot/base/StringUtils.h"

Class module containing functions to compare and manipulate strings.

See also:
NumUtils



Method Summary
static int CompareNoCase(const String& lhs, const String& rhs)
         Compares two strings without regard to case.
static int CompareNoCase(const char* pszLHS, const char* pszRHS)
         Compares two null-terminated sequences of char disregarding case differences.
static bool ContainsMultiCharSequence(const String& str)
         Tests to see if the String contains at least one Unicode code-point value which is encoded as a sequence of more than one CharType value.
static ByteString Format(const char* format, ... )
         A wrapper around the c-library sprintf routine.
static String FromConsoleMBCS(const char* pStr)
         Translates a Multi-Byte Character String (MBCS) originating from the console into an OpenTop String.
static String FromLatin1(const char* pStr)
         Converts a null-terminated sequence of char characters into a OpenTop String.
static String FromLatin1(const ByteString& str)
         Converts a string of char characters into an OpenTop String.
static String FromLatin1(const char* pStr, size_t len)
         Converts an array of char characters into an OpenTop String.
static String FromNativeMBCS(const char* pStr)
         Translates a native (locale-dependent) Multi-Byte Character String (MBCS) into an OpenTop String.
static String FromUTF8(const ByteString& str)
         Converts a UTF-8 encoded Byte sequence into an internal OpenTop String.
static String FromUTF8(const Byte* pStr, size_t len)
         Converts a UTF-8 encoded Byte sequence into an internal OpenTop String.
static bool IsHexString(const ByteString& in)
         Tests if the passed string contains only hexadecimal characters [0-9], [a-f], [A-F].
static bool LessNoCase(const String& lhs, const String rhs)
        
static String NormalizeWhiteSpace(const String& in)
         Normalize a String value by removing all leading and trailing white-space and converting sequences of more than one white-space character into a single space character (U+0032).
static bool ReplaceAll(String& in, CharType search, const String& replacement)
         Replaces all CharType values in a String with a replacement String of CharType elements.
static bool ReplaceAll(String& in, const String& search, const String& replacement)
         Replaces all CharType sequences in a String that match the search String with a replacement String.
static String StripWhiteSpace(const String& in, eStripType type)
         Strips white-space from a String.
static ByteString ToAscii(const String& str)
         Converts the passed String into a char string encoded as US-ASCII.
static ByteString ToConsoleMBCS(const String& str)
         Translates an OpenTop string into a Multi-Byte Character String (MBCS) intended to be displayed on the console.
static String ToHexString(const String& str)
         Creates and returns a String containing each Unicode character from str in hexadecimal notation.
static ByteString ToLatin1(const String& str)
         Converts the passed String into a char string encoded as Latin-1.
static String ToLower(const String& str)
         Returns a String representation of str with all characters converted to lower-case.
static ByteString ToNativeMBCS(const String& str)
         Converts an OpenTop String into a native (locale-dependent) Multi-Byte character string (MBCS).
static String ToUpper(const String& str)
         Returns a String representation of str with all characters converted to upper-case.
static ByteString ToUTF8(const String& str)
         Converts the passed String into a char string encoded as UTF-8.
static WCharAutoPtr ToWideChar(const String& str)
         Converts an OpenTop String, consisting of Unicode characters encoded as a sequence of CharType values, into a null-terminated array of wchar_t values.

Typedefs

WCharAutoPtr

typedef ArrayAutoPtr< wchar_t > WCharAutoPtr

Enumerations

enum eStripType { leading  
  trailing  
  both  


Method Detail

CompareNoCase

static int CompareNoCase(const String& lhs,
                         const String& rhs)
Compares two strings without regard to case.

Parameters:
lhs - the first String
rhs - the second String
Returns:
-1 if lhs compares less than rhs, 0 if they are equal or +1 if lhs compares greater than rhs

CompareNoCase

static int CompareNoCase(const char* pszLHS,
                         const char* pszRHS)
Compares two null-terminated sequences of char disregarding case differences.

Parameters:
pszLHS - the first character sequence to compare
pszRHS - the second character sequence to compare
Returns:
-1 if pszLHS is less than pszRHS, 0 if they are equal or +1 if pszLHS is greater than pszRHS

ContainsMultiCharSequence

static bool ContainsMultiCharSequence(const String& str)
Tests to see if the String contains at least one Unicode code-point value which is encoded as a sequence of more than one CharType value.

Parameters:
str - the String to test
Returns:
true if str contains a Unicode character that is encoded using a sequence of more than one CharType element; false otherwise.

Format

static ByteString Format(const char* format,
                         ... )
A wrapper around the c-library sprintf routine.


FromConsoleMBCS

static String FromConsoleMBCS(const char* pStr)
Translates a Multi-Byte Character String (MBCS) originating from the console into an OpenTop String.


FromLatin1

static String FromLatin1(const char* pStr)
Converts a null-terminated sequence of char characters into a OpenTop String. The input char sequence is encoded in Latin1 (ISO-8859-1), i.e. each char is an unsigned byte with a value in the Unicode range U+0000-U+0255.


FromLatin1

static String FromLatin1(const ByteString& str)
Converts a string of char characters into an OpenTop String. The input string is encoded in Latin1 (ISO-8859-1), i.e. each char member of str is an unsigned byte with a value in the Unicode range U+0000-U+0255.


FromLatin1

static String FromLatin1(const char* pStr,
                         size_t len)
Converts an array of char characters into an OpenTop String. The input char sequence is encoded in Latin1 (ISO-8859-1), i.e. each char is an unsigned byte with a value in the Unicode range U+0000-U+0255.

Parameters:
pStr - pointer to the start of the char array
len - size

FromNativeMBCS

static String FromNativeMBCS(const char* pStr)
Translates a native (locale-dependent) Multi-Byte Character String (MBCS) into an OpenTop String. Applications have to deal with MBCS strings when making system calls and accepting (some) system input. Internal OpenTop Strings are composed of a sequence of CharType values representing Unicode characters encoded into to the native OpenTop encoding.


FromUTF8

static String FromUTF8(const ByteString& str)
Converts a UTF-8 encoded Byte sequence into an internal OpenTop String. If the native OpenTop encoding is already UTF-8, the input string is simply returned unchanged.

Parameters:
str - the sequence to convert.
Returns:
A String representing the UTF-8 encoded Byte sequence.
Exceptions:
IllegalCharacterException - if str contains an invalid UTF-8 sequence. This exception may also be thrown if the UTF-8 sequence contains an encoded character which cannot be represented in the native OpenTop encoding.

FromUTF8

static String FromUTF8(const Byte* pStr,
                       size_t len)
Converts a UTF-8 encoded Byte sequence into an internal OpenTop String. If the native OpenTop encoding is already UTF-8, the input sequence is simply returned as a String.

Parameters:
pStr - a pointer to the start of the sequence to convert.
len - the length of the sequence (in bytes)
Returns:
A Unicode String containing the characters from the UTF-8 encoded input sequence.
Exceptions:
IllegalCharacterException - if the input contains an invalid UTF-8 sequence. This exception may also be thrown if the UTF-8 sequence contains an encoded character which cannot be represented in the native OpenTop encoding.

IsHexString

static bool IsHexString(const ByteString& in)
Tests if the passed string contains only hexadecimal characters [0-9], [a-f], [A-F].

Parameters:
in - the string to test
Returns:
true if the passed String contains hexadecimal characters only; false otherwise

LessNoCase

static bool LessNoCase(const String& lhs,
                       const String rhs)


NormalizeWhiteSpace

static String NormalizeWhiteSpace(const String& in)
Normalize a String value by removing all leading and trailing white-space and converting sequences of more than one white-space character into a single space character (U+0032). The definition of white-space is taken from UnicodeCharacterType::IsSpace().

Parameters:
in - the String to normalize
Returns:
the normalized String
See also:
UnicodeCharacterType::IsSpace()

ReplaceAll

static bool ReplaceAll(String& in,
                       CharType search,
                       const String& replacement)
Replaces all CharType values in a String with a replacement String of CharType elements.

Parameters:
in - a String containing a sequence of CharType elements
search - the CharType to match against each element of in
replacement - a String containing a sequence of or more CharType elements that will be used to replace matching elements from in
Returns:
true if at least one matching CharType value was found.

ReplaceAll

static bool ReplaceAll(String& in,
                       const String& search,
                       const String& replacement)
Replaces all CharType sequences in a String that match the search String with a replacement String.

Parameters:
in - a String containing a sequence of CharType elements
search - a String containing a sequence of CharType characters to match against sub-sequences of in
replacement - a String containing a sequence of CharType characters that will be used to replace matched sequences from in
Returns:
true if at least one matching sequence was found

StripWhiteSpace

static String StripWhiteSpace(const String& in,
                              eStripType type)
Strips white-space from a String. The definition of white-space is taken from UnicodeCharacterType::IsSpace().

Parameters:
in - the String to process
type - an enum value specifying from where the white-space should be removed (StringUtils::leading, StringUtils::trailing or StringUtils::both)
Returns:
a copy of in with the requested white-space removed
See also:
UnicodeCharacterType::IsSpace()

ToAscii

static ByteString ToAscii(const String& str)
Converts the passed String into a char string encoded as US-ASCII. The US-ASCII encoding is a subset of Unicode with characters in the range U+0000-U+007F.

Returns:
A US-ASCII version of the String str
Exceptions:
IllegalCharacterException - if str contains a character that cannot be encoded as US-ASCII (any character with a Unicode code-point above 0x007F) or is not encoded in accordance with the internal OpenTop String encoding conventions.

ToConsoleMBCS

static ByteString ToConsoleMBCS(const String& str)
Translates an OpenTop string into a Multi-Byte Character String (MBCS) intended to be displayed on the console.


ToHexString

static String ToHexString(const String& str)
Creates and returns a String containing each Unicode character from str in hexadecimal notation.

Parameters:
str - a String containing Unicode characters encoded according to the internal OpenTop convention.
Returns:
a hexadecimal representation of the Unicode characters within str

ToLatin1

static ByteString ToLatin1(const String& str)
Converts the passed String into a char string encoded as Latin-1. The Latin-1 encoding (used throughout Europe) is officially known as ISO-8859-1.

Returns:
A Latin-1 encoded version of the String str
Exceptions:
IllegalCharacterException - if str contains a character that cannot be encoded as ISO-8859-1 (any character with a Unicode code-point above U+00FF) or is not encoded in accordance with the internal OpenTop String encoding conventions.

ToLower

static String ToLower(const String& str)
Returns a String representation of str with all characters converted to lower-case.

Parameters:
str - the String to convert to lower case.

ToNativeMBCS

static ByteString ToNativeMBCS(const String& str)
Converts an OpenTop String into a native (locale-dependent) Multi-Byte character string (MBCS). OpenTop Strings are stored in an encoded Unicode format. This function translates from the native OpenTop encoding into the MBCS encoding for the current locale.

Parameters:
str - the String to translate
Exceptions:
IllegalCharacterException - if str is not encoded according to the native OpenTop encoding or str contains a Unicode character that cannot be represented in the native MBCS format.

ToUpper

static String ToUpper(const String& str)
Returns a String representation of str with all characters converted to upper-case.

Parameters:
str - the String to convert to upper case.

ToUTF8

static ByteString ToUTF8(const String& str)
Converts the passed String into a char string encoded as UTF-8.

Parameters:
str - the String to convert.
Returns:
A UTF-8 encoded version of the String str.
Exceptions:
IllegalCharacterException - if str contains an illegal character or is not encoded in accordance with the internal OpenTop String encoding conventions.

ToWideChar

static WCharAutoPtr ToWideChar(const String& str)
Converts an OpenTop String, consisting of Unicode characters encoded as a sequence of CharType values, into a null-terminated array of wchar_t values. This function is not present in the @wchar_t version of the library.



Cross-Platform C++

Found a bug or missing feature? Please email us at support@elcel.com

Copyright © 2000-2005 ElCel Technology   Trademark Acknowledgements