UTF-8 Conversion
[UTIL]

Collaboration diagram for UTF-8 Conversion:


Classes

struct  SUnicodeTranslation
 Structure to keep substititutions for the particular unicode character. More...

Typedefs

typedef SUnicodeTranslation TUnicodePlan [256]
typedef TUnicodePlanTUnicodeTable [256]
typedef unsigned int TUnicode

Enumerations

enum  ESubstType {
  eSkip = 0, eAsIs, eString, eException,
  eHTML, ePicture, eOther
}
 Types of substitutors. More...
enum  EConversionResult { eConvertedFine, eDefaultTranslationUsed }
enum  EConversionStatus { eSuccess, eSkipChar, eOutrangeChar }

Functions

const SUnicodeTranslationUnicodeToAscii (TUnicode character, const TUnicodeTable *table=NULL, const SUnicodeTranslation *default_translation=NULL)
 Convert Unicode character into ASCII string.
size_t UTF8ToUnicode (const char *utf, TUnicode *unicode)
 Convert UTF8 into Unicode character.
size_t UnicodeToUTF8 (TUnicode unicode, char *buffer, size_t buf_length)
 Convert Unicode character into UTF8.
string UnicodeToUTF8 (TUnicode unicode)
 Convert Unicode character into UTF8.
ssize_t UTF8ToAscii (const char *src, char *dst, size_t dst_len, const SUnicodeTranslation *default_translation, const TUnicodeTable *table=NULL, EConversionResult *result=NULL)
 Convert UTF8 into ASCII character buffer.
string UTF8ToAsciiString (const char *src, const SUnicodeTranslation *default_translation, const TUnicodeTable *table=NULL, EConversionResult *result=NULL)
 Convert UTF8 into ASCII string.
char StringToChar (const string &src, size_t *seq_len=0, bool ascii_table=true, EConversionStatus *status=0)
string StringToAscii (const string &src, bool ascii_table=true)
long StringToCode (const string &src, size_t *seq_len=0, EConversionStatus *status=0)
vector< long > StringToVector (const string &src)
char CodeToChar (const long src, EConversionStatus *status=0)

Variables

const char kOutrangeChar = '?'
const char kSkipChar = '\xFF'


Typedef Documentation

typedef unsigned int TUnicode
 

Definition at line 77 of file unicode.hpp.

typedef SUnicodeTranslation TUnicodePlan[256]
 

Definition at line 75 of file unicode.hpp.

typedef TUnicodePlan* TUnicodeTable[256]
 

Definition at line 76 of file unicode.hpp.


Enumeration Type Documentation

enum EConversionResult
 

Enumerator:
eConvertedFine 
eDefaultTranslationUsed 

Definition at line 62 of file unicode.hpp.

enum EConversionStatus
 

Enumerator:
eSuccess 
eSkipChar 
eOutrangeChar 

Definition at line 64 of file utf8.hpp.

enum ESubstType
 

Types of substitutors.

Enumerator:
eSkip  Unicode to be skipped in translation. Usually it is combined mark.
eAsIs  Unicodes which should go into the text as is.
eString  String of symbols.
eException  Throw exception (CUtilException, with type eWrongData).
eHTML  HTML tag or, for example, HTML entity.
ePicture  Path to the picture, or maybe picture itself.
eOther  Something else.

Definition at line 50 of file unicode.hpp.


Function Documentation

char CodeToChar const long  src,
EConversionStatus status = 0
 

Definition at line 295 of file utf8.cpp.

References eOutrangeChar, eSkipChar, eSuccess, kOutrangeChar, kSkipChar, RETURN_S, tblTrans, and tblTransA.

Referenced by StringToChar().

string StringToAscii const string &  src,
bool  ascii_table = true
 

Definition at line 187 of file utf8.cpp.

References kSkipChar, and StringToChar().

char StringToChar const string &  src,
size_t *  seq_len = 0,
bool  ascii_table = true,
EConversionStatus status = 0
 

Definition at line 149 of file utf8.cpp.

References CodeToChar(), eOutrangeChar, eSuccess, kOutrangeChar, RETURN_S, and StringToCode().

Referenced by StringToAscii().

long StringToCode const string &  src,
size_t *  seq_len = 0,
EConversionStatus status = 0
 

Definition at line 215 of file utf8.cpp.

References eOutrangeChar, eSkipChar, eSuccess, kOutrangeChar, kSkipChar, and RETURN_LS.

Referenced by StringToChar(), and StringToVector().

vector<long> StringToVector const string &  src  ) 
 

Definition at line 268 of file utf8.cpp.

References StringToCode().

const SUnicodeTranslation* UnicodeToAscii TUnicode  character,
const TUnicodeTable table = NULL,
const SUnicodeTranslation default_translation = NULL
 

Convert Unicode character into ASCII string.

Parameters:
character character to translate
table Table to use in translation. If Table is not specified, the internal default one will be used.
Returns:
Pointer to substitute structure

Definition at line 96 of file unicode.cpp.

References eException, NCBI_THROW, and SUnicodeTranslation::Type.

Referenced by UTF8ToAscii(), and UTF8ToAsciiString().

string UnicodeToUTF8 TUnicode  unicode  ) 
 

Convert Unicode character into UTF8.

Parameters:
unicode Unicode character
Returns:
UTF8 buffer as a string

Definition at line 181 of file unicode.cpp.

size_t UnicodeToUTF8 TUnicode  unicode,
char *  buffer,
size_t  buf_length
 

Convert Unicode character into UTF8.

Parameters:
unicode Unicode character
buffer UTF8 buffer to store the result
buf_length UTF8 buffer size
Returns:
Length of the generated UTF8 sequence

Definition at line 189 of file unicode.cpp.

Referenced by UnicodeToUTF8().

ssize_t UTF8ToAscii const char *  src,
char *  dst,
size_t  dst_len,
const SUnicodeTranslation default_translation,
const TUnicodeTable table = NULL,
EConversionResult result = NULL
 

Convert UTF8 into ASCII character buffer.

Decode UTF8 buffer and substitute all Unicodes with appropriate symbols or words from dictionary.

Parameters:
src UTF8 buffer to decode
dst Buffer to put the result in
dst_len Length of the destignation buffer
default_translation Default translation of unknown Unicode symbols
table Table to use in translation. If Table is not specified, the internal default one will be used.
result Result of the conversion
Returns:
Length of decoded string or -1 if buffer is too small

Definition at line 223 of file unicode.cpp.

References eAsIs, eConvertedFine, eDefaultTranslationUsed, eSkip, SUnicodeTranslation::Subst, SUnicodeTranslation::Type, UnicodeToAscii(), and UTF8ToUnicode().

string UTF8ToAsciiString const char *  src,
const SUnicodeTranslation default_translation,
const TUnicodeTable table = NULL,
EConversionResult result = NULL
 

Convert UTF8 into ASCII string.

Decode UTF8 buffer and substitute all Unicodes with appropriate symbols or words from dictionary.

Parameters:
src UTF8 buffer to decode
default_translation Default translation of unknown Unicode symbols
table Table to use in translation. If Table is not specified, the internal default one will be used.
result Result of the conversion
Returns:
String with decoded text

Definition at line 291 of file unicode.cpp.

References eAsIs, eConvertedFine, eDefaultTranslationUsed, eSkip, SUnicodeTranslation::Subst, SUnicodeTranslation::Type, UnicodeToAscii(), and UTF8ToUnicode().

size_t UTF8ToUnicode const char *  utf,
TUnicode unicode
 

Convert UTF8 into Unicode character.

Parameters:
utf Start of UTF8 character buffer
unicode Pointer to Unicode character to store the result in
Returns:
Length of the translated UTF8 or 0 in case of error.

Definition at line 150 of file unicode.cpp.

Referenced by UTF8ToAscii(), and UTF8ToAsciiString().


Variable Documentation

const char kOutrangeChar = '?'
 

Definition at line 54 of file utf8.hpp.

Referenced by CodeToChar(), StringToChar(), and StringToCode().

const char kSkipChar = '\xFF'
 

Definition at line 61 of file utf8.hpp.

Referenced by CodeToChar(), StringToAscii(), and StringToCode().


Generated on Wed Dec 9 08:14:21 2009 for NCBI C++ ToolKit by  doxygen 1.4.6
Modified on Wed Dec 09 08:20:18 2009 by modify_doxy.py rev. 173732