libUEMF
A portable library for reading and writing WMF, EMF and EMF+ files
 All Data Structures Files Functions Variables Typedefs Enumerations Enumerator Macros Groups Pages
Functions
uemf_utf.c File Reference

Functions for manipulating UTF and various types of text. More...

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <iconv.h>
#include <wchar.h>
#include <errno.h>
#include <limits.h>
#include <math.h>
#include "uemf_utf.h"

Functions

void wchar8show (const char *src)
 Dump a UTF8 string. Not for use in production code. More...
 
void wchar16show (const uint16_t *src)
 Dump a UTF16 string. Not for use in production code. More...
 
void wchar32show (const uint32_t *src)
 Dump a UTF32 string. Not for use in production code.
 
void wchartshow (const wchar_t *src)
 Dump a wchar_t string. Not for use in production code. More...
 
size_t wchar16len (const uint16_t *src)
 Find the number of (storage) characters in a 16 bit character string, not including terminator. More...
 
size_t wchar32len (const uint32_t *src)
 Find the number of (storage) characters in a 32 bit character string, not including terminator. More...
 
void wchar16strncpy (uint16_t *dst, const uint16_t *src, size_t nchars)
 Strncpy for wchar16 (UTF16). More...
 
void wchar16strncpypad (uint16_t *dst, const uint16_t *src, size_t nchars)
 Fill the output string with N characters, if the input string is shorter than N, pad with nulls. More...
 
uint16_t * U_Utf32leToUtf16le (const uint32_t *src, size_t max, size_t *len)
 Convert a UTF32LE string to a UTF16LE string. More...
 
uint32_t * U_Utf16leToUtf32le (const uint16_t *src, size_t max, size_t *len)
 Convert a UTF16LE string to a UTF32LE string. More...
 
uint32_t * U_Latin1ToUtf32le (const char *src, size_t max, size_t *len)
 Convert a Latin1 string to a UTF32LE string. More...
 
uint32_t * U_Utf8ToUtf32le (const char *src, size_t max, size_t *len)
 Convert a UTF8 string to a UTF32LE string. More...
 
char * U_Utf32leToUtf8 (const uint32_t *src, size_t max, size_t *len)
 Convert a UTF32LE string to a UTF8 string. More...
 
uint16_t * U_Utf8ToUtf16le (const char *src, size_t max, size_t *len)
 Convert a UTF-8 string to a UTF16-LE string. More...
 
char * U_Utf16leToUtf8 (const uint16_t *src, size_t max, size_t *len)
 Convert a UTF16LE string to a UTF8 string. More...
 
char * U_Utf16leToLatin1 (const uint16_t *src, size_t max, size_t *len)
 Convert a UTF16LE string to a LATIN1 string. More...
 
uint16_t U_Utf16le (const uint16_t src)
 Put a single 16 bit character into UTF-16LE form. More...
 
char * U_Utf8ToLatin1 (const char *src, size_t max, size_t *len)
 Convert a UTF8 string to a Latin1 string. More...
 
char * U_Latin1ToUtf8 (const char *src, size_t max, size_t *len)
 Convert a Latin1 string to a UTF8 string. More...
 
int U_Utf16leEdit (uint16_t *src, uint16_t find, uint16_t replace)
 Single character replacement in a UTF-16LE string. More...
 
char * U_strdup (const char *s)
 strdup for when strict C99 compliance is enforced More...
 

Detailed Description

Functions for manipulating UTF and various types of text.

Compile with "U_VALGRIND" defined defined to enable code which lets valgrind check each record for uninitialized data.

Compile with "SOL8" defined for Solaris 8 or 9 (Sparc).

Function Documentation

uint32_t* U_Latin1ToUtf32le ( const char *  src,
size_t  max,
size_t *  len 
)

Convert a Latin1 string to a UTF32LE string.

Returns
pointer to new string or NULL if it fails
Parameters
srcLatin1 string to convert
maxnumber of characters to convert, if 0, until terminator
lennumber of characters in new string, NOT including terminator

U_EMR_EXTTEXTOUTA records are "8 bit ASCII". In theory that is ASCII in an 8 bit character, but numerous applications store Latin1 in them, and some may store UTF-8 in them. Since very vew Latin1 strings are valid UTF-8 strings, call U_Utf8ToUtf32le first, and if it fails, then call this function.

char* U_Latin1ToUtf8 ( const char *  src,
size_t  max,
size_t *  len 
)

Convert a Latin1 string to a UTF8 string.

Returns
pointer to new string or NULL if it fails
Parameters
srcLatin1 string to convert
maxnumber of characters to convert, if 0, until terminator
lennumber of characters in new string, NOT including terminator

WMF uses latin1, others UTF-8, all Latin1 should be able to convert to utf-8.

char* U_strdup ( const char *  s)

strdup for when strict C99 compliance is enforced

Returns
duplicate string or NULL on error
Parameters
sstring to duplicate
uint16_t U_Utf16le ( const uint16_t  src)

Put a single 16 bit character into UTF-16LE form.

Used in conjunction with U_Utf16leEdit(), because the character representation would otherwise be dependent on machine Endianness.

Returns
UTF16LE representation of the character.
Parameters
src16 bit character
int U_Utf16leEdit ( uint16_t *  src,
uint16_t  find,
uint16_t  replace 
)

Single character replacement in a UTF-16LE string.

Used solely for the Description field which contains embedded nulls, which makes it difficult to manipulate. Use some other character and then swap it.

Returns
number of substitutions, or -1 if src is not defined
Parameters
srcUTF16LE string to edit
findcharacter to replace
replacereplacestitute character
char* U_Utf16leToLatin1 ( const uint16_t *  src,
size_t  max,
size_t *  len 
)

Convert a UTF16LE string to a LATIN1 string.

Returns
pointer to new UTF8 string or NULL if it fails
Parameters
srcUTF16LE string to convert
maxnumber of characters to convert, if 0, until terminator
lennumber of characters in new string, NOT including terminator
uint32_t* U_Utf16leToUtf32le ( const uint16_t *  src,
size_t  max,
size_t *  len 
)

Convert a UTF16LE string to a UTF32LE string.

Returns
pointer to new string or NULL if it fails
Parameters
srcUTF16LE string to convert
maxnumber of characters to convert, if 0, until terminator
lennumber of characters in new string, NOT including terminator
char* U_Utf16leToUtf8 ( const uint16_t *  src,
size_t  max,
size_t *  len 
)

Convert a UTF16LE string to a UTF8 string.

Returns
pointer to new UTF8 string or NULL if it fails
Parameters
srcUTF16LE string to convert
maxnumber of characters to convert, if 0, until terminator
lennumber of characters in new string, NOT including terminator
uint16_t* U_Utf32leToUtf16le ( const uint32_t *  src,
size_t  max,
size_t *  len 
)

Convert a UTF32LE string to a UTF16LE string.

Returns
pointer to new string or NULL if it fails
Parameters
srcwchar_t string to convert
maxnumber of characters to convert, if 0, until terminator
lennumber of characters in new string, NOT including terminator
char* U_Utf32leToUtf8 ( const uint32_t *  src,
size_t  max,
size_t *  len 
)

Convert a UTF32LE string to a UTF8 string.

Returns
pointer to new string or NULL if it fails
Parameters
srcwchar_t string to convert
maxnumber of characters to convert, if 0, until terminator
lennumber of characters in new string, NOT including terminator
char* U_Utf8ToLatin1 ( const char *  src,
size_t  max,
size_t *  len 
)

Convert a UTF8 string to a Latin1 string.

Returns
pointer to new string or NULL if it fails
Parameters
srcLatin1 string to convert
maxnumber of characters to convert, if 0, until terminator
lennumber of characters in new string, NOT including terminator

WMF uses latin1, others UTF-8, only some utf-8 can be converted to latin1.

uint16_t* U_Utf8ToUtf16le ( const char *  src,
size_t  max,
size_t *  len 
)

Convert a UTF-8 string to a UTF16-LE string.

Returns
pointer to new string or NULL if it fails
Parameters
srcUTF8 string to convert
maxnumber of characters to convert, if 0, until terminator
lennumber of characters in new string, NOT including terminator
uint32_t* U_Utf8ToUtf32le ( const char *  src,
size_t  max,
size_t *  len 
)

Convert a UTF8 string to a UTF32LE string.

Returns
pointer to new string or NULL if it fails
Parameters
srcUTF8 string to convert
maxnumber of characters to convert, if 0, until terminator
lennumber of characters in new string, NOT including terminator
size_t wchar16len ( const uint16_t *  src)

Find the number of (storage) characters in a 16 bit character string, not including terminator.

Parameters
srcstring to examine
void wchar16show ( const uint16_t *  src)

Dump a UTF16 string. Not for use in production code.

Parameters
srcstring to examine
void wchar16strncpy ( uint16_t *  dst,
const uint16_t *  src,
size_t  nchars 
)

Strncpy for wchar16 (UTF16).

Parameters
dstdestination (already allocated)
srcsource
ncharsnumber of characters to copy
void wchar16strncpypad ( uint16_t *  dst,
const uint16_t *  src,
size_t  nchars 
)

Fill the output string with N characters, if the input string is shorter than N, pad with nulls.

Parameters
dstdestination (already allocated)
srcsource
ncharsnumber of characters to copy
size_t wchar32len ( const uint32_t *  src)

Find the number of (storage) characters in a 32 bit character string, not including terminator.

Parameters
srcstring to examine
void wchar8show ( const char *  src)

Dump a UTF8 string. Not for use in production code.

Parameters
srcstring to examine
void wchartshow ( const wchar_t *  src)

Dump a wchar_t string. Not for use in production code.

Parameters
srcstring to examine