'임베디드/문자셋관련'에 해당되는 글 6건

2009.03.04 Freetype tutorial Step1(막번) 2
2009.02.18 Online sample of a CharSet property for conversion texts and files. 1
2009.02.18 문자셋 코드표 검색 사이트
2009.02.18 Code page conversion 2
2009.02.18 UNICODE QnA 1
2009.02.18 유니코드란?

임베디드/문자셋관련2009. 3. 4. 16:40

Freetype tutorial Step1(막번)

FreeType 2 Tutorial
Step 1 — simple glyph loading

© 2003, 2006, 2007 David Turner (david@freetype.org)

Introduction
This is the first section of the FreeType 2 tutorial. It will teach you how to:

initialize the library 라이브러리의 초기화

open a font file by creating a new face object 파일로부터 새로운 faceobject의 생성

select a character size in points or in pixels 문자크기를 point나 pixel로 선택

load a single glyph image and convert it to a bitmap 하나의 glyph image를 로드하고 비트맵으로 변환하는 과정

render a very simple string of text 간단한 문자표시

render a rotated string of text easily 쉬운 문자회전방법

1. Header files
The following are instructions required to compile an application that uses the FreeType 2 library. 아래는 FreeType2 library를 사용하는 app에서 컴파일하는대 필요한 과정이다.

Locate the FreeType 2 `include` directory.include directory로 위치를 잡을것.

You have to add it to your compilation include path. 컴파일 경로상에 include path를 포함할것.

Note that on Unix systems, you can now run the `freetype-config` script with the `--cflags` option to retrieve the appropriate compilation flags. This script can also be used to check the version of the library that is installed on your system, as well as the required librarian and linker flags.

Include the file named `ft2build.h`.

It contains various macro declarations that are later used to `#include` the appropriate public FreeType 2 header files. 관련되는 header file들을 적절히 호함시켜주는 ft2build.h를 포함시킬것

Include the main FreeType 2 API header file.

You should do that using the macro `FT_FREETYPE_H`, like in the following example: 아래와 같이 FT_FREETYPE_H를 추가할것.

#include <ft2build.h>

#include FT_FREETYPE_H

`FT_FREETYPE_H` is a special macro defined in the file `ftheader.h`. It contains some installation-specific macros to name other public header files of the FreeType 2 API. FT_FREETYPE_H는 설치관련 macro를 포함한다.

You can read this section of the FreeType 2 API Reference for a complete listing of the header macros. header macro의 전체를 보려면 API Reference를 참조하라

The use of macros in `#include` statements is ANSI-compliant. It is used for several reasons:

It avoids some painful conflicts with the FreeType 1.x public header files.

The macro names are not limited to the DOS 8.3 file naming limit; names like`FT_MULTIPLE_MASTERS_H` or `FT_SFNT_NAMES_H` are a lot more readable and explanatory than the real file names `ftmm.h` and `ftsnames.h`.

It allows special installation tricks that will not be discussed here.

NOTE: Starting with FreeType 2.1.6, the old header file inclusion scheme is no longer supported. This means that you now get an error if you do something like the following:

#include <freetype/freetype.h> #include <freetype/ftglyph.h> ...

2. Initialize the library

Simply create a variable of type FT_Library named, for example, library, and call the function FT_Init_FreeType as in

#include <ft2build.h>

#include FT_FREETYPE_H

FT_Library library;

...

error = FT_Init_FreeType( &library );

if ( error )

{

... an error occurred during library initialization ...

}

This function is in charge of the following:

It creates a new instance of the FreeType 2 library, and sets the handlelibrary to it. Freetype2 library의 인서턴스를 생성하고 핸들을 설정한다.
It loads each module that FreeType knows about in the library. Among others, your new library object is able to handle TrueType, Type 1, CID-keyed & OpenType/CFF fonts gracefully. 서로다른 폰트타입(Truetype, Type1, CID-keyed...)들을 다른 library로 생성하여 모듈로 로등하여 사용할수 있다.

As you can see, the function returns an error code, like most others in the FreeType API. An error code of 0 always means that the operation was successful; otherwise, the value describes the error, and library is set to NULL. 대부분의 함수의 리턴되는 error code중 0은 successfull을 의미한다.

3. Load a font face

a. From a font file

Create a new face object by calling FT_New_Face. A face describes a given typeface and style. For example, ‘Times New Roman Regular’ and ‘Times New Roman Italic’ correspond to two different faces.

FT_Library library; /* handle to library */ FT_Face face; /* handle to face object */error = FT_Init_FreeType( &library ); if ( error ) { ... } error = FT_New_Face( library, "/usr/share/fonts/truetype/arial.ttf", 0, &face ); if ( error == FT_Err_Unknown_File_Format ) { ... the font file could be opened and read, but it appears ... that its font format is unsupported } else if ( error ) { ... another error code means that the font file could not ... be opened or read, or simply that it is broken... }

As you can certainly imagine, FT_New_Face opens a font file, then tries to extract one face from it. Its parameters are

`library`	A handle to the FreeType library instance where the face object is created.
`filepathname`	The font file pathname (a standard C string).
`face_index`	Certain font formats allow several font faces to be embedded in a single file. This index tells which face you want to load. An error will be returned if its value is too large. Index 0 always work though.
`face`	A pointer to the handle that will be set to describe the new face object. It is set to NULL in case of error.

To know how many faces a given font file contains, simply load its first face (this is, face_index should be set to zero), then check the value of face->num_faceswhich indicates how many faces are embedded in the font file.

b. From memory

In the case where you have already loaded the font file into memory, you can similarly create a new face object for it by calling FT_New_Memory_Face as in

FT_Library library; /* handle to library */ FT_Face face; /* handle to face object */error = FT_Init_FreeType( &library ); if ( error ) { ... } error = FT_New_Memory_Face( library, buffer, /* first byte in memory */ size, /* size in bytes */ 0, /* face_index */ &face ); if ( error ) { ... }

As you can see, FT_New_Memory_Face simply takes a pointer to the font file buffer and its size in bytes instead of a file pathname. Other than that, it has exactly the same semantics as FT_New_Face.

Note that you must not deallocate the memory before calling FT_Done_Face.

c. From other sources (compressed files, network, etc.)

There are cases where using a file pathname or preloading the file into memory is simply not sufficient. With FreeType 2, it is possible to provide your own implementation of i/o routines.

This is done through the FT_Open_Face function, which can be used to open a new font face with a custom input stream, select a specific driver for opening, or even pass extra parameters to the font driver when creating the object. We advise you to refer to the FreeType 2 reference manual in order to learn how to use it.

4. Accessing face content

A face object models all information that globally describes the face. Usually, this data can be accessed directly by dereferencing a handle, like in face−>num_glyphs.

face object는 face에 대한 전체를 기술하는 모든 정보를 표현(model)한다. 보통 handle로부터 역참조하는 방식으로 데이터에 접근한다(예: face->num_glyphs)

The complete list of available fields in in the FT_FaceRec structure description. However, we describe here a few of them in more details: FT_FaceRec구조체에 모든 데이터필드가 정의되어 있다. 그중 중요한 몇가지만 자세히 설명하겟다.

`num_glyphs`	This variable gives the number of glyphs available in the font face. A glyph is simply a character image. It doesn't necessarily correspond to a character code though. font face에서 실제 유용한 glyph의 개수이다. 하나의 glyph는 하나의 문자이미지이다.character code와 일치할필요 없다.
`flags`	A 32-bit integer containing bit flags used to describe some face properties. For example, the flag `FT_FACE_FLAG_SCALABLE`is used to indicate that the face's font format is scalable and that glyph images can be rendered for all character pixel sizes. For more information on face flags, please read theFreeType 2 API Reference. 32bit정수값으로 face 속성을 기술한다. 예를 들어 FT_FACE_FLAG_SCALABLE은 font format이 scalable한지를 나타내고 glyph image가 모든 문자 픽셀크기로 렌더가능한지를 의미한다. 자세한 정보는 API Reference를 참조하라.
`units_per_EM`	This field is only valid for scalable formats (it is set to 0 otherwise). It indicates the number of font units covered by the EM. scalable format에서만 유효하며, EM이 커버가능한 font unit의 개수를 나타낸다.
`num_fixed_sizes`	This field gives the number of embedded bitmap strikes in the current face. A strike is simply a series of glyph images for a given character pixel size. For example, a font face could include strikes for pixel sizes 10, 12 and 14. Note that even scalable font formats can have embedded bitmap strikes! 현재face에 strike된 embedded bitmap의 개수를 나타낸다. 하나의 strike는 주어진 character pixel size로 된 glyph image들의 시리즈이다. 예를 들면 어떤 font face는 pixel size로 10, 12, 14를 strike한다. saclable font format도 embedded bitmap 을 strike할수 있다.
`fixed_sizes`	A pointer to an array of `FT_Bitmap_Size` elements. Each`FT_Bitmap_Size` indicates the horizontal and vertical character pixel sizes for each of the strikes that are present in the face.FT_Bitmap_Size요소 배열의 포인터, 각 FT_Bitmap_Size는 face에서 제공하는 strike들의 각각의 가로/세로 픽셀사이즈를 나타낸다. Note that, generally speaking, these are not the cell size of the bitmap strikes.

5. Setting the current pixel size

FreeType 2 uses size objects to model all information related to a given character size for a given face. For example, a size object will hold the value of certain metrics like the ascender or text height, expressed in 1/64th of a pixel, for a character size of 12 points.

When the FT_New_Face function is called (or one of its cousins), it automaticallycreates a new size object for the returned face. This size object is directly accessible as face−>size.

NOTE: A single face object can deal with one or more size objects at a time; however, this is something that few programmers really need to do. We have thus decided to simplify the API for the most common use (i.e., one size per face) while keeping this feature available through additional functions.

When a new face object is created, all elements are set to 0 during initialization. To populate the structure with sensible values, simply call FT_Set_Char_Size. Here is an example where the character size is set to 16pt for a 300×300dpi device:

error = FT_Set_Char_Size( face, /* handle to face object */ 0, /* char_width in 1/64th of points */ 16*64, /* char_height in 1/64th of points */ 300, /* horizontal device resolution */ 300 ); /* vertical device resolution */

Some notes:

The character widths and heights are specified in 1/64th of points. A point is a physical distance, equaling 1/72th of an inch. Normally, it is not equivalent to a pixel.
The horizontal and vertical device resolutions are expressed in dots-per-inch, or dpi. Normal values are 72 or 96 dpi for display devices like the screen. The resolution is used to compute the character pixel size from the character point size.
A value of 0 for the character width means ‘same as character height’, a value of 0 for the character height means ‘same as character width’. Otherwise, it is possible to specify different character widths and heights.
A value of 0 for the horizontal resolution means ‘same as vertical resolution’, a value of 0 for the vertical resolution means ‘same as horizontal resolution’. If both values are zero, 72 dpi is used for both dimensions.
The first argument is a handle to a face object, not a size object.

This function computes the character pixel size that corresponds to the character width and height and device resolutions. However, if you want to specify the pixel sizes yourself, you can simply call FT_Set_Pixel_Sizes, as in

error = FT_Set_Pixel_Sizes( face, /* handle to face object */ 0, /* pixel_width */16 ); /* pixel_height */

This example will set the character pixel sizes to 16×16 pixels. As previously, a value of 0 for one of the dimensions means ‘same as the other’.

Note that both functions return an error code. Usually, an error occurs with a fixed-size font format (like FNT or PCF) when trying to set the pixel size to a value that is not listed in the face->fixed_sizes array.

6. Loading a glyph image

a. Converting a character code into a glyph index

Usually, an application wants to load a glyph image based on its character code, which is a unique value that defines the character for a given encoding. For example, the character code 65 represents the ‘A’ in ASCII encoding. 일반적으로 application은 char code에 기반한 glyph image를 load하기를 기대한다. 이 code는 encoding방식내에서 유일한 값이다. 예를 들면 ASCII encoding에서 code 65는 'A'이다.

A face object contains one or more tables, called charmaps, that are used to convert character codes to glyph indices. For example, most TrueType fonts contain two charmaps. One is used to convert Unicode character codes to glyph indices, the other is used to convert Apple Roman encoding into glyph indices. Such fonts can then be used either on Windows (which uses Unicode) and Macintosh (which uses Apple Roman). Note also that a given charmap might not map to all the glyphs present in the font. face object는 하나이상의 charmap을 가지고 있으며 char code를 glyph index로 변환하는데 사용된다. 예를 들면 대부분의 Truetype font는 2개의 charmap을 가지고 있다. 하나는 unicode char codemap이고 다른 하나는 Apple roman encoding map이다. 따라서 이는 윈도만이 아니라 매킨토시에서도 동작한다. 폰트가 모든 charmap을 지원하지 않는다.

By default, when a new face object is created, it selects a Unicode charmap. FreeType tries to emulate a Unicode charmap if the font doesn't contain such a charmap, based on glyph names. Note that it is possible that the emulation misses glyphs if glyph names are non-standard. For some fonts, including symbol fonts and (older) fonts for Asian scripts, no Unicode emulation is possible at all. 기본적으로 새로운 face object가 생성될 때 unicode charmap을 선택하게된다. freetype은 font가 charmap이 없으면 unicode charmap을 emulate를 glyph name으로 시도한다. 이는 glyph name이 non-standard이면 emulation이 실패할수 있음을 말한다. 일부 폰트의 경우 symbol이나 asian scriptr를 위한 (오래된) 폰트들은 더이상 unicode emulation을 지원하지 않는다.

We will describe later how to look for specific charmaps in a face. For now, we will assume that the face contains at least a Unicode charmap that was selected during a call to FT_New_Face. To convert a Unicode character code to a font glyph index, we use FT_Get_Char_Index, as in 나중에 face에서 특정 charmap을 찾는 방법에 대해 설명할 것이다. 여기서는 FT_New_Face로 선택한 font가 최소한 unicode charmap은 포함한다는 가정을 할것이다. unicode char code를 glyph index로 변환하기 위해서 우리는 FT_Get_Char_Index를 사용한다.

glyph_index = FT_Get_Char_Index( face, charcode );

This will look the glyph index corresponding to the given charcode in the charmap that is currently selected for the face. If no charmap was selected, the function simply returns the charcode. 위코드는 charmap상에서 charcode에 해당하는 glyph index를 얻어낸다. charmap이 선택되지 않았다면 그냥 charcode를 리턴한다.

Note that this is one of the rare FreeType functions that do not return an error code. However, when a given character code has no glyph image in the face, the value 0 is returned. By convention, it always correspond to a special glyph image called the missing glyph, which is commonly displayed as a box or a space. 이 함수는 에러코드를 리턴하지 않는 드문 freetype함수중 하나이다. charcode가 face내에서 유효한 glyph image를 얻지 못하면 0을 리턴한다.

b. Loading a glyph from the face

Once you have a glyph index, you can load the corresponding glyph image. The latter can be stored in various formats within the font file. For fixed-size formats like FNT or PCF, each image is a bitmap. Scalable formats like TrueType or Type 1 use vectorial shapes, named outlines to describe each glyph. Some formats may have even more exotic ways of representing glyphs (e.g., MetaFont — but this format is not supported). Fortunately, FreeType 2 is flexible enough to support any kind of glyph format through a simple API. 하나의 glyph index를 얻으면 대응되는 glyph image를 로드할수 있다. font file에 다양한 포맷으로 저장되어 있다. fixd-size포맷(FNT, PCF)는 각 이미지가 하나의 비트맵으로 되어 있다. Scalable format(Truetype, Type1)은 vectorial shape(outline으로 불림.)을 가지고 있다. 어떤 포맷은 훨씬 징그러운 방법으로 glyph을 표현한다(MetaFont같은 폰트-지원하지 않음). Freetype은 다행히 많은 포맷을 지원한다.

The glyph image is always stored in a special object called a glyph slot. As its name suggests, a glyph slot is simply a container that is able to hold one glyph image at a time, be it a bitmap, an outline, or something else. Each face object has a single glyph slot object that can be accessed as face->glyph. Its fields are explained by the FT_GlyphSlotRec structure documentation. glyph image는 glyph slot이라는 틀별한 객채에 항상 저장된다. 이름에서 느껴지듯이 glayph slot은 단순히 bitmap, outline등으로 되어 있는 하나의 glayph image을 한번에 하나씩 가지고 있다. 각 face object는 하나의 glyph slot object를 가지고 있으며 face->glyph로 접근 가능하다. 이 필드는 FT_GlyphSlotRec구조체에 설명되어 있다.

Loading a glyph image into the slot is performed by calling FT_Load_Glyph as inFT_Load_Glyph를 호출하여 slot으로 glyph image를 로드한다.

error = FT_Load_Glyph( face, /* handle to face object */

glyph_index, /* glyph index */

load_flags ); /* load flags, see below */

The load_flags value is a set of bit flags used to indicate some special operations. The default value FT_LOAD_DEFAULT is 0. load_flag(bit flag의 set)가 몇가지 특별한 동작을 위해 사용된다.

This function will try to load the corresponding glyph image from the face: 이 함수는 face로부터 대응되는 glyph image를 로드한다.

If a bitmap is found for the corresponding glyph and pixel size, it will be loaded into the slot. Embedded bitmaps are always favored over native image formats, because we assume that they are higher-quality versions of the same glyph. This can be changed by using the FT_LOAD_NO_BITMAP flag.대응하는 glyph와 pixel size에 맞는 비트맵이 발견되면 slot으로 로드될것이다. ??
Otherwise, a native image for the glyph will be loaded. It will also be scaled to the current pixel size, as well as hinted for certain formats like TrueType and Type 1. 그렇지 않으면 native image가 로드될것이다.

The field face−>glyph−>format describes the format used to store the glyph image in the slot. If it is not FT_GLYPH_FORMAT_BITMAP, one can immediately convert it to a bitmap through FT_Render_Glyph as in: face->glyph->format은 slot에서 사용되는 glyphimage의 포맷을 나타낸다. 만일 FT_GLYPH_FORMAT_BITMAP이 아니면 FT_Render_Glyph로 즉시 bitmap으로 변환할수 있다.

error = FT_Render_Glyph(

face->glyph, /* glyph slot */

render_mode ); /* render mode */

The parameter render_mode is a set of bit flags used to specify how to render the glyph image. Set it to FT_RENDER_MODE_NORMAL to render a high-quality anti-aliased (256 gray levels) bitmap, as this is the default. You can alternatively useFT_RENDER_MODE_MONO if you want to generate a 1-bit monochrome bitmap.render_mode파라메터는 bit flag set으로 glyph image의 렌더링 방식을 나타낸다. FT_RENDER_MODE_NORMAL은 고품질 anti-aliased(256 gray level) bitmap으로 렌더링함을 말한다(default setting). FT_RENDER_MODE_MONO로 1-bit monochrome bitmap으로 생성할 수도 있다.

Once you have a bitmapped glyph image, you can access it directly through glyph->bitmap (a simple bitmap descriptor), and position it through glyph->bitmap_leftand glyph->bitmap_top. glyph image로 bitmap으로 만들어지면 glyph->bitmap(간단한 bitmap descriptor)로 바오 접근가능하며 glyph->bitmap_left와 glyph->bitmap_top으로 위치에 접근할 수 있다.

Note that bitmap_left is the horizontal distance from the current pen position to the leftmost border of the glyph bitmap, while bitmap_top is the vertical distance from the pen position (on the baseline) to the topmost border of the glyph bitmap. It is positive to indicate an upwards distance. bitmap_left는 glyph bitmap의 가장좌측 경계에서 현재 pen위치의 가로거리를 말한다. bitmap_top은 최상단경계 에서 현재위치(baseline)까지의 새로거리이다.

The next section will give more details on the contents of a glyph slot and how to access specific glyph information (including metrics). 다음섹션에서 더 자세히 설명한다.

c. Using other charmaps

As said before, when a new face object is created, it will look for a Unicode charmap and select it. The currently selected charmap is accessed via face->charmap. This field is NULL when no charmap is selected, which typically happens when you create a new FT_Face object from a font file that doesn't contain a Unicode charmap (which is rather infrequent today). 앞서 말한것 처럼, 새로운 face object가 생성될때 unicode charmap을 찾아서 선택한다. 현재 선택된 charmap은 face->charmap으로 접근가능하다. 이 field가 NULL이면 charmap이 선택되지 않은 것이다. 이 경우 font가 unicode charmap을 포함하지 않는 것이다.

There are two ways to select a different charmap with FreeType 2. The easiest is when the encoding you need already has a corresponding enumeration defined inFT_FREETYPE_H, for example FT_ENCODING_BIG5. In this case, you can simply callFT_Select_CharMap as in: FreeType2에서 다른 charmap을 선택하는 두가지 방법이 제공된다. 가장 쉬운 방법은 FT_FREETYPE_H에 이미 열거하여 정의 해둔 값을 사용하는 것이다. 예를 들어 FT_ENCODING_BIG5의 경우 FT_Select_CharMap함수로 간단하게 사용가능하다.

error = FT_Select_CharMap(

face, /* target face object */

FT_ENCODING_BIG5 ); /* encoding */

Another way is to manually parse the list of charmaps for the face; this is accessible through the fields num_charmaps and charmaps (notice the ‘s&rsquo) of the face object. As you could expect, the first is the number of charmaps in the face, while the second is a table of pointers to the charmaps embedded in the face. 다른 방법은 face의 charmap을 수동으로 검사하는 방법이다. num_charmaps필드와 charmaps필드를 통해서 접근가능하다. 기대한대로 첫번째는 num_charmaps는 face내의 charmaps의 개수이고 두번째는 charmap을 갖고 있는 배열을 포인터이다.

Each charmap has a few visible fields used to describe it more precisely. Mainly, one will look at charmap->platform_id and charmap->encoding_id that define a pair of values that can be used to describe the charmap in a rather generic way. 각 charmap은 더 자세한 정보를 표현하기 위해 몇가지 필드를 더 가지고 있다. charmap을 표현하는 일반적인 방법보다는 charmap->platform_id와 charmap->encoding_id의 쌍으로 정의된 값을 볼수 있다.

Each value pair corresponds to a given encoding. For example, the pair (3,1) corresponds to Unicode. The list is defined in the TrueType specification but you can also use the file FT_TRUETYPE_IDS_H which defines several helpful constants to deal with them. 각 쌍의 값들은 주어진 encoding에 대응된다. 예를 들어 쌍(3,1)은 unicode에 대응된다. 그 리스트는 truetype spec에 정의되어 있지만 FT_TRUETYPE_IDS_H파일을 사용할 수 있으며 이는 다루기 편하도록 몇가지 상수가 정의되어 있다.

To select a specific encoding, you need to find a corresponding value pair in the specification, then look for it in the charmaps list. Don't forget that there are encodings which correspond to several value pairs due to historical reasons. Here some code to do it: 특정encoding을 선택하려면, spec에서 대응되는 쌍의값을 찾아야 한다. 역사적인 이유들로 인해서 대응되는 몇가지 쌍의값들이 encoding값이라는 것을 기억하라.

FT_CharMap found = 0;

FT_CharMap charmap;

int n;

for ( n = 0; n < face->num_charmaps; n++ )

{

charmap = face->charmaps[n];

if ( charmap->platform_id == my_platform_id &&

charmap->encoding_id == my_encoding_id )

{

found = charmap;

break;

}

if ( !found ) { ... } /* now, select the charmap for the face object */

error = FT_Set_CharMap( face, found );

if ( error ) { ... }

Once a charmap has been selected, either through FT_Select_CharMap orFT_Set_CharMap, it is used by all subsequent calls to FT_Get_Char_Index.FT_Select_CharMap이나 FT_Set_CharMap으로 charmap이 한번 선택되면, 이후의 모든 FT_Get_Char_Index의 호출에 영향을 준다.

d. Glyph transformations

It is possible to specify an affine transformation to be applied to glyph images when they are loaded. Of course, this will only work for scalable (vectorial) font formats. glyph image가 로드되었을 때 적용하기 위한 유사한 변환방식을 지정하는것이 가능하다.

To do that, simply call FT_Set_Transform, as in: FT_Set_Transform을 호출하면 된다.

error = FT_Set_Transform( face, /* target face object */

&matrix, /* pointer to 2x2 matrix */

&delta ); /* pointer to 2d vector */

This function will set the current transform for a given face object. Its second parameter is a pointer to a simple FT_Matrix structure that describes a 2×2 affine matrix. The third parameter is a pointer to a FT_Vector structure that describes a simple two-dimensional vector that is used to translate the glyph image after the 2×2 transformation. 이 함수는 현재의 변환을 face object에 set할것이다. 두번째 파라메터는 FT_Matric의 포인터이다. 이것은 2x2 affine matrix를 기술한다. 세번째 파라메터는 FT_Vector의 포인터이다. 이것은 2D vector를 기술하며 2x2 변환(transformation) 후에 glyph image을 변형(translate)하는데 사용된다.

Note that the matrix pointer can be set to NULL, in which case the identity transform will be used. Coefficients of the matrix are otherwise in 16.16 fixed float units. matrix pointer는 NULL로 설정가능하다. 이경우 identity transform이 사용될것이다. matrix의 계수는 16.16의 고정된 부동소수단위이다.

The vector pointer can also be set to NULL (in which case a delta of (0,0) will be used). The vector coordinates are expressed in 1/64th of a pixel (also known as 26.6 fixed floats). vector pointer도 NULL로 설정할 수 있다(이경우 delta값은 (0,0)이 사용된다). vector좌표는 하나의 픽셀의 1/64 로 표현된다(고정된 26.6 부동소수값).

NOTE: The transformation is applied to every glyph that is loaded throughFT_Load_Glyph and is completely independent of any hinting process. This means that you won't get the same results if you load a glyph at the size of 24 pixels, or a glyph at the size at 12 pixels scaled by 2 through a transform, because the hints will have been computed differently (except you have disabled hints).

If you ever need to use a non-orthogonal transformation with optimal hints, you first have to decompose your transformation into a scaling part and a rotation/shearing part. Use the scaling part to compute a new character pixel size, then the other one to call FT_Set_Transform. This is explained in details in a later section of this tutorial. 향상된 힌트로 비직교변환을 할 필요가 있다면, 먼저 하나의 scaling part와 rotation/shearing part로 변환을 분해해야 한다. 새로운 character pixel size를 계산하기위해 scaling part를 사용하라, 이어서 이에 대해 더 자세히 설명할것이다.

Loading a glyph bitmap with a non-identity transformation works; the transformation is ignored in this case.

7. Simple text rendering

We will now present a very simple example used to render a string of 8-bit Latin-1 text, assuming a face that contains a Unicode charmap. 우리는 지금 8비트 라틴-1 으로된 문자열을 렌더링하는 간단한 예제를 제공할것이다.

The idea is to create a loop that will, on each iteration, load one glyph image, convert it to an anti-aliased bitmap, draw it on the target surface, then increment the current pen position. 루프안에서 계속해서 과정을 반복할것이다. 하나의 glyph image를 로드하고 -> anti-aliased bitmap으로 변환하고, 결과를 target surface에 그리고나서 pen위치를 증가시킨다.

a. Basic code

The following code performs our simple text rendering with the functions previously described. 아래코드가 위에서 기술한 방식으로 문자를 그리는 예제이다.

FT_GlyphSlot slot = face->glyph; /* a small shortcut */

int pen_x, pen_y, n;

... initialize library ...

... create face object ...

... set character size ...

pen_x = 300; pen_y = 200;

for ( n = 0; n < num_chars; n++ ) {

FT_UInt glyph_index; /* retrieve glyph index from character code */

/* load glyph image into the slot (erase previous one) */

glyph_index = FT_Get_Char_Index( face, text[n] );

error = FT_Load_Glyph( face, glyph_index, FT_LOAD_DEFAULT );

if ( error ) continue; /* ignore errors */

/* convert to an anti-aliased bitmap */

error = FT_Render_Glyph( face->glyph, FT_RENDER_MODE_NORMAL );

if ( error ) continue; /* now, draw to our target surface */

my_draw_bitmap( &slot->bitmap, pen_x + slot->bitmap_left,

pen_y - slot->bitmap_top ); /* increment pen position */

pen_x += slot->advance.x >> 6;

pen_y += slot->advance.y >> 6; /* not useful for now */

}

This code needs a few explanations:

We define a handle named slot that points to the face object's glyph slot. (The type FT_GlyphSlot is a pointer). That is a convenience to avoid usingface->glyph->XXX every time. face object의 glyph slot을 지정하는 handle을 정의한다(FT_GlyphSlot은 포인터이다). 그러면 매번 face->glyph->XXX로 호출할 불편함이 없다.
We increment the pen position with the vector slot->advance, which correspond to the glyph's advance width (also known as its escapement). The advance vector is expressed in 1/64th of pixels, and is truncated to integer pixels on each iteration. vector값인 slot->advance로 pen위치를 증가시킨다. 이 값은 glyph의 advance width에 대응되며 escapement로 알려져있다. advance vector는 1/64 pixel로 표현되고 각 반복시마다 integer로 truncate된다.
The function my_draw_bitmap is not part of FreeType but must be provided by the application to draw the bitmap to the target surface. In this example, it takes a pointer to a FT_Bitmap descriptor and the position of its top-left corner as arguments. my_draw_bitmap함수는 FreeType의 일부가 아니지만 target surface에 비트맵을 그리기 위해 app에서 반드시 제공되어야 한다. 여기서는 FT_Bitmap descriptor의 포인터를 취하여 파라메터로 top-left의 위치를 사용했다.
The value of slot->bitmap_top is positive for an upwards vertical distance. Assuming that the coordinates taken by my_draw_bitmap use the opposite convention (increasing Y corresponds to downwards scanlines), we subtract it from pen_y, instead of adding to it. slot->bitmap_top값은 positive for an upward이다. my_draw_bitmap에서 반대로 좌표를 취하는것을 보라(?), pen_y값을 더하지 않고 뺐다.

b. Refined code

The following code is a refined version of the example above. It uses features and functions of FreeType 2 that have not yet been introduced, and which are explained below: 다음은 위의 예제를 더 손좀 본 것이다. 아직 소개되지 않은 방법으로 구현한 예이다.

FT_GlyphSlot slot = face->glyph; /* a small shortcut */

FT_UInt glyph_index;

int pen_x, pen_y, n;

... initialize library ...

... create face object ...

... set character size ...

pen_x = 300; pen_y = 200;

for ( n = 0; n < num_chars; n++ ) {

/* load glyph image into the slot (erase previous one) */

error = FT_Load_Char( face, text[n], FT_LOAD_RENDER );

if ( error ) continue; /* ignore errors */

/* now, draw to our target surface */

my_draw_bitmap( &slot->bitmap, pen_x + slot->bitmap_left,

pen_y - slot->bitmap_top ); /* increment pen position */

pen_x += slot->advance.x >> 6;

}

We have reduced the size of our code, but it does exactly the same thing: code size를 줄였다. 죽이지 않는가?

We use the function FT_Load_Char instead of FT_Load_Glyph. As you probably imagine, it is equivalent to calling FT_Get_Char_Index thenFT_Get_Load_Glyph. FT_Load_Char를 FT_Load_Glyph대신 사용했다. 예상하겠지만 FT_Get_Char_Index와 FT_Get_Load_Glyph를 호출한것과 동일하다.
We do not use FT_LOAD_DEFAULT for the loading mode, but the bit flagFT_LOAD_RENDER. It indicates that the glyph image must be immediately converted to an anti-aliased bitmap. This is of course a shortcut that avoids calling FT_Render_Glyph explicitly but is strictly equivalent. 여기서 loading mode로 FT_LOAD_DEFAULT를 사용하지 않고 FT_LOAD_RENDER를 사용했다. 이는 glyph image가 anti-aliased bitmap으로 바로 변환하게 한다. 문론 이것은 명시적으로 FT_Render_Glyph를 호출하는것을 피하는 쉬운방법이지만 정확하게 동일하다.

Note that you can also specify that you want a monochrome bitmap instead by using the addition FT_LOAD_MONOCHROME load flag. FT_LOAD_MONOCHROME 을 사용하여 monochrome bitmap을 사용할수 있다.

c. More advanced rendering

Let us try to render transformed text now (for example through a rotation). We can do this using FT_Set_Transform. Here is how to do it: rotation으로 text를 변환해보자. FT_Set_Transform을 이용하면 된다. 아래처럼 해보자.

FT_GlyphSlot slot;

FT_Matrix matrix; /* transformation matrix */

FT_UInt glyph_index;

FT_Vector pen; /* untransformed origin */

int n;

... initialize library ...

... create face object ...

... set character size ...

slot = face->glyph; /* a small shortcut */

/* set up matrix */

matrix.xx = (FT_Fixed)( cos( angle ) * 0x10000L );

matrix.xy = (FT_Fixed)(-sin( angle ) * 0x10000L );

matrix.yx = (FT_Fixed)( sin( angle ) * 0x10000L );

/* the pen position in 26.6 cartesian space coordinates */

matrix.yy = (FT_Fixed)( cos( angle ) * 0x10000L ); /* start at (300,200) */

pen.x = 300 * 64; pen.y = ( my_target_height - 200 ) * 64;

for ( n = 0; n < num_chars; n++ ) { /* set transformation */

/* load glyph image into the slot (erase previous one) */

FT_Set_Transform( face, &matrix, &pen );

error = FT_Load_Char( face, text[n], FT_LOAD_RENDER );

if ( error ) continue; /* ignore errors */

/* now, draw to our target surface (convert position) */

my_draw_bitmap( &slot->bitmap, slot->bitmap_left,

my_target_height - slot->bitmap_top ); /* increment pen position */

pen.x += slot->advance.x; pen.y += slot->advance.y;

}

Some remarks:

We now use a vector of type FT_Vector to store the pen position, with coordinates expressed as 1/64th of pixels, hence a multiplication. The position is expressed in cartesian space.
Glyph images are always loaded, transformed, and described in the cartesian coordinate system in FreeType (which means that increasing Y corresponds to upper scanlines), unlike the system typically used for bitmaps (where the topmost scanline has coordinate 0). We must thus convert between the two systems when we define the pen position, and when we compute the topleft position of the bitmap.
We set the transformation on each glyph to indicate the rotation matrix as well as a delta that will move the transformed image to the current pen position (in cartesian space, not bitmap space).

As a consequence, the values of bitmap_left and bitmap_top correspond to the bitmap origin in target space pixels. We thus don't add pen.x or pen.yto their values when calling my_draw_bitmap.
The advance width is always returned transformed, which is why it can be directly added to the current pen position. Note that it is not rounded this time.

A complete source code example can be found here.

It is important to note that, while this example is a bit more complex than the previous one, it is strictly equivalent for the case where the transform is the identity. Hence it can be used as a replacement (but a more powerful one).

It has however a few shortcomings that we will explain, and solve, in the next part of this tutorial.

Conclusion

In this first section, you have learned the basics of FreeType 2, as well as sufficient knowledge how to render rotated text.

The next section will dive into more details of the API in order to let you access glyph metrics and images directly, as well as how to deal with scaling, hinting, kerning, etc.

The third section will discuss issues like modules, caching and a few other advanced topics like how to use multiple size objects with a single face. [This part hasn't been written yet.]

Posted by 삼스

임베디드/문자셋관련2009. 2. 18. 17:30

Online sample of a CharSet property for conversion texts and files.

http://www.motobit.com/util/charset-codepage-conversion.asp

Online sample of a CharSet property for conversion texts and files.

This online sample demonstrates functionality of ByteArray class for conversion between severalCodepages/CharSets. You can convert text or multibyte in any available code page to another code page or Unicode with this script.
The Form.SizeLimit is 1000000bytes. Please, do not post more source data.

Type some text to a textbox bellowCharset of this document and textbox is

or select a file and its charset as a source data:

Select character set of the source file:
or custom charset

Select destination character set:

Output data:
output to a textbox (as a string)
export to a file, filename:

Note: The source file is handled as a text data with specified character set. The textbox is handled as a string data, default character set for the textbox is the same as a charset of this document.

Posted by 삼스

임베디드/문자셋관련2009. 2. 18. 17:09

문자셋 코드표 검색 사이트

http://www.i18nguy.com/unicode/codepages.html#msftiso

Character Sets And Code Pages At The Push Of A Button

Code Pages, Character Encodings from Software Vendors and Standards Bodies

Here you can find character set and code page information from software vendors (Microsoft, HP, IBM, Sun, etc.) and international standards organizations (e.g. ISO, ECMA, INCITS, etc.). Push any "button" and you will be taken either to the chart of a code page provided by the vendor, or the vendor's web page of links to code page charts. This gives you fast access to popular code pages, as well as access to more complete lists of code page charts.

Content and Product Globalization

Organization

The links are (mostly) organized by vendor or standard organization. Some code pages are listed redundantly, usually because the code page is being described by different vendors. Sometimes the difference is important. For example, one vendor's view of a code page may be different from another's. Certainly character conversion or mapping tables may be very different. Sometimes a code page has been updated and one vendor is still referring to an earlier version of the code page.

Character Encodings, Transformation Formats, Double-Byte, Multi-byte, UTF...

Note that a "code page" is also known by various other names: codepage, encoding, charset, character set, coded character set, (CCS), graphic character set, character map et al. Some of these have more specific names DBCS (double-byte character set), MBCS (multi-byte character set). Some encodings are the result of transformations, and are known as transformation formats, examples include Unicode UTF-8, UTF-16, UTF-32.

Unicode UTF-16 Surrogate Code Points, or Supplementary Characters

If you are interested in UTF-16 surrogate code points, or supplementary characters, see
Setting up Microsoft Windows NT, 2000 or Windows XP to Support Unicode Supplementary Characters and
Conversion Table: Unicode Surrogates to Scalar Value/UTF-32.

Other Unicode pages on this site that may be of interest include: Cheat Sheet: Unicode-Enabling Microsoft C/C++ Source Code,Hiragana Characters, Hebrew Characters, Benefits of the Unicode Standard, and the Compelling Unicode Demo.

TABLE OF CONTENTS
Unicode Standards Organizations Assorted web pages The Go To Guys Czyborra's Site Great Sites China's GB18030 Hong Kong Supplementary Character Set (HKSCS) Library of Congress MAchine Readable Catalog (MARC)	Microsoft's ISO code pages Microsoft Windows code pages Microsoft double-byte character sets Microsoft DOS code pages	IBM ICU Character Conversion Data IBM's ISO code pages IBM Windows code pages IBM Asian code pages IBM DOS code pages

Push A Button To Get Code Page Information
Assorted Web Pages I18n Guy's Hiragana Unicode Chart Dik Winter's Character Set History Piotr Trzcionkowski's Polish code page site (in Polish) Cyrillic.com Character Sets I18nGuru's Character Sets page VT320, VT102, VT52, Heath-19 DEC Terminals VT100, VT220, VT320 Kostis' Character Sets Kostis' Apple Macintosh Roman Japanese Encoding Differences Koichi Yasuoka's Character Tables	Unicode Charts Unicode Charts Unicode character name index UTF-32 (TR-19) Character Encoding Model (TR-17) Basic Latin Latin-1 Supplement Latin Extended-A Combining Diacritical Marks Greek Cyrillic Hebrew I18n Guy's Hebrew Unicode Chart Arabic Currency Symbols Hangul Jamo Hiragana I18n Guy's Hiragana Unicode Chart Katakana	Standards Organizations ISO INCITS ECMA Standards ISO 6429 = ECMA-48 (pdf) (Control codes) ISO/IEC International register of coded character sets to be used with escape sequences Links to many code page charts! IANA Character Set Registry RFC Index RFC 1555 Hebrew Character Encoding for Internet Messages RFC 1556 Handling of Bi-directional Texts in MIME RFC 1556 defines ISO-8859-6-e, ISO-8859-6-i, ISO-8859-8-e,ISO-8859-8-i Armenian Character Sets ArmSCII Thai TIS 620-2533 (in Thai 620-2533) Annotated reference to the Thai implementations
The Go To Guys Michael Everson's site Ken Lunde's CJK.inf Ken Lunde's Character set server Mark Davis's site	Czyborra's Site www.czyborra.com/charsets is offline. Fortunately, Kevin Atkinson has mirrored it at aspell.net/charsets. These buttons now link to his mirror. Thanks Kevin. Roman Czyborra's site Czyborra's Vendor Codepages Czyborra's Vietnamese page Czyborra's ASCII/ISO 646 page Czyborra's ISO 8859 Alphabet Soup So vat's Unicode? Chicken soup?	Great Sites Frank da Cruz's Character Sets Frank da Cruz's Character Set Tables Korpela's Tutorial on character code issues Korpela's Character and encoding site
GB18030 Web Pages ICU's Markus Scherer on GB18030 Sun on GB18030-2000 Microsoft GB18030 Support Package (in GB2312) (Adobe) Dirk Meyer's Summary of GB18030	Hong Kong Supplementary Character Set (HKSCS) Hong Kong Supplementary Character Set (HKSCS) Hong Kong ITF on ISO 10646	MARC Bibliographic MARC 21 MARC-8 MARC UCS (Unicode) MARC Code Tables
Here are many transcoding tables expressed in XML files using theCharacter Mapping Markup Language (CharMapML, UTR 22). The encoding conversion data is used in the Internationalization Components for Unicode (ICU) open source library. IBM ICU Character Conversion Data IBM Character Data IBM Code pages (Appendix F) IBM Character lists (Appendix I) IBM Sort Sequences (Appendix C)	IBM ISO Code Pages CP 00819 (ISO 8859-1) Latin Alphabet No. 1 CP 00813 (ISO 8859-7) Greece CP 00916 (ISO 8859-8) Hebrew CP 00920 (ISO 8859-9) Turkey	IBM Windows Code pages CP 01250 (Windows) Latin 2 CP 01252 (Windows) Latin 1 CP 01253 (Windows) Greek CP 01254 (Windows) Turkish CP 01255 (Windows) Hebrew CP 01256 (Windows) Arabic CP 01257 (Windows) Baltic Rim
In the following web pages, leadbytes are indicated by light gray background shading. Each of these leadbytes links to a new page showing the 256 character block associated with that leadbyte. Unused leadbytes are identified by a darker gray background. Microsoft Double-Byte Character Sets I18n Guy's Hiragana Unicode Chart Japanese Shift-JIS (CP 932) Conversion Problems CP932 & Unicode Simplified Chinese GBK (CP 936) Korean (CP 949) Traditional Chinese Big5 (CP 950) Hong Kong Character Set (HKSCS)	Microsoft Windows Code Pages Microsoft's Windows code pages Microsoft's Windows code pages by country Windows CP 1250 (Central Europe) Windows CP 1251 (Cyrillic) Windows CP 1252 (Latin I) Windows CP 1253 (Greek) Windows CP 1254 (Turkish) Windows CP 1255 (Hebrew) Windows CP 1256 (Arabic) Windows CP 1257 (Baltic) Windows CP 1258 (Viet Nam) Windows CP 874 (Thai)	Microsoft's ISO Code Page Charts Globalization site: GlobalDev ISO Code Pages at Microsoft's site ISO/IEC 8859-1 (Latin 1) ISO/IEC 8859-2 (Latin 2) ISO/IEC 8859-3 (Latin 3) ISO/IEC 8859-4 (Baltic) ISO/IEC 8859-5 (Cyrillic) ISO/IEC 8859-6 (Arabic) ISO/IEC 8859-7 (Greek) ISO/IEC 8859-8 (Hebrew) ISO/IEC 8859-9 (Turkish) ISO/IEC 8859-15 (Latin 9)
IBM DOS Code pages CP 00437 (IBM PC) USA CP 00850 (IBM PC) Multilingual CP 00851 (IBM PC) Greece CP 00852 Latin-2 PC CP 00855 (IBM PC) Cyrillic CP 00856 (IBM PC) Hebrew CP 00857 (IBM PC) Turkey CP 00860 (IBM PC) Portugal CP 00861 (IBM PC) Iceland CP 00862 (IBM PC) Israel CP 00863 (IBM PC) Canadian French CP 00864 (IBM PC) Arabic CP 00865 (IBM PC) Nordic CP 00866 (IBM PC) Cyrillic #2 CP 00869 (IBM PC) Greece CP 00870 Latin-2 Multilingual CP 00874 (IBM PC) Thai Extended	Microsoft OEM (DOS) Code Pages Microsoft's OEM code pages DOS CP 437 (US) DOS CP 720 (Arabic) DOS CP 737 (Greek) DOS CP 775 (Baltic) DOS CP 850 (Western Europe) DOS CP 852 (Central Europe) DOS CP 855 (Cyrillic) DOS CP 857 (Turkish) DOS CP 862 (Hebrew) DOS CP 866 (Cyrillic II)	IBM Asian Code pages I18n Guy's Hiragana Unicode Chart CP 00290 (EBCDIC) Japanese (Katakana) Non-extended CP 00290 (EBCDIC) Japanese (Katakana) Extended CP 00833 (EBCDIC) Korea Extended CP 00836 (EBCDIC) Simplified Chinese Extended CP 00891 (IBM PC) Korea CP 00895 Japan 7-Bit CP 00897 (IBM PC) Japan PC #1 CP 00903 (IBM PC) People's Republic of China (PRC) CP 00904 (IBM PC) Republic of China (ROC) CP 00905 (EBCDIC) Turkey Extended CP CP 01027 (EBCDIC) Japanese (Latin) Extended CP 01040 (IBM PC) Korean Extended CP 01041 (IBM PC) Japanese Extended CP 01042 (IBM PC) Simplified Chinese Extended CP 01043 (IBM PC) Traditional Chinese CP 01088 (IBM PC) Korean CP 01114 Traditional Chinese (Big5) CP 01115 Simplified Chinese (GB)

Posted by 삼스

임베디드/문자셋관련2009. 2. 18. 16:48

Code page conversion

The Unicode conversion filter offers conversions between the following code pages:

For more information on code pages, please see

Code-Page Identifiers

(*) The list of available code pages may be different on your system. You can install additional code pages using Control Panel\Regional Options.

Identifier	Name
037	IBM EBCDIC - U.S./Canada
437	OEM - United States
500	IBM EBCDIC - International
708	Arabic - ASMO 708
709	Arabic - ASMO 449+, BCON V4
710	Arabic - Transparent Arabic
720	Arabic - Transparent ASMO
737	OEM - Greek (formerly 437G)
775	OEM - Baltic
850	OEM - Multilingual Latin I
852	OEM - Latin II
855	OEM - Cyrillic (primarily Russian)
857	OEM - Turkish
858	OEM - Multlingual Latin I + Euro symbol
860	OEM - Portuguese
861	OEM - Icelandic
862	OEM - Hebrew
863	OEM - Canadian-French
864	OEM - Arabic
865	OEM - Nordic
866	OEM - Russian
869	OEM - Modern Greek
870	IBM EBCDIC - Multilingual/ROECE (Latin-2)
874	ANSI/OEM - Thai (same as 28605, ISO 8859-15)
875	IBM EBCDIC - Modern Greek
932	ANSI/OEM - Japanese, Shift-JIS
936	ANSI/OEM - Simplified Chinese (PRC, Singapore)
949	ANSI/OEM - Korean (Unified Hangeul Code) -> EUC-KR
950	ANSI/OEM - Traditional Chinese (Taiwan; Hong Kong SAR, PRC)
1026	IBM EBCDIC - Turkish (Latin-5)
1047	IBM EBCDIC - Latin 1/Open System
1140	IBM EBCDIC - U.S./Canada (037 + Euro symbol)
1141	IBM EBCDIC - Germany (20273 + Euro symbol)
1142	IBM EBCDIC - Denmark/Norway (20277 + Euro symbol)
1143	IBM EBCDIC - Finland/Sweden (20278 + Euro symbol)
1144	IBM EBCDIC - Italy (20280 + Euro symbol)
1145	IBM EBCDIC - Latin America/Spain (20284 + Euro symbol)
1146	IBM EBCDIC - United Kingdom (20285 + Euro symbol)
1147	IBM EBCDIC - France (20297 + Euro symbol)
1148	IBM EBCDIC - International (500 + Euro symbol)
1149	IBM EBCDIC - Icelandic (20871 + Euro symbol)
1200	Unicode UCS-2 Little-Endian (BMP of ISO 10646)
1201	Unicode UCS-2 Big-Endian
1250	ANSI - Central European
1251	ANSI - Cyrillic
1252	ANSI - Latin I
1253	ANSI - Greek
1254	ANSI - Turkish
1255	ANSI - Hebrew
1256	ANSI - Arabic
1257	ANSI - Baltic
1258	ANSI/OEM - Vietnamese
1361	Korean (Johab)
10000	MAC - Roman
10001	MAC - Japanese
10002	MAC - Traditional Chinese (Big5)
10003	MAC - Korean
10004	MAC - Arabic
10005	MAC - Hebrew
10006	MAC - Greek I
10007	MAC - Cyrillic
10008	MAC - Simplified Chinese (GB 2312)
10010	MAC - Romania
10017	MAC - Ukraine
10021	MAC - Thai
10029	MAC - Latin II
10079	MAC - Icelandic
10081	MAC - Turkish
10082	MAC - Croatia
12000	Unicode UCS-4 Little-Endian
12001	Unicode UCS-4 Big-Endian
20000	CNS - Taiwan
20001	TCA - Taiwan
20002	Eten - Taiwan
20003	IBM5550 - Taiwan
20004	TeleText - Taiwan
20005	Wang - Taiwan
20105	IA5 IRV International Alphabet No. 5 (7-bit)
20106	IA5 German (7-bit)
20107	IA5 Swedish (7-bit)
20108	IA5 Norwegian (7-bit)
20127	US-ASCII (7-bit)
20261	T.61
20269	ISO 6937 Non-Spacing Accent
20273	IBM EBCDIC - Germany
20277	IBM EBCDIC - Denmark/Norway
20278	IBM EBCDIC - Finland/Sweden
20280	IBM EBCDIC - Italy
20284	IBM EBCDIC - Latin America/Spain
20285	IBM EBCDIC - United Kingdom
20290	IBM EBCDIC - Japanese Katakana Extended
20297	IBM EBCDIC - France
20420	IBM EBCDIC - Arabic
20423	IBM EBCDIC - Greek
20424	IBM EBCDIC - Hebrew
20833	IBM EBCDIC - Korean Extended
20838	IBM EBCDIC - Thai
20866	Russian - KOI8-R
20871	IBM EBCDIC - Icelandic
20880	IBM EBCDIC - Cyrillic (Russian)
20905	IBM EBCDIC - Turkish
20924	IBM EBCDIC - Latin-1/Open System (1047 + Euro symbol)
20932	JIS X 0208-1990 & 0121-1990
20936	Simplified Chinese (GB2312)
21025	IBM EBCDIC - Cyrillic (Serbian, Bulgarian)
21027	Extended Alpha Lowercase
21866	Ukrainian (KOI8-U)
28591	ISO 8859-1 Latin I
28592	ISO 8859-2 Central Europe
28593	ISO 8859-3 Latin 3
28594	ISO 8859-4 Baltic
28595	ISO 8859-5 Cyrillic
28596	ISO 8859-6 Arabic
28597	ISO 8859-7 Greek
28598	ISO 8859-8 Hebrew
28599	ISO 8859-9 Latin 5
28605	ISO 8859-15 Latin 9
29001	Europa 3
38598	ISO 8859-8 Hebrew
50220	ISO 2022 Japanese with no halfwidth Katakana
50221	ISO 2022 Japanese with halfwidth Katakana
50222	ISO 2022 Japanese JIS X 0201-1989
50225	ISO 2022 Korean
50227	ISO 2022 Simplified Chinese
50229	ISO 2022 Traditional Chinese
50930	Japanese (Katakana) Extended
50931	US/Canada and Japanese
50933	Korean Extended and Korean
50935	Simplified Chinese Extended and Simplified Chinese
50936	Simplified Chinese
50937	US/Canada and Traditional Chinese
50939	Japanese (Latin) Extended and Japanese
51932	EUC - Japanese
51936	EUC - Simplified Chinese
51949	EUC - Korean
51950	EUC - Traditional Chinese
52936	HZ-GB2312 Simplified Chinese
54936	Windows XP: GB18030 Simplified Chinese (4 Byte)
57002	ISCII Devanagari
57003	ISCII Bengali
57004	ISCII Tamil
57005	ISCII Telugu
57006	ISCII Assamese
57007	ISCII Oriya
57008	ISCII Kannada
57009	ISCII Malayalam
57010	ISCII Gujarati
57011	ISCII Punjabi
65000	Unicode UTF-7
65001	Unicode UTF-8

Posted by 삼스

임베디드/문자셋관련2009. 2. 18. 16:36

UNICODE QnA

기존시스템과 다국어(Unicode)기반 시스템은 무엇이 다를까?

지금까지 사용했던 시스템은 KSC5601기반의 문자집합을 사용하고 있습니다.

기본적으로 한글 2350자, 한자 4888자, 특수문자, 히라가나, 카타가나, 러시아어등 일부 다국어만을 사용할 수 있도록 되어 있습니다.

프랑스어, 독일어, 중국어간체등의 다국어 사용은 원칙적으로 불가능하였습니다. UNICODE에서는 이러한 문자표현의 한계를 극봅하고 다양한 언어와 수많은 문자를 지원하게 되었습니다.

한글은 11172자, 한자(CJK)는 20902자를 지원합니다.

이외에도 세계 대부분의 언어 즉 그리스어, 라틴계역, 시릴문자, 히브리어, 타이어, 기호문자(symbols), 함수문자(Punctuation), 아랍어등의 이용이 가능하게 되었습니다.

다국어를 입력하려면?

다국어 입력기를 설치해야 합니다. 사용하는 OS에 따라 설치방법이 달라집니다. Win2K, XP는 입력기가 운영체제에 내장되어 있어서 입력기 설치가 용이하나 다른 ME, 95, NT등의 OS는 별도의 설치프로그램을 설치해야 합니다( http://www.microsoft.com/windows/ie/downloads/recommended/ime/install.mspx ) .

입력하고자 하는 다국어 키보드 자판을 모르는 상태에서 입력하려면 운영체제에서 제공하는 가상키보드를 사용하면 됩니다.

Posted by 삼스

임베디드/문자셋관련2009. 2. 18. 16:29

유니코드란?

지구상에는 수많은 국가와 민족이 있는 만큼 수많은 언어와 문자가 존재합니다.

이러한 언어와 문자를 컴퓨터상에서 표현할 때 서로 다른 문자집합(Character set)과 인코딩(encoding)방식을 사용하게 됩니다. 이로 인해 국가간, 기관간 자료의 상호교환이나 동시 여러개의 언어를 입력하고자 할 때 문제가 됩니다.

우리나라는 KSC5601문자집합을 사용하고 있으나 중국은 GB2312, 일본은 JISX0212를 사용하고 있습니다.

따라서 각국에서 작성한 문서나 자료를 서로 교환하여 보고자 할때 글자가 깨져보이는 문제가 발생합니다.

이로 인해 이를 통합하고자하는 노력이 유니코드(UNICODE)로 나타나게 됩니다. 유니코드는 16bit영역안에 모든 문자를 표현하게 됩니다. 따라서 이론적으로 65536개만큼의 문자를 표현할 수 있습니다.

이러한 세계 표준 문자집합을 만들려고 하는 시도는 국제표준화기구(ISO)에서도 있었습니다. 다행히 이는 1991년에 하나로 통일됩니다. 이 표준안의공식명칭은 ISO/IEC 10646입니다.

Posted by 삼스

«이전 1 다음»

고 투 더 멘토

'임베디드/문자셋관련'에 해당되는 글 6건

Freetype tutorial Step1(막번)

FreeType 2 Tutorial Step 1 — simple glyph loading

© 2003, 2006, 2007 David Turner (david@freetype.org)

Introduction

1. Header files

2. Initialize the library

3. Load a font face

a. From a font file

b. From memory

c. From other sources (compressed files, network, etc.)

4. Accessing face content

5. Setting the current pixel size

6. Loading a glyph image

a. Converting a character code into a glyph index

b. Loading a glyph from the face

c. Using other charmaps

d. Glyph transformations

7. Simple text rendering

a. Basic code

b. Refined code

c. More advanced rendering

Conclusion

Online sample of a CharSet property for conversion texts and files.

Online sample of a CharSet property for conversion texts and files.

Change html charset (default character set for this document)

문자셋 코드표 검색 사이트

Character Sets And Code Pages At The Push Of A Button

Code Pages, Character Encodings from Software Vendors and Standards Bodies

Organization

Character Encodings, Transformation Formats, Double-Byte, Multi-byte, UTF...

Unicode UTF-16 Surrogate Code Points, or Supplementary Characters

Assorted Web Pages

Unicode Charts

Standards Organizations

The Go To Guys

Czyborra's Site

Great Sites

GB18030 Web Pages

Hong Kong Supplementary Character Set (HKSCS)

MARC Bibliographic

IBM ICU

IBM Character Data

IBM ISO Code Pages

IBM Windows Code pages

Microsoft Double-Byte Character Sets

Microsoft Windows Code Pages

Microsoft's ISO Code Page Charts

IBM DOS Code pages

Microsoft OEM (DOS) Code Pages

IBM Asian Code pages

Code page conversion

Code-Page Identifiers

UNICODE QnA

유니코드란?

카테고리

공지사항

태그목록

최근에 올라온 글

최근에 달린 댓글

최근에 받은 트랙백

글 보관함

달력

링크

티스토리툴바

FreeType 2 Tutorial
Step 1 — simple glyph loading

IBM
ISO Code Pages

IBM
Windows Code pages

Microsoft
Double-Byte Character Sets

Microsoft Windows
Code Pages

Microsoft's
ISO Code Page Charts

Microsoft OEM
(DOS) Code Pages