Thursday, May 24, 2007

CakePHP: Donation

Yesterday, I donate 5 USD to Cake Software Foundation. 5 USD = 5 meals (for me, in Thailand). So it is much money :-P

Tuesday, April 17, 2007

For GNU/Linux only

I found that I posted a lot articles about human language technology and etc. here. Thus, I create new blog (and homepage) at www.vee-u.com. And I try to post mostly GNU/Linux and free software related stuff here.

Saturday, March 31, 2007

Converting Orchid corpus to XML

Orchid corpus is a Thai part-of-speech annotated corpus, which is used to be freely available on Nectec's website. (I wish it will become available again.) Since, it has quite unique format so it is quite inconvenient to handle. Therefore I just wrote a script to convert it to XML. Then I can just use a XML parser like pulldom to handle it by using a familiar API e.g. (pull)DOM etc. The example for Orchid corpus format. %metadata %metadata #P1 #1 blaa blaa blaa// blaa/NNNN blaa/NNNN blaa/NNNN // The example XML for Orchid corpus format. <corpus> <document author="abcd" ...> <paragraph> <sentence raw_txt="blaa blaa blaa"> <word surface="blaa" pos="NNNN"/> <word surface="blaa" pos="NNNN"/> <word surface="blaa" pos="NNNN"/> <word surface="blaa" pos="NNNN"/> </sentence> </paragraph> </document> ... </corpus> TEI format is probably suit for this job but I am just to lazy to read the specification.

Wednesday, March 28, 2007

Displaying multilingual text in SVG using Firefox

In Khem's tree editor, SVG is used for displaying tree in Firefox. Firefox 2.x on Windows XP can display English text and Thai text in SVG correctly. But when I try to use Firefox 2.x on Mac OS X, Thai, Bengari and Chinese text became a box as shown below. firefox screenshot
(using this following code) <svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" baseProfile="full"> <text x="50" y="50" font-size="16" fill="blue" > Wikipedia 維基百科 วิกิพีเดีย উইকিপিডিয়া </text> </svg>
Thus, I try to assign a font family to the text as the following code:
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" baseProfile="full"> <text x="50" y="50" font-family="Garuda" font-size="16" fill="blue" > Wikipedia 維基百科 วิกิพีเดีย উইকিপিডিয়া </text> </svg>
It works. Firefox can display Thai text correctly. However, Firefox still cannot display Bangari text and Chinese text. As shown below. firefox screenshot I try to use other font families, i.e. Times, Sans and Helvetica but only English text can be displayed.

Sunday, February 25, 2007

A pure ruby ternary search tree implementation

source code. It takes 10 minutes to load the Yaitron dictionary. Thus, I try ctst :-P. Thank lindever for introducing me TST :-)
Creative Commons License
This workis licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 License.