Sunday, December 24, 2006

GIZA++ on Mac OS X (HFS+)

Today I find that foobar.a3.final and foobar.A3.final are the same file on HFS+ (the file system are used in my iBook). Now I know why foobar.A3.final in my working directory is not the same as what mentioned in GIZA++'s README. A workaround is as follow:
diff -Nuar GIZA++-v2/model3.cc GIZA++-v2-osx/model3.cc
--- GIZA++-v2/model3.cc Tue Sep 30 21:24:18 2003
+++ GIZA++-v2-osx/model3.cc     Sat Dec 23 18:16:08 2006
@@ -318,8 +318,8 @@
     d4file = Prefix + ".d4." + number ;
     d4file2 = Prefix + ".D4." + number ;
     d5file = Prefix + ".d5." + number ;
-      alignfile = Prefix + ".A3." + number ;
-      test_alignfile = Prefix + ".tst.A3." + number ;
+      alignfile = Prefix + ".uA3." + number ;
+      test_alignfile = Prefix + ".tst.uA3." + number ;
     p0file = Prefix + ".p0_3." + number ;
   }
   // clear count tables
I noticed this after running GIZA++ on NetBSD and the result was just like in README. Update: Now I switched from Mac OS X to Ubuntu http://blog.vee-u.com/2008/03/02/giza_pp/

2 comments:

Kevin Brubeck Unhammer said...
This comment has been removed by the author.
Kevin Brubeck Unhammer said...

I just ran into this problem myself. I'm glad I was running it on a small test corpus first! Mac's really should come with a big red sticker saying "Warning! File system is case insensitive!" because I've had so much trouble with this... btw, I may be running a different version of Giza++, the relevant file is called "model3.cpp" in mine.

Creative Commons License
This workis licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 License.