PDA

View Full Version : convert MsWord document (*.doc) to text



zero_one
January 13th, 2006, 11:00 AM
is there any way in PHP or ASP in which I can convert a Microsoft Word document into plain text?

this would later be saved in a database row so that i can do a fulltext search with mySQL. i don't need the formatting or images.. the user can get that by opening the link to the original doc file.

(i'd like to have the user just input the doc file + the user already has an archive of docs which would need to be added to this database)

mpelland
January 13th, 2006, 11:18 AM
http://www.google.com/custom?hl=en&lr=&ie=ISO-8859-1&oe=ISO-8859-1&client=pub-2951707118576741&channel=5742870948&cof=FORID%3A1%3BL%3Ahttp%3A%2F%2Ffiles.phpclasses. org%2Fgraphics%2Fgooglesearch.jpg%3BLH%3A50%3BLW%3 A256%3BGL%3A1%3BBGC%3AA3C5CC%3BT%3A%23000000%3BLC% 3A%230000ff%3BVLC%3A%23663399%3BALC%3A%230000ff%3B GALT%3A%23663399%3BGFNT%3A%230000ff%3BGIMP%3A%2300 00ff%3BDIV%3A%23222222%3BLBGC%3AA3C5CC%3BAH%3Acent er%3BS%3Ahttp%3A%2F%2Fwww.phpclasses.org%2Fsearch. html%3B&domains=www.phpclasses.org&q=read+ms+word+files&btnG=Search&sitesearch=www.phpclasses.org

try there

λ
January 13th, 2006, 01:22 PM
Try running http://www.45.free.net/~vitus/software/catdoc/ on a file with using the exec (http://uk2.php.net/manual/en/function.exec.php) function.

Note that with open standards there'd be no need to resort to an even slightly hacky solution like this ;)

zero_one
January 13th, 2006, 05:47 PM
Try running http://www.45.free.net/~vitus/software/catdoc/ (http://www.45.free.net/%7Evitus/software/catdoc/) on a file with using the exec (http://uk2.php.net/manual/en/function.exec.php) function.

Note that with open standards there'd be no need to resort to an even slightly hacky solution like this ;)


I'll take a look at it. thanks...
do you suggest any other way in which I can tackle this problem/application?

ironikart
January 13th, 2006, 07:25 PM
If you're site is running a *nix server you can use 'Antiword'

zero_one
January 14th, 2006, 05:38 PM
If you're site is running a *nix server you can use 'Antiword'

what's a nix server? i'll be hosting it on the hosting plans offered by godaddy.com

Jeff Wheeler
January 14th, 2006, 05:53 PM
It's a linux server (as opposed to a Windows server). You'll be a happy camper if you're on one. :)

zero_one
January 15th, 2006, 08:01 AM
It's a linux server (as opposed to a Windows server). You'll be a happy camper if you're on one. :)

yes i'll be on a Linux server

Jeff Wheeler
January 15th, 2006, 09:21 AM
Then use antiword. :)

ironikart
January 15th, 2006, 05:14 PM
If you're on a shared hosting plan (which sounds like you will be) the host may have already installed the tools. I'd check with them to see whether they have, whether they will allow you to install them and also whether the php.ini settings will allow you to call the exec() function which is what you will (I think) need to call antiword and pass information back to php.