Tuesday, November 10, 2009

How will you extract text from PDF

Option 1:
The best solution AFAIK is to use xpdf :
http://www.foolabs.com/xpdf/download.html

under linux just do :
apt-get install xpdf-utils
command line : pdftotext

Then in php you execute the command and get back the text.

Option 2:
http://www.pdflib.com/products/tet/

It has a compiled library that you call within your PHP code which eliminates the need to call command line executables and parse the result.

Option 3:

Use Zend Framework

Enjoy Programming !!!