-
April 8th, 2009, 07:58 PM
#1
Command Line?
Hello,
I d/l'd this program to convert pdf to text, but I can't make head nor tail of the readme. I think it has something to do with command line, about which I also seem to have forgotten. How do you edit the command line? How do you get to it? What do I type? I'm running Win '98 on this desktop; I have XP on my laptop.
Yours floundering in a fog,
Chas
*******************************
ATTENTION: pdftotext = pdf-txt1
*******************************
pdftotext(1) pdftotext(1)
NAME
pdftotext - Portable Document Format (PDF) to text con-
verter (version 0.91)
SYNOPSIS
pdftotext [options] [PDF-file [text-file]]
DESCRIPTION
Pdftotext converts Portable Document Format (PDF) files to
plain text.
Pdftotext reads the PDF file, PDF-file, and writes a text
file, text-file. If text-file is not specified, pdftotext
converts file.pdf to file.txt. If text-file is '-', the
text is sent to stdout.
OPTIONS
-f number
Specifies the first page to convert.
-l number
Specifies the last page to convert.
-ascii7
Convert the text to 7-bit ASCII; the default is to
use the 8-bit ISO Latin-1 character set.
-latin2
Convert the text to the Latin-2 (ISO-8859-2) char-
acter set. (This will only be useful if the font
encodings are specified correctly in the PDF file.)
-latin5
Convert the text to the Latin-5 (ISO-8859-9) char-
acter set. (This will only be useful if the font
encodings are specified correctly in the PDF file.)
-eucjp Convert Japanese text to EUC-JP. This is currently
the only option for converting Japanese text -- the
only effect is to switch to 7-bit ASCII for non-
Japanese text, in order to fit into the EUC-JP
encoding. (This option is only available if pdfto-
text was compiled with Japanese support.)
-raw Keep the text in content stream order. This is a
hack which often "undoes" column formatting, etc.
This option will likely be replaced with something
more sophisticated when pdftotext is rewritten to
use a smarter text placement algorithm.
-upw password
Specify the user password for the PDF file.
-q Don't print any messages or errors.
-v Print copyright and version information.
-h Print usage information. (-help is equivalent.)
BUGS
Some PDF files contain fonts whose encodings have been
mangled beyond recognition. There is no way (short of
OCR) to extract text from these files.
AUTHOR
The pdftotext software and documentation are copyright
1996-2000 Derek B. Noonburg (derekn@foolabs.com).
SEE ALSO
xpdf(1), pdftops(1), pdfinfo(1), pdftopbm(1), pdfimages(1)
http://www.foolabs.com/xpdf/
Last edited by chaszal; April 8th, 2009 at 08:05 PM.
-
April 8th, 2009, 08:14 PM
#2
Which OS are trying this in?
I've moved your thread to the Windows 98 forum as this OS still has DOS.
-
April 8th, 2009, 08:32 PM
#3
I see you list 98 so you are int the right forum now.
To generate a plain text file, run pdftotext:
http://www.foolabs.com/xpdf/README
I believe this is where you got it,
http://www.foolabs.com/xpdf/home.html
Thread Information
Users Browsing this Thread
There are currently 1 users browsing this thread. (0 members and 1 guests)
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
|