« Economics link roundup | Main | Raise the price of vanilla enough and everyone will order the rocky-road »
July 3, 2009
Getting data out of PDF files is hard
I was helping a friend get data out of PDF files with limited tools yesterday. Your best bet is to use the text select tool while holding the ALT key, so you can select a column at a time. I use OCR software that came with my scanner, OmniPage, and in general I find it pretty easy to get data out. When I used to work in banking I didn't have access to such software and I used to dread extracting such data. My friend writes me this morning to mention he discovered PDF to Excel, which he really liked. I haven't tried it yet but I will give it a shot on my next data extraction project. Fingers crossed that it can do something intelligent with foot notes and Greek symbols.
Posted by OneEyedMan at July 3, 2009 8:20 AM
Comments
Post a comment
Thanks for signing in, . Now you can comment. (sign out)
(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)