-
Notifications
You must be signed in to change notification settings - Fork 102
Open
Description
When doing:
import doc2text
doc = doc2text.Document()
doc.read('something.pdf')
doc.process()
I get:
Error in /usr/local/lib/python2.7/dist-packages/doc2text/page.py on line 23
dst is not a numpy array, neither a scalar
Error in /usr/local/lib/python2.7/dist-packages/doc2text/page.py on line 197
dst is not a numpy array, neither a scalar
Error in /usr/local/lib/python2.7/dist-packages/doc2text/page.py on line 77
dst is not a numpy array, neither a scalar
And then, when I do:
doc.extract_text()
I get:
AttributeError Traceback (most recent call last)
<ipython-input-5-57184997370d> in <module>()
----> 1 doc.extract_text()
/usr/local/lib/python2.7/dist-packages/doc2text/__init__.pyc in extract_text(self)
89 for page in self.processed_pages:
90 new = page
---> 91 text = new.extract_text()
92 self.page_content.append(text)
93 else:
/usr/local/lib/python2.7/dist-packages/doc2text/page.pyc in extract_text(self)
36 def extract_text(self):
37 temp_path = 'text_temp.png'
---> 38 cv2.imwrite(temp_path, self.image)
39 self.text = pytesseract.image_to_string(Image.open(temp_path))
40 os.remove(temp_path)
AttributeError: Page instance has no attribute 'image'
Metadata
Metadata
Assignees
Labels
No labels