-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Labels
Description
Right now our extraction uses Cyberneko to parse HTML and provide a DOM.
However since it does not have a JS engine, contents loaded through JS is not extractable.
One way to overcome this is to interface with phantom.js (which provides DOM and JS engine) for extraction.
This will require designing an extraction engine API in BSJava, and implementing a wrapper for phantom.js.
Reactions are currently unavailable