Skip to content

How to scrape using html files if the site did not declare any "class" #74

@jhnferraris

Description

@jhnferraris

Hello,

I'm trying to review on my javascript skills here and would like to try out this neat scraper. I have this static website here: http://www.phivolcs.dost.gov.ph/html/update_SOEPD/EQLatest.html, I'm trying to scrape off the 2017 table.

Comparing to HackerNews website, my target site doesn't have any css classes to target which texts to scrape.

Example:
screen shot 2017-09-01 at 3 36 49 pm

For starters I tried to do this this way,

var scraperjs = require('scraperjs');

router.get('/bulletin', function(request, response, next){
    scraperjs.StaticScraper.create('http://www.phivolcs.dost.gov.ph/html/update_SOEPD/EQLatest.html')
        .scrape(function($) {
            // This is similar to an inspector on a scrapinghub service.
            return $("html > body > div > table > tbody > tr > td").map(function() {
                console.log($(this));
                return $(this).text();
            }).get();
        })
        .then(function(news) {
            response.send(news);
        })
});

But I can't get any data from the static page. How do can I achieve this?

Thanks for the assist!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions