Skip to content

Respect base tags #17

@noyainrain

Description

@noyainrain

If a document defines a base tag, take it into account when resolving extracted ad URLs.

Draft

class Company:
    """
    URL field of an ad. The extracted URL is resolved against the base URL of the document (for HTML
    see https://developer.mozilla.org/en-US/docs/Web/HTML/Element/base).
    """
    url_path: str

Implementation hints: Company._parse_html() could search for a <base> tag and join it with each
ad URL. A minimal HTML document containing a <base> tag could be used for a unit test.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions