Skip to content

Popular repositories Loading

  1. solrwayback solrwayback Public

    A search interface and wayback machine for the UKWA Solr based warc-indexer framework.

    Java 133 28

  2. heatmap heatmap Public

    A GitHub-inspired graph for visualising activity

    JavaScript 35 4

  3. netarchivesuite netarchivesuite Public

    Netarchivesuite development

    Java 22 23

  4. netsearch netsearch Public

    Merged search-arctika and search-achon into a multi-module project

    Java 14 2

  5. dvenabler dvenabler Public

    Adds DocValues to Solr index fields without full re-index

    Java 9 1

  6. so-me so-me Public

    Social Media harvests

    Shell 9

Repositories

Showing 10 of 56 repositories
  • solrwayback Public

    A search interface and wayback machine for the UKWA Solr based warc-indexer framework.

    netarchivesuite/solrwayback’s past year of commit activity
    Java 133 Apache-2.0 28 63 (1 issue needs help) 2 Updated Jan 26, 2026
  • browsertrix-custom-behaviors Public

    Browsertrix Crawler supports automatically running customized behaviors on each page. Several types of behaviors are supported, including built-in, background, and site-specific behaviors. It is also possible to add fully user-defined custom behaviors that can be added to trigger specific actions on certain pages.

    netarchivesuite/browsertrix-custom-behaviors’s past year of commit activity
    JavaScript 1 0 0 0 Updated Jan 16, 2026
  • warc-indexer Public

    Index ARC/WARC files into a Solr for search purposes.

    netarchivesuite/warc-indexer’s past year of commit activity
    Java 2 GPL-2.0 2 3 0 Updated Jan 9, 2026
  • netarchivesuite Public

    Netarchivesuite development

    netarchivesuite/netarchivesuite’s past year of commit activity
    Java 22 23 32 21 Updated Dec 17, 2025
  • heritrix3-wrapper Public

    Small wrapper to start/stop and communicate with Heritrix 3.

    netarchivesuite/heritrix3-wrapper’s past year of commit activity
    Java 3 Apache-2.0 2 1 7 Updated Nov 17, 2025
  • netarchivesuite-docker-compose Public

    Quickstart for Netarchivesuite using docker-compose

    netarchivesuite/netarchivesuite-docker-compose’s past year of commit activity
    Jinja 2 5 0 1 Updated Nov 5, 2025
  • heritrix3 Public Forked from Landsbokasafn/heritrix3

    Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.

    netarchivesuite/heritrix3’s past year of commit activity
    Java 0 801 0 2 Updated Nov 5, 2025
  • netarchiveclient Public

    Java library to extract large scale data from a Solr server with index build by the Warc-indexer.

    netarchivesuite/netarchiveclient’s past year of commit activity
    Java 2 Apache-2.0 0 0 0 Updated Jun 12, 2025
  • crawlrss Public Forked from Landsbokasafn/crawlrss

    Crawl RSS - Heritrix 3 add-on

    netarchivesuite/crawlrss’s past year of commit activity
    Java 0 6 0 0 Updated Oct 14, 2024
  • jwat-tools Public

    JWAT Tools

    netarchivesuite/jwat-tools’s past year of commit activity
    Java 5 2 2 1 Updated Dec 13, 2023

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Most used topics

Loading…