Skip to content

Comments

refactor: separate stats endpoints into two, rename keys for more clarity#387

Open
bolinocroustibat wants to merge 5 commits intomainfrom
feat/better-stats-endpoints
Open

refactor: separate stats endpoints into two, rename keys for more clarity#387
bolinocroustibat wants to merge 5 commits intomainfrom
feat/better-stats-endpoints

Conversation

@bolinocroustibat
Copy link
Contributor

@bolinocroustibat bolinocroustibat commented Feb 4, 2026

Closes #382 and more.

  1. Moved resource stats (total_count, deleted_count, statuses_count) from /api/status/crawler to /api/resources/stats. Kept total_eligible_count in crawler status (crawler-related and expensive).

    • More consistent perimeters for each stats endpoint (one related to ressources info, one related to crawler status)
    • That makes the endpoint /api/status/crawler less SQL-expensive and better balance the expensive queries between /api/status/crawler and /api/resources/status
  2. Renamed status endpoint keys for clarity (fixes Rename and add keys/values in /api/status/crawler response for more clarity #382):

    • pending_countneeds_check_count
    • fresh_countup_to_date_check_count
    • checked_percentageneeds_check_percentage
    • fresh_percentageup_to_date_check_percentage
  3. Added in_progress_count and in_progress_percentage

  4. Performance optimization: Combined 2 SQL queries into 1 (reduced from 3 to 2 queries total in /api/status/crawler)

  5. Renamed variable: total_resources_filteredtotal_eligible_resources (and JSON key total_filtered_counttotal_eligible_count)

  6. Add tests for this new endpoint, adapt the test for /crawler/stats and add a new test case (outdated check)

How it looks

GET /api/resources/stats:

{
  "total_count": 100,
  "deleted_count": 3,
  "statuses_count": {
    "null": 85,
    "BACKOFF": 2,
    "CRAWLING_URL": 1,
    "TO_ANALYSE_RESOURCE": 0,
    "ANALYSING_RESOURCE_HEAD": 0,
    "DOWNLOADING_RESOURCE": 0,
    "ANALYSING_DOWNLOADED_RESOURCE": 0,
    "TO_ANALYSE_CSV": 0,
    "ANALYSING_CSV": 0,
    "VALIDATING_CSV": 0,
    "INSERTING_IN_DB": 0,
    "CONVERTING_TO_PARQUET": 0,
    "TO_ANALYSE_GEOJSON": 0,
    "ANALYSING_GEOJSON": 0,
    "CONVERTING_TO_PMTILES": 0,
    "CONVERTING_TO_GEOJSON": 0,
    "TO_ANALYSE_PARQUET": 0,
    "ANALYSING_PARQUET": 0
  },
  "cors": {
    "external_resources_with_cors_data": 42,
    "external_resources_without_cors_data": 55,
    "external_resources_cors_coverage_percentage": 43.3,
    "external_resources_allow_origin_distribution": [
      {
        "access_status": "Accessible (Specific Whitelist)",
        "unique_resources_count": 20,
        "percentage": 47.62
      },
      {
        "access_status": "Accessible (Wildcard *)",
        "unique_resources_count": 15,
        "percentage": 35.71
      },
      {
        "access_status": "Blocked (Missing Header)",
        "unique_resources_count": 5,
        "percentage": 11.9
      },
      {
        "access_status": "Blocked (Other Domain Only)",
        "unique_resources_count": 2,
        "percentage": 4.76
      }
    ]
  }
}

GET /api/status/crawler:

{
  "checks": {
    "in_progress_count": 5,
    "in_progress_percentage": 4.76,
    "needs_check_count": 60,
    "needs_check_percentage": 63.16,
    "up_to_date_check_count": 35,
    "up_to_date_check_percentage": 36.84
  },
  "resources": {
    "total_eligible_count": 95
  }
}

@bolinocroustibat bolinocroustibat changed the title refactor: separate stats endpoints into two, rename keys for more cla… refactor: separate stats endpoints into two, rename keys for more crity Feb 4, 2026
@bolinocroustibat bolinocroustibat self-assigned this Feb 4, 2026
@bolinocroustibat bolinocroustibat marked this pull request as draft February 4, 2026 13:11
@bolinocroustibat bolinocroustibat changed the title refactor: separate stats endpoints into two, rename keys for more crity refactor: separate stats endpoints into two, rename keys for more clarity Feb 4, 2026
@bolinocroustibat bolinocroustibat marked this pull request as ready for review February 4, 2026 16:46
Copy link
Contributor

@maudetes maudetes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the refacto!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

Status: 👀 Review

Development

Successfully merging this pull request may close these issues.

Rename and add keys/values in /api/status/crawler response for more clarity

2 participants