-
Notifications
You must be signed in to change notification settings - Fork 0
Analyzer
Mia Sorola Yoshida edited this page May 7, 2025
·
1 revision
The Analyzer is a tool that checks if the profiles collected by the Crawler are likely to be NAF alumni.
- Collected data to train a machine learning (ML) model (data is available here: https://docs.google.com/spreadsheets/d/1BBxZNuxyXkUHUc9krwzmN3ktNMM6yVH4NNTZzVfhs9w/edit?usp=sharing)
- Built an ML model that assigns importance (weights) to different profile identifiers
- Created a way to calculate a confidence score based on these weights
- Connected the Analyzer to the database so it can read profiles from the Crawler and save results in the Analyzer tables
The Analyzer takes profiles from the Crawler table, analyzes them, and stores results in the unconfirmed_alumni table.
- Open your terminal and go to the Analyzer folder:
cd analyzer- Build the TypeScript code into JavaScript:
npm run build- Start the Analyzer:
npm startWhat to expect
- The program will skip profiles already analyzed (based on the profile_url)
- It will show how many new profiles were analyzed
Example output:
- Collect more training data to improve the ML model, especially for rare cases
- Fine-tune confidence score thresholds to improve accuracy
All related documents are stored here: https://drive.google.com/drive/folders/1pZor1ZZ0oMw0NGoDMwU8nDXF-GsEUvMC?usp=drive_link