A Python script that converts satellite asset data from an Excel file into a structured YAML format with proper hierarchical organization.
This script reads satellite mission data from an Excel spreadsheet (specifically the EOAssetNameList worksheet) and converts it into a well-formatted YAML file. It organizes satellite data into a hierarchical structure:
Programs → Missions → Platforms → Instruments
The converter handles multiple instruments per platform, normalizes values, removes duplicates, and produces beautifully formatted YAML output with proper indentation and quoting.
✨ Data Organization
- Hierarchical structure (Programs → Missions → Platforms → Instruments)
- Automatic deduplication of instruments
- Handles multiple instruments per platform
🎨 Formatting
- Proper 2-space indentation at each nesting level
- Quoted string values for clarity
- Unquoted keys following YAML conventions
- Blank lines between program blocks for readability
🔧 Data Cleaning
- Normalizes missing values to "unspecified"
- Corrects common typos (e.g., "unispecified" → "unspecified")
- Extracts instruments from multiple columns
- Python 3.6+
- pandas
- PyYAML
- Clone the repository:
git clone https://github.com/yourusername/excel-to-yaml-converter.git
cd excel-to-yaml-converter- Install dependencies:
pip install pandas pyyamlpython excelConverter.py <path_to_excel_file> <output_directory>python excelConverter.py ./data/eodata.xlsx ./output/This will create EOassetNameList.yaml in the ./output/ directory.
<path_to_excel_file>: Path to the input Excel file containing the 'EOAssetNameList' worksheet<output_directory>: Directory where the YAML file will be saved
The Excel file should have the following structure:
| Column | Description |
|---|---|
ProgramShortName |
Satellite program identifier (e.g., "PROBA", "Copernicus") |
missionName |
Mission name (e.g., "Proba-V", "SENTINEL-1") |
missionShortName |
Mission abbreviation |
platformName |
Platform/satellite name |
platformShortName |
Platform abbreviation |
platformAcronym |
Platform acronym |
alternativePlatformName |
Alternative name for the platform |
instrumentName |
Primary instrument name |
instrumentShortName |
Primary instrument abbreviation |
| Additional instrument columns | Additional instruments (columns 10-20) and their short names (columns 22-32) |
The output YAML file follows this structure:
EOasset:
- programShortName: "PROBA"
missions:
- missionName: "Proba-V"
missionShortName: "PROBAV"
platforms:
- platformName: "Proba V"
platformShortName: "PROBAV"
platformAcronym: "PROBAV"
alternativePlatformName: "unspecified"
instruments:
- instrumentName: "VEGETATION"
instrumentShortName: "VEGETATION"
- programShortName: "SPOT"
missions:
- missionName: "SPOT-4"
missionShortName: "SPOT4"
platforms:
- platformName: "SPOT 4"
platformShortName: "SPOT4"
platformAcronym: "SPOT4"
alternativePlatformName: "unspecified"
instruments:
- instrumentName: "VEGETATION"
instrumentShortName: "VEGETATION"
- instrumentName: "DORIS (SPOT)"
instrumentShortName: "unspecified"The script prints a summary of the converted data:
YAML file created successfully: ./output/EOassetNameList.yaml
Data summary:
- Programs: 13
- Missions: 25
- Platforms: 39
- Instruments: 129
python excelConverter.py ./data/satellites.xlsx ./output/from excelConverter import excel_to_yaml
# Convert Excel to YAML
success = excel_to_yaml('./data/eodata.xlsx', './output/')
if success:
print("Conversion completed successfully!")
else:
print("Conversion failed. Check the error messages above.")The script includes error handling for:
- Missing Excel files
- Invalid sheet names
- Missing or malformed columns
- File I/O errors
All errors are reported with descriptive messages to help identify and fix issues.
The script has been tested with satellite data including:
- PROBA, SPOT, EPS, Copernicus (Sentinel), MODIS, JPSS, GOES-R, MSG, Himawari, ENVISAT, DMSP, Suomi NPP, and GCOM-W programs
Issue: FileNotFoundError - Excel file not found
- Solution: Check the file path is correct and the file exists
Issue: KeyError - Expected column not found
- Solution: Verify the Excel sheet is named 'EOAssetNameList' and contains the required columns
Issue: Output file is empty
- Solution: Check that the Excel file contains data in the expected format
Contributions are welcome! Please feel free to:
- Report bugs
- Suggest improvements
- Submit pull requests
This project is licensed under the MIT License - see the LICENSE file for details.
Created and maintained as a data conversion utility for satellite mission data.
- Initial release
- Excel to YAML conversion with hierarchical structure
- Proper YAML formatting with quotes and indentation
- Instrument deduplication
- Value normalization