Skip to content

File with naming convention for missions, platforms and instruments in the EO metadata within the CLMS project.

Notifications You must be signed in to change notification settings

copernicus-land/EOAssets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Excel to YAML Converter for Satellite Data

A Python script that converts satellite asset data from an Excel file into a structured YAML format with proper hierarchical organization.

Description

This script reads satellite mission data from an Excel spreadsheet (specifically the EOAssetNameList worksheet) and converts it into a well-formatted YAML file. It organizes satellite data into a hierarchical structure:

ProgramsMissionsPlatformsInstruments

The converter handles multiple instruments per platform, normalizes values, removes duplicates, and produces beautifully formatted YAML output with proper indentation and quoting.

Features

Data Organization

  • Hierarchical structure (Programs → Missions → Platforms → Instruments)
  • Automatic deduplication of instruments
  • Handles multiple instruments per platform

🎨 Formatting

  • Proper 2-space indentation at each nesting level
  • Quoted string values for clarity
  • Unquoted keys following YAML conventions
  • Blank lines between program blocks for readability

🔧 Data Cleaning

  • Normalizes missing values to "unspecified"
  • Corrects common typos (e.g., "unispecified" → "unspecified")
  • Extracts instruments from multiple columns

Requirements

  • Python 3.6+
  • pandas
  • PyYAML

Installation

  1. Clone the repository:
git clone https://github.com/yourusername/excel-to-yaml-converter.git
cd excel-to-yaml-converter
  1. Install dependencies:
pip install pandas pyyaml

Usage

Basic Usage

python excelConverter.py <path_to_excel_file> <output_directory>

Example

python excelConverter.py ./data/eodata.xlsx ./output/

This will create EOassetNameList.yaml in the ./output/ directory.

Command Line Arguments

  • <path_to_excel_file>: Path to the input Excel file containing the 'EOAssetNameList' worksheet
  • <output_directory>: Directory where the YAML file will be saved

Input Format

The Excel file should have the following structure:

Column Description
ProgramShortName Satellite program identifier (e.g., "PROBA", "Copernicus")
missionName Mission name (e.g., "Proba-V", "SENTINEL-1")
missionShortName Mission abbreviation
platformName Platform/satellite name
platformShortName Platform abbreviation
platformAcronym Platform acronym
alternativePlatformName Alternative name for the platform
instrumentName Primary instrument name
instrumentShortName Primary instrument abbreviation
Additional instrument columns Additional instruments (columns 10-20) and their short names (columns 22-32)

Output Format

The output YAML file follows this structure:

EOasset:
  - programShortName: "PROBA"
    missions:
      - missionName: "Proba-V"
        missionShortName: "PROBAV"
        platforms:
          - platformName: "Proba V"
            platformShortName: "PROBAV"
            platformAcronym: "PROBAV"
            alternativePlatformName: "unspecified"
            instruments:
              - instrumentName: "VEGETATION"
                instrumentShortName: "VEGETATION"

  - programShortName: "SPOT"
    missions:
      - missionName: "SPOT-4"
        missionShortName: "SPOT4"
        platforms:
          - platformName: "SPOT 4"
            platformShortName: "SPOT4"
            platformAcronym: "SPOT4"
            alternativePlatformName: "unspecified"
            instruments:
              - instrumentName: "VEGETATION"
                instrumentShortName: "VEGETATION"
              - instrumentName: "DORIS (SPOT)"
                instrumentShortName: "unspecified"

Output Statistics

The script prints a summary of the converted data:

YAML file created successfully: ./output/EOassetNameList.yaml

Data summary:
- Programs: 13
- Missions: 25
- Platforms: 39
- Instruments: 129

Examples

Convert a single file

python excelConverter.py ./data/satellites.xlsx ./output/

Using in a Python script

from excelConverter import excel_to_yaml

# Convert Excel to YAML
success = excel_to_yaml('./data/eodata.xlsx', './output/')

if success:
    print("Conversion completed successfully!")
else:
    print("Conversion failed. Check the error messages above.")

Error Handling

The script includes error handling for:

  • Missing Excel files
  • Invalid sheet names
  • Missing or malformed columns
  • File I/O errors

All errors are reported with descriptive messages to help identify and fix issues.

Testing

The script has been tested with satellite data including:

  • PROBA, SPOT, EPS, Copernicus (Sentinel), MODIS, JPSS, GOES-R, MSG, Himawari, ENVISAT, DMSP, Suomi NPP, and GCOM-W programs

Troubleshooting

Issue: FileNotFoundError - Excel file not found

  • Solution: Check the file path is correct and the file exists

Issue: KeyError - Expected column not found

  • Solution: Verify the Excel sheet is named 'EOAssetNameList' and contains the required columns

Issue: Output file is empty

  • Solution: Check that the Excel file contains data in the expected format

Contributing

Contributions are welcome! Please feel free to:

  • Report bugs
  • Suggest improvements
  • Submit pull requests

License

This project is licensed under the MIT License - see the LICENSE file for details.

Author

Created and maintained as a data conversion utility for satellite mission data.

Changelog

Version 1.0.0

  • Initial release
  • Excel to YAML conversion with hierarchical structure
  • Proper YAML formatting with quotes and indentation
  • Instrument deduplication
  • Value normalization

About

File with naming convention for missions, platforms and instruments in the EO metadata within the CLMS project.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages