Skip to content

Conversation

@fsalmon001
Copy link

So far, two options to write a mooring file:

  • One file is written sequentially
  • One file is written per processor, so N files are written in parallel.

In the proposed pull request, one file is written in parallel using the parallel library netCDF (you need to compile netCDF with the parallel option). NeXtSIM can still be compiled without netcdf parallel but this option could not be used.

The netCDF library can only write rectangular grids efficiently, so there is first a rectangle decomposition of the domain.

Then each processor writes a part of the grid corresponding to one rectangle or a set of rectangles.

I tested it only with a regular grid because I have no file for an arbitrary grid. Please check with an arbitrary grid if you use it, there could be issues in this case.

@tdcwilliams
Copy link
Contributor

Thanks @fsalmon001.
I am having trouble compiling the netcdf libraries though.
Which versions of netcdf-c, netcdf-cxx and netcdf-fortran are you using, and where are you downloading them from?

I can compile netcdf-c (latest version, 4.9.3) but netcdf-cxx fails to compile

@tdcwilliams
Copy link
Contributor

Hi again @fsalmon001,

If you know apptainer, maybe you could help me to make a container. Here is my recipe file:

Bootstrap: docker
From: ubuntu:24.04
Stage: build

%files
    nextsim.sh /etc/profile.d/nextsim.sh

%post
    # install libraries
    apt-get update --fix-missing
    apt-get install -y --no-install-recommends \
        apt-transport-https \
        ca-certificates \
        cmake \
        g++ \
        git \
        gcc \
        gfortran \
        grsync \
        libblas-dev \
        libboost1.74-all-dev \
        liblapack-dev \
        libopenmpi-dev \
        libx11-dev \
        libxml2-dev \
        make \
        rsync \
        openmpi-bin \
        unzip \
        valgrind \
        wget \
        zip
    update-ca-certificates
    apt-get clean
    rm -rf /var/lib/apt/lists/*

    # set some variables
    . /etc/profile.d/nextsim.sh

    # Build HDF5 with parallel support
    mkdir -p /build/hdf5
    cd /build/hdf5
    HDF5_VERSION="1.14.3"
    wget -nc -nv --no-check-certificate \
        https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-${HDF5_VERSION%.*}/hdf5-${HDF5_VERSION}/src/hdf5-${HDF5_VERSION}.tar.gz
    tar -xzf hdf5-${HDF5_VERSION}.tar.gz
    cd hdf5-${HDF5_VERSION}
    CC=mpicc ./configure \
        --prefix=/opt/local/hdf5 \
        --enable-parallel \
        --enable-shared \
        --disable-static \
        --with-zlib=/usr
    make -j8
    make install
    cd /
    rm -rf /build/hdf5

    # Build NetCDF-C with parallel support
    mkdir -p /build/netcdf
    cd /build/netcdf
    #NETCDF_VERSION="4.9.3" #latest, but does not work with NETCDF_CXX4_VERSION=4.3.1
    NETCDF_VERSION="4.7.4"
    wget -nc -nv --no-check-certificate \
        https://downloads.unidata.ucar.edu/netcdf-c/${NETCDF_VERSION}/netcdf-c-${NETCDF_VERSION}.tar.gz
    tar -xzf netcdf-c-${NETCDF_VERSION}.tar.gz
    cd netcdf-c-${NETCDF_VERSION}
    CPPFLAGS=-I/opt/local/hdf5/include \
    LDFLAGS=-L/opt/local/hdf5/lib \
    LD_LIBRARY_PATH=/opt/local/hdf5/lib:$LD_LIBRARY_PATH \
    CC=mpicc ./configure \
        --prefix=/opt/local/netcdf \
        --enable-netcdf4 \
        --enable-shared \
        --disable-static \
        --enable-dap=no
    make -j8
    make install
    cd /
    rm -rf /build/netcdf

    # Build NetCDF-CXX4 (C++ API) with parallel support
    mkdir -p /build/netcdf-cxx4
    cd /build/netcdf-cxx4
    NETCDF_CXX4_VERSION="4.3.1"
    wget -nc -nv --no-check-certificate \
        https://downloads.unidata.ucar.edu/netcdf-cxx/${NETCDF_CXX4_VERSION}/netcdf-cxx4-${NETCDF_CXX4_VERSION}.tar.gz
    tar -xzf netcdf-cxx4-${NETCDF_CXX4_VERSION}.tar.gz
    cd netcdf-cxx4-${NETCDF_CXX4_VERSION}
    CPPFLAGS=-I/opt/local/netcdf/include \
    LDFLAGS=-L/opt/local/netcdf/lib \
    LD_LIBRARY_PATH=/opt/local/netcdf/lib:/opt/local/hdf5/lib:$LD_LIBRARY_PATH \
    ./configure \
        --prefix=/opt/local/netcdf \
        --enable-shared \
        --disable-static
    make -j8
    make install
    cd /
    rm -rf /build/netcdf-cxx4

    # Build NetCDF-Fortran with parallel support (optional, for Fortran code)
    mkdir -p /build/netcdf-fortran
    cd /build/netcdf-fortran
    NETCDF_FORTRAN_VERSION="4.6.2"
    wget -nc -nv --no-check-certificate \
        https://downloads.unidata.ucar.edu/netcdf-fortran/${NETCDF_FORTRAN_VERSION}/netcdf-fortran-${NETCDF_FORTRAN_VERSION}.tar.gz
    tar -xzf netcdf-fortran-${NETCDF_FORTRAN_VERSION}.tar.gz
    cd netcdf-fortran-${NETCDF_FORTRAN_VERSION}
    CPPFLAGS=-I/opt/local/netcdf/include \
    LDFLAGS=-L/opt/local/netcdf/lib \
    LD_LIBRARY_PATH=/opt/local/netcdf/lib:/opt/local/hdf5/lib:$LD_LIBRARY_PATH \
    ./configure \
        --prefix=/opt/local/netcdf \
        --enable-shared \
        --disable-static
    make -j8
    make install
    cd /
    rm -rf /build/netcdf-fortran

    # install gmsh
    mkdir /gmsh
    cd /gmsh
    wget -nc -nv --no-check-certificate \
       https://gitlab.onelab.info/gmsh/gmsh/-/archive/gmsh_3_0_6/gmsh-gmsh_3_0_6.tar.gz
    tar -xzf gmsh-gmsh_3_0_6.tar.gz
    cd /gmsh/gmsh-gmsh_3_0_6
    cmake \
        -DCMAKE_INSTALL_PREFIX=/opt/local/gmsh \
        -DENABLE_BUILD_LIB=ON \
        -DENABLE_BUILD_SHARED=ON \
        -DENABLE_BUILD_DYNAMIC=ON \
        -DCMAKE_BUILD_TYPE=release \
        -DENABLE_MPI=OFF \
        -DENABLE_MUMPS=OFF \
        -DENABLE_PETSC=OFF \
        -DENABLE_OPENMP=OFF \
        -DENABLE_MMG3D=OFF
    make -j8
    make install
    # clean up
    cd /
    rm -r /gmsh*

%environment
    . /etc/profile.d/nextsim.sh
    export LD_LIBRARY_PATH=/opt/local/netcdf/lib:/opt/local/hdf5/lib:$LD_LIBRARY_PATH
    export PATH=/opt/local/netcdf/bin:/opt/local/hdf5/bin:$PATH

%test
    grep -q NAME=\"Ubuntu\" /etc/os-release
    if [ $? -eq 0 ]
    then
        echo "Container base is Ubuntu as expected."
    else
        echo "Container base is not Ubuntu."
        exit 1
    fi
    # Test parallel NetCDF
    if [ -f /opt/local/netcdf/bin/nc-config ]
    then
        echo "NetCDF installation found."
        /opt/local/netcdf/bin/nc-config --has-nc4
        /opt/local/netcdf/bin/nc-config --has-parallel
    else
        echo "NetCDF installation not found."
        exit 1
    fi

%labels
    Author Timothy Williams
    Version v0.1.0

%help
    This is an ubuntu container with required external libraries
    to compile and run nextsim with parallel NetCDF support.
    
    Includes:
    - HDF5 compiled with parallel (MPI) support
    - NetCDF-C and NetCDF-Fortran with parallel I/O capabilities
    - OpenMPI and compilation tools

@tdcwilliams
Copy link
Contributor

PS this is the environment file nextsim.sh

#! /bin/bash

# runtime environment variables
export CC=mpicc
export CXX=mpicxx
export CFLAGS="-O3 -fPIC"
export CXXFLAGS="-O3 -pthread -fPIC -fopenmp"
export CCFLAGS="$CFLAGS"
export OPENMPI_INCLUDE_DIR=/usr/lib/x86_64-linux-gnu/openmpi/include
export MPI_INC_DIR=/usr/lib/x86_64-linux-gnu/openmpi/include
export OPENMPI_LIB_DIR=/usr/lib/x86_64-linux-gnu/openmpi/lib
export NETCDF_DIR=/usr
export BOOST_DIR=/usr
export BOOST_INCDIR=$BOOST_DIR/include
export BOOST_LIBDIR=$BOOST_DIR/lib/x86_64-linux-gnu
export GMSH_DIR=/opt/local/gmsh
export LANG=C.UTF-8
export LC_ALL=C.UTF-8

# where container will expect to find code to compile/run
export NEXTSIMDIR=/nextsim

# update PATH
export PATH=$GMSH_DIR/bin:$PATH

# input and output data
export NEXTSIM_MESH_DIR=/nextsim_mesh_dir
export NEXTSIM_DATA_DIR=/nextsim_data_dir

@fsalmon001
Copy link
Author

Ok, I had the same troubles. The only way I found is to first install a not recent version of hdf5 in parallel:

wget https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.10/hdf5-1.10.7/src/hdf5-1.10.7.tar.gz
tar -xzf hdf5-1.10.7.tar.gz
cd hdf5-1.10.7
./configure --prefix=/usr/hdf5-1.10 --enable-parallel --enable-shared
make -j8
make install

Then, I used netcdf-c-4.8.1

export CPPFLAGS="-I/usr/hdf5-parallel/include"
export LDFLAGS="-L/usr/hdf5-parallel/lib"
export LD_LIBRARY_PATH="/usr/hdf5-parallel/lib:$LD_LIBRARY_PATH"

cd netcdf-c-4.8.1
./configure --prefix=/usr/netcdf-parallel \
            --enable-parallel-tests \
            --disable-detect-parallel
make -j8
make install

Then, if you had already installed another version of netcdf, you need to point on this new version to have access to the parallel functions. You can check with nc-config --has-parallel. If the answer is no, there is an issue.

Of course, you need to change the paths. Tell me if it works, I did not write every step so it may be incomplete

@tdcwilliams
Copy link
Contributor

tdcwilliams commented Oct 20, 2025 via email

@fsalmon001
Copy link
Author

I guess it was here https://github.com/Unidata/netcdf-c/tree/v4.8.1 @tdcwilliams

For the c++ version, the parallel process is not implemented in it, so I had to take the c version for parallelization. However, it is still used sequentially in neXtSIM. For a small grid and a small number of processors, the sequential approach is still more efficient that the parallel one because I had to make a big preliminary process before writing the netcdf file in parallel.

@tdcwilliams
Copy link
Contributor

tdcwilliams commented Oct 20, 2025 via email

@fsalmon001
Copy link
Author

fsalmon001 commented Oct 21, 2025

In gridouput no, but there are netcdf functions elsewhere, in other functions. I did not have a look on that @tdcwilliams

@tdcwilliams
Copy link
Contributor

tdcwilliams commented Oct 21, 2025 via email

@fsalmon001
Copy link
Author

I did not need to modify netcdf-cxx, I did not even recompile it after compiling netcdf-c @tdcwilliams

@tdcwilliams
Copy link
Contributor

Hi @fsalmon001,
I got it to compile and run in my container. For large_arctic_10km.msh and 10 days it was 1 minute faster.

I still need to try an irregular output grid though.

Maybe I'll try large_arctic_5km as well to see the difference there as well.

@tdcwilliams
Copy link
Contributor

Hi again @fsalmon001
I forgot to add moorings.parallel_output=true so the run was not using parallel writing. When I set it to true, the model hangs when it it is first time to write to the netcdf file.

This happens both when I use a container and when I compile with intel compilers.
Did this ever happen to you at all?

@fsalmon001
Copy link
Author

Hi @tdcwilliams,

No, it never happens for me. Maybe this can stem from a different MPI configuration or maybe a MPI barrier should be needed somewhere. Do you know in which function the process hangs?

@tdcwilliams
Copy link
Contributor

Hi @fsalmon001
it is somewhere in writeNetcdfParallel, in this block

    for (int k = 0; k < max_size; k++)
    {
        size_t start[3] = {nc_step, 0, 0};
        size_t count[3] = {1, 0, 0};
        if (k < local_size)
        {
            start[0] = nc_step;
            start[1] = (size_t) jmin[k];
            start[2] = (size_t) imin[k]; 
            count[0] = 1;
            count[1] = (size_t) jmax[k] - jmin[k];
            count[2] = (size_t) imax[k] - imin[k]; 
        }

        int el = -1;
        float miss_val = (float) M_miss_val;

        for (auto* container : { &M_nodal_variables, &M_elemental_variables }) 
        {
            el++;
            int ID = 0;
            for (auto it = container->begin(); it != container->end(); it++)
            {
                if ( it->varID < 0 ) // Skip non-outputting variables
                continue;

                nc_inq_varid(ncid, it->name.c_str(), &data);
                nc_var_par_access(ncid, data, NC_COLLECTIVE);

                if (k < local_size)
                { 
                    // First, write the local data
                    std::vector<float> tmp((jmax[k] - jmin[k])*(imax[k] - imin[k]));
                    for (int i = imin[k]; i < imax[k]; i++)
                    {
                        for (int j = jmin[k]; j < jmax[k]; j++)
                        {
                            int ind_glob = i + j * M_ncols;
                            int ind_loc = i-imin[k] + (j-jmin[k]) * (imax[k]-imin[k]);
                            tmp[ind_loc] = (float) it->data_grid[ind_glob];
                        }
                    }
    
                    // Second, write the data from other processes
                    std::vector<std::vector<std::vector<double>>> value;
                    if (el)
                        value = elemental_recv;
                    else
                        value = nodal_recv;
    
                    for (int i = 0; i < indices[k].size(); i++)
                    {
                        int n = indices[k][i]%M_comm.size();
                        int ii = indices[k][i]/M_comm.size();
                        int x = list_recv[n][ii]%M_ncols;
                        int y = list_recv[n][ii]/M_ncols;
                        int ind_loc = x - imin[k] + (y-jmin[k]) * (imax[k]-imin[k]);

                        // If the point has already been written locally, it should not be erased
                        if (fabs(tmp[ind_loc]-miss_val)<1 || fabs(tmp[ind_loc]) < 1.e-8) tmp[ind_loc] = (float) value[n][ID][ii];
                    }
    
                    nc_put_vara_float(ncid, data, start, count, &tmp[0]);
                }
                else // Dummy
                    nc_put_vara_float(ncid, data, start, count, nullptr);

                ID++;
            }
        }
    }

@fsalmon001
Copy link
Author

fsalmon001 commented Oct 27, 2025

Thank you @tdcwilliams

I am not sur about the issue, but I had some problems here.

The function nc_put_vara_float must be called by every process the same number of time. This is why I make a MPI reduce to have the max_size of the loop and the processors with less data send dummy information.

If it blocks here, I think the number of nc_put_vara_float calls is not the same for each process. With my configuration it is the case, so I assume you could use another option that I did not use, which involves differences here.

Maybe, the problem is that M_nodal_variables or M_elemental_variables are not the same on all processors, but it would be strange.

Maybe you could put some "std::cout << M_comm.rank() << " " << k << std::endl;" in some places in the loop, I feel some process does not perform each iteration but I do not understand why.

@tdcwilliams
Copy link
Contributor

Hi @fsalmon001, It is hanging at the nullptr line when local_size < max_size; I tried making sure count = {0,0,0} but it still hung

@fsalmon001
Copy link
Author

Hi @tdcwilliams, it is with the same version of netcdf as me? I do not understand why it could not run with a null pointer. It is at the first iteration of the loop or after some iterations? What are your options for moorings? Maybe I can test here.

@tdcwilliams
Copy link
Contributor

Hi @fsalmon001, yes it is the same version of netcdf.
I am using large_arctic_10km.msh, and

[moorings]
use_moorings=true
parallel_output=true
spacing=10
output_timestep=1
output_time_step_units=time_steps
file_length=monthly
variables=conc

@tdcwilliams
Copy link
Contributor

Hi again @fsalmon001
Just confirming I use v4.8.1, also I am running with -np 2 which usually gives local_size 1,3 for rank 0,1

@tdcwilliams
Copy link
Contributor

Actually Claude suggested changing mode to NC_INDEPENDENT for the dummy write and it no longer hangs.
I can continue testing now eg for irregular grids

@fsalmon001
Copy link
Author

Thank you @tdcwilliams

I knew for NC_INDEPENDENT, but I think there was an issue about it. Maybe every process overlaps what the other did. Did you show the resulting netcdf file? Is it good? Maybe it was an issue during my development but if each process now writes only non-overlapping rectangles, this could be ok.

@tdcwilliams
Copy link
Contributor

Hi @fsalmon001 , when I go back to 32 cores it hangs again (also writing more variables every 6 hours, instead of every time step)

@fsalmon001
Copy link
Author

Ok I will have a closer look and I come back to you @tdcwilliams

@fsalmon001
Copy link
Author

I do not understand how we can have such different results.

Here it works with your options, but with NC_INDEPENDENT, I have NaN in the netcdf output file @tdcwilliams . You really have a good netcdf file with correct values with NC_INDEPENDENT?

@fsalmon001
Copy link
Author

And if you add a 'M_comm.barrier' just after the nc_put_vara_float functions @tdcwilliams ? Maybe it is a problem for you that some processes exit the loop and the writeNetCDFParallel function before others.

I ran cases with 128 processors on a cluster where I have netcdf 4.8.1.
On my personal computer, I have actually netcdf 4.9.2 but it works similarly.

@fsalmon001
Copy link
Author

Hi @tdcwilliams,
I reviewed your commit. Actually, you cannot use NC_INDEPENDENT for some processes and NC_COLLECTIVE for others. The first one says that all the processors are independent and the second one says the contrary and forces each processor to wait all the other processors. So you can either use NC_INDEPENDENT or NC_COLLECTIVE, but not both in the same time.

But when using only NC_INDEPENDENT, you should have NaN in the output file since each process overwrites the file.

@tdcwilliams
Copy link
Contributor

Hi @fsalmon001,
I will revert that commit since it only works for 2 cpu's anyway.

Similarly, commenting the call with nullptr only works for 2cpu's. (This option makes good nc files, although the domain is truncated more than usual).

This reverts commit 606ced6.

Only works for 2 cpus. Similarly commenting the nullptr write only works
for 2 cpus.
@fsalmon001
Copy link
Author

fsalmon001 commented Oct 29, 2025

I also asked to some generative AIs. Have you tried like in your commit to initialize size_t count[3] = {0, 0, 0}; and to use a buffer instead of nullptr for dummy data:


                 else // Dummy
                {   

                    std::vector<float> tmp(0);

                    nc_put_vara_float(ncid, data, start, count, &tmp[0]);

                }

Sorry I have nothing more @tdcwilliams

@tdcwilliams
Copy link
Contributor

Hi @fsalmon001,
I am also quite stuck, but the problem is a bit different to what I though.
It seems to have been a coincidence that the hanging call was a dummy write.

With 3 cpu's my output is:

[0] local_size = 1
[1] local_size = 2
[2] local_size = 2

[0,0] before nc_put_vara_float sic
[1,0] before nc_put_vara_float sic
[2,0] before nc_put_vara_float sic
[1,0] after nc_put_vara_float sic

so only one of the ranks exits nc_put_vara_float. The same thing happens with 4 cpus.

I now have a test script (test_netcdf_parallel.cpp)

#include <iostream>
#include <vector>
#include <mpi.h>
#include <netcdf.h>
#include <netcdf_par.h>

#define ERR(e) {if(e){printf("Error: %s\n", nc_strerror(e)); MPI_Abort(MPI_COMM_WORLD, e);}}

int main(int argc, char** argv) {
    int rank, size;
    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);
    
    std::cout << "Rank " << rank << " of " << size << " starting" << std::endl;
    
    // File and variable IDs
    int ncid, varid;
    int dimids[2];
    
    // Global dimensions
    const size_t NROWS = 100;
    const size_t NCOLS = 100;
    
    // Each rank writes a horizontal slice
    const size_t rows_per_rank = NROWS / size;
    const size_t start_row = rank * rows_per_rank;
    const size_t my_rows = (rank == size - 1) ? (NROWS - start_row) : rows_per_rank;
    
    std::cout << "Rank " << rank << ": writing rows " << start_row 
              << " to " << (start_row + my_rows - 1) << std::endl;
    
    // Create file with parallel access
    int err;
    err = nc_create_par("test_parallel.nc", NC_NETCDF4 | NC_MPIIO, 
                        MPI_COMM_WORLD, MPI_INFO_NULL, &ncid);
    ERR(err);
    
    if (rank == 0) {
        std::cout << "File created successfully" << std::endl;
    }
    
    // Define dimensions
    err = nc_def_dim(ncid, "rows", NROWS, &dimids[0]);
    ERR(err);
    err = nc_def_dim(ncid, "cols", NCOLS, &dimids[1]);
    ERR(err);
    
    // Define variable
    err = nc_def_var(ncid, "data", NC_FLOAT, 2, dimids, &varid);
    ERR(err);
    
    // End define mode
    err = nc_enddef(ncid);
    ERR(err);
    
    if (rank == 0) {
        std::cout << "Dimensions and variable defined" << std::endl;
    }
    
    // Prepare data - each rank fills with its rank number
    std::vector<float> data(my_rows * NCOLS);
    for (size_t i = 0; i < data.size(); i++) {
        data[i] = static_cast<float>(rank) + 0.1f * (i % 10);
    }
    
    std::cout << "Rank " << rank << ": data prepared, first value = " 
              << data[0] << std::endl;
    
    // Set collective access mode
    err = nc_var_par_access(ncid, varid, NC_COLLECTIVE);
    ERR(err);
    
    // Define hyperslab for this rank
    size_t start[2] = {start_row, 0};
    size_t count[2] = {my_rows, NCOLS};
    
    std::cout << "Rank " << rank << ": about to write with NC_COLLECTIVE" << std::endl;
    
    // Write data collectively
    err = nc_put_vara_float(ncid, varid, start, count, data.data());
    ERR(err);
    
    std::cout << "Rank " << rank << ": write completed successfully" << std::endl;
    
    // Synchronize
    MPI_Barrier(MPI_COMM_WORLD);
    
    if (rank == 0) {
        std::cout << "All ranks completed write" << std::endl;
    }
    
    // Close file
    err = nc_close(ncid);
    ERR(err);
    
    if (rank == 0) {
        std::cout << "File closed successfully" << std::endl;
        std::cout << "\nTo verify output, run:" << std::endl;
        std::cout << "  ncdump -v data test_parallel.nc | head -20" << std::endl;
    }
    
    MPI_Finalize();
    return 0;
}

it runs with make test using Makefile

executable=test_parallel_netcdf

PHONY: all test mrproper

default: all

all:
	$(CXX) -o $(executable) $(executable).cpp \
		-lnetcdf -I$(NETCDF_DIR)/include

mrproper: 
	rm $(executable)

test: all
	mpirun -np 4 ./$(executable)

This test actually passes (doesn't hang and makes a sensible .nc file) in my environment so I am not sure about the difference between the call in nextsim and this one. I don't know if it would be worth having a test that was more similar to nextsim.

I don't know if you have any experience of containers - would you be able to try to make a container where your code runs? Then we would know we could run it anywhere. I can give some recipe files to start from if you were willing to try that?

@fsalmon001
Copy link
Author

Yes it is strange @tdcwilliams. Could it be a conflict between different versions of netcdf if several netcdf are loaded? Maybe one processor runs the nc_put_vara_float function of netcdf 1 and the other the function of netcdf 2 which avoids communication. I don't know if this is possible.

No I am not familiar with containers but you can give me the files, I will have a look if I can do it.

@tdcwilliams
Copy link
Contributor

Hi @fsalmon001
thank you.

I've put the files here:
https://github.com/nansencenter/nextsim-env/tree/apptainer-netcdf-parallel/apptainer

On your personal computer install apptainer (sudo apt-get install apptainer for ubuntu) and then do sudo apptainer build nextsim.sif nextsim.def

The nextsim.def file has the compilation formulae that I used for the netcdf libraries.

Incidentally, on our HPC nextsim runs 2x faster inside apptainer than with intel compilers, so I think it would be worth getting it to work for its own sake, and not just for portability,

Once you've built the image files and compiled the model, you may need some more help to run the model. On our HPC, I source this file: https://github.com/nansencenter/nextsim-env/blob/apptainer-netcdf-parallel/machines/tim/fram/env/pynextsim.apptainer.src

which mounts some directories inside the container with forcing data etc and sets some variables inside it.

It works with moorings.parallel_output=false, so try that option out first while you're working out running with the container.

@fsalmon001
Copy link
Author

fsalmon001 commented Oct 31, 2025

Hi @tdcwilliams,

With your netxsim.def et nextsim.sh, the code already runs inside the container with moorings.parallel_output=true here, with :

[moorings]
#snapshot=true
output_timestep=1
spacing = 10
use_moorings=true
output_time_step_units=time_steps
file_length=monthly
variables =  conc
#variables = thick
#variables = velocity
#variables = ridge_ratio
#variables = damage
parallel_output = true

@tdcwilliams
Copy link
Contributor

tdcwilliams commented Nov 1, 2025 via email

@tdcwilliams
Copy link
Contributor

Hi @fsalmon001
I am still getting hanging inside this container with moorings.parallel_output=true - on our new HPC and also on my laptop.
Maybe I am making some mistake with running it - can you give some more details of how you ran it in your container?
Was it on an HPC or a laptop (if so, what sort)?

@fsalmon001
Copy link
Author

Hi @tdcwilliams,

I am on my laptop, with ubuntu 22.04.3.

I just go into my container, then apptainer shell nextsim.sif and then in the shell I compiled nextsim (the branch moorings_parallel) and I run my script, which includes the different exports of NEXTSIMDIR, NEXTSIM_MESH_DIR, etc, and
mpirun -np 16 $NEXTSIMDIR/model/bin/nextsim.exec --config-files=file.cfg

@tdcwilliams
Copy link
Contributor

Hi @fsalmon001,
I can't run it still but it sounds like you can on a few machines and in a container so if you can check a couple of more things we can merge it and hopefully we can use it later.

Can I double check that it runs for the same mesh as I was using (large_arctic_10km.msh) and can you also try outputting on an irregular grid such as this one
wget ftp://ftp.nersc.no/pub/timill/NEMO_025.nc

You copy it to $NEXTSIM_DATA_DIR and use moorings options

moorings.grid_type=from_file
moorings.grid_file=NEMO_025.nc
moorings.grid_latitude=plat
moorings.grid_longitude=plon

@tdcwilliams
Copy link
Contributor

By the way @fsalmon001,
have you compared the memory use for parallel vs sequential moorings output?

@fsalmon001
Copy link
Author

Hi @tdcwilliams, can you also give me your mesh file please, I have only the small arctic mesh?

No I did not rigorously check the memory, but we keep the same global grid in each process, but with only local data, so that the memory need can only decrease, but not like in a fully parallel decomposition. For the final file, it is obviously the same size.

@tdcwilliams
Copy link
Contributor

tdcwilliams commented Nov 10, 2025

Hi @fsalmon001
sure - you can get it from here:
wget ftp://ftp.nersc.no/pub/timill/large_arctic_10km.msh

@fsalmon001
Copy link
Author

I did not expect that the irregular grid will be curvilinear. What I have coded is not suitable for this kind of grid. I tried to find a way to make it work but I did not find a simple approach to do it because my algorithm is based on rectangles. I think we need a totally other philosophy for this. I will think a bit about this in case I find a workaround but I don't think it will be possible. If I can't, I will remove the parts corresponding to the irregular grids in my code @tdcwilliams.

@tdcwilliams
Copy link
Contributor

Thanks @fsalmon001.

@tdcwilliams
Copy link
Contributor

Hi @fsalmon001
@einola was just wondering about the environment inside your container as that could be different: could you do apptainer shell nextsim.sif and then send the result of env?

@fsalmon001
Copy link
Author

The result is:

SHELL=/bin/bash
SESSION_MANAGER=local/ptb-09004091.bordeaux.inria.fr:@/tmp/.ICE-unix/4981,unix/ptb-09004091.bordeaux.inria.fr:/tmp/.ICE-unix/4981
QT_ACCESSIBILITY=1
COLORTERM=truecolor
XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg
SSH_AGENT_LAUNCHER=gnome-keyring
GMSH_VERSION=3.0.6
XDG_MENU_PREFIX=gnome-
GNOME_DESKTOP_SESSION_ID=this-is-deprecated
GTK_IM_MODULE=ibus
SINGULARITY_NAME=nextsim.sif
OPENMPI_LIB_DIR=/usr/lib/x86_64-linux-gnu/openmpi/lib
GNOME_SHELL_SESSION_MODE=ubuntu
HDF5_DIR=/opt/local/hdf5
SSH_AUTH_SOCK=/run/user/673553/keyring/ssh
SCOTCH_DIR=/home/fsalmon/scotch-master/
SCOTCH_INCDIR=/usr/local/include
SCOTCH_LIBDIR=/usr/local/lib
NEXTSIMDIR=/home/fsalmon/Bureau/nextsim
XMODIFIERS=@im=ibus
DESKTOP_SESSION=ubuntu
SINGULARITY_ENVIRONMENT=/.singularity.d/env/91-environment.sh
MPI_INC_DIR=/usr/lib/x86_64-linux-gnu/openmpi/include
CCFLAGS=-O3 -fPIC
FFLAGS=-O3 -fPIC
GMSH_DIR=/opt/local/gmsh
GTK_MODULES=gail:atk-bridge
BOOST_INCDIR=/usr/include
PWD=/home/fsalmon/test
XDG_SESSION_DESKTOP=ubuntu
LOGNAME=fsalmon
XDG_SESSION_TYPE=x11
BOOST_DIR=/usr
NETCDF_FORTRAN_VERSION=4.5.4
GPG_AGENT_INFO=/run/user/673553/gnupg/S.gpg-agent:0:1
SYSTEMD_EXEC_PID=5587
CXX=mpicxx
CXXFLAGS=-O3 -pthread -fPIC -fopenmp
XAUTHORITY=/run/user/673553/gdm/Xauthority
OPENMPI_INCLUDE_DIR=/usr/lib/x86_64-linux-gnu/openmpi/include
NEXTSIM_MESH_DIR=/home/fsalmon/Bureau/nextsim_mesh_dir
USER_PATH=/opt/netcdf-parallel/bin:/home/fsalmon/mmg:/nextsim-tools/python/pynextsim/scripts/:/nextsimf/scripts:/nextsim/model/bin:/home/fsalmon/.local/bin:/home/fsalmon/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/snap/bin:/home/fsalmon/nextsim/deps/gmsh-3.0.6-source/lib:/usr/local/go/bin:/bin:/usr/bin:/sbin:/usr/sbin:/usr/local/bin:/usr/local/sbin
NEXTSIM_DATA_DIR=/home/fsalmon/Bureau/nextsim_data_dir
WINDOWPATH=2
APPTAINER_ENVIRONMENT=/.singularity.d/env/91-environment.sh
APPTAINER_APPNAME=
HOME=/home/fsalmon
USERNAME=fsalmon
LANG=C.UTF-8
LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.zst=01;31:*.tzst=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.wim=01;31:*.swm=01;31:*.dwm=01;31:*.esd=01;31:*.jpg=01;35:*.jpeg=01;35:*.mjpg=01;35:*.mjpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.webp=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.m4a=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.oga=00;36:*.opus=00;36:*.spx=00;36:*.xspf=00;36:
XDG_CURRENT_DESKTOP=ubuntu:GNOME
APPTAINER_COMMAND=shell
BOOST_VERSION=1.74
VTE_VERSION=6800
SINGULARITY_CONTAINER=/tmp/rootfs-2865454840/root
NETCDF_FOR_DIR=/opt/local/netcdf
HDF5_VERSION=1.10.7
GNOME_TERMINAL_SCREEN=/org/gnome/Terminal/screen/156c9f71_2b6e_4bfa_ac97_8275e8ac2a64
PTSCOTCH_INCDIR=/usr/local/include
NETCDF_DIR=/opt/local/netcdf
NETCDF_VERSION=4.8.1
LESSCLOSE=/usr/bin/lesspipe %s %s
XDG_SESSION_CLASS=user
APPTAINER_CONTAINER=/tmp/rootfs-2865454840/root
TERM=xterm-256color
F77=mpifort
LESSOPEN=| /usr/bin/lesspipe %s
USER=fsalmon
GNOME_TERMINAL_SERVICE=:1.87
NETCDF_CXX4_VERSION=4.3.1
DISPLAY=:1
SHLVL=2
NETCDF_CXX_DIR=/opt/local/netcdf
USE_NETCDF_PARALLEL=1
QT_IM_MODULE=ibus
APPTAINER_NAME=nextsim.sif
SINGULARITY_BIND=
PTSCOTCH_DIR=/home/fsalmon/scotch-master/
APPTAINER_BIND=
LD_LIBRARY_PATH=/opt/local/netcdf/lib:/opt/local/netcdf/lib:/opt/local/netcdf/lib:/opt/local/hdf5/lib:/opt/local/gmsh/lib::/.singularity.d/libs
XDG_RUNTIME_DIR=/run/user/673553
PS1=Apptainer> 
PTSCOTCH_LIBDIR=/usr/local/lib
BOOST_LIBDIR=/usr/lib/x86_64-linux-gnu
FC=mpifort
LC_ALL=C.UTF-8
XDG_DATA_DIRS=/usr/share/ubuntu:/usr/share/gnome:/usr/local/share/:/usr/share/:/var/lib/snapd/desktop
PATH=/opt/local/netcdf/bin:/opt/local/hdf5/bin:/opt/local/gmsh/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
GDMSESSION=ubuntu
CC=mpicc
CFLAGS=-O3 -fPIC
DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/673553/bus
_=/usr/bin/env

@fsalmon001
Copy link
Author

In addition @tdcwilliams, I think I have found a way to deal with the irregular grids. There are more parallel exchanges, so this should be less time efficient, but in terms of memory, it should be similar to the regular grids. It works here.

Moreover, I had an issue with hanging runs. This stemmed from MPI exchanges, so I modified them with non-blocking exchanges. Then, could you try again with this commit using both the regular and irregular grids please? Hopefully this could solve your problem.

@tdcwilliams
Copy link
Contributor

Hi @fsalmon001
thanks - that sounds like good progress.

I tried the latest code but it still hangs.

Could you try one more thing with the container please? apptainer shell --cleanenv --bind path/to/nextsim:/nextsim nextsim.sif and see if it still works?

@fsalmon001
Copy link
Author

Hi @tdcwilliams, with your command, it stills works, with both regular and irregular grids.

@tdcwilliams
Copy link
Contributor

Thanks @fsalmon001.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants