Skip to content

Grid does not compile on Arm with CUDA #450

@edbennett

Description

@edbennett

Describe the issue:

Attempting to compile Grid for NVIDIA on Arm fails due to a large number of undefined symbols in arm_neon.h. Following up on @agsunderland's comment on #430 and digging deeper, after some discussion with @RChrHill the issue is that including Eigen with __CUDACC__ undefined causes Eigen to emit code using NEON vector instructions, which according to this post on NVIDIA's forums aren't yet supported in NVCC.

A workaround is to define the EIGEN_DONT_VECTORIZE macro, for example by adding -DEIGEN_DONT_VECTORIZE to the CXXFLAGS; this disables Eigen from using SIMD completely. I'm not sure what performance impact this has compared to being able to use NEON for the things that Eigen is used for on CPU. Upgrading to Eigen 3.4.0 did not fix the problem.

To compile the minimal example below, the following was used:

nvcc -x cu -I../../Grid    -O3 -o eigen_hello eigen_hello.cc

(Replace the ../../Grid with the path to wherever Eigen is available.)

Code example:

#include <iostream>

//uncomment the five commented lines below to be able to compile in the case where EIGEN_DONT_VECTORIZE is defined. These are not needed for the error seen with full Grid to be triggered.
//#undef __CUDA_ARCH__
//#undef __NVCC__
#undef __CUDACC__
#include <Eigen/Dense>
//#define __CUDA_ARCH__
//#define __NVCC__
//#define __CUDACC__

int main()
{
  return 0;
}

Target platform:

This is the Arm GPU testbed in Leicester; there is no model name in the cpuinfo.

Configure options:

N/A

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions