chengscott

Follow

Scott Cheng chengscott

Follow

78 followers · 102 following

Open to Work
PhD candidate @ Penn State CSE
chengscott.io

Achievements

Achievements

Highlights

Pro

Organizations

Pinned Loading

AITemplate AITemplate Public

Forked from facebookincubator/AITemplate

AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.

Python
c++ multiprocessing Barrier/CondVar/... c++ multiprocessing Barrier/CondVar/SharedMemory using SysV IPC
1
#include "ipc.hpp"
2
#include <cassert>
3
#include <cstring>
4
#include <sys/ipc.h>
5
#include <sys/sem.h>
flat_array_t and flat_tuple_t are au... flat_array_t and flat_tuple_t are automatically hashable type wrapper for std::array<T, N> and std::tuple<Ts...> that can be used as keys in map/unordered_map
1
#include <array>
2
#include <functional>
3
#include <iostream>
4
#include <tuple>
5
#include <unordered_map>

host_vector (a std::vector with a cu...

1

#include "host_vector.hpp"

2

#define DEVICE_CHECK(call)                                                                         \

3

  if ((call) != cudaSuccess) {                                                                     \

4

    throw std::runtime_error(#call " API call failed: " + GetLastErrorString() + " at " +          \

5

                             __FILE__ + ", line" + std::to_string(__LINE__));                      \

pdf_crop pdf_crop Public

Python
dlp2019 dlp2019 Public archive

Deep Learning and Practice 2019

Python