Overview
Project description
Working on GPU support for OpenAI / Triton — a language and compiler for writing highly efficient custom Deep-Learning primitives. Work with the open-source community to analyze, develop, test, and deploy performance improvements for neural networks implemented with Triton on GPUs with ROCm.
Responsibilities
- New features development, support and optimization of OpenAI / Triton project for GPUs. Communication with other developers, customers and project managers. Test implementation, project documentation and verification of system with unit / component / functional tests.
Skills
Must have
Strong C / C++ programming skillsExperience with compiler internals (llvm, gcc or any other)Basic Python programming skillsExperience in performance analysisNice to have
Basic understanding of ML technologiesExperience with GPGPU (General purpose GPU) computing (HIP, CUDA, OpenCL, etc.)Experience with PyTorchExperience with LLVM and MLIR compiler infrastructure, analysis or optimizations implementationKnowledge of ROCm infrastructureExperience in CMake, make / ninja build systemGEMM performance fundamentalsExperience with Docker#J-18808-Ljbffr