We optimize you neural network training and inference pipelines for your target hardware.
We are specialized in Nvidia GPUs and libraries: CUDA, CuBLAS, CuTLASS, CuDNN, CuTe, NCCL, NVSHMEM. However, we can also work on other platforms and DSLs on-request.
We are an experienced team in quantization, pruning, distillation, distributed systems, parallelization, sharding, compilers and custom kernels such as Flash-Attention variants.
We also train custom models using your proprietary data while maintaining privacy and security. We have done projects on the following tasks:
•Computer Vision & Image Processing
•SLAM & Robotics Systems
•Reinforcement Learning
•Large Language Models & Fine-tuning
•3D Graphics & Rendering