Naveenraj Kamalakannan
I am a Master's student in Computer Engineering at New York University and a Research Assistant at the NYU Center for Data Science and NYU Langone, where I work on sub-second action detection with Vision Language Models (InternVL3, NVILA, LLaVa-OneVision) for stroke rehabilitation, advised by Prof. Carlos Fernandez-Granda, Victor Li and Prof. Heidi Schambra. Previously, I was an AI & Data Science Associate Intern at J.P. Morgan (Asset & Wealth Management), building agentic applications that mimic the collective thought process of teams.
Research Interests:
- LLM Inference & Training Systems
- Multimodal Reasoning & Human Motion Understanding
- Evaluation & Interpretability
- Distributed Systems for Large-Scale Models
I am passionate about building scalable deep learning infrastructure and I actively contribute to open-source projects such as vLLM, Microsoft’s DeepSpeed, NVIDIA TensorRT-LLM and Snowflake’s ArcticInference.
I earned my Bachelor's in Electronics Engineering from Vellore Institute of Technology, where I collaborated with Prof. Sudhakar MS on medical image processing, developing the Exponential Pixelating Integral Transform for chest X-ray abnormality detection.
Email /
GitHub /
Google Scholar /
LinkedIn /
CV
|
|
Open Source Contributions
Contributions to major open source projects including vLLM, Microsoft DeepSpeed, and Snowflake ArcticInference.
|
|
Tree-of-Thought (ToT) & MCTS Integration - NVIDIA TensorRT-LLM
Open Source Contribution
pull request /
Integrated Tree-of-Thought (ToT) and MCTS controllers into the AutoDeploy scaffolding framework in PR #7490 to enable multi-step reasoning flows and experimentation with search-based inference strategies.
|
|
Separate MLAAttention from Attention - vLLM
Open Source Contribution
pull request /
Refactored Multi-Head Latent Attention (MLA) to decouple prefill/decode paths from a unified custom op, enabling torch.compile fusion and piecewise CUDA Graph capture, reducing Python overhead and making MLA more modular and easier to experiment with.
|
|
FlashInfer Backend Support for SwiftKV - Snowflake ArcticInference
Open Source Contribution
pull request /
Integrated FlashInfer backend support into SwiftKV in PR #124, optimizing high-throughput KV-cache-aware decoding and enabling more efficient large-scale LLM serving within the ArcticInference stack.
|
|
Bug Fix: Gradient Norm Calculation for CPU Offload - Microsoft DeepSpeed
Open Source Contribution
pull request /
Fixed a critical ZeRO-3 CPU-offload gradient clipping bug in PR #7302, ensuring global gradient norms correctly reflect clipped gradients during offload scenarios and improving training stability for large models. Collaborating with Olatunji Ruwase on PyTorch Core (Issue #158187) to implement Zip serialization support for DeepNVMe.
|
Research & Publications
I'm interested in distributed training, model optimization, reinforcement learning and AI applications in finance, healthcare and robotics.
|
|
The Potential and Limitations of Vision-Language Models for Human Motion Understanding
Victor Li, Naveenraj Kamalakannan, et al.
arXiv preprint, 2025
arxiv /
Benchmarked SOTA VLMs (InternVL3, NVILA, LLaVa-OneVision) and engineered a pose-refined prompting pipeline that integrates YOLOv11 pose tracks with VLM context to extract sub-second motion primitives, achieving ~67.75 Edit Score (ES) on structured upper-limb rehab tasks.
|
|
Exponential Pixelating Integral Transform with Dual Fractal Features for Enhanced Chest X-Ray Abnormality Detection
Kamalakannan N, Macharla S, Kanimozhi M, Sudhakar M S
Computers in Biology and Medicine, Volume 182, 2024
paper /
Built a chest X-ray abnormality detection model using Exponential Pixelating Integral transforms and fractal features. Implemented Multivariate Adaptive Regression Splines (MARS) ensemble, achieving 99.63% accuracy and F1 scores up to 98.10%.
|
|
A Novel Approach for the Early Detection of Parkinson's Disease Using EEG Signal
Kamalakannan, Naveenraj, Shiva Prasaath Sudha Balamurugan, Kalaivani Shanmugam
IJEET 12.5 (2021): 80-95, 2021
paper /
Led a team to develop a Deep Learning model analyzing EEG signals, achieving 93.3% accuracy in detecting early-stage Parkinson’s disease. Attained an F1 score of 93.48% and presented findings at the University of Tubingen Symposium.
|
Notable Projects
Cool projects that actually won things and made people go "wow, that's neat!"
|
|
Early Detection of Sepsis - National Hackathon Winner
VIT National Hackathon
First Place - Design Category
2020-03-01
Led a cross-functional team to develop a sepsis onset detection model using critical biomarkers (PCT and MDW), securing first place in the Design Category and winning a $2,000 grant from the VIT Incubator. The project focused on early detection of sepsis, a critical medical condition requiring rapid intervention.
|
Other Projects
Fun side projects where I get to play with robots, build AI stuff, and generally tinker with cool tech.
|
|
Starbots.AI Automated Cafeteria System
Robotics Project - The Construct Bootcamp
2024-05-01
website /
video /
code /
Built an autonomous mobile robot using ROS2, RRT* path planning, and OMPL for navigation. Integrated CNNs for object detection, SLAM for mapping, and robotic manipulation for food handling. Developed with Python, C++, and Gazebo simulation as part of The Construct Robotics Bootcamp.
|
|
Adaptive Monte-Carlo Localization Warehouse Robot
Robotics Project - The Construct Bootcamp
2024-03-15
website /
code /
Built an autonomous warehouse robot using Adaptive Monte-Carlo Localization (AMCL) and Cartographer SLAM for precise positioning. Implemented Nav2 navigation stack with costmap generation, path planning, and obstacle avoidance. Developed with ROS2, Python, and Gazebo simulation for the RB1 robot platform.
|
|