Naveenraj Kamalakannan

I am a Master's student in Computer Engineering at New York University and a Research Assistant at the NYU Center for Data Science and NYU Langone, where I work on sub-second action detection with Vision Language Models (Qwen 2.5, InternVL3) for stroke rehabilitation, advised by Prof. Carlos Fernandez-Granda and Victor Li. Previously, I was an AI & Data Science Associate Intern at J.P. Morgan (Asset & Wealth Management), building agentic applications that mimic the collective thought process of teams.

Research Interests:

  • Vision Language Models - Training and Fine-Tuning
  • Efficient inference systems: KV cache and kernel/attention optimizations
  • Distributed training and systems for large-scale models
  • LLM alignment for agentic planning and tool use
I am passionate about building scalable deep learning infrastructure and I actively contribute to open-source projects such as vLLM, Microsoft’s DeepSpeed, and Snowflake’s ArcticInference.

I earned my Bachelor's in Electronics Engineering from Vellore Institute of Technology, where I collaborated with Prof. Sudhakar MS on medical image processing, developing the Exponential Pixelating Integral Transform for chest X-ray abnormality detection.

Email  /  GitHub  /  Google Scholar  /  LinkedIn  /  CV

profile photo

Open Source Contributions

Contributions to major open source projects including vLLM, Microsoft DeepSpeed, and Snowflake ArcticInference.

project image

Separate MLAAttention from Attention - vLLM


Open Source Contribution
pull request /

Contributed to vLLM by separating Multi-Head Latent Attention into its own AttentionLayerBase subclass in PR #25103, enabling cleaner kernels and future features (e.g., FLASHMLA_SPARSE registration and modular kernel refactors).

project image

FlashInfer Backend Support for SwiftKV - Snowflake ArcticInference


Open Source Contribution
pull request /

Added FlashInfer backend support for SwiftKV in PR #124, with automatic backend detection and improved throughput performance.

project image

Bug Fix: Gradient Norm Calculation for CPU Offload - Microsoft DeepSpeed


Open Source Contribution
pull request /

Fixed a bug in PR #7302 where gradient clipping wasn’t working properly with CPU offloading in ZeRO-3. Added unit tests to cover different precision modes and gradient clipping scenarios.




Research & Publications

I'm interested in distributed training, model optimization, reinforcement learning and AI applications in finance, healthcare and robotics.

project image

Exponential Pixelating Integral Transform with Dual Fractal Features for Enhanced Chest X-Ray Abnormality Detection


Kamalakannan N, Macharla S, Kanimozhi M, Sudhakar M S
Computers in Biology and Medicine, Volume 182, 2024
paper /

Built a chest X-ray abnormality detection model using Exponential Pixelating Integral transforms and fractal features. Implemented Multivariate Adaptive Regression Splines (MARS) ensemble, achieving 99.63% accuracy and F1 scores up to 98.10%.

project image

A Novel Approach for the Early Detection of Parkinson's Disease Using EEG Signal


Kamalakannan, Naveenraj, Shiva Prasaath Sudha Balamurugan, Kalaivani Shanmugam
IJEET 12.5 (2021): 80-95, 2021
paper /

Led a team to develop a Deep Learning model analyzing EEG signals, achieving 93.3% accuracy in detecting early-stage Parkinson’s disease. Attained an F1 score of 93.48% and presented findings at the University of Tubingen Symposium.




Notable Projects

Cool projects that actually won things and made people go "wow, that's neat!"

project image

Early Detection of Sepsis - National Hackathon Winner


VIT National Hackathon
First Place - Design Category
2020-03-01

Led a cross-functional team to develop a sepsis onset detection model using critical biomarkers (PCT and MDW), securing first place in the Design Category and winning a $2,000 grant from the VIT Incubator. The project focused on early detection of sepsis, a critical medical condition requiring rapid intervention.




Other Projects

Fun side projects where I get to play with robots, build AI stuff, and generally tinker with cool tech.

project image

Starbots.AI Automated Cafeteria System


Robotics Project - The Construct Bootcamp
2024-05-01
website / video / code /

Built an autonomous mobile robot using ROS2, RRT* path planning, and OMPL for navigation. Integrated CNNs for object detection, SLAM for mapping, and robotic manipulation for food handling. Developed with Python, C++, and Gazebo simulation as part of The Construct Robotics Bootcamp.

project image

Adaptive Monte-Carlo Localization Warehouse Robot


Robotics Project - The Construct Bootcamp
2024-03-15
website / code /

Built an autonomous warehouse robot using Adaptive Monte-Carlo Localization (AMCL) and Cartographer SLAM for precise positioning. Implemented Nav2 navigation stack with costmap generation, path planning, and obstacle avoidance. Developed with ROS2, Python, and Gazebo simulation for the RB1 robot platform.


Design and source code from Jon Barron's website