AI/ML Engineer | Agentic AI Systems | Multi-Agent RAG | Applied ML Research

Software Development Engineer | Full-Stack Systems | Cloud & Distributed Platforms

Deepak Sai Pendyala

I am an AI/ML engineer and applied researcher focused on building reliable, scalable, and explainable AI systems. I specialize in agentic AI architectures, multi-agent orchestration, Retrieval-Augmented Generation, and end-to-end ML platforms.

I currently conduct applied AI research at North Carolina State University while building enterprise-grade agentic platforms in industry.

I’m a software engineer focused on building scalable, secure, and high-performance systems, with experience across full-stack platforms, microservices, distributed workflows, and cloud-native infrastructure.

My work bridges backend systems, frontend platforms, DevOps, and data-heavy applications, with strong engineering rigor around reliability, performance, and maintainability. I currently engineer full-stack platforms at North Carolina State University and have shipped scalable systems in industry environments.

  • Autonomous multi-agent reasoning systems
  • Hybrid retrieval pipelines combining vector search, web data, and knowledge graphs
  • Automated AI evaluation frameworks for reliability and failure detection
  • Production ML systems with strong MLOps foundations
  • Clean service architectures
  • Optimized databases and APIs
  • CI/CD pipelines and containerized deployments
  • Real-time data processing and monitoring
Multi-agent reasoning Hybrid retrieval + KG Automated evaluation MLOps at scale
Full-stack systems Microservices + APIs Cloud-native deployments Reliability + performance
Illustration of an AI stickman on a chalkboard Illustration of a software engineer stickman

Now building

Advanced agentic AI platforms for scientific labs, with citation-grounded reasoning and evaluation.

Explore experience

Now shipping

Scalable full-stack platforms with clean architectures, optimized APIs, and cloud-native deployments.

Explore experience

Experience

Applied research and production engineering across AI, analytics, and MLOps.

Full-stack systems engineering across research platforms, cloud infrastructure, and distributed workflows.

Graduate Research Assistant - Applied AI

STEPS Center, North Carolina State University

  • Built a scalable on-prem agentic AI platform for scientific labs.
  • Integrated paper scraping, document parsing, and knowledge graph ingestion.
  • Deployed autonomous analysis agents for extraction, comparison, and visualization.

Local LLM deployments from 3B to 70B parameters across heterogeneous hardware.

  • Automated AI evaluation with synthetic Q/A, LLM-judge scoring, and three-stage checks.
  • Failure detection and rerouting for continuous reliability monitoring.

AI Engineer Intern

SproutsAI

  • Built an agentic analytics platform turning natural language into chart-ready insights.
  • Multi-agent query planning, hybrid retrieval with Qdrant + Neo4j, automated evaluation.
  • Kubernetes deployment with CI/CD and monitoring.

Impact: 60% accuracy gain, 40% latency reduction, 99.9% uptime, 2x engagement.

Research Assistant - Machine Learning

CAMAL Lab, North Carolina State University

  • Built real-time ML monitoring for DARPA-funded additive manufacturing systems.
  • Integrated predictive models into production pipelines for quality control.
  • Developed anomaly detection with FastAPI, OpenCV, and GPU-optimized inference.

4.3x reduction in inference time, 70% improvement in accuracy and efficiency.

Applied Scientist Intern

Amazon

  • Developed ML and GenAI automation for finance and tax workflows.
  • Built multilingual commodity code classifier using transformer models.
  • Created event-driven MLOps pipelines with automated retraining loops.

60% manual review reduction, 90% effort decrease, 80% scalability improvement.

Graduate Research Assistant - Full-Stack Platform Engineering

STEPS Center, North Carolina State University

  • Designed and managed a full-stack Django + React AI platform for research workflows and demos.
  • Implemented RBAC authentication, audit logging, and optimized REST APIs for high-traffic usage.
  • Built responsive UI components for research visualization.
  • Redesigned databases with normalized schemas, indexing, and query tuning.

Impact: ~40% improvement in API latency and frontend responsiveness, plus higher stability.

AI Engineer Intern (Full-Stack & Cloud Systems)

SproutsAI

  • Built a full-stack analytics platform using React, FastAPI, and MongoDB.
  • Containerized microservices, parallelized query execution, and added caching layers.
  • Deployed on AWS and GCP Kubernetes clusters with GitHub Actions CI/CD.
  • Integrated monitoring and evaluation pipelines for reliability.

Impact: 3x faster queries, 40% latency reduction, 99.9% uptime.

Research Assistant - Real-Time Systems Engineering

CAMAL Lab, North Carolina State University

  • Built a FastAPI real-time anomaly detection platform for manufacturing workflows.
  • Implemented live video processing with OpenCV and GPU-optimized inference.
  • Added socket-based alerts, async processing, and monitoring dashboards.

Impact: 4.3x reduction in inference latency and earlier fault detection.

Applied Scientist Intern (Systems + MLOps Engineering)

Amazon

  • Engineered event-driven ML workflows with SageMaker, Lambda, and Step Functions.
  • Built fault-tolerant async inference with retries and clean service boundaries.
  • Integrated microservices with ECS, API Gateway, and CloudWatch.
  • Defined REST API contracts and connected frontend systems to ML backends.

Impact: 90% reduction in manual review, 80% scalability gain, ~9 minutes faster per job.

Core Expertise

Deep focus on agentic AI, RAG, and scalable ML systems.

Full-stack engineering with rigor around reliability, performance, and maintainability.

Agentic AI and Multi-Agent Systems

LangGraph orchestration, modular agent roles, shared state models, and self-healing pipelines.

Retrieval-Augmented Generation

Hybrid retrieval, RAG-Fusion, query expansion, multi-hop retrieval, citation-backed generation.

LLM Engineering

Fine-tuning large language models, multilingual NLP systems, prompt engineering, agent workflows.

Machine Learning

Deep learning with PyTorch and TensorFlow, time-series forecasting, computer vision, optimization.

MLOps and AI Infrastructure

Automated retraining, event-driven workflows, CI/CD for ML systems, scalable inference on Kubernetes.

Data and Knowledge Systems

Vector databases, knowledge graphs, structured and unstructured data ingestion.

Full-Stack Development

Django + React platforms, FastAPI microservices, REST API design, contract testing, responsive UI.

Backend & Distributed Systems

Microservices architecture, async I/O, parallel execution, caching, query optimization, event-driven workflows.

Cloud & DevOps

Dockerized applications, Kubernetes on AWS/GCP, GitHub Actions CI/CD, monitoring and logging.

Data & Infrastructure

PostgreSQL, MySQL, MongoDB, DynamoDB, Redis, schema design, migrations, index tuning.

Reliability & Engineering Practices

Unit and integration testing, error handling and retries, observability dashboards, clean architecture.

Publications

RL-CURATE-KG: Multi-Agent RL for Scalable KG Curation (IEEE Big Data 2025).

Generative Transformers and Text Generation Models (IET Generative AI Unleashed 2025).

Leadership and Recognition

Founder and Lead, Intel IoT Club (2,000+ members, 10+ national AI trainings).

Top 10 Global DeepLearning.AI Ambassador (2022).

Top 0.05% Amazon ML Summer School (converted to Applied Scientist Intern).

Certifications

Intel Edge AI Developer.

AWS Machine Learning Foundations.

Google IT Support Specialization.

Engineering Tech Stack

Languages: Python, C, C++, JavaScript, Bash, MATLAB, Embedded C, Assembly.

Frameworks: React, Node.js, Django, FastAPI, Flask, Streamlit.

Cloud & DevOps: AWS, GCP, Docker, Kubernetes, GitHub Actions CI/CD.

Data Systems: PostgreSQL, MySQL, MongoDB, DynamoDB, Redis.

Leadership and Recognition

Founder and Lead, Intel IoT Club (2,000+ members, 10+ national hackathons and trainings).

Top 10 Global DeepLearning.AI Ambassador (2022).

Top 0.05% Amazon ML Summer School (converted to internship).

Engineering Philosophy

I focus on clean, maintainable architectures, measurable performance improvements, reliability over unnecessary complexity, and strong automation and testing.

I enjoy building systems that scale, stay stable under load, and are easy for teams to extend.

Flagship Projects

Selected AI/ML systems that combine research depth with production impact.

Selected software systems focused on clean architecture, performance, and reliability.

ActiveRAG Next

A production-grade, explainable multi-agent RAG platform.

  • LangGraph-based orchestration with coordinator, retrieval, reasoning, validation, feedback.
  • Hybrid retrieval with RAG-Fusion and web augmentation.
  • Citation-backed responses with confidence-triggered reruns.
  • Real-time Streamlit UI with traceable agent execution.

Stack: Python, FastAPI, LangGraph, Streamlit, Qdrant, Neo4j.

Predictive Power Price Tagging

Hybrid optimization and forecasting system for energy markets.

  • LSTM-based deep learning for trajectory forecasting.
  • Genetic algorithms for resource allocation and pricing strategies.
  • Economic Load Dispatch optimization for market clearing price forecasting.

Full-Stack Research Platform (Django + React)

Production-ready web platform supporting research demos and workflows.

  • Secure authentication with RBAC and audit logging.
  • Optimized REST APIs and responsive UI components.
  • Database schema redesign, migrations, and performance tuning.
  • CI/CD automation and rapid server portability.

ActiveRAG Next - Microservice-Style Multi-Agent Platform

AI-focused system showcasing strong SDE design and orchestration.

  • Modular microservice-like agents with shared state via Pydantic models.
  • Async I/O communication with FastAPI backend and Streamlit frontend.
  • Emphasis on clean interfaces, scalability, and observability.

Enigma - Distributed Discord Music System

Scalable music streaming and recommendation platform.

  • Multi-source streaming, Spotify-integrated search, queue management.
  • Recommendation engine over 24K+ tracks with playlist curation.
  • Cloud deployment on GCP with CI/CD, logging, and monitoring.

Event-Driven ML Orchestration Systems (Industry)

Reusable cloud workflows for automated ML operations.

  • Step Functions orchestration with Lambda triggers and SageMaker jobs.
  • Fault tolerance with retries and async execution.
  • Operational monitoring and clean service boundaries.

Get In Touch

Have a project, research idea, or opportunity? I would love to collaborate.

Contact illustration

Location

Raleigh, NC, USA

Social

Availability

Open to research collaborations, AI product work, and internships.

Book a Call