AI Engineer – Model Optimization & Acceleration
Advanced Micro Devices, Inc
Bengaluru, Karnataka, IndiaMID
Job Description
Seeking an AI Engineer to optimize and deploy ML models across heterogeneous platforms.
Responsibilities
- Optimize diverse models: generative (LLMs, diffusion), vision (classification, detection, segmentation), multi-modal, and speech
- Port models across frameworks (e.g., PyTorch → ONNX → runtimes)
- Deploy on hardware accelerators (GPU/NPU) and optimize performance
- Improve inference latency, throughput, and memory (batching, caching, parallelism, fusion)
- Apply quantization and model compression (FP32 → lower precision)
- Profile and debug system and model performance
Qualifications
- Strong in PyTorch (or similar), ONNX (or equivalent)
- Proficient in Python and C++
- Experience with GPU/hardware acceleration (CUDA/ROCm or similar)
- Solid understanding of deep learning models (transformers, CNNs)
- Knowledge of optimization, quantization, and performance tuning
- Edge AI or embedded deployment
- Generative or multi-modal AI systems
- Distributed inference or streaming pipelines