About the internship
Job Title: AI Intern (Computer Vision & Visual Reasoning)
Location: Remote / Hybrid
Internship Type: Paid
About Us:
We are a fast-growing startup building AI-powered solutions for vision and reasoning tasks. Our focus is on advancing computer vision systems that can understand, interpret, and reason about images and videos.
Role Overview:
We are seeking a motivated AI Intern with strong computer vision fundamentals and hands-on experience in deep learning for visual tasks. You will collaborate with our AI/ML engineers to design, prototype, and evaluate models that combine CV with visual reasoning capabilities.
Selected intern's day-to-day responsibilities include:
1. Research and prototype CV models for detection, segmentation, classification, and reasoning.
2. Work on tasks like object detection, OCR, scene understanding, and multi-modal reasoning.
3. Collect, preprocess, and augment image/video datasets.
4. Implement and experiment with deep learning architectures (CNNs, Vision Transformers, multi-modal models).
5. Evaluate model performance with appropriate metrics and improve robustness.
6. Document findings and contribute insights to the team.
Requirements:
1. Proficiency in Python, PyTorch/TensorFlow.
2. Solid understanding of computer vision techniques and deep learning models.
3. Experience with data preprocessing, image augmentation, and dataset handling.
4. Familiarity with visual reasoning benchmarks or multi-modal AI (e.g., CLIP, BLIP, LLaVA).
Preferred:
1. Prior internship or project experience in computer vision or visual reasoning.
2. Hands-on work with Vision Transformers, detection/segmentation pipelines, OCR systems.
3. Experience integrating CV models with LLMs for reasoning tasks.
4. Contributions to open-source AI/CV projects.
What You'll Gain:
1. Practical experience building state-of-the-art vision and reasoning models.
2. Exposure to multi-modal AI research and applications.
3. Mentorship from senior AI/ML engineers.
4. Opportunity for transition into a full-time role.
Skill(s) required
Computer Vision
Reinforcement Learning
Self-learning
Transformers
Earn certifications in these skills
Who can apply
Only those candidates can apply who:
1. are available for full time (in-office) internship
2. can start the internship between 15th Sep'25 and 20th Oct'25
3. are available for duration of 6 months
4. have relevant skills and interests
Other requirements
1. Prior internship or project experience in computer vision or visual reasoning.
2. Hands-on work with Vision Transformers, detection/segmentation pipelines, OCR systems.
3. Experience integrating CV models with LLMs for reasoning tasks.
4. Contributions to open-source AI/CV projects.
Perks
Certificate
Letter of recommendation
Informal dress code
Number of openings
1
About Elementals.ai
ElementalsAI is a deep-tech and AI-powered product company helping entrepreneurs, startups, SMBs, and enterprises build, customize, and scale software products faster and more effortlessly through a unified platform + services approach.
We act as an extended product and engineering partner, combining:
1. A flexible self-serve platform with AI agentic workflows to let teams build and automate with ease.
2. High-velocity engineering teams for those who want turnkey, full-service product delivery.
3. Product and design expertise + internal AI tools to adapt solutions to real-world workflows.
Activity on Internshala
Hiring since December 2020