← Back to all projects

Pepper: Multi-Modal Personal Assistant Robot

Timeline: Dec 2020 – May 2021 Role: Robotics engineer Team: AI & robotics student cohort Focus: Autonomous navigation and human interaction
Python PyTorch CNN SLAM Object Recognition DL Multi-agent system
Pepper robot media 1
Pepper robot media 2

Pepper integrates SLAM, speech recognition, and deep learning-based object manipulation into a single platform designed for home assistant scenarios that demand reliable navigation and intuitive human-robot interaction.

Technology Stack

  • SLAM via ROS gmapping paired with Microsoft Kinect depth sensing.
  • Deep learning perception with CNN classifiers for 1000 object categories.
  • Voice interaction using MFCC-based speaker identification and recognition pipelines.
  • Custom three-layer chassis designed in AutoCAD and fabricated with acrylic for modularity.

Key Features

  • Autonomous indoor navigation that fuses encoder and IMU data for drift-free localization.
  • Personalized speech interface capable of identifying speakers and responding in real time.
  • Object recognition and depth-guided grasping integrated with a 3-DoF manipulator.
  • Modular design enabling rapid upgrades to sensors, actuators, and compute payloads.

Robot Design

The three-tier chassis houses power electronics, motor drivers, a 3-DoF manipulator, and the onboard compute stack. Pepper employs high-torque drive motors with a caster wheel for stability, while a top-mounted Kinect sensor delivers RGB-D data for mapping and manipulation tasks.

Autonomous Navigation

Human Interaction

Outcome

The project demonstrates a cohesive assistant robot capable of mapping its environment, understanding verbal commands, and manipulating everyday objects, underscoring how accessible hardware can support rich personal robotics experiences.