Naoki Shitanda

My research lies at the intersection of robotics and machine learning, with a focus on reinforcement learning. I study how to learn policies that work in real-world robotic systems and how to improve data efficiency by leveraging policy ensembles in large-scale training settings. Through hobby projects and part-time work, I also work on research and development for deploying robotic automation and computer vision technologies in real-world applications.

Rethinking Policy Diversity in Ensemble Policy Gradientin Large-Scale Reinforcement Learning

ICLR 2026

Naoki Shitanda, Motoki Omura, Tatsuya Harada, Takayuki Osa

#Reinforcement Learning  #Large-scale RL  #Robot Learning

Summary figure for the ICLR 2026 paper

In large-scale RL with tens of thousands of parallel environments, the learning dynamics differ from traditional RL settings. Recent work has therefore explored learning multiple policies together to collect diverse data. In this work, we investigate how the diversity between these policies should be structured and controlled.

Through both theoretical and empirical analyses, we show that policy diversity is not always beneficial: excessive divergence between policies can harm training stability and sample efficiency. To address this issue, we propose Coupled Policy Optimization, which regulates inter-KL divergence to encourage follower policies to perform efficient exploration around the leader policy. Our method achieves significantly improved final performance and sample efficiency on challenging tasks such as dexterous manipulation, outperforming strong baselines including SAPG, PBT, and PPO.

Katakko: Embodiment of Modular Robotsthrough Automatic Motion Mapping

SIGGRAPH 2026
Emerging Technologies

*Sotaro Yokoi, *Naoki Shitanda, *Kento Matsuo, Takuji Narumi
(Hobby activity supported by IPA MITOH IT Program)

#Modular Robot  #Motion Mapping  #Visual Recognition

Katakko modular robot system

Katakko is a modular robot system for creating and embodying personalized social agents without requiring engineering ability for users. It introduces functional modules and an automatic framework that maps human motion to arbitrarily assembled robot structures. The system detects robot configurations via pictures and generates joint-to-joint mappings based on structural alignment, with support for incremental updates to preserve control consistency.

I worked on automatic motion mapping generation and robot structure recognition.

Katakko assembly, configuration, and operation pipeline

Gaussians in the City: Enhancing 3D Scene Reconstructionunder distractors with Text-guided Segmentation and Inpainting

SIGGRAPH Asia 2024
Poster Track

Naoki Shitanda, Jun Rekimoto
(Work as a research assistant @Sony CSL)

#3D Scene Reconstruction  #3D Gaussian Splatting  #Practical AI

Gaussians in the City method overview

In 3D scene reconstruction using NeRF and 3D Gaussian Splatting, real-world data often contains both dynamic and static distractors, such as pedestrians and vehicles, which degrade reconstruction quality. Existing approaches typically assume clean scenes or rely on imperfect masking strategies, leading to artifacts especially in heavily occluded regions.

In this work, we investigate how to robustly reconstruct scenes under such challenging conditions. We show that simply masking distractors is insufficient, as it introduces defects in regions with missing observations. To address this, we propose a method that combines text-guided segmentation with selective inpainting. Our approach identifies heavily masked regions and applies 3D-consistent inpainting only where necessary, preserving overall consistency while reducing artifacts.

Food Manipulation Competition

Robosoft 2023
Competition
(ICRA2023 ViTac WS)

Saekwang Nam, Longyi Lee, Max Yang, Lihaoya Tan, Naoki Shitanda
(Team FANTASTIC)

#Robot System  #Manipulation  #Food Manipulation

In a robot competition involving pick-and-place tasks for various food items, I worked on instance recognition of the food and grasp pose estimation based on depth information.