Chan’s Research Note
    • 한국어
  • Home
  • About
  • Blog
  • Publications
  • Activities
    Sensor-Based Reward Learning from Video Labels for Tumble Motion Control in a Household Dryer

    To appear at 2026 IEEE 22nd International Conference on Automation Science and Engineering (CASE), August 2026

    Jinwoo Lee Chanseok Kang Guntae Bae

    LG Electronics AI Lab

    To appear

    TL;DR

    This work transfers human video judgments of laundry motion into a deployable sensor-only reward model, then uses reinforcement learning to improve dryer drum-speed control without requiring video at runtime.

    Learning-based reward model pipeline for sensor-only dryer control.

    Learning-based reward model pipeline for sensor-only dryer control.

    Learning-based reward model pipeline: visual supervision is collected during development, distilled into a sensor-only reward model, and used for reinforcement learning.

    Overview

    Problem

    Efficient drying depends on maintaining cataracting laundry motion, but the internal tumble state is not directly observable from production sensors such as motor current and drum-speed signals.

    Approach

    The method collects synchronized sensor trajectories and internal video during development, uses human video labels or preferences to train a sensor-only reward model, and freezes that model during SAC-based policy learning.

    Outcome

    Across five load compositions, the learned controller improved normalized moisture-removal performance by 2.04% on average and 2.86% in the best case compared with an expert-designed baseline.

    Motion Labels

    Bad Motion: Wall-Following

    Laundry mass follows the drum wall instead of tumbling.

    Laundry mass follows the drum wall instead of tumbling.

    The laundry remains attached to the drum wall, reducing effective surface exposure to airflow.

    Bad Motion: Rolling

    Laundry mass rolls near the bottom of the drum.

    Laundry mass rolls near the bottom of the drum.

    The load rolls near the bottom of the drum rather than entering the desired falling motion.

    Good Motion: Tumbling

    Desirable tumbling motion inside the dryer drum.

    Desirable tumbling motion inside the dryer drum.

    The target motion exposes laundry surfaces more evenly to heated airflow and supports drying efficiency.

    Method

    Stage 1 Collect

    Synchronized onboard sensor sequences and internal drum video are gathered on a real dryer.

    →

    Stage 2 Learn Reward

    Video labels or pairwise preferences supervise an LSTM-based reward model that consumes sensor histories only.

    →

    Stage 3 Train Control

    The frozen reward model supplies rewards to a SAC controller that adjusts drum speed from onboard sensors.

    1. Sweep drum speed over the operating motor range while recording synchronized sensor and video data.
    2. Annotate video clips as desirable or undesirable tumble motion, or compare clip pairs by relative motion quality.
    3. Align annotations with sensor windows and train a sensor-only reward model.
    4. Use the learned reward to train an incremental drum-speed controller with Soft Actor-Critic.
    5. Deploy the final controller without video input; only production sensor streams are required.

    Motion Supervision

    The key design is to use video only as development-time supervision. Human-readable tumble quality is distilled into a reward model that maps sensor histories to scalar motion quality, so the learned controller remains compatible with production sensing constraints.

    5 evaluated load compositions

    2.04% average relative gain

    2.86% best-case gain

    Development Setup

    Dryer instrumentation setup for synchronized sensor and video collection.

    Dryer instrumentation setup for synchronized sensor and video collection.

    Household dryer setup used during reward learning and controller evaluation.

    Household dryer setup used during reward learning and controller evaluation.

    Real household dryer setup used for collecting synchronized sensor streams, observing internal tumble motion, and evaluating the learned controller.

    Results

    Moisture-Removal Performance

    The proposed controller improved the normalized moisture-removal metric for all five evaluated loads. The mean metric increased from 0.6129 to 0.6261, corresponding to a mean per-load relative improvement of 2.04%.

    Load Coverage

    The experiments cover 3 kg and 5 kg mixed-fabric loads, different cotton-to-polyester ratios, and a towel-heavy 3 kg load.

    Preference Reward Check

    A preference-based reward model was also validated on the towel-heavy load. It followed the overall trend of binary motion labels and improved performance by 2.10%.

    Deployment Constraint

    Video is used only for annotation during development. During RL training and deployment, both the reward model and controller use onboard sensor sequences only.

    Videos

    Supplementary demonstration of dryer tumble-motion control.

    Citation

    @misc{lee2026sensorRewardDryer,
      title  = {Sensor-Based Reward Learning from Video Labels for Tumble Motion Control in a Household Dryer},
      author = {Lee, Jinwoo and Kang, Chanseok and Bae, Guntae},
      note   = {To appear at 2026 IEEE 22nd International Conference on Automation Science and Engineering (CASE)},
      year   = {2026}
    }