To appear at 2026 IEEE 22nd International Conference on Automation Science and Engineering (CASE), August 2026
LG Electronics AI Lab
TL;DR
This work transfers human video judgments of laundry motion into a deployable sensor-only reward model, then uses reinforcement learning to improve dryer drum-speed control without requiring video at runtime.
Overview
Problem
Efficient drying depends on maintaining cataracting laundry motion, but the internal tumble state is not directly observable from production sensors such as motor current and drum-speed signals.
Approach
The method collects synchronized sensor trajectories and internal video during development, uses human video labels or preferences to train a sensor-only reward model, and freezes that model during SAC-based policy learning.
Outcome
Across five load compositions, the learned controller improved normalized moisture-removal performance by 2.04% on average and 2.86% in the best case compared with an expert-designed baseline.
Motion Labels
Bad Motion: Wall-Following
The laundry remains attached to the drum wall, reducing effective surface exposure to airflow.
Bad Motion: Rolling
The load rolls near the bottom of the drum rather than entering the desired falling motion.
Good Motion: Tumbling
The target motion exposes laundry surfaces more evenly to heated airflow and supports drying efficiency.
Method
Synchronized onboard sensor sequences and internal drum video are gathered on a real dryer.
→
Video labels or pairwise preferences supervise an LSTM-based reward model that consumes sensor histories only.
→
The frozen reward model supplies rewards to a SAC controller that adjusts drum speed from onboard sensors.
- Sweep drum speed over the operating motor range while recording synchronized sensor and video data.
- Annotate video clips as desirable or undesirable tumble motion, or compare clip pairs by relative motion quality.
- Align annotations with sensor windows and train a sensor-only reward model.
- Use the learned reward to train an incremental drum-speed controller with Soft Actor-Critic.
- Deploy the final controller without video input; only production sensor streams are required.
Motion Supervision
The key design is to use video only as development-time supervision. Human-readable tumble quality is distilled into a reward model that maps sensor histories to scalar motion quality, so the learned controller remains compatible with production sensing constraints.
5 evaluated load compositions
2.04% average relative gain
2.86% best-case gain
Development Setup
Results
Moisture-Removal Performance
The proposed controller improved the normalized moisture-removal metric for all five evaluated loads. The mean metric increased from 0.6129 to 0.6261, corresponding to a mean per-load relative improvement of 2.04%.
Load Coverage
The experiments cover 3 kg and 5 kg mixed-fabric loads, different cotton-to-polyester ratios, and a towel-heavy 3 kg load.
Preference Reward Check
A preference-based reward model was also validated on the towel-heavy load. It followed the overall trend of binary motion labels and improved performance by 2.10%.
Deployment Constraint
Video is used only for annotation during development. During RL training and deployment, both the reward model and controller use onboard sensor sequences only.
Videos
Supplementary demonstration of dryer tumble-motion control.
Citation
@misc{lee2026sensorRewardDryer,
title = {Sensor-Based Reward Learning from Video Labels for Tumble Motion Control in a Household Dryer},
author = {Lee, Jinwoo and Kang, Chanseok and Bae, Guntae},
note = {To appear at 2026 IEEE 22nd International Conference on Automation Science and Engineering (CASE)},
year = {2026}
}




