CookAnything: A Framework for Flexible and Consistent Multi-Step Recipe Image Generation
Ruoxuan Zhang, Bin Wen, Hongxia Xie, Yi Yao, Songhan Zuo, Jian-Yu Jiang-Lin, Hong-Han Shuai, Wen-Huang Cheng
In: Proceedings of the ACM International Conference on Multimedia, 2025.
ACMMM 2025. CCF-A.
RecipeGen: A Step-Aligned Multimodal Benchmark for Real-World Recipe Generation
Ruoxuan Zhang, Jidong Gao, Bin Wen, Hongxia Xie, Chenming Zhang, Hong-Han Shuai, Wen-Huang Cheng
In: Proceedings of the ACM International Conference on Multimedia, 2025.
ACMMM 2025. CCF-A.
EmoArt: A Multidimensional Dataset for Emotion-Aware Artistic Generation
Cheng Zhang, Hongxia Xie, Bin Wen, Songhan Zuo, Ruoxuan Zhang, Wen-Huang Cheng
In: Proceedings of the ACM International Conference on Multimedia, 2025.
ACMMM 2025. CCF-A.
MindPower: Enabling Theory-of-Mind Reasoning in VLM-based Embodied Agents
Ruoxuan Zhang, Qiyun Zheng, Zhiyu Zhou, Ziqi Liao, Siyu Wu, Jian-Yu Jiang-Lin, Bin Wen, Hongxia Xie, Jianlong Fu, Wen-Huang Cheng
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025.
CVPR 2026. CCF-A.