| Representative Publications [1] Shancheng Fang, Zhendong Mao, Hongtao Xie, Yuxin Wang, Chenggang Yan, Yongdong Zhang. ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Spotting. IEEE Transactions on Pattern Analysis and Machine Intelligence 2023. [2] Shancheng Fang, Hongtao Xie, Yuxin Wang, Zhendong Mao, Yongdong Zhang. Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2021. [3] Yadong Qu, Shancheng Fang*, Yuxin Wang, Xiaorui Wang, Zhineng Chen, Hongtao Xie, Yongdong Zhang. IGD: Instructional Graphic Design with Multimodal Layer Generation. Proceedings of the IEEE/CVF International Conference on Computer Vision 2025. [4] Tianhao Qi, Jianlong Yuan, Wanquan Feng, Shancheng Fang*, Jiawei Liu, SiYu Zhou, Qian He, Hongtao Xie, Yongdong Zhang. Mask2DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2025. [5] Yaqi Cai, Shancheng Fang*, Yadong Qu, Xiaorui Wang, Meng Shao, Hongtao Xie. IterMeme: Expert-Guided Multimodal LLM for Interactive Meme Creation with Layout-Aware Generation. International Joint Conference on Artificial Intelligence 2025. [6] Fengyi Fu, Shancheng Fang*, Weidong Chen, Yan Song, Zhendong Mao, Yongdong Zhang. Sentiment-oriented transformer-based variational autoencoder network for live video commenting. ACM Transactions on Multimedia Computing, Communications, and Applications 2023. [7] Jingjing Zhang, Shancheng Fang*, Zhendong Mao, Zhiwei Zhang, Yongdong Zhang. Fine-tuning with Multi-modal Entity Prompts for News Image Captioning. Proceedings of the ACM International Conference on Multimedia 2022. [8] Yuxin Wang, Hongtao Xie, Shancheng Fang*, Jing Wang, Shenggao Zhu, Yongdong Zhang. From two to one: A new scene text recognizer with visual language modeling network. Proceedings of the IEEE/CVF International Conference on Computer Vision 2021. [9] Jianjun Chen, Shancheng Fang*, Hongtao Xie, Zheng-Jun Zha, Yue Hu, Jianlong Tan. End-to-end Boundary Exploration for Weakly-supervised Semantic Segmentation. Proceedings of the ACM International Conference on Multimedia 2021. [10] Shancheng Fang, Hongtao Xie, Jianjun Chen, Jianlong Tan, Yongdong Zhang. Learning to Draw Text in Natural Images with Conditional Adversarial Networks. International Joint Conference on Artificial Intelligence 2019. [11] Shancheng Fang, Hongtao Xie, Zheng-Jun Zha, Nannan Sun, Jianlong Tan, Yongdong Zhang. Attention and Language Ensemble for Scene Text Recognition with Convolutional Sequence Modeling. Proceedings of the ACM International Conference on Multimedia 2018. [12] Hongtao Xie, Shancheng Fang*, Zheng-Jun Zha, Yating Yang, Yan Li, Yongdong Zhang. Convolutional Attention Networks for Scene Text Recognition. ACM Transactions on Multimedia Computing, Communications, and Applications 2019. [13] Shancheng Fang, Hongtao Xie, Zhineng Chen, Yizhi Liu, Yan Li. Uyghur Text Matching in Graphic Images for Biomedical Semantic Analysis. Neuroinformatics 2018. [14] Shancheng Fang, Hongtao Xie, Zhineng Chen, Shiai Zhu, Xiaoyan Gu, Xingyu Gao. Detecting Uyghur text in complex background images with convolutional neural network. Multimedia Tools and Applications 2017. |