CS Colloquium - Self-supervised text generation metrics: Teaching computers to judge how good it writes without an English degree

Forrest Sheng Bao portrait - ISU

Forrest Sheng Bao


The Transformer architecture invented by Google in 2017 has triggered a boom of text generation (natural language generation, NLG), including summarization, simplification, and translation. A consequent problem is how to judge the quality of generated text or a generator (summarizer, translator, etc.). The conventional approach is to measure the lexical or token-level overlap between the generated text and a parallel text prepared by humans serving as the reference. However, this is not scalable as it requires non-trivial human work. In this talk, Dr. Bao will present a couple of approaches that his group took to tackle this issue without using human-written references, by using self-supervised machine learning from self-augmented training data.


Dr. Forrest Sheng Bao is an Assistant Professor in the Department of Computer Science at Iowa State University. His current research focuses on artificial intelligence (AI) and Electronic Design Automation (EDA). In AI, he works on natural language processing (NLP) such as natural language generation (NLG) metrics. In EDA, he works on using machine learning (ML) to solve circuit routing problems as well as hardware description languages (HDLs). Previously he had worked on medical imaging (MRI) and medical signal processing (EEG) where his work was reported by MIT Technology Review and Lancet Neurology. His research has been funded by National Science Foundation (NSF), Federal Aviation Administration (FAA) , Air Force Research Lab (AFRL), and Microsoft.

Individuals with disabilities are encouraged to attend all University of Iowa–sponsored events. If you are a person with a disability who requires a reasonable accommodation in order to participate in this program, please contact Computer Science Dept. in advance at (319)335-0713 or matthieu-biger@uiowa.edu