Sushant Kafle

get.skafle @ google.com
Software Engineer
Google
Photo of Sushant Kafle.

About Sushant

Sushant is Software Engineer at Google where he is tackling challenges in the field of Natural Language Understanding and Information Retrieval. He did his Ph.D. at the Golisano College of Computing and Information Sciences at the Rochester Institute of Technology, where he specialized in accessibility for people with disabilities, human-computer interaction and computational linguistics.

Research Interests
In general, I am interested in understanding the processes of acquisition and distilltion of natural language (and peripheral linguistic signals) with a goal to enhance human-to-human or human-to-machine interaction. Prior to joining Google, my research aimed to inform the design and the evaluation of automatic speech recognition technology for use in captioning for people who are deaf or hard of hearing. My work tends to be heavily user-centric involving design, evaluation and validation of ML-powered systems through real-world studies and observation with the actual end users of the system.

Publications

I publish my research in top venues in Computer Accessibility, Human-Computer Interaction and Natural Language and Speech Processing.

2019
Artificial Intelligence Fairness in the Context of Accessibility Research on Intelligent Systems for People who are Deaf or Hard of Hearing
Sushant Kafle, Abraham Glasser, Sedeeq Al-khazraji, Larwan Berke, Matthew Seita and Matt Huenerfauth. ASSETS 2019 Workshop on AI Fairness for People with Disabilities (@ ASSETS'19)
Image showing a snippet of a typical online classroom setup with different visual section labelled such as video-stream of lecturer on the top-left corner, slides in the middle and captions at the bottom of the image.
Evaluating the Benefit of Highlighting Key Words in Captions for People who are Deaf or Hard of Hearing.
Sushant Kafle, Peter Yeung and Matt Huenerfauth. Annual SIGACCESS Conference on Computers and Accessibility (ASSETS'19)
Image varying distributed feature representation of word night under different prosodic contexts.
Fusion Strategy for Prosodic and Lexical Representations of Word Importance.
Sushant Kafle, Cecilia O. Alm and Matt Huenerfauth. International Speech Communication Association (Interspeech'19)
Image showing a RNN-based neural architecture for extracting speech-based features for words in a spoken dialogue.
Modeling Acoustic-Prosodic Cues for Word Importance Prediction in Spoken Dialogues.
Sushant Kafle, Cecilia O. Alm and Matt Huenerfauth. Speech and Language Processing for Assistive Technologies (@ NAACL'19)
Image showing a graph of impact scores of recognition errors on the understadability of the text.
Predicting the Understandability of Imperfect English Captions for People who are Deaf or Hard of Hearing.
Sushant Kafle and Matt Huenerfauth. ACM Transactions on Accessible Computing (TACCESS'19)
2018
Image showing three individuals (two hearing and one deaf) using a mobile app with automatic speech recognition for communication.
Behavioral Changes in Speakers who are Automatically Captioned in Meetings with Deaf or Hard-of-Hearing Peers.
Matthew Seita, Khaled Albusays, Sushant Kafle, Michael Stinson and Matt Huenerfauth. Annual SIGACCESS Conference on Computers and Accessibility (ASSETS'18)
Image showing ASL sentences with variable-length pauses inserted in between words.
Modeling the Speed and Timing of American Sign Language to Generate Realistic Animations.
Sedeeq Al-khazraji, Larwan Berke, Sushant Kafle, Peter Yeung and Matt Huenerfauth. Annual SIGACCESS Conference on Computers and Accessibility (ASSETS'18)
Best Paper Award
Image showing a sentence, what do you think is the biggest problem, with importance annotation for each word. Words like, biggest and problem, received high importance.
A Corpus for Modeling Word Importance in Spoken Dialogue Transcripts.
Sushant Kafle, Matt Huenerfauth. International Conference on Language Resources and Evaluation (LREC'18)
Image showing a speaker speaking in front of the camera with captions being displayed underneath.
Methods for Evaluation of Imperfect Captioning Tools by Deaf or Hard-of-Hearing Users at Different Reading Literacy Levels.
Larwan Berke, Sushant Kafle, Matt Huenerfauth. ACM Conference on Human Factors in Computing Systems (CHI'18)
Best Paper Honarable Mention
Image showing a sign-language avatar delivering an ASL sentence.
Modeling and Predicting the Location of Pauses for the Generation of Animations of American Sign Language.
Sedeeq Al-khazraji, Sushant Kafle, Matt Huenerfauth. International Conference on Language Resources and Evaluation (@ LREC'18)
2017
Evaluating the Usability of Automatically Generated Captions for People who are Deaf or Hard of Hearing.
Sushant Kafle, Matt Huenerfauth. Annual SIGACCESS Conference on Computers and Accessibility (ASSETS'17)
Best Paper Award
2016
A graph showing the importance of various semantic properties of text on the impact of ASR errors. Properties like the length of the word is shown to have heighest impact of error quality.
Effect of Speech Recognition Errors on Text Understandability for People who are Deaf or Hard of Hearing.
Sushant Kafle, Matt Huenerfauth. Speech and Language Processing for Assistive Technologies (@ INTERSPEECH'16)

Projects

Image showing a sentence, what do you think is the biggest problem, with importance annotation for each word. Words like, biggest and problem, received high importance.
Word Importance Labeler
Developing a tool with a suite of metrics for evaluating the quality automatically generated transcripts of classroom lectures based on word importance information. The tool was developed as a part of a research project at National Technical Institute for Deaf (NTID) which investigated the usability of automatic captioning for classrooms.
Image showing a RNN-based neural architecture for extracting speech-based features for words in a spoken dialogue.
Speech Analysis for Word Importance Modeling
Investigated various acoustic-prosodic features from human speech to see if they provide clues about the importance of word being spoken; importance defined in terms of the contribution of the word in understanding the meaning of a spoken utterance.
Image showing various type of ASR errors with color-bards highlighting the percentage of their occurence.
Speech Recognition Error Analysis.
Categorized and analyzed different types of errors produced by Sphinx4 Speech Recognition System on 100-hrs of speech recordings from LibriSpeech Corpus. Implemented novel output alignment modules to account for fuzzy time-stamp matching and, one to many and many to one substitution errors. Created a local compute cluster to make speech recognition faster.

Timeline and Events

September, 2019

Presented our work on "Fusion Strategy for Prosodic and Lexical Representations of Word Importance." at the INTERSPEECH 2019 conference. [slides]

July, 2019

Our work position paper on "Artificial Intelligence Fairness in the Context of Accessibility Research on Intelligent Systems for People who are Deaf or Hard of Hearing" has been accepted at the ASSETS 2019 workshop on AI Fairness for People with Disabilities.

June, 2019

Our work on "Evaluating the Benefit of Highlighting Key Words in Captions for People who are Deaf or Hard of Hearing" has been accepted at the ASSETS 2019 conference!

June, 2019

Attended the HCIC 2019 Workshop on the Future of Work held in Pajaro Dunes, Watsonville, CA. Also, got a chance to present my Ph.D. thesis work as a part of the poster session at the workshop.

June, 2019

Happy to annouce that our paper "Fusion Strategy for Prosodic and Lexical Representations of Word Importance." was accepted at the INTERSPEECH 2019 conference.

June, 2019

Attended NAACL 2019 conference and participated in the SLPAT workshop with our paper "Modeling Acoustic-Prosodic Cues for Word Importance Prediction in Spoken Dialogues."

April, 2019

Our journal paper on "Predicting the Understandability of Imperfect English Captions for People who are Deaf or Hard of Hearing." was accepted at ACM Transaction of Accessible Computing (TACCESS).

October, 2018

Co-presented our paper on "Modeling the Speed and Timing of American Sign Language to Generate Realistic Animations." at the ASSETS 2018 conference which was also recognized with the Best Paper Award in the conference.

June, 2018

Participating in the summer internship program at Google in Seattle office till September, 2018.

June, 2018

Two of our papers from the lab, which I am pleased to have contributed to, has been accepted at the ASSETS 2018 conference.

June, 2018

Sucessfully defended my Ph.D. thesis proposal. Officially a Ph.D. candidate now (yay!).

January, 2018

Our workshop paper "Modeling and Predicting the Location of Pauses for the Generation of Animations of American Sign Language" was accepted at the Sign Langague Workshop at LREC 2018.

December, 2017

Our paper "Methods for Evaluation of Imperfect Captioning Tools by Deaf or Hard-of-Hearing Users at Different Reading Literacy Levels" was accepted at the CHI 2018 conference and was nominated for a Best Paper Honarable Mention award (ranked among the top 5% of all submissions to the SIGCHI 2018 conference).

December, 2017

Our paper "A Corpus for Modeling Word Importance in Spoken Dialogue Transcripts." was accepted at the LREC 2018 conference.

November, 2017

Our ASSETS 2017 paper won the "Best Paper Award".

September, 2017

We announce the creation of the Corpus of Word Importance Annotations, more details here.

July, 2017

Our paper "Evaluating the Usability of Automatically Generated Captions for People who are Deaf or Hard of Hearing" was accepted at the ASSETS 2017 conference and was nominated for a Best Paper Award.

May, 2017

Helped facilitate the Research Experience for Undergraduates (REU) program at the CAIR lab.

October, 2016

Participated and presented at ASSETS Doctoral Consortium 2016.

July, 2016

Our paper "Effect of Speech Recognition Errors on Text Understandability for People who are Deaf or Hard of Hearing" was accepted at SLPAT 2016 workshop.

May, 2016

Sucessfully defended the PhD Research Potential Assesment.

August, 2015

Joined RIT for doctoral studies in the Golisano College of Computing and Information Sciences. Started working as a research assistant at the CAIR Lab.