Snippet generation using textbook corpus - An NLP approach based on BERT

A. Moses; Bharadwaja Kumar

doi:10.1088/1742-6596/1716/1/012061

Profiles Research Units Publications

Conferences

Snippet generation using textbook corpus - An NLP approach based on BERT

A. Moses,

Published in IOP Publishing Ltd

2021

DOI: 10.1088/1742-6596/1716/1/012061

Volume: 1716

Issue: 1

Abstract

In today’s technology-driven world, most millennials are tech-savvy. They have neither the time nor the interest in reading textbooks, newspapers or journals. They would like to immediately get instant answers and clarifications for all their doubts and questions. On many occasions, we are unable to find the exact word or meaning which we are searching for. So, if we have a clear, concise summary of a piece of literature, and we could understand what it contains with just a glimpse, we would be able to save a lot of time. This paper dwells about utilizing Natural Language Processing (NLP) to summarize a given text/textbook/paper. The state-of-the-art technology in this field has been demonstrated by Google’s Bidirectional Encoder Representations from Transformers (BERT), one of the latest developments in NLP. BERT is believed to understand English better than other models because of its underlying bidirectional architecture. The present proposal is to use BERT as a sentence similarity extractor. By applying the TextRank algorithm, the sentences holding the most important information are extracted. This comes under the domain of extractive summarization. Abstractive summarization is much talked about, but since Google BERT is not built for generating text, we are utilizing it in a different way to achieve the requirement. This paper intends to discuss the use of BERT for the gen-next kids which will save time and initiate further interest for researchers in developing new programs continuously in the future. © 2021 Institute of Physics Publishing. All rights reserved.

Topics: Snippet (58)%

View more info for "Snippet generation using textbook corpus - An NLP approach based on BERT"

PDFPostprint

Postprint Version

Content may be subject to copyright.

PDF

Journal Details

Authors (1)

About the journal