|
Seem impossible even to people, and developers don't yet expect the model to solve it perfectly. However, such a challenging task encourages the model to learn language and general facts about the world, as well as to distill information taken from throughout the document to generate an output that looks a lot like a perfect summary. The advantage of this self-monitoring is that you can create as many examples as there are documents, without any human annotation – a factor that is often the bottleneck in purely supervised systems.
An example of the application of self-monitoring controls used Bosnia and Herzegovina Mobile Number List by PEGASUS during pre-training As with BERT, the model is trained to produce all masked sentences. In this regard it was found that it worked better to choose “important” sentences to mask, making the output of self-monitored examples even more similar to a summary. Important sentences were automatically identified by finding those most similar to the rest of the document, using a metric called ROUGE.

The latter calculates the similarity of two texts by evaluating n-gram overlaps and using a score from 0 to 100 (ROUGE-1, ROUGE-2 and ROUGE-L are three common variants). Similar to other recent methods, such as T5, the model was pre-trained on a very large corpus of web-crawled documents, then the model was fine-tuned on 12 down-stream public abstract abstract datasets, obtaining new state-of-the-art results measured with automatic metrics, using only 5% of the number of parameters of T5.
|
|