Researchers have unearthed the significant impact of the order in which concepts are presented to commonsense generators. This discovery could reshape how we understand and utilize language models (LMs) for generative tasks.
Generative Commonsense Reasoning (GCR) is a specialized field in Natural Language Generation (NLG) that seeks to create sentences that not only sound natural but also adhere to commonsense logic. The challenge lies in ensuring that the generated sentences encompass all given input concepts. A simple example to take this is when given the concepts {ball, batter, pitcher, throw}, a sentence like “The pitcher throws the ball, and the batter hits a home run!” is generated, showcasing the relationship and order of concepts.
The study focused on the CommonGen dataset, a platform built to evaluate a model’s capability to produce such sentences. It comprises 3.5K distinct concept sets with tens of thousands of human-written sentences. The quality of these generated sentences was judged using multiple metrics, including the popular BLEU, ROUGE, and METEOR scores.
The findings suggest that the ordering of the input concepts plays an important role in the quality of the generated sentence. When concepts were presented in an order that mirrored human-written sentences, all models, regardless of their underlying architecture, produced higher-quality outputs.
Among the various models assessed, BART-large emerged as the clear front-runner, consistently outperforming its peers across different evaluation metrics. This was especially noticeable when it was fine-tuned using concept orders found in the CommonGen training data. Yet, the study highlighted that even larger models, like GPT3, didn’t necessarily outdo smaller ones in this specific task, emphasizing the importance of fine-tuning.
It’s also worth noting that human annotators often reordered input concepts when manually crafting sentences. This inclination towards reordering provides vital insights into potential best practices for presenting data to these models.
In conclusion, as technology continues its rapid advancement, understanding the intricacies of how models like BART-large and GPT3 process and generate information becomes paramount. This study offers a beacon, shedding light on the nuanced relationship between concept ordering and sentence generation, setting the stage for further advancements in the field of Natural Language Generation. Check more here.