Тип работы:
Предмет:
Язык работы:


EXPLORING THE CAPABILITIES OF LUHN SUMMARIZATION, BART, PEGASUS, ANDCLAUDE LLM FOR NEWS SUMMARIZATION

Работа №185795

Тип работы

Бакалаврская работа

Предмет

прочее

Объем работы68
Год сдачи2024
Стоимость4750 руб.
ПУБЛИКУЕТСЯ ВПЕРВЫЕ
Просмотрено
19
Не подходит работа?

Узнай цену на написание


GLOSSARY 3
INTRODUCTION 4
1 Design 5
1.1 Requirements 5
1.2 Flowchart 6
1.3 Activity Diagram 7
2 Used Techniques, Technologies and Tools 12
2.1 Jupyter Notebook 12
2.2 PyTorch 13
2.3 Luhn Summarization 14
2.4 BART 15
2.5 Pegasus 18
2.6 Claude LLM 19
2.7 ROUGE 20
2.8 BERT Score 22
3 Implementations 25
3.1 Luhn Summarization 25
3.2 BART 31
3.3 Pegasus 39
3.4 Claude LLM 45
3.5 Model Comparison 49
CONCLUSION 56
REFERENCES 57
APPENDIX A 59


In today's information age, vast amounts of data are readily available. While this abundance holds immense value, extracting key insights and essential information from large volumes of text can be a significant time investment. News articles, in particular, often contain a wealth of information, but reading them in their entirety can be cumbersome. Ideally, a solution could exist to condense these texts and efficiently extract the most important details.
Fortunately, numerous text summarization techniques have emerged. These methods range from basic approaches, relying on simpler logic for summarizing text, to more advanced methods utilizing complex architectures and pre-trained models. Each technique undoubtedly offers unique advantages and limitations.
This paper aims to explore the capabilities of several prominent summarization techniques. Specifically, we will investigate the effectiveness of Luhn summarization, BART, Pegasus, and Claude LLM in summarizing news articles. By utilizing open-source news article resources, this research will evaluate the strengths and weaknesses of each method in extracting crucial information from news content. To achieve these next task must be done:
• Provide a text summarization model for each technique that will be analyzed.
• Analyze the performance of each text summarization model using the results of each specific metric.
• Examine by comparing the outcome metrics obtained from each of the summarization models.


Возникли сложности?

Нужна помощь преподавателя?

Помощь в написании работ!


The objective of this bachelor thesis paper is to explore the capabilities of Luhn Summarization, BART, Pegasus, and Claude LLM using open-source news articles. The specific objectives are to provide a text summarization model or technique, to then obtain the evaluation metrics scores of each of the models or techniques, and to conclude by comparing and concluding the result of each of the evaluation metrics of the text summarization model or technique.
The techniques, technologies, and tools mentioned in this paper can be leveraged to great effect when exploring text summarization models and techniques for generating summaries of news articles.
This paper successfully explores the capabilities of Luhn Summarization, BART, Pegasus, and Claude LLM in generating summaries for news articles. It concludes that Pegasus is the most suitable text summarization model or technique compared to the other three for generating summaries for news articles.


1. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension / M. Lewis, Y. Liu, N. Goyal [et al.] // arXiv preprint arXiv:1910.13461v1. - 2019.
2. What is a large language model (LLM)?. Cloudflare. [N.p.], n.d. - URL: https://www.cloudflare.com/learning/ai/what-is-large-language-model/ (access date 05.04.2024).
3. ROUGE: A Package for Automatic Evaluation of Summaries / C. Lin // Association for Computational Linguistics. - P. 74-81. - 2004.
4. What is an API?. MuleSoft. M. Frye, n.d. - URL: https://www.mulesoft.com/resources/api/what-is-an-api (access date 05.04.2024).
5. Transfer learning and fine-tuning. TensorFlow. [N.p.], n.d. - URL: https://www.tensorflow.org/tutorials/images/transfer_learning (access date 05.03.2024).
6. What is Tokenization?. DataCamp. [N.p.], n.d. - URL:
https://www.datacamp.com/blog/what-is-tokenization (access date 05.03.2024).
7. Prefix. Computer Hope. [N.p.], n.d. - URL:
https://www.computerhope.com/jargon/p/prefix.htm#:~:text=A%20prefix%20is%20an% 20affix,cores%20is%20denoted%20by%20prefixes (access date 05.03.2024)
8. Model inference overview. Google Cloud. [N.p.], n.d. - URL: https://cloud.google.com/bigquery/docs/inference- overview#:~:text=Machine%20learning%20inference%20is%20the,machine%20learning %20model%20into%20production.%22 (access date 05.03.2024).
9. fine-tuning. Tech Target. L. Craig, 2024. - URL:
https://www.techtarget.com/searchenterpriseai/definition/fine- tuning#:~:text=Fine%2Dtuning%20is%20the%20process,suit%20more%20specialized% 20use%20cases. (access date 05.03.2024).
10. Parameters and Hyperparameters in Machine Learning and Deep Learning. Towards Data Science. K. Nyuytiymbiy, 2020. - URL: https://towardsdatascience.com/parameters-and- hyperparameters-aa609601a9ac (access date 05.03.2024).
11. Trainer. Hugging Face. [N.p.], n.d. - URL:
https://huggingface.co/docs/transformers/en/main_classes/trainer (access date
05.03.2024).
12. Project Jupyter. [N.p.], n.d. - URL: https://jupyter.org/ (access date 05.01.2024).
13. PyTorch Community. [N.p.], n.d. - URL: https://discuss.pytorch.org/ (access date 05.01.2024).
14. Hugging Face. [N.p.], n.d. - URL: https://huggingface.co/ (access date 05.04.2024).
15. What is Epoch in Machine Learning?. U-next. UNext Editorial Team, 2022. - URL: https://u-next.com/blogs/machine-learning/epoch-in-machine- learning/#:~:text=An%20epoch%20in%20machine%20learning,learning%20process%20 of%20the%20algorithm. (access date 05.04.2024)....39



Работу высылаем на протяжении 30 минут после оплаты.




©2025 Cервис помощи студентам в выполнении работ