RECOGNITION OF FAKE NEWS WITH DEEP LEARNING ARCHITECTURE LSTM

Original Article

Recognition of Fake News with Deep Learning Architecture LSTM

Noor Hasan Fadhil ^1*

¹Al-Qasim Green University, Babylon, Iraq

ABSTRACT

Fake news spread through digital channels has been a social issue due to which it has become imperative to detect such news. In this regard, this paper aims to develop a classifier for fake news based on deep learning. The proposed technique consists of various phases including preprocessing of the data to eliminate noise, tokenize data, and vectorize data after which the classifier will be trained using an LSTM model. The LSTM model suggested makes use of its potential to incorporate sequence and context-based information within the text data. The empirical findings show that the model provides 99.7% accuracy and outperforms many traditional machine learning models and contemporary hybrid models. Various other metrics such as precision, recall, and F1 score validate the effectiveness of the model. However, despite its excellent performance, more testing of the model on different datasets is recommended

Keywords: Fake News Detection, Deep Learning, LSTM, Natural Language Processing

INTRODUCTION

Fake news, which refers to fabricated information presented as actual news, poses a major risk to healthy dialogue in any society that prides itself on democracy Sharma et al. (2022). This phenomenon, characterized by the dissemination of fabricated stories, distorted facts, and sensationalized headlines, exploits the rapid and widespread nature of digital media to manipulate public opinion and sow confusion D’Ulizia et al. (2021). The rapid spread of such misinformation, particularly via social media, calls for strong and automatic identification tools in order to preserve information credibility Güler and Gunduz (2023). The problems related to the identification of fake news become even more complicated due to its different types that might range from blatant falsehoods to distortions, not to mention the natural inability to keep up with a flood of information available on the Internet Hosseini et al. (2023). Thus, it has become essential to have the proper methodologies in place in order to identify fake news Pierri and Ceri (2019).

In the past, many attempts have been made in order to detect fake news using linguistic techniques, metadata analysis, and other network-based features Alghamdi et al. (2023). Later developments have revolved around machine learning models, including deep learning models that have proven effective in automating the process through the use of complex patterns derived from large volumes of data Alghamdi et al. (2023). This includes the use of deep learning models such as CNNs in feature extraction, and also the use of LSTM networks in capturing long-range dependencies in the text data Killi et al. (2022). Hybrid models based on the combination of the abilities of CNNs to extract spatial features and LSTM networks to analyze temporal sequences have been proposed to handle the challenges of fake news Ajik et al. (2023).The present research tackles this urgent problem through the development of a unique model for distinguishing between real news articles and fake news, using state-of-the-art deep learning algorithms that perform the analysis of textual features for detecting linguistic cues of deception Hosseini et al. (2023). This work makes an important contribution to the field of combating fake news, developing new ways of improving fake news detection methods Akinyemi et al. (2020). Indeed, one of the unique strengths of using Long Short-Term Memory networks for such purposes lies in their ability to detect complex linguistic patterns that serve as reliable cues of deception Killi et al. (2022). Such an approach allows the model to analyze intricate linguistic patterns that play a significant role in the distinction between fake and real news articles Ajik et al. (2023).

LITERATURE REVIEW

Existing research on fake news detection has explored various techniques, including deep learning and natural language processing, yet they often overlook the complete optimization of model hyperparameters, which can significantly influence the classification results Emmy et al. (2023)

Several studies have shown that deep learning methods (e.g., CNNs, LSTMs, BERT) are useful in capturing the complex linguistic patterns characteristic of deceptive content Hosseini et al. (2023), Güler and Gunduz (2023).

However, there is still a lack of effective use of contextual cues and skip connections in models, which in turn limits the development of detection systems that can utilize more extensive contextual information. To handle data propagation through multiple neural layers, hierarchical stacked models like FakeStack using Tri-BERT-CNN-LSTM architectures were created Keya et al. (2023).

Furthermore, the effectiveness of these models can be greatly enhanced by using attention-based mechanisms or deep variational models, which allow the system to weigh the importance of different words and thus detect key features suggestive of misinformation Hosseini et al. (2023).

For this period, frameworks often used BERT-base to encode textual content and CNN and max-pooling layers to reduce features and used BiLSTM layers to process associated metadata to capture contextual dependencies Akinyemi et al. (2020).

Some other advanced models showed high performance by combining ensemble learning approaches with different machine learning and deep learning techniques to separate real and fake news Rezaei et al. (2022).

Moreover, the ambiguity of natural language interpretation was addressed by leveraging multi-EDU-structure awareness and enhanced text representations Wang et al. (2022).

In addition to textual analysis, research before now highlights the importance of multimodal approaches that include visual and contextual information to improve the detection accuracy of fake news D’Ulizia et al. (2021). These versatile approaches highlight the importance of robust data processing pipelines, where techniques like GloVe are being used for word representation to facilitate sophisticated deep learning methods like concatenated CNNs and LSTMs Güler and Gunduz (2023), Killi et al. (2022).

The combination of LSTM and CNN models has been proven to be effective in extracting context semantics and local features Güler and Gunduz (2023). However, despite these advances, there are still major challenges in the generalization ability of these models to diverse datasets and adapting to the evolving misinformation tactics Sharma et al. (2022).

This requires continued research into architectures that can detect nuance in discourse such as conspiracy theory patterns or maintain efficacy against increasingly sophisticated forms of deceptive content Haupt et al. (2023).

Moreover, when the characteristics of the test dataset differ significantly from the development data, the classification performance is often challenged since multiple neural network layers are required for effective training Akinyemi et al. (2020).

Models with GloVe-LSTM architectures are outperforming traditional baselines in F-scores but are limited by training data ratios and preprocessing techniques Killi et al. (2022).

These studies often require large, labeled datasets, which poses a challenge for adapting to new forms of fake news due to their reliance on extensive feature engineering Essa et al. (2023).

These models were often trained on different benchmarks such as: ISOT Dataset: 44,898 political and world news articles (2016–2017) from PolitiFact and Reuters Hosseini et al. (2023), Kholiq et al. (2022), and WELFake Dataset: A dataset with 72,134 instances with title and text columns where fake articles tend to have more subjectivity and lower readability Killi et al. (2022). The generality of these models might be restricted by their dependence on these specific datasets, and so performance can vary a lot when used on diverse sources Pierri and Ceri (2019). Such data dependency often results in inconsistent performance and biases, in which a model trained on a particular type of narratives may not perform well on different types of misinformation, calling for more robust and generalizable frameworks Sharma et al. (2022). Hence, research continues to find transfer learning methods for cross-dataset generalization and domain adaptation to boost model applicability across diverse information landscapes Sharma et al. (2022).The Table 1 summarizes the models, data sets, metrics, and limitations found in each paper. This table helps to illustrate the current state of fake news detection research.

Table 1

Table 1 Summarize Fake News Research
Study [ID]	Approach	Primary Dataset(s)	accuracy	Limitations
Kholiq et al. Kholiq et al. (2022)	LightGBM with CountVectorizer	ISOT (44.9K articles)	Nearly perfect	Optimal performance is strictly dependent on specific dataset sizes (e.g., 20,000–26,000 samples); higher loss in small sets.
Ajik et al. Ajik et al. (2023)	Optimized CNN and LSTM	Scraped News	97.15%	Overlooks the comprehensive optimization of model hyperparameters, which can significantly impact classification performance
Güler and Gunduz (2023)	CNN, LSTM, and BERT	ISOT, BuzzFeed	Demonstrated high utility	Excludes social media involvement features; faces high computational complexity for real-time detection
Hosseini et al. Hosseini et al. (2023)	Topic and Deep Variational	ISOT	80% – 91%	Relies on very specific semantics of fake news, which can generate misclassification in long-form articles
Keya et al. Keya et al. (2023)	FakeStack	Not Specified	99.74%	The hierarchical stacked architecture is highly complex, limiting practical deployment in real-world scenarios
Alghamdi et al. Alghamdi et al., 2023	BERT-base + CNN + BiLSTM	Metadata-enhanced sets	N/A	Lacks original empirical analysis and in-depth exploration of implementation challenges in actual systems
Rezaei et al. Rezaei et al. (2022)	Stacking Ensemble Network	Content-feature sets	96.24%	Accuracy drops (to 94.40%) when applied to multi-class classification and diverse fact-checking sources
Wang et al. Wang et al. (2022)	Multi-EDU Awareness	Kaggle, BuzzFeed	Demonstrated high utility	Assessment is limited by a lack of diverse performance metrics like recall and precision
D’Ulizia et al. D’Ulizia et al. (2021)	Multimodal Survey	Benchmarks (2016-2017)	N/A	Dependence on specific 2016-2017 political datasets limits the generalizability to more modern misinformation
Killi et al. Killi et al. (2022)	GloVe-LSTM	WELFake (72.1K entries)	94.4%	Struggled with effective text data processing and inconsistencies during data scraping
Sharma et al. Sharma et al. (2022)	Mitigation Survey	Global news sets	N/A	Identifies persistent model bias toward specific narratives and low adaptability to evolving misinformation tactics
Haupt et al. Haupt et al. (2023)	Qualitative Coding + ML	Conspiracy discourse	Varies by discourse	Limited scalability for high-volume detection due to the need for qualitative content coding
Akinyemi et al. Akinyemi et al. (2020)	Multi-layer Neural Nets	Scraped Social Media	High performance utility	Classification performance encounters difficulties when the test dataset significantly outweighs the development dataset
Essa et al. Essa et al. (2023)	Hybrid BERT + LightGBM	Labeled datasets	98.91%	Reliance on extensive feature engineering and model sophistication poses challenges for adaptability to novel news forms
Pierri & Ceri Pierri and Ceri (2019)	False News Analysis	Propagation sets	Varies by platform	Generalizability is restricted by dependence on specific datasets, leading to inconsistent performance across diverse sources

The suggested LSTM model has a accuracy of 99.7%, which surpasses other approaches reported in literature, such as stand-alone algorithms like LSTM (99.18%), and combined DL models (maximum 98%). This is due to better data pre-processing and model configuration. However, more experiments need to be conducted to validate its performance. (Figure 1 shows comparison of model’s accuracy).

Figure 1

Figure 1 Comparison of Fake News Detection Models

METHODOLOGY

This methodology describes how to develop a deep learning model that can be used to distinguish between true and fake news. This model involves several stages, from data gathering to the generation of predictions. (Figure 2 illustrate methodology process flow).

Figure 2

Figure 2 Methodology Process

Data

The first step starts with a Fake and Real News Dataset, which has categorized news articles as real or fake. This forms the basis for the application of supervised learning.(Figure 3 illustrates word cloud for fake and real data), which helps understand most used words except stopwords.

Figure 3

Figure 3 Methodology Process Illustrates Word Cloud for Fake and Real Data

Data Preprocessing

Prior to feeding the data into the model, there are some preprocessing techniques that are performed for better results: Outlier Detection: Getting rid of outliers or unnecessary data that might have a bad impact on the model. Noise Detection: Removing any noise from the text such as unnecessary characters or symbols. Tokenization: Splitting text into smaller units (words or tokens). Vectorization: Converting textual data into numerical representations (e.g., TF-IDF or word embeddings) so it can be processed by machine learning algorithms. Distribution

Train-Test Split

Dataset is split into two subset as follows: Training Dataset: The training dataset is used for training purposes only. Testing Dataset: Testing dataset will be used for testing the model.

DATA ANALYSIS AND RESULTS

Model Training (LSTM-based fake news detection model)

The LSTM framework utilized architecture that includes Embedding, 128 neurons in the layers, and a sigmoid activation function in classifying the data, optimizer Adam, and accuracy metric for measuring the capability of the model.

Model Saving

After completing the process of training, the trained model and its weights can be stored. This means that there would be no need for the process of retraining when deploying the model.

Prediction

At last, the saved model is used for predictions. Preprocessing processes new input news and the saved model decides whether the input news is real or fake.

Model Evaluation

To determine the performance of the recommended fake news detection approach using LSTM, various evaluation parameters have been utilized. This ensures that not only is the model highly accurate but also highly reliable in terms of detecting fake news. Accuracy is the parameter used to measure the correctness of the model, as explain in equation bellow:

(1)

Precision will be used to evaluate the number of actual fake news detected among those predicted by the model, , as explain in equation bellow:

(2)

Recall indicates the ability of the model to find out fake news, as explain in equation bellow:

(3)

F1-Score is the harmonic mean of the above metrics, as explain in equation bellow:

(4)

where: True Positive = Fake News detected as Fake ,True Negative = Non-Fake News detected as Non-Fake ,False Positive = Non-Fake News detected as Fake ,and False Negative = Fake News detected as non-fake. (Figure 4 explain confusion matrix), (Figure 5 show accuracy), and (Figure 6 loss function).

Figure 4

Figure 4 Explain Confusion Matrix.

Figure 5

Figure 5 Model Accuracy.

Figure 6

Figure 6 Model Loss.

CONCLUSION AND RECOMMENDATIONS

Conclusion

This study presented an effective approach for fake news detection using a Long Short-Term Memory (LSTM) model. Methodology covered data preprocessing, feature extraction, training the model, and evaluating its performance with various metrics. As a result, the proposed approach reached very high metrics, having 99.7% accuracy, Precision: 99.49%, Recall: 99.69%, and F1-Score: 99.59% which means it has excellent potential in recognizing the difference between fake and genuine news. Comparison with other works shows that traditional machine learning algorithms (SVM, Random Forest) usually provide slightly worse accuracy from 80% to 92%, while using deep learning and hybrid methods can increase this metric to 95%-99%. Even more complex algorithms like transformers, ensembles, etc. provide accuracy up to 99.3%. In this case, despite the use of simple features, our proposed LSTM-based model shows better results than most other algorithms. Although the model demonstrates great accuracy, certain limitations cannot be overlooked. First, performance might vary depending on the quality and structure of the dataset used. Besides, overfitting can be observed with more complex datasets and the use of highly sophisticated algorithms. In addition, advanced approaches often use metadata and multimodal features in their algorithms.

Recommendation

For future research in this area, it is suggested to test the effectiveness of this model on other and varied datasets, as well as include more features such as metadata and images. Furthermore, it is advisable to consider hybrid models, incorporating the use of transformers alongside LSTM models.

DATA AVAILABILITY

The study's dataset is accessible to the general public on the Kaggle platform. It is available at [https://www.kaggle.com/data-sets/clmentbisaillon/fake-and-real-news-dataset?select=Fake.csv].

The dataset contains labeled news articles categorized as real and fake.

All preprocessing steps applied in this study are described in the methodology section, and the processed data can be made available upon reasonable request.

ACKNOWLEDGMENTS

None.

REFERENCES

Ajik, E. D., Obunadike, G. N., and Echobu, F. O. (2023). Fake News Detection Using Optimized CNN and LSTM Techniques. Journal of Information Systems and Informatics, 5(3), 1044–1057. https://doi.org/10.51519/journalisi.v5i3.548

Akinyemi, B. O., Adewusi, O., and Oyebade, A. (2020). An Improved Classification Model for Fake News Detection in Social Media. International Journal of Information Technology and Computer Science, 12(1), 34–43. https://doi.org/10.5815/ijitcs.2020.01.05

Alghamdi, J., Luo, S., and Lin, Y. (2023). A Comprehensive Survey on Machine Learning Approaches for Fake News Detection. Multimedia Tools and Applications, 83(17), 51009–51067. https://doi.org/10.1007/s11042-023-17470-8

D’Ulizia, A., Caschera, M. C., Ferri, F., and Grifoni, P. (2021). Fake News Detection: A Survey of Evaluation Datasets. PeerJ Computer Science, 7. https://doi.org/10.7717/peerj-cs.518

Essa, E., Omar, K., and Alqahtani, A. (2023). Fake News Detection Based on a Hybrid BERT and LightGBM Models. Complex and Intelligent Systems, 9(6), 6581–6592. https://doi.org/10.1007/s40747-023-01098-0

Güler, G., and Gunduz, S. (2023). Deep Learning Based Fake News Detection on Social Media. International Journal of Information Security Science, 12(2), 1–21. https://doi.org/10.55859/ijiss.1231423

Haupt, M. R., Chiu, M., Chang, J., Li, Z., Cuomo, R., and Mackey, T. K. (2023). Detecting Nuance in Conspiracy Discourse: Advancing Methods in Infodemiology and Communication Science With Machine Learning and Qualitative Content Coding. PLOS ONE, 18(12). https://doi.org/10.1371/journal.pone.0295414

Hosseini, M., Sabet, A. J., He, S., and Aguiar, D. (2023). Interpretable Fake News Detection With Topic and Deep Variational Models. Online Social Networks and Media, 36, 100249. https://doi.org/10.1016/j.osnem.2023.100249

Keya, A. J., Shajeeb, H. H., Rahman, Md. S., and Mridha, M. F. (2023). FakeStack: Hierarchical Tri-BERT-CNN-LSTM Stacked Model for Effective Fake News Detection. PLOS ONE, 18(12). https://doi.org/10.1371/journal.pone.0294701