Theoretical and Natural Science
- The Open Access Proceedings Series for Conferences
Vol. 25, 20 December 2023
* Author to whom correspondence should be addressed.
In this age marked by a wealth of information, the relevance of social networks has increased in a manner that is analogous to an exponential growth curve. Notably, the content that is shared on these platforms has the potential to act as a reflection of the emotional states that people are now experiencing. The importance of emotions is brought to light in the research presented here, which makes use of a technique based on a review of the relevant literature to analyze the problem of random sample bias and the effects that it has on sentiment analysis. It is possible to draw the conclusion, on the basis of the findings of the research, that the problem of random sample propensity is not a sporadic or insignificant one. In addition, the findings of the study indicate the presence of multiple types of prejudice. Because of the potential repercussions that could result from doing a distorted sentiment analysis, it is really necessary to keep your method focused.
Social Media Data, Random Sampling Bias, Sentiment Analysis
1. Pang, B., & Lee, L. (2008). Opinion mining and sentiment analysis. Foundations and Trends® in Information Retrieval, 2(1–2), 1–135.
2. Kemp, S. (2020). Digital 2020: Global digital overview. Datareportal.
3. Agarwal, A., Xie, B., Vovsha, I., Rambow, O., & Passonneau, R. (2011). Sentiment analysis of Twitter data. In Proceedings of the workshop on languages in social media (pp. 30-38). Association for Computational Linguistics.
4. Morstatter, F., Pfeffer, J., Liu, H., & Carley, K. M. (2013). Is the sample good enough? Comparing data from Twitter’s streaming API with Twitter’s firehose. In Seventh international AAAI conference on weblogs and social media.
5. Zhang, A. X., Chen, R. M., & Carley, K. M. (2018). Large Scale Structure and Dynamics of Complex Networks: From Information Technology to Finance and Natural Science. World Scientific.
6. Baccianella, S., Esuli, A., & Sebastiani, F. (2010). SentiWordNet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10). European Language Resources Association (ELRA).
7. Smith, T. M. (2013). Sampling and statistical methods for behavioral ecologists. Cambridge University Press.
8. Bethlehem, J. (2010). Selection bias in web surveys. International Statistical Review, 78(2), 161-188.
9. Wang, W., Rothschild, D., Goel, S., & Gelman, A. (2015). Forecasting elections with non-representative polls. International Journal of Forecasting, 31(3), 980-991.
10. Groves, R. M. (2006). Nonresponse rates and nonresponse bias in household surveys. Public Opinion Quarterly, 70(5), 646-675.
11. Salganik, M. J. (2017). Bit by bit: Social research in the digital age. Princeton University Press.
12. Lohr, S. (2019). Sampling: Design and Analysis. Chapman and Hall/CRC.
13. Cochran, W. G. (2007). Sampling techniques. John Wiley & Sons.
14. Chawla, N. V. (2005). Data mining for imbalanced datasets: An overview. In Data mining and knowledge discovery handbook (pp. 853-867). Springer, Boston, MA.
The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Authors who publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See Open Access Instruction).