Large-Model-Driven Content Moderation Systems: Platform-Level Applications and Governance Practices

Rui Li

doi:10.71222/d58yzk94

Authors

Rui Li Independent Researcher, China Author

DOI:

https://doi.org/10.71222/d58yzk94

Keywords:

Content Moderation, Large Models, Platform Governance, AI Ethics, Social Media

Abstract

Into the application and administration of bombastic-model-take content moderation systems on societal media platforms, this research article delf. As the deployment of these AI systems get more widespread. Translate their capabilities and limit is crucial. In big models for content moderation, the subject review the technological promotion. Exploring their strength in key inappropriate content. Thereby to ensure honourable and unbiased application. The article likewise examines governance practices; use quantitative method, the enquiry thereby assess the performance metrics of these simulation in genuine-world contexts and discourse their import for users and policymakers. Key findings intrinsically highlight the equalizer between algorithmic efficiency and human superintendence and supply perceptivity into next improvements in content moderation systems.

References

1. K. Palla et al., "Policy-as-prompt: Rethinking content moderation in the age of large language models," in Proc. 2025 ACM Conf. Fairness, Accountability, and Transparency, 2025, pp. 840-854.

2. P. Matan and P. Velvizhy, "A comprehensive review of supervised fine-tuning for large language models in creative applications and content moderation," in Proc. 2025 Int. Conf. Inventive Computation Technologies (ICICT), 2025, pp. 1294-1299.

3. H. Ma, C. Zhang, H. Fu, P. Zhao, and B. Wu, "Adapting large language models for content moderation: Pitfalls in data engineering and supervised fine-tuning," arXiv preprint arXiv:2310.03400, 2023.

4. M. Franco, O. Gaggi, and C. E. Palazzi, "Analyzing the use of large language models for content moderation with chatgpt examples," in Proc. 3rd Int. Workshop on Open Challenges in Online Social Networks, 2023, pp. 1-8.

5. H. Liu, H. Huang, X. Gu, H. Wang, and Y. Wang, "On calibration of LLM-based guard models for reliable content moderation," arXiv preprint arXiv:2410.10414, 2024.

6. H. Elesedy, P. M. Esperança, S. V. Oprea, and M. Ozay, "Lora-guard: Parameter-efficient guardrail adaptation for content moderation of large language models," in Proc. 2024 Conf. Empirical Methods in Natural Language Processing, 2024, pp. 11746-11765.

7. J. Wu et al., "Legilimens: Practical and unified content moderation for large language model services," in Proc. 2024 on ACM SIGSAC Conf. Computer and Communications Security, 2024, pp. 1151-1165.

8. M. Franco, O. Gaggi, and C. E. Palazzi, "Integrating content moderation systems with large language models," ACM Transactions on the Web, vol. 19, no. 2, pp. 1-21, 2025.

9. T. Huang, "Content moderation by LLM: from accuracy to legitimacy," Artificial Intelligence Review, vol. 58, no. 10, 320, 2025.

10. M. Kolla, S. Salunkhe, E. Chandrasekharan, and K. Saha, "Llm-mod: Can large language models assist content moderation?," in Extended Abstracts of the CHI Conf. on Human Factors in Computing Systems, 2024, pp. 1-8.

11. W. Zeng et al., "Shieldgemma: Generative ai content moderation based on gemma," arXiv preprint arXiv:2407.21772, 2024.

12. N. AlDahoul, M. J. T. Tan, H. R. Kasireddy, and Y. Zaki, "Advancing content moderation: Evaluating large language models for detecting sensitive content across text, images, and videos," arXiv preprint arXiv:2411.17123, 2024.

Large-Model-Driven Content Moderation Systems: Platform-Level Applications and Governance Practices

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite

ISSN

Make a Submission

Indexing & Abstracting