Setting Up a Free Chat Moderation AI Service on Your Own Server

Introduction

In the digital age, ensuring online interactions remain civil and respectful is paramount. Whether you run a community forum, a chat service, or any platform where users can interact, moderating content becomes essential. While there are numerous external services that offer chat moderation, relying on them may not always be ideal due to cost, privacy concerns, or latency issues. But what if you could set up your own chat moderation AI service directly on your server, giving you full control and flexibility? In this article, we delve into how you can easily set up and run your own chat moderation software for free. From installing the necessary components on an Ubuntu server to creating an API service and even integrating it with Java and JavaScript, we’ll guide you step by step.

Advanced Data Analysis
user avatar
I have a linux ubuntu server, where I would like to run chat moderation AI service. I do not want to use any external API services. Such chat moderation AI model should support multilanguage moderation. I want you to provide me a list of available free models, that I can run on my 16GB RAM Ubuntu server. Than I will ask you in next questions for technical details about installation and of pick model and how to create an API service on top of it.

Certainly! Running a chat moderation AI service on your own server is a great way to maintain control and privacy. Here's a list of free models and frameworks that can be used for chat moderation, especially those that can be run on a server with 16GB of RAM:

  1. Spacy:

    • Description: Spacy is a popular NLP library that can be used to build custom text moderation tools. While it doesn't offer out-of-the-box moderation models, it's powerful enough to build your own based on labeled datasets.
    • Languages Supported: Multiple languages including English, German, Spanish, French, etc.
    • Website: Spacy
  2. TextBlob:

    • Description: TextBlob is a simpler NLP library based on NLTK and Pattern. It's good for basic text processing and sentiment analysis, which can be useful in a moderation context.
    • Languages Supported: Multiple languages, though English is the primary focus.
    • Website: TextBlob
  3. Perspective API (Jigsaw's Conversation AI):

    • Description: Originally developed by Jigsaw and Google's Counter Abuse Technology team, the Perspective API models are designed to detect toxic comments. You can run the models locally without the need for the API.
    • Languages Supported: Multiple languages.
    • Website: Perspective API
  4. Detoxify:

    • Description: Detoxify is a Python library that uses deep learning to classify toxic comments. The models are based on the Transformers library.
    • Languages Supported: Primarily English, but can be fine-tuned for other languages.
    • Website: Detoxify
  5. NLTK (Natural Language Toolkit):

    • Description: NLTK is a leading platform for building Python programs to work with human language data. It's more of a general-purpose NLP library, but with the right datasets, it can be adapted for chat moderation.
    • Languages Supported: Primarily English, but there are tools and corpora for other languages.
    • Website: NLTK

Given the requirements you mentioned and the hardware constraints (16GB RAM), I'd recommend starting with either Detoxify or Spacy. Detoxify has pre-trained models specifically for toxic comment classification, while Spacy offers flexibility to build your own moderation tools based on custom datasets.

Once you decide on a model or library, I can guide you through the installation process and how to set up an API service on top of it. Which one would you like to proceed with, or would you like more information on any of the options listed?

user avatar
Since I do not have any dataset, resources and knowledge to train custom model, I would like to continue with Detoxify. I have a fresh ubuntu server, nothing installed there. Can you please provide me with step by step commands that I need to run to install that model and create an API service on top of it?

Certainly! Let's proceed with setting up Detoxify on your Ubuntu server and then creating an API service on top of it.

Step 1: Install Python and Required Libraries

  1. Update your package lists:
bash
sudo apt update && sudo apt upgrade -y
  1. Install Python3 and pip3:
bash
sudo apt install python3 python3-pip -y
  1. Install virtualenv to create an isolated Python environment:
bash
pip3 install virtualenv

Step 2: Setup Virtual Environment and Install Detoxify

  1. Create a new directory for your project and navigate to it:
bash
mkdir detoxify_project && cd detoxify_project
  1. Create a virtual environment and activate it:
bash
virtualenv venv source venv/bin/activate
  1. Install Detoxify and its dependencies:
bash
pip install detoxify

Step 3: Create a Simple API Service using Flask

  1. Install Flask and other required libraries:
bash
pip install flask flask_restful
  1. Create a new Python file for your API:
bash
touch app.py
  1. Edit the file using nano (or any text editor you prefer):
bash
nano app.py
  1. Add the following code to app.py:
python
from flask import Flask, request, jsonify from flask_restful import Resource, Api from detoxify import Detoxify app = Flask(__name__) api = Api(app) class ModerateText(Resource): def post(self): text = request.json['text'] results = Detoxify('original').predict(text) return jsonify(results) api.add_resource(ModerateText, '/moderate') if __name__ == '__main__': app.run(debug=True, host='0.0.0.0', port=5000)

This code sets up a simple Flask API with one endpoint, /moderate, that expects a POST request with JSON content containing the text to be moderated.

Step 4: Run the API Service

  1. Start your Flask application:
bash
python app.py

The API will be running on port 5000, and you can send a POST request to http://your_server_ip:5000/moderate with a JSON body containing the text to be moderated.

Step 5: Secure Your API (Optional but Recommended)

For production use, you'll want to use a production-ready server like Gunicorn, secure your API with HTTPS using Nginx and Let's Encrypt, and possibly use a firewall like UFW.

If you'd like detailed steps on these, please let me know, and I'll guide you through it!

Remember, running the Flask app with debug=True is not recommended for production as it will expose sensitive information in the case of errors. Ensure you set it to debug=False or remove it entirely when you're ready for production.

user avatar
Let's assume I correctly installed everything as you described. Can you write me a simple JAVA class, that tests chat message by calling this API? I need exact code that works with exact structure of request and response.

Certainly! Let's create a simple Java class to interact with the Flask API we set up for Detoxify. We'll use the java.net.HttpURLConnection class to make the HTTP request and the org.json library to handle JSON data.

Prerequisites:

You'll need the org.json library. If you're using Maven, you can include it in your pom.xml:

xml
<dependency> <groupId>org.json</groupId> <artifactId>json</artifactId> <version>20210307</version> </dependency>

Java Class to Test the Chat Message:

java
import java.io.BufferedReader; import java.io.InputStreamReader; import java.io.OutputStream; import java.net.HttpURLConnection; import java.net.URL; import org.json.JSONObject; public class DetoxifyAPITest { private static final String API_URL = "http://your_server_ip:5000/moderate"; public static void main(String[] args) { String testMessage = "Your test message here"; JSONObject response = testChatMessage(testMessage); System.out.println(response.toString(4)); // Pretty print JSON response } public static JSONObject testChatMessage(String message) { try { URL url = new URL(API_URL); HttpURLConnection con = (HttpURLConnection) url.openConnection(); con.setRequestMethod("POST"); con.setRequestProperty("Content-Type", "application/json; utf-8"); con.setRequestProperty("Accept", "application/json"); con.setDoOutput(true); // Create JSON request body JSONObject jsonRequest = new JSONObject(); jsonRequest.put("text", message); // Send the POST request try (OutputStream os = con.getOutputStream()) { byte[] input = jsonRequest.toString().getBytes("utf-8"); os.write(input, 0, input.length); } // Read and return the response try (BufferedReader br = new BufferedReader(new InputStreamReader(con.getInputStream(), "utf-8"))) { StringBuilder response = new StringBuilder(); String responseLine; while ((responseLine = br.readLine()) != null) { response.append(responseLine.trim()); } return new JSONObject(response.toString()); } } catch (Exception e) { e.printStackTrace(); return null; } } }

Make sure to replace your_server_ip with the actual IP address or domain name of your server. When you run the DetoxifyAPITest class, it will send the test message to the API and print the JSON response.

The response will contain toxicity scores for different categories, which you can interpret and decide if a message should be moderated or allowed.

user avatar
What will the JSON response look like? Can you write me an example of a response, that found out, the text contains sexual harasment? And another example of a response, where API call returned that there was nothing harmful?

The Detoxify model provides scores indicating the likelihood that a given text falls into various categories of harmful content. The output from the model will have a score between 0 and 1 for each category. A higher score indicates a higher likelihood of that category being present in the text.

Here's a sample structure of the response:

json
{ "toxicity": 0.95, "severe_toxicity": 0.75, "obscene": 0.65, "threat": 0.15, "insult": 0.85, "identity_hate": 0.25 }

Given the above structure:

  1. Example response for a text containing sexual harassment:
json
{ "toxicity": 0.92, "severe_toxicity": 0.82, "obscene": 0.88, "threat": 0.12, "insult": 0.86, "identity_hate": 0.28 }

In this example, high scores in toxicity, severe_toxicity, obscene, and insult categories suggest that the text is likely harmful and contains elements of sexual harassment.

  1. Example response for a text where there's nothing harmful:
json
{ "toxicity": 0.05, "severe_toxicity": 0.02, "obscene": 0.03, "threat": 0.01, "insult": 0.04, "identity_hate": 0.02 }

Here, the scores are low across all categories, suggesting that the text is likely harmless.

When interpreting the results, you might want to set threshold values for each category based on which you'd take action. For example, if the toxicity score is above 0.75, you might decide to flag the text for review or block it outright. The exact thresholds will depend on your specific use case and the level of moderation you want to enforce.

user avatar
Thank you, now I understant what can I expect and how to work with that. Since I would like to use this API call not only from my JAVA server, but also directly from my webpage, can you provide me javascript equivalent of that JAVA class?

Absolutely! Here's a JavaScript equivalent using the Fetch API to make the call to your Detoxify moderation service. You can integrate this code directly into your web page:

javascript
const API_URL = "http://your_server_ip:5000/moderate"; async function testChatMessage(message) { try { const response = await fetch(API_URL, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ text: message }) }); const jsonResponse = await response.json(); console.log(jsonResponse); // Print the response to the console return jsonResponse; } catch (error) { console.error("Error calling the Detoxify API:", error); return null; } } // Test the function testChatMessage("Your test message here");

Make sure to replace your_server_ip with the actual IP address or domain name of your server.

A Note on Cross-Origin Resource Sharing (CORS): If you're calling this API from a web page that's hosted on a different domain or even a different port than your server, you'll run into CORS issues. To handle this, you'd need to add CORS headers to your Flask server responses. The simplest way to do this is to use the Flask-CORS extension.

  1. Install Flask-CORS:
bash
pip install Flask-CORS
  1. Modify your Flask app (app.py) to use Flask-CORS:
python
from flask_cors import CORS # . (rest of your imports) app = Flask(__name__) CORS(app) # This will enable CORS for all routes

This will add the necessary headers to responses from your Flask server, allowing you to make requests to the API from web pages hosted on different origins. Make sure to properly configure CORS in a production environment to only allow specific origins for security reasons.

user avatar
Thank you. Can Flask handle multiple parallel calls? Mywebsite has many people, I would like to make sure, the Flash API server can handle such load.

Flask, in its default configuration, is single-threaded and serves one request at a time. If you're expecting a lot of traffic or concurrent users, the default Flask server won't be sufficient. However, there are several ways to scale and optimize Flask to handle multiple parallel requests:

  1. Using Gunicorn:

    Gunicorn is a production-ready WSGI server that can spawn multiple worker processes or threads to handle requests. By using Gunicorn, you can effectively handle multiple requests in parallel.

    To use Gunicorn:

    • Install it:
      bash
      pip install gunicorn
    • Run your Flask application with Gunicorn:
      bash
      gunicorn -w 4 -b 0.0.0.0:5000 app:app

    Here, -w 4 specifies that you want to run 4 worker processes. Adjust this based on your server's CPU cores and expected traffic.

  2. Using Nginx as a Reverse Proxy:

    You can put Nginx in front of Flask (and Gunicorn) to serve as a reverse proxy. Nginx can handle client connections, serve static content, load balance between multiple Gunicorn instances, and more. This is a common setup for deploying Flask applications in production.

  3. Scaling Horizontally:

    If the traffic becomes too much for a single server, you can deploy your Flask app on multiple servers and use a load balancer (like Nginx or a cloud provider's load balancer) to distribute incoming requests among the servers.

  4. Enabling Threaded Mode:

    If you still want to use Flask's built-in server for some reason (though it's not recommended for production), you can enable threaded mode to allow handling of multiple requests:

    python
    if __name__ == '__main__': app.run(debug=True, host='0.0.0.0', port=5000, threaded=True)
  5. Rate Limiting:

    To protect against abuse or an unexpected surge in requests, consider implementing rate limiting. Flask has extensions like Flask-Limiter that can help with this.

  6. Database Connections:

    If your Flask application interacts with a database, ensure that the database connection pool is adequately sized to handle multiple concurrent requests.

Remember, while these optimizations allow Flask to handle more requests concurrently, there's still a limit. Regularly monitoring server metrics (like CPU, memory usage, response times) and adjusting configurations or scaling up/down as necessary is crucial for maintaining a responsive and reliable service.

user avatar
Thank you very much for your help.

You're welcome! If you have any more questions in the future, don't hesitate to reach out. Best of luck with your chat moderation AI service and website! Have a great day!


Summary

Online interactions have the potential to enrich our experiences, but they also come with the risk of toxic or harmful content. Moderation is the key to fostering positive interactions, and while many turn to external services, there's the option to set up your own AI-powered moderation tool. This article presents a comprehensive guide on how to install, run, and integrate a chat moderation AI service on an Ubuntu server using the Detoxify model. By the end of this guide, readers will have insights into:

  • Installing and setting up the Detoxify model on a Ubuntu server.
  • Creating a Flask API to serve the moderation model.
  • Scaling the Flask application to handle multiple requests.
  • Integrating the service with Java and JavaScript applications.

Join us as we explore the steps to achieve a self-hosted, efficient, and free chat moderation system!

Leave a Reply