NetuArk Posts Classifier (Ensemble Architecture)

This model is a novel ensemble classifier designed to categorize technology-related social media posts into their respective news sources. The model is trained to classify the following sources: - ArsTechnica - FT - GuardianTech - HackerNews - Slashdot - TechCrunch - TheVerge

Model Details

  • Architecture: Voting Classifier (Multinomial Naive Bayes + Logistic Regression)
  • Vectorization: TF-IDF (N-grams 1-3)
  • Accuracy: 99.81% on the NetuArk-6000 dataset.
  • Classes: HackerNews, TechCrunch, TheVerge, FT, GuardianTech, Slashdot, ArsTechnica.

Training Data

Trained on the Xerv-AI/netuark-posts-6000 dataset.

Usage

import joblib
model = joblib.load('netuark_ensemble_classifier.joblib')
prediction = model.predict(["New AI breakthrough on HackerNews"])
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train Phase-Technologies/netuark-classifier-ensemble