How Desire Paths can Transform your Branding and Public Relations
How Desire Paths can Transform your Branding and Public Relations
12 Steps to Create Videos

Oversampling & Undersampling in TF-IDF for Extremely Imbalanced Indonesian Text Classification [Video]

Categories
Sentiment Analysis

Oversampling & Undersampling in TF-IDF for Extremely Imbalanced Indonesian Text Classification

Comparing Oversampling and Undersampling in Different TF-IDF Vectorizers for Extremely Imbalanced Indonesian Language Short Text Classification
Authors: I Nyoman Prayana Trisna, Ni Wayan Emmy Rosiana Dewi, Muhammad Alam Pasirulloh (TELK 26510)

Even though it is considered a more traditional method compared to more modern algorithms, TF-IDF nevertheless produces excellent results in a range of text mining tasks. This study assesses the effectiveness of several TF-IDF modifications for short text classification. This research also addresses the issue of imbalanced datasets. To address the imbalanced issue, we incorporate standard, log-scaled, and boolean TF-IDF into short text classification, utilizing both undersampling and oversampling methods. We evaluate each experiment using precision, recall, and f-measure metrics. Combining boolean TF-IDF with the oversampling method yields the best results. In every experiment, oversampling methods outperform undersampling methods, although there are some cases where experiments with undersampling methods yield significant results.

TELKOMNIKA Telecommunication, Computing, Electronics and Control
http://telkomnika.uad.ac.id

Supported by Master Program of Electrical and Computer Engineering, Universitas Ahmad Dahlan, https://mee.uad.ac.id #yogyakarta
Admission: https://mee.uad.ac.id/pendaftaran/

#scopus #journal #publications #publication #uad #electricvehicle #EV #solution #environment #pollution #passenger #fieldorientedcontrolled #FOC #PermanentMagnetSynchronousMotor #PMSM #MATLAB #gradient #truck #regenerative #propulsion #MachineLearning #DeepLearning #InternetOfThings #Classification #CloudComputing #ConvolutionalNeuralNetwork #CNN #SupportVectorMachine #SVM #GeneticAlgorithm #IoT #Security #ArtificialIntelligence #Optimization #COVID-19 #ParticleSwarmOptimization #PSO #ImageProcessing #Clustering #NeuralNetwork #ArtificialNeuralNetwork #FuzzyLogic #RenewableEnergy #5G #WirelessSensorNetwork #WSN #DataMining #Cryptography #Photovoltaic #FeatureSelection #Encryption #Microcontroller #DistributedGeneration #FeatureExtraction #NaturalLanguageProcessing #NLP #TransferLearning #WirelessSensorNetworks #Prediction #SentimentAnalysis #PowerQuality #Simulation #DecisionTree #BigData #RandomForest #Arduino #Sensors #Segmentation #EnergyEfficiency #FPGA #MobileApplication #Algorithm #EnergyConsumption #MATLAB #Blockchain #PIDController #Sensor #Authentication #ComputerVision #THD #TotalHarmonicDistortion #Harmonics #RaspberryPi #FaceRecognition #ImageClassification #IntrusionDetectionSystem #LuminousFlux #QualityOfService #QoS #ElectricVehicle #EV #Network #Routing #SocialMedia #SosMed #Steganography #TextMining #Throughput #Accuracy #AugmentedReality #DeepNeuralNetwork #MANET #MultilevelInverter #NaiveBayes #Performance #Temperature #TextClassification #Elearning #LoadBalancing #NetworkLifetime #PrincipalComponentAnalysis #PCA #Android #BitErrorRate #BER #ColorHomogeneity #Efficiency #Healthcare #MPPT #Microgrid #MiescatteringTheory #RFID #KNN #OFDM #GPS #GSM

How to Market to Expensive Keywords
How to Market to Expensive Keywords
5 Steps to Creating Successful Ads