Round Corner
Department of Computer and Information Science


Synthetic sample for extremely imbalance data

Data imbalance is a frequently occurring problem in a classification task where the number of samples in one class exceeds the amount in other classes. Quite often, the minority class data is of great importance representing concepts of interest and is diffiuclt to obtain in the real dataset. Lack of enough data samples results in data imbalance causing poor classification performance while training. Synthetic data generation techniques such as SMOT can address this issue, yet such methods suffer from overfitting and substantial noise. This research work aims at creating an efficient data generation technique overcoming challenges posed by existing state-of-the-art methods.



Ali Shariq Imran Ali Shariq Imran
Associate Professor
339 Bygg A
NTNU logo