dataproc

In this module, functions for reading and pre-processing datasets are defined.

Functions

conv_str_fl(data) It converts string data to float for computation.
read_data(filename[, header]) It converts a CSV dataset to NumPy arrays for further operations like training the TwinSVM classifier.
read_libsvm(filename) It reads LIBSVM data files for doing classification using the TwinSVM model.
dataproc.conv_str_fl(data)[source]

It converts string data to float for computation.

Parameters:

data : array-like, shape (n_samples, n_features)

Training samples, where n_samples is the number of samples and n_features is the number of features.

Returns:

array-like

A numerical dataset which is suitable for futher computation.

dataproc.read_data(filename, header=True)[source]

It converts a CSV dataset to NumPy arrays for further operations like training the TwinSVM classifier.

Parameters:

filename : str

Path to the dataset file.

header : boolean, optional (default=True)

Ignores first row of dataset which contains header names.

Returns:

data_train : array-like, shape (n_samples, n_features)

Training samples in NumPy array.

data_labels : array-like, shape(n_samples,)

Class labels of training samples.

file_name : str

Dataset’s filename.

dataproc.read_libsvm(filename)[source]

It reads LIBSVM data files for doing classification using the TwinSVM model.

Parameters:

filename : str

Path to the LIBSVM data file.

Returns:

array-like

Training samples.

array-like

Class labels of training samples.

str

Dataset’s filename