dataproc¶

In this module, functions for reading and pre-processing datasets are defined.

Functions

`conv_str_fl`(data)	It converts string data to float for computation.
`read_data`(filename[, header])	It converts a CSV dataset to NumPy arrays for further operations like training the TwinSVM classifier.
`read_libsvm`(filename)	It reads LIBSVM data files for doing classification using the TwinSVM model.

dataproc.conv_str_fl(data)[source]¶

It converts string data to float for computation.

Parameters:

data : array-like, shape (n_samples, n_features)

Training samples, where n_samples is the number of samples and n_features is the number of features.

Returns:

array-like

A numerical dataset which is suitable for futher computation.

dataproc.read_data(filename, header=True)[source]¶

It converts a CSV dataset to NumPy arrays for further operations like training the TwinSVM classifier.

Parameters:

filename : str

Path to the dataset file.

header : boolean, optional (default=True)

Ignores first row of dataset which contains header names.

Returns:

data_train : array-like, shape (n_samples, n_features)

Training samples in NumPy array.

data_labels : array-like, shape(n_samples,)

Class labels of training samples.

file_name : str

Dataset’s filename.

dataproc.read_libsvm(filename)[source]¶

It reads LIBSVM data files for doing classification using the TwinSVM model.

Parameters:

filename : str

Path to the LIBSVM data file.

Returns:

array-like

Training samples.

array-like

Class labels of training samples.

str

Dataset’s filename