Package simpleml.dataset
April 22, 2022 ยท View on GitHub
Tutorial - Idea and basic concepts | Interface | API | DSL
Table of Contents
- Classes
- Global functions
Class AttributeTransformer
No description available.
Constructor: Class has no constructor.
Class Dataset
A dataset with its data instances (e.g., rows and columns)
Constructor: Class has no constructor.
addAttribute (Instance Method )
Add a new attribute to the dataset with values according to a transformation function
Parameters:
newAttributeId: String- The ID of the new attributetransformer: AttributeTransformer- The attribute transformer to be used.newAttributeLabel: String? = null- The name of the new attribute.
Results:
dataset: Dataset- The updated dataset
dropAllMissingValues (Instance Method )
Drops instances with missing values
Parameters: None expected.
Results:
dataset: Dataset- The updated dataset
dropAttribute (Instance Method )
Drop a single attribute from a dataset
Parameters:
attribute: String- The attribute to drop from the dataset
Results:
dataset: Dataset- The updated dataset
dropAttributes (Instance Method )
Drop attributes from a dataset
Parameters:
vararg attributes: String- The list of attributes to drop from the dataset
Results:
dataset: Dataset- The updated dataset
dropMissingValues (Instance Method )
Drops instances with missing values in the specified attribute
Parameters:
attribute: String- Attribute whose empty values are dropped
Results:
dataset: Dataset- The updated dataset
exportDataAsFile (Instance Method )
Export any dataset to CSV file
Parameters:
filePath: String- No description available.
Results: None returned.
filterInstances (Instance Method )
Remove instances in a dataset according to a filter function
Parameters:
filterFunc: (instance: Instance) -> shouldKeep: Boolean- The filter function that returns either True (keep) or False (remove) for each instance
Results:
dataset: Dataset- The updated dataset
getRow (Instance Method )
Get a specific row of a dataset
Parameters:
rowNumber: Int- The number of the row to be retrieved
Results:
instance: Instance- The specified row
keepAttribute (Instance Method )
Retain a single attribute of a dataset
Parameters:
attribute: String- The attribute to retain in the dataset
Results:
dataset: Dataset- The updated dataset
keepAttributes (Instance Method )
Retain attributes of a dataset
Parameters:
vararg attributes: String- The list of attributes to retain in the dataset
Results:
dataset: Dataset- The updated dataset
sample (Instance Method )
Create a sample of a dataset
Parameters:
nInstances: Int- Number of instances in the sample
Results:
sample: Dataset- The sampled dataset
setTargetAttribute (Instance Method )
Set the specified attribute as prediction target
Parameters:
targetAttribute: String- The attribute to be predicted later on
Results:
dataset: Dataset- The updated dataset
splitIntoTrainAndTest (Instance Method )
Split a dataset in a train and a test dataset
Parameters:
trainRatio: Float- The percentage of instances to keep in the training datasetrandomState: Int? = null- A random seed to use for splitting
Results:
train: Dataset- The training datasettest: Dataset- The test dataset
splitIntoTrainAndTestAndLabels (Instance Method )
Split a dataset into four datasets: train/test and labels/features. Requires that a target attribute has been set before via setTargetAttribute()
Parameters:
trainRatio: Float- The percentage of instances to keep in the training datasetrandomState: Int? = null- A random seed to use for splitting
Results:
xTrain: Dataset- Features of the training datasetxTest: Dataset- Features of the test datasetyTrain: Dataset- Labels of the training datasetyTest: Dataset- Labels of the test dataset
transform (Instance Method )
Update existing attribute with values according to a transformation function
Parameters:
attributeId: String- The ID of the attribute to be replacedtransformer: AttributeTransformer- The attribute transformer to be used
Results:
dataset: Dataset- The updated dataset
transformDatatypes (Instance Method )
Convert all column values into numbers
Parameters: None expected.
Results:
dataset: Dataset- The updated dataset
Class DayOfTheYearTransformer
An attribute transformer to convert a date attribute to its day in the year.
Constructor parameters:
attributeId: String- The ID of the date attribute.
Class Instance
A single instance (e.g., row) of a dataset
Constructor: Class has no constructor.
getValue (Instance Method )
Return a specific value of the instance
Parameters:
attribute: String- The attribute whose value is returned
Results:
value: Any- The specified value
Class StandardNormalizer
A normalizer to normalize dataset values
Constructor parameters: None expected.
normalize (Instance Method )
Normalize all numeric values in the dataset
Parameters:
dataset: Dataset- Dataset to be normalized
Results:
normalizedDataset: Dataset- The normalized dataset
Class StandardScaler
A scaler to scale dataset values
Constructor parameters: None expected.
scale (Instance Method )
Scale all numeric values in the dataset
Parameters:
dataset: Dataset- Dataset to be scaled
Results:
scaledDataset: Dataset- The scaled dataset
Class TimestampTransformer
An attribute transformer to convert a date attribute to its timestamp.
Constructor parameters:
attributeId: String- The ID of the date attribute.
Class WeekDayTransformer
An attribute transformer to convert a date attribute to its weekday (as a string).
Constructor parameters:
attributeId: String- The ID of the date attribute.
Class WeekendTransformer
An attribute transformer to convert a date attribute to whether the date is on the weekend or not.
Constructor parameters:
attributeId: String- The ID of the date attribute.
Global Functions
Global Function joinTwoDatasets
Join two datasets into one dataset
Parameters:
dataset1: Dataset- The first datasetdataset2: Dataset- The second datasetattributeId1: String- The attribute of the first dataset to use for the joinattributeId2: String- The attribute of the second dataset to use for the joinsuffix1: String- The suffix to be attached to the attribute names of the first datasetsuffix2: String- The suffix to be attached to the attribute names of the second dataset
Results:
dataset: Dataset- The joined dataset
Global Function loadDataset
Load a dataset via its identifier
Parameters:
datasetID: String- Identifier of the dataset
Results:
dataset: Dataset- The loaded dataset
Global Function readDataSetFromCSV
Load a dataset from a CSV file
Parameters:
fileName: String- Path and name of the CSV filedatasetId: String- Identifier of the datasetseparator: String- Separator used in the filehasHeader: String- True, if the file has a header rownullValue: String- String that should be parsed as missing valuedatasetName: String- Name of the datasetcoordinateSystem: Int = 3857- Coordinate system used in the geometry columns of the dataset
Results:
dataset: Dataset- The loaded dataset
This file was created automatically. Do not change it manually!