ROM Generators¶
Model Generator Base¶
-
class
rom.generators.model_generator_base.ModelGeneratorBase(analysis_id, random_seed=None, **kwargs)[source]¶ Bases:
object-
evaluate(model, model_name, model_moniker, x_data, y_data, downsample, build_time, cv_time, covariates=None, scaler=None)[source]¶ Generic base function to evaluate the performance of the models.
Parameters: - model –
- model_name –
- x_data –
- y_data –
- downsample –
- build_time –
Returns: Ordered dict
-
train_test_validate_split(dataset, metamodel, downsample=None, scale=False)[source]¶ Use the built in method to generate the train and test data. This adds an additional set of data for validation. This vaildation dataset is a unique ID that is pulled out of the dataset before the test_train method is called.
# :param dataset: dataframe, data to process # :param covariates: list, dict of covariates and information # :param responses: list, of responses to keep in the dataset # :param validation_id: str, unique ID of model to extract :param kwargs: downsample - fraction of dataframe to keep (after validation data extraction) :return: dataframes, dataframe: 1) dataset with removed validation data, 2) validation data
-
Linear Model¶
-
class
rom.generators.linear_model.LinearModel(analysis_id, random_seed=None, **kwargs)[source]¶ Bases:
rom.generators.model_generator_base.ModelGeneratorBase
Random Forest Model¶
-
class
rom.generators.random_forest.RandomForest(analysis_id, random_seed=None, **kwargs)[source]¶ Bases:
rom.generators.model_generator_base.ModelGeneratorBase-
evaluate(model, model_name, model_type, x_data, y_data, downsample, build_time, cv_time, covariates=None, scaler=None)[source]¶ Evaluate the performance of the forest based on known x_data and y_data.
Parameters: - model –
- model_name –
- model_type –
- x_data –
- y_data –
- downsample –
- build_time –
- cv_time –
- covariates –
Returns:
-
save_cv_results(cv_results, response, downsample, filename)[source]¶ Save the cv_results to a CSV file. Data in the cv_results file looks like the following.
The CV results are the results of the GridSearch k-fold cross validation. The form of the results take the following from:
{ 'param_kernel': masked_array(data=['poly', 'poly', 'rbf', 'rbf'], mask=[False False False False]...) 'param_gamma': masked_array(data=[-- -- 0.1 0.2], mask=[True True False False]...), 'param_degree': masked_array(data=[2.0 3.0 - - --], mask=[False False True True]...), 'split0_test_score': [0.8, 0.7, 0.8, 0.9], 'split1_test_score': [0.82, 0.5, 0.7, 0.78], 'mean_test_score': [0.81, 0.60, 0.75, 0.82], 'std_test_score': [0.02, 0.01, 0.03, 0.03], 'rank_test_score': [2, 4, 3, 1], 'split0_train_score': [0.8, 0.9, 0.7], 'split1_train_score': [0.82, 0.5, 0.7], 'mean_train_score': [0.81, 0.7, 0.7], 'std_train_score': [0.03, 0.03, 0.04], 'mean_fit_time': [0.73, 0.63, 0.43, 0.49], 'std_fit_time': [0.01, 0.02, 0.01, 0.01], 'mean_score_time': [0.007, 0.06, 0.04, 0.04], 'std_score_time': [0.001, 0.002, 0.003, 0.005], 'params': [{'kernel': 'poly', 'degree': 2}, ...], }
Parameters: - cv_results –
- filename –
Returns:
-
Support Vector Regression¶
-
class
rom.generators.svr.SVR(analysis_id, random_seed=None, **kwargs)[source]¶ Bases:
rom.generators.model_generator_base.ModelGeneratorBase-
evaluate(model, model_name, model_moniker, x_data, y_data, downsample, build_time, cv_time, covariates=None, scaler=None)[source]¶ Evaluate the performance of the forest based on known x_data and y_data.
-
save_cv_results(cv_results, response, downsample, filename)[source]¶ Save the cv_results to a CSV file. Data in the cv_results file looks like the following.
- {
- ‘param_kernel’: masked_array(data=[‘poly’, ‘poly’, ‘rbf’, ‘rbf’],
- mask=[False False False False]…)
- ‘param_gamma’: masked_array(data=[– – 0.1 0.2],
- mask=[True True False False]…),
- ‘param_degree’: masked_array(data=[2.0 3.0 - - –],
- mask=[False False True True]…),
‘split0_test_score’: [0.8, 0.7, 0.8, 0.9], ‘split1_test_score’: [0.82, 0.5, 0.7, 0.78], ‘mean_test_score’: [0.81, 0.60, 0.75, 0.82], ‘std_test_score’: [0.02, 0.01, 0.03, 0.03], ‘rank_test_score’: [2, 4, 3, 1], ‘split0_train_score’: [0.8, 0.9, 0.7], ‘split1_train_score’: [0.82, 0.5, 0.7], ‘mean_train_score’: [0.81, 0.7, 0.7], ‘std_train_score’: [0.03, 0.03, 0.04], ‘mean_fit_time’: [0.73, 0.63, 0.43, 0.49], ‘std_fit_time’: [0.01, 0.02, 0.01, 0.01], ‘mean_score_time’: [0.007, 0.06, 0.04, 0.04], ‘std_score_time’: [0.001, 0.002, 0.003, 0.005], ‘params’: [{‘kernel’: ‘poly’, ‘degree’: 2}, …],
}
Parameters: - cv_results –
- filename –
Returns:
-