module varspark.core¶
This module includes the core variant-spark API.
-
class
varspark.core.
FeatureSource
(_jvm, _vs_api, _jsql, sql, _jfs)[source]¶ -
importance_analysis
(**kwargs)[source]¶ Builds random forest classifier.
Parameters: - label_source – The ingested label source
- n_trees (int) – The number of trees to build in the forest.
- mtry_fraction (float) – The fraction of variables to try at each split.
- oob (bool) – Should OOB error be calculated.
- seed (int) – Random seed to use.
- batch_size (int) – The number of trees to build in one batch.
- var_ordinal_levels (int) –
Returns: Importance analysis model.
Return type:
-
-
class
varspark.core.
ImportanceAnalysis
(_jia, sql)[source]¶ Model for random forest based importance analysis
-
varspark.core.
VariantsContext
¶ alias of
varspark.core.VarsparkContext
-
class
varspark.core.
VarsparkContext
(ss, silent=False)[source]¶ The main entry point for VariantSpark functionality.
-
load_label
(**kwargs)[source]¶ Loads the label source file
Parameters: - label_file_path – The file path for the label source file
- col_name – the name of the column containing labels
-