flexcv.synthesizer
flexcv.synthesizer.generate_regression(m_features, n_samples, n_groups=5, n_slopes=1, random_seed=42, noise_level=0.1, fixed_random_ratio=0.01)
Generate a dataset for linear regression using the numpy default rng.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
m_features |
int
|
Number of features, i.e. columns, to be generated. |
required |
n_samples |
int
|
Number of rows to be generated. |
required |
n_groups |
int
|
Number of groups/clusters. (Default value = 5) |
5
|
n_slopes |
int
|
Number of columns in the feature matrix to be used as random slopes as well. (Default value = 1) |
1
|
noise_level |
float
|
The data will be generated with added standard normal noise which is multiplied with noise_level. (Default value = 0.1) |
0.1
|
fixed_random_ratio |
float
|
The ratio of the fixed effects to the random effects. (Default value = 0.01) |
0.01
|
random_seed |
int
|
The random seed to be used for reproducibility. (Default value = 42) |
42
|
Returns:
Type | Description |
---|---|
tuple
|
A tuple containing the following elements: (The feature matrix DataFrame, the target vector Series, the group labels Series, the random slopes DataFrame) |