Question.56 A company wants to predict the sale prices of houses based on available historical sales data. The target variable in the company’s dataset is the sale price. The features include parameters such as the lot size, living area measurements, non-living area measurements, number of bedrooms, number of bathrooms, year built, and postal code. The company wants to use multi-variable linear regression to predict house sale prices. Which step should a machine learning specialist take to remove features that are irrelevant for the analysis and reduce the model’s complexity? (A) Plot a histogram of the features and compute their standard deviation. Remove features with high variance. (B) Plot a histogram of the features and compute their standard deviation. Remove features with low variance. (C) Build a heatmap showing the correlation of the dataset against itself. Remove features with low mutual correlation scores. (D) Run a correlation check of all features against the target variable. Remove features with low target variable correlation scores. |
56. Click here to View Answer
Correct Answer: D
Feature selection is the process of reducing the number of input variables to those that are most relevant for predicting the target variable. One way to do this is to run a correlation check of all features against the target variable and remove features with low target variable correlation scores. This means that these features have little or no linear relationship with the target variable and are not useful for the prediction. This can reduce the model’s complexity and improve its performance. References:
* Feature engineering – Machine Learning Lens
* Feature Selection For Machine Learning in Python
Question.57 A manufacturing company needs to identify returned smartphones that have been damaged by moisture. The company has an automated process that produces 2.000 diagnostic values for each phone. The database contains more than five million phone evaluations. The evaluation process is consistent, and there are no missing values in the data. A machine learning (ML) specialist has trained an Amazon SageMaker linear learner ML model to classify phones as moisture damaged or not moisture damaged by using all available features. The model’s F1 score is 0.6. What changes in model training would MOST likely improve the model’s F1 score? (Select TWO.) (A) Continue to use the SageMaker linear learner algorithm. Reduce the number of features with the SageMaker principal component analysis (PCA) algorithm. (B) Continue to use the SageMaker linear learner algorithm. Reduce the number of features with the scikit- iearn multi-dimensional scaling (MDS) algorithm. (C) Continue to use the SageMaker linear learner algorithm. Set the predictor type to regressor. (D) Use the SageMaker k-means algorithm with k of less than 1.000 to train the model (E) Use the SageMaker k-nearest neighbors (k-NN) algorithm. Set a dimension reduction target of less than 1,000 to train the model. |
57. Click here to View Answer
Correct Answer: A,E
* Option A is correct because reducing the number of features with the SageMaker PCA algorithm can help remove noise and redundancy from the data, and improve the model’s performance. PCA is a dimensionality reduction technique that transforms the original features into a smaller set of linearly uncorrelated features called principal components. The SageMaker linear learner algorithm supports PCA as a built-in feature transformation option.
* Option E is correct because using the SageMaker k-NN algorithm with a dimension reduction target of less than 1,000 can help the model learn from the similarity of the data points, and improve the model’s performance. k-NN is a non-parametric algorithm that classifies an input based on the majority vote of its k nearest neighbors in the feature space. The SageMaker k-NN algorithm supports dimension reduction as a built-in feature transformation option.
* Option B is incorrect because using the scikit-learn MDS algorithm to reduce the number of features is not a feasible option, as MDS is a computationally expensive technique that does not scale well to large datasets. MDS is a dimensionality reduction technique that tries to preserve the pairwise distances between the original data points in a lower-dimensional space.
* Option C is incorrect because setting the predictor type to regressor would change the model’s objective from classification to regression, which is not suitable for the given problem. A regressor model would output a continuous value instead of a binary label for each phone.
* Option D is incorrect because using the SageMaker k-means algorithm with k of less than 1,000 would not help the model classify the phones, as k-means is a clustering algorithm that groups the data points into k clusters based on their similarity, without using any labels. A clustering model would not output a binary label for each phone.
Amazon SageMaker Linear Learner Algorithm
Amazon SageMaker K-Nearest Neighbors (k-NN) Algorithm
[Principal Component Analysis – Scikit-learn]
[Multidimensional Scaling – Scikit-learn]
Question.58 A global financial company is using machine learning to automate its loan approval process. The company has a dataset of customer information. The dataset contains some categorical fields, such as customer location by city and housing status. The dataset also includes financial fields in different units, such as account balances in US dollars and monthly interest in US cents. The company’s data scientists are using a gradient boosting regression model to infer the credit score for each customer. The model has a training accuracy of 99% and a testing accuracy of 75%. The data scientists want to improve the model’s testing accuracy. Which process will improve the testing accuracy the MOST? (A) Use a one-hot encoder for the categorical fields in the dataset. Perform standardization on the financial fields in the dataset. Apply L1 regularization to the data. (B) Use tokenization of the categorical fields in the dataset. Perform binning on the financial fields in the dataset. Remove the outliers in the data by using the z-score. (C) Use a label encoder for the categorical fields in the dataset. Perform L1 regularization on the financial fields in the dataset. Apply L2 regularization to the data. (D) Use a logarithm transformation on the categorical fields in the dataset. Perform binning on the financial fields in the dataset. Use imputation to populate missing values in the dataset. |
58. Click here to View Answer
Correct Answer: A
The question is about improving the testing accuracy of a gradient boosting regression model. The testing accuracy is much lower than the training accuracy, which indicates that the model is overfitting the training data. To reduce overfitting, the following steps are recommended:
* Use a one-hot encoder for the categorical fields in the dataset. This will create binary features for each category and avoid imposing an ordinal relationship among them. This can help the model learn the patterns better and generalize to unseen data.
* Perform standardization on the financial fields in the dataset. This will scale the features to have zero mean and unit variance, which can improve the convergence and performance of the model. This can also help the model handle features with different units and ranges.
* Apply L1 regularization to the data. This will add a penalty term to the loss function that is proportional to the absolute value of the coefficients. This can help the model reduce the complexity and select the most relevant features by shrinking the coefficients of less important features to zero.
1: AWS Machine Learning Specialty Exam Guide
2: AWS Machine Learning Specialty Course
3: AWS Machine Learning Blog
Question.59 A machine learning (ML) specialist is using Amazon SageMaker hyperparameter optimization (HPO) to improve a model’s accuracy. The learning rate parameter is specified in the following HPO configuration: During the results analysis, the ML specialist determines that most of the training jobs had a learning rate between 0.01 and 0.1. The best result had a learning rate of less than 0.01. Training jobs need to run regularly over a changing dataset. The ML specialist needs to find a tuning mechanism that uses different learning rates more evenly from the provided range between MinValue and MaxValue. Which solution provides the MOST accurate result? (A) Modify the HPO configuration as follows:Select the most accurate hyperparameter configuration form this HPO job. (B) Run three different HPO jobs that use different learning rates form the following intervals for MinValue and MaxValue while using the same number of training jobs for each HPO job:[0.01, 0.1][0.001, 0.01] [0.0001, 0.001]Select the most accurate hyperparameter configuration form these three HPO jobs. (C) Modify the HPO configuration as follows:Select the most accurate hyperparameter configuration form this training job. (D) Run three different HPO jobs that use different learning rates form the following intervals for MinValue and MaxValue. Divide the number of training jobs for each HPO job by three:[0.01, 0.1][0.001, 0.01] [0.0001, 0.001]Select the most accurate hyperparameter configuration form these three HPO jobs. |
59. Click here to View Answer
Correct Answer: C
The solution C modifies the HPO configuration to use a logarithmic scale for the learning rate parameter. This means that the values of the learning rate are sampled from a log-uniform distribution, which gives more weight to smaller values. This can help to explore the lower end of the range more evenly and find the optimal learning rate more efficiently. The other solutions either use a linear scale, which may not sample enough values from the lower end, or divide the range into sub-intervals, which may miss some combinations of hyperparameters. References:
* How Hyperparameter Tuning Works – Amazon SageMaker
* Tuning Hyperparameters – Amazon SageMaker
Question.60 A Machine Learning Specialist is working with a media company to perform classification on popular articles from the company’s website. The company is using random forests to classify how popular an article will be before it is published A sample of the data being used is below. Given the dataset, the Specialist wants to convert the Day-Of_Week column to binary values. What technique should be used to convert this column to binary values. (A) Binarization (B) One-hot encoding (C) Tokenization (D) Normalization transformation |
60. Click here to View Answer
Correct Answer: B
One-hot encoding is a technique that can be used to convert a categorical variable, such as the Day-Of_Week column, to binary values. One-hot encoding creates a new binary column for each unique value in the original column, and assigns a value of 1 to the column that corresponds to the value in the original column, and 0 to the rest. For example, if the original column has values Monday, Tuesday, Wednesday, Thursday, Friday, Saturday, and Sunday, one-hot encoding will create seven new columns, each representing one day of the week. If the value in the original column is Tuesday, then the column for Tuesday will have a value of 1, and the other columns will have a value of 0. One-hot encoding can help improve the performance of machine learning models, as it eliminates the ordinal relationship between the values and creates a more informative and sparse representation of the data.
One-Hot Encoding – Amazon SageMaker
One-Hot Encoding: A Simple Guide for Beginners | by Jana Schmidt …
One-Hot Encoding in Machine Learning | by Nishant Malik | Towards …