Kaggle Challenge 09 - Your First Machine Learning Model

5759 ワード

Kaggle Challenge 09 - Your First Machine Learning Model
Tutorial01
Import

import pandas as pd

melbourne_file_path = '../input/melbourne-housing-snapshot/melb_data.csv'
melbourne_data = pd.read_csv(melbourne_file_path) 
melbourne_data.columns

Index(['Suburb', 'Address', 'Rooms', 'Type', 'Price', 'Method', 'SellerG',
       'Date', 'Distance', 'Postcode', 'Bedroom2', 'Bathroom', 'Car',
       'Landsize', 'BuildingArea', 'YearBuilt', 'CouncilArea', 'Lattitude',
       'Longtitude', 'Regionname', 'Propertycount'],
      dtype='object')

Step 1: Specify Prediction Target
Quesition
print the list of columns in the dataset to find the name of the prediction target

y = ____

# Check your answer
step_1.check()

Solution

y = home_data.SalePrice

Step 2: Create X
Quesition
Step 2: Create X
Now you will create a DataFrame called X holding the predictive features.
Since you want only some columns from the original data, you'll first create a list with the names of the columns you want in X.
You'll use just the following columns in the list (you can copy and paste the whole list to save some typing, though you'll still need to add quotes): LotArea YearBuilt 1stFlrSF 2ndFlrSF FullBath BedroomAbvGr * TotRmsAbvGrd
After you've created that list of features, use it to create the DataFrame that you'll use to fit the model.

# Create the list of features below
feature_names = ___

# Select data corresponding to features in feature_names
X = ____

Solution

feature_names = ["LotArea", "YearBuilt", "1stFlrSF", "2ndFlrSF",
                      "FullBath", "BedroomAbvGr", "TotRmsAbvGrd"]

X=home_data[feature_names]

Step 3: Specify and Fit Mode
Quesition
Create a DecisionTreeRegressor and save it iowa_model. Ensure you've done the relevant import from sklearn to run this command.
Then fit the model you just created using the data in X and y that you saved above.

# from _ import _
#specify the model. 
#For model reproducibility, set a numeric value for random_state when specifying the model
iowa_model = ____

# Fit the model

Solution

from sklearn.tree import DecisionTreeRegressor
iowa_model = DecisionTreeRegressor(random_state=1)
iowa_model.fit(X, y)

Step 4: Make Predictions
Quesition

predictions = ____
print(predictions)

Solution

iowa_model.predict(X)

Tutorial02

from sklearn.metrics import mean_absolute_error

predicted_home_prices = melbourne_model.predict(X)
mean_absolute_error(y, predicted_home_prices)

from sklearn.model_selection import train_test_split

# split data into training and validation data, for both features and target
# The split is based on a random number generator. Supplying a numeric value to
# the random_state argument guarantees we get the same split every time we
# run this script.
train_X, val_X, train_y, val_y = train_test_split(X, y, random_state = 0)
# Define model
melbourne_model = DecisionTreeRegressor()
# Fit model
melbourne_model.fit(train_X, train_y)

# get predicted prices on validation data
val_predictions = melbourne_model.predict(val_X)
print(mean_absolute_error(val_y, val_predictions))

Step 1: Split Your Data
Quesition
Use the train_test_split function to split up your data.
Give it the argument random_state=1 so the check functions know what to expect when verifying your code.
Recall, your features are loaded in the DataFrame X and your target is loaded in y.

# Import the train_test_split function and uncomment
# from _ import _

# fill in and uncomment
# train_X, val_X, train_y, val_y = ____

Solution

from sklearn.model_selection import train_test_split
train_X, val_X, train_y, val_y = train_test_split(X, y, random_state=1)

Step 2: Specify and Fit the Model
Quesition
Create a DecisionTreeRegressor model and fit it to the relevant data. Set random_state to 1 again when creating the model.

# You imported DecisionTreeRegressor in your last exercise
# and that code has been copied to the setup code above. So, no need to
# import it again

# Specify the model
iowa_model = ____

# Fit iowa_model with the training data.

Solution

iowa_model = DecisionTreeRegressor(random_state=1)
iowa_model.fit(train_X, train_y)

Step 3: Make Predictions with Validation data
Quesition

# Predict with all validation observations
val_predictions = ____

Solution

val_predictions = iowa_model.predict(val_X)

Step 4: Calculate the Mean Absolute Error in Validation Data
Quesition

from sklearn.metrics import mean_absolute_error
val_mae = ____

# uncomment following line to see the validation_mae
#print(val_mae)

Solution

val_mae = mean_absolute_error(val_predictions, val_y)

Reference

この問題について(Kaggle Challenge 09 - Your First Machine Learning Model), 我々は、より多くの情報をここで見つけました https://velog.io/@ljsk99499/Kaggle09

テキストは自由に共有またはコピーできます。ただし、このドキュメントのURLは参考URLとして残しておいてください。

Collection and Share based on the CC Protocol

/usr/bin/ld: cannot find -lmysqlclient

スマートな方法でカスタムフォントをロードする方法