Learn Python this summer Day 18: Introduction to Machine Learning

Learn Python this summer Day 18: Introduction to Machine Learning

Welcome back! Yesterday, we learned about data visualization with matplotlib. Today, we’ll dive into the basics of machine learning and explore some simple algorithms using scikit-learn. By the end of this day, you’ll have a foundational understanding of machine learning concepts and how to implement them in Python. Let’s get started!

What is Machine Learning?

Machine learning is a subset of artificial intelligence (AI) that involves training algorithms to learn from data and make predictions or decisions without being explicitly programmed. It can be used for various tasks, such as classification, regression, and clustering.

Installing Scikit-learn

If you haven’t installed scikit-learn yet, you can do so using pip:

pip install scikit-learn

Importing Scikit-learn

To start using scikit-learn, import the necessary modules in your Python script:

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

Supervised Learning

Supervised learning involves training a model on labeled data. The two main types are regression (predicting continuous values) and classification (predicting categorical values).

Example: Linear Regression

Linear regression is a simple algorithm used to predict a continuous target variable based on one or more predictor variables.

Example:

import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

# Generating some example data
np.random.seed(0)
X = 2 * np.random.rand(100, 1)
y = 4 + 3 * X + np.random.randn(100, 1)

# Splitting the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

# Creating and training the model
model = LinearRegression()
model.fit(X_train, y_train)

# Making predictions
y_pred = model.predict(X_test)

# Evaluating the model
mse = mean_squared_error(y_test, y_pred)
print(f"Mean Squared Error: {mse}")

# Plotting the results
plt.scatter(X, y, color='blue')
plt.plot(X_test, y_pred, color='red')
plt.xlabel("X")
plt.ylabel("y")
plt.title("Linear Regression")
plt.show()

Classification

Classification involves predicting a categorical target variable. An example algorithm is the k-nearest neighbors (KNN) classifier.

Example: K-Nearest Neighbors

Example:

import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score

# Generating some example data
np.random.seed(0)
X = np.random.rand(100, 2)
y = np.random.randint(0, 2, 100)

# Splitting the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

# Creating and training the model
model = KNeighborsClassifier(n_neighbors=3)
model.fit(X_train, y_train)

# Making predictions
y_pred = model.predict(X_test)

# Evaluating the model
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy}")

Unsupervised Learning

Unsupervised learning involves training a model on unlabeled data. The main types are clustering and dimensionality reduction.

Example: K-Means Clustering

Example:

import numpy as np
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans

# Generating some example data
np.random.seed(0)
X = np.random.rand(100, 2)

# Creating and training the model
model = KMeans(n_clusters=3, random_state=0)
model.fit(X)

# Predicting the clusters
y_pred = model.predict(X)

# Plotting the results
plt.scatter(X[:, 0], X[:, 1], c=y_pred, cmap='viridis')
plt.xlabel("X1")
plt.ylabel("X2")
plt.title("K-Means Clustering")
plt.show()

Practice Time!

Let’s put what we’ve learned into practice. Write a Python program that implements a simple machine learning algorithm using scikit-learn.

Example: Using linear regression to predict housing prices.

import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

# Generating some example data
np.random.seed(0)
X = 2 * np.random.rand(100, 1)
y = 4 + 3 * X + np.random.randn(100, 1)

# Splitting the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

# Creating and training the model
model = LinearRegression()
model.fit(X_train, y_train)

# Making predictions
y_pred = model.predict(X_test)

# Evaluating the model
mse = mean_squared_error(y_test, y_pred)
print(f"Mean Squared Error: {mse}")

# Plotting the results
plt.scatter(X, y, color='blue')
plt.plot(X_test, y_pred, color='red')
plt.xlabel("X")
plt.ylabel("y")
plt.title("Linear Regression")
plt.show()

Conclusion

Great job today! You’ve learned the basics of machine learning and implemented simple algorithms using scikit-learn. Tomorrow, we’ll focus on planning a final project that incorporates what you’ve learned so far. Keep practicing and having fun coding!

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *