2024-12-15

In this example, we will first load the MNIST dataset, using Keras, and then define an LF that uses a digit classification model to classify the digits in each image. We then load the MNIST images into a Snorkel dataset and apply the LF to generate labels for the specified digit. Finally, we visualize the labels using Snorkel’s viewer.

Note that, in this example, we assume that you have already trained a digit classification model and saved it as a file named digit_classifier.h5. You can replace this with any other model of your choice. Also, make sure to provide the correct path to the model file. Finally, the labels generated by the LF will be 1 if the image has the specified digit, and -1 if it doesn’t have it:
#Importing Libraries
import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import load_model

In this block, TensorFlow is imported, along with specific modules needed to work with the MNIST dataset and pre-trained models.

The MNIST dataset is loaded into two sets – x_test contains the images, and y_test contains the corresponding labels. The training set is not used in this snippet:
(_, _), (x_test, y_test) = mnist.load_data()

A pre-trained model is loaded using the load_model function. Ensure to replace mnist_model.h5 with the correct path to your pre-trained model file:
model = load_model(‘mnist_model.h5’)

The pixel values of the images are normalized to be in the range [0, 1] by converting the data type to float32 and dividing by 255:
x_test = x_test.astype(‘float32’) / 255

The images are reshaped to match the input shape expected by the model, which is (batch_size, height, width, and channels):
x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)

Predictions are made on the test dataset using the pre-trained model, and the predictions for the first image are printed:
predictions = model.predict(x_test)
print(“predictions”,predictions[0])

Class labels for the MNIST digits (0–9) are created as strings and printed:
class_labels = [str(i) for i in range(10)]
print(“class_labels:”, class_labels

The script iterates through the test dataset, printing the index of the maximum prediction value, the predicted digit, and the actual digit label for each image:
for i in range(len(x_test)):
    print(“maxpredict”, predictions[i].argmax())
    predicted_digit = class_labels[predictions[i].argmax()]
    actual_digit = str(y_test[i])
    print(f”Predicted: {predicted_digit}, Actual: {actual_digit}”)

Here is the output:

Figure 5.3 – The output of digital classification

Let us see another example of defining rules using a pre-trained classifier for image labeling. In the following example, we will use a pre-trained model, YOLO V3, to detect a person in the image, and then we will apply an LF to label the large set of image data.

Leave a Reply

Your email address will not be published. Required fields are marked *