728x90

A perceptron is a simple algorithm that takes multiple input signals and produces a single output signal.

For example, imagine someone hits you. Depending on where you are hit, you may or may not make a sound. If multiple parts of your body are hit (multiple input signals), your response (e.g., saying "Ouch!") can be considered the output signal (0: no sound, 1: "Ouch!"). The perceptron is designed based on this concept and can be represented mathematically.

Structure of a Perceptron

The structure of a perceptron consists of the following components:

  • x₁, x₂: Input signals
  • w₁, w₂: Weights (determining the importance of each input)
  • θ (Theta): Bias (Threshold value)

The perceptron computes the output y using the following equation:

If (w₁ * x₁ + w₂ * x₂) > θ, then y = 1
Else, y = 0

In other words, if the weighted sum of the input signals exceeds a certain threshold θ, the perceptron outputs 1; otherwise, it outputs 0.


Implementing Logic Gates with Perceptrons

A perceptron can be used to implement simple logic circuits. Let’s look at an example: AND gate.

What is an AND Gate?

An AND gate outputs 1 only when both input signals x₁ and x₂ are 1. The following truth table illustrates this behavior:

x₁x₂y (Output)

0 0 0
0 1 0
1 0 0
1 1 1

Implementing an AND Gate with a Perceptron

Using the perceptron equation, we can set appropriate values for the weights and threshold to achieve the AND gate behavior:

  • w₁ = 0.5
  • w₂ = 0.5
  • θ = 0.6

Now, let's check if the perceptron correctly implements the AND gate:

If (w₁ * x₁ + w₂ * x₂) ≤ θ, then y = 0
If (w₁ * x₁ + w₂ * x₂) > θ, then y = 1

By substituting values from the truth table, we see that the perceptron correctly produces 1 only when both x₁ and x₂ are 1, matching the AND gate.

Implementing OR and NAND Gates

Similarly, by adjusting the weights and bias values, we can implement other logic gates like OR and NAND. For example:

  • OR Gate: Adjust weights so that the perceptron activates when either input is 1.
  • NAND Gate: Use negative weights and bias to invert the AND gate output.

Conclusion

The perceptron is a simple yet powerful algorithm that can be used to implement logic gates. While it is a basic model, it serves as the foundation for modern deep learning neural networks. Understanding perceptrons helps in grasping the core ideas behind more advanced machine learning models.

728x90
728x90

When studying deep learning, you often come across terms like epoch, batch, and iteration. These terms are essential for understanding how neural networks learn from data. In this post, let's clarify these concepts in a simple and intuitive way.

What is an Epoch?

An epoch in deep learning refers to one complete pass through the entire dataset during training.

For example, if we set 128 epochs, it means that the neural network will train on the entire dataset 128 times.

One training cycle = one forward pass + one backward pass

However, training for too many epochs can lead to overfitting, while too few epochs can cause underfitting. It’s crucial to find the right number of epochs to achieve an optimal model.

What is Batch Size?

 

A batch refers to a subset of the dataset used for training.

If a dataset contains 1000 data points, setting a batch size of 100 means the model will train using 100 samples at a time, instead of processing all 1000 at once.

Batch size is essential because loading the entire dataset into memory may be infeasible for large datasets. By breaking it into smaller batches, we make the training more efficient.

What is an Iteration?

An iteration is one update of the model's weights using a batch of data.

Since processing the entire dataset at once is impractical, we split it into batches. During training, each batch is processed separately, and the model updates its weights accordingly.

Example Breakdown

Let's say we have:

  • Dataset size = 500 samples
  • Epochs = 20
  • Batch size = 50

Since each batch contains 50 samples, it takes 10 iterations to process the entire dataset once (500 / 50 = 10 iterations per epoch).

  • Total number of updates (iterations) = Epochs × Iterations per epoch = 20 × 10 = 200 iterations

Summary

  • Epoch: One full pass through the entire dataset.
  • Batch: A subset of the dataset used in training.
  • Iteration: A single batch update of the model’s weights.

Understanding these terms is crucial for tuning your deep learning models effectively. Hope this helps clarify the concepts!


References

728x90
728x90

While working, I had the opportunity to join a task force related to machine learning and deep learning. Although I had some interest in the field, I lacked in-depth knowledge. I often came across terms I wasn't familiar with, so I decided to take this opportunity to organize and understand them one by one.

The first terms I will cover are Precision, Recall, and Accuracy—common evaluation metrics used when working with machine learning or deep learning models, especially in pattern recognition and classification tasks.

Understanding Precision, Recall, and Accuracy with an Example

Before diving into the definitions, let's consider a simple example:
Imagine we have an AI model that classifies a coin as either heads or tails. The model’s predictions can be categorized into four cases:

  • True Positive (TP): The actual coin is heads, and the AI correctly predicts heads.
  • False Positive (FP): The actual coin is tails, but the AI incorrectly predicts heads.
  • False Negative (FN): The actual coin is heads, but the AI incorrectly predicts tails.
  • True Negative (TN): The actual coin is tails, and the AI correctly predicts tails.

Key Concepts:

  • If the model's prediction matches the actual value (ground truth), it is classified as True; otherwise, it is False.
  • If the model predicts the positive class (e.g., heads), it is labeled as Positive; if it predicts the negative class (e.g., tails), it is labeled as Negative.

1) Precision (Positive Predictive Value)

So, where does Precision fit in the TP/FP/FN/TN framework?

Precision measures how many of the model’s positive predictions are actually correct. In other words, it is the ratio of correctly predicted positives (TP) to all predicted positives (TP + FP).

A high Precision score means that when the model predicts a positive result, it is often correct. However, if the model misses many actual positives (high FN), it cannot be considered reliable. This is why Recall must also be considered.


2) Recall (Sensitivity, True Positive Rate)

Recall measures how many actual positive cases the model correctly identifies. In other words, it is the ratio of correctly predicted positives (TP) to all actual positives (TP + FN).

Both Precision and Recall are commonly used metrics in classification models to evaluate their performance.


3) Accuracy

Accuracy measures how many predictions (both positive and negative) are correct out of all predictions made. It is calculated as follows:

Unlike Precision and Recall, which focus only on TP, Accuracy considers TN (correctly predicted negatives) as well, providing a more general evaluation of model performance.


Conclusion

  • Precision is useful when false positives are costly (e.g., spam detection).
  • Recall is important when missing a true positive is critical (e.g., medical diagnoses).
  • Accuracy gives a general idea of overall model correctness but can be misleading if the dataset is imbalanced.

By understanding these metrics, you can better evaluate and improve machine learning models!


References

Thank you for reading! 🚀

728x90
728x90

While studying machine learning and deep learning, I frequently came across the term annotation. To better understand its meaning, I decided to organize my thoughts on the topic.


1) What Is Annotation?

By definition, annotation means "adding notes or comments to explain something." In Korean, it is defined as:

"Providing an easy explanation for a word or sentence, or such a written note."

In programming and data contexts, annotation carries a similar meaning. A simple way to understand it is:

  • In programming, annotation is used to add comments to code. For example, in languages like C, comments are written using // to provide explanations for the code. (See Figure 1)
  • In data-related fields, annotation refers to labeling data to describe it. Labeling, as shown in (Figure 2), involves adding metadata to objects recognized in an image. For instance, in an object detection task, bounding boxes are labeled with categories like Car, Person, etc.

The Purpose of Annotation

Annotation helps others—whether programmers or machine learning models—understand code or data more easily.


2) Types of Data Annotation

From a data perspective, annotation can be classified based on how objects are labeled. It is similar to cutting out an object from an image (also known as "object masking").

① Bounding Box

The bounding box is the most common annotation method in object detection. A tight rectangular box is drawn around an object, and a class label is assigned.

  • It is easy to implement since only two diagonal points need to be marked.
  • However, it lacks detail, as it does not perfectly capture the object's shape.

② Polygon

A polygon annotation uses multiple points to outline the exact shape of an object.

  • This method provides high accuracy and detail.
  • However, it requires more effort, making it labor-intensive.

③ Point

In point annotation, objects are marked using single points.

  • It is very easy to implement but limited in capturing object features.
  • This method works well when identifying distinctive key features or counting objects in an image.

④ Keypoint

Keypoint annotation is used when the shape of an object needs to be detected.

  • It combines polygon and point annotation to outline the object.
  • Keypoint metadata includes the number of points and their order, ensuring consistency across similar objects.

⑤ Polyline

A polyline is created by connecting multiple points with a line.

  • This method is useful for detecting continuous structures like roads, lane markings, or other boundary lines.
  • It is commonly used in autonomous driving (ADAS systems) to detect lanes.

⑥ Cuboid

A cuboid annotation extends bounding boxes to 3D.

  • Unlike the 2D bounding box, cuboids provide depth information.
  • This method is widely used in autonomous driving to track vehicles in a 3D space.
  • More detailed metadata leads to better neural network performance.

Conclusion

Understanding these annotation types will help you choose the right labeling method based on the needs of your machine learning model. Selecting the appropriate annotation style is essential for building high-quality datasets.

728x90

+ Recent posts