Reactive Machines AI: Based on present actions, it cannot use previous experiences to form current decisions and simultaneously update their memory.
Example: Deep Blue
Limited Memory AI: Used in self-driving cars. They detect the movement of vehicles around them constantly and add it to their memory.
Theory of Mind AI: Advanced AI that has the ability to understand emotions, people and other things in the real world.
Self Aware AI: AIs that posses human-like consciousness and reactions. Such machines have the ability to form self-driven actions.
Artificial Narrow Intelligence (ANI): General purpose AI, used in building virtual assistants like Siri.
Artificial General Intelligence (AGI): Also known as strong AI. An example is the Pillo robot that answers questions related to health.
Artificial Superhuman Intelligence (ASI): AI that possesses the ability to do everything that a human can do and more. An example is the Alpha 2 which is the first humanoid ASI robot.
One of the easiest ways to handle missing or corrupted data is to drop those rows or columns or replace them entirely with some other value.
There are two useful methods in Pandas:
IsNull() and dropna() will help to find the columns/rows with missing data and drop them
Fillna() will replace the wrong values with a placeholder value
Supervised learning involves training a model on labeled data, where the input comes with corresponding output labels. Examples include classification and regression problems. Unsupervised learning works with unlabeled data and aims to find patterns or structure within the data. Examples include clustering (like k-means) and association (like Apriori algorithm).
Bias refers to the error introduced by approximating a real-world problem, which might be complex, by a simpler model. High bias can cause underfitting. Variance refers to the model's sensitivity to fluctuations in the training data. High variance can cause overfitting. The tradeoff: As you reduce bias by making the model more complex, variance increases and vice versa. Finding the right balance is key to building robust models.
Precision is the ratio of true positive predictions to the total predicted positives. It measures the accuracy of the positive predictions. Recall is the ratio of true positives to the actual positives. It measures the model's ability to capture all positive instances in the dataset.
A decision tree splits the data into subsets based on the value of input features. This process continues recursively, forming a tree where each node represents a feature, and each branch represents a decision rule, leading to a leaf node (final decision or classification).
Data Collection: Gathering and preparing the dataset. Data Preprocessing: Cleaning and transforming the data (handling missing values, scaling, encoding, etc.). Model Selection: Choosing the right algorithm for the task. Model Training: Training the model on the training data. Evaluation: Evaluating the model's performance using validation data. Hyperparameter Tuning: Adjusting hyperparameters to improve the model's performance. Deployment: Implementing the trained model for real-world use.
Overfitting occurs when a model learns the training data too well, including noise and outliers, leading to poor performance on new data. Prevention methods: Use cross-validation, Simplify the model (reduce complexity), Use regularization techniques (L1, L2), Increase the training dataset, Pruning in decision trees.
The k-means algorithm partitions the data into k clusters, where each data point belongs to the cluster with the nearest mean. The algorithm iterates through steps: Initialize k centroids randomly. Assign each data point to the nearest centroid. Update centroids by calculating the mean of the assigned points. Repeat until convergence (centroids no longer change).
CNNs are designed for spatial data like images. They use convolutional layers to detect features, pooling layers to reduce dimensions, and fully connected layers for classification. RNNs are designed for sequential data like time series or text. They have loops within the network allowing information to be retained across time steps, making them suitable for tasks like language modeling and speech recognition.
Gradient Descent is an optimization algorithm used to minimize the cost function by iteratively updating the model parameters in the opposite direction of the gradient. Variants include: Stochastic Gradient Descent (SGD): Updates parameters for each training example. Mini-batch Gradient Descent: Updates parameters based on small batches of data. Momentum-based Gradient Descent: Accelerates convergence using a momentum term. Adam (Adaptive Moment Estimation): Combines the advantages of AdaGrad and RMSProp algorithms.
Cross-validation is a technique to evaluate the model's performance by dividing the dataset into training and validation sets multiple times. The most common method is k-fold cross-validation. Importance: It provides a better estimate of the model's ability to generalize to new data and helps prevent overfitting.