Notes

Intro to Calculus
- History
  - Newton
  - Leibniz
- Differential
- Integral (area under the normal curve from one point to another)
  - probability theory
  - z score
  - z table

ML Calculus
- ML Algos
- finding maxima and minima, essential for gradient based optimization
- loss function Sum (y cap - y) square, did not get the residual part etc,measure how accurate or inaccurate our model is

loss function in calculus

Absolutely! Let's tackle your question step by step.
 
### 1. Loss Function
 
In machine learning, a model is trained to predict an output based on some input. The "truth" or real-world output is usually known, and the predicted output is what the model gives us. The difference between the real output and the predicted output is called the "error" or "residual".
 
A "loss function" is a way to measure this error. It quantifies how well or poorly our model is doing.
 
### ELI5: Loss Function
 
Imagine you're trying to shoot arrows at a target. The bullseye is the true value (or the target value). Where your arrow lands is the predicted value. The distance between the bullseye and where your arrow lands is the error. The goal is to have your arrow (predicted value) as close as possible to the bullseye (true value). The loss function is like a score that tells you how far off you were.
 
### 2. Residual
 
The residual is just another term for the error. It's the difference between the observed (true) value and the predicted value.
 
Mathematically, if \( y \) is the true value and \( \hat{y} \) (read as "y hat") is the predicted value, the residual is:
 
\[
\text{residual} = y - \hat{y}
\]
 
### 3. The Given Loss Function: \(\sum ( \hat{y} - y )^2\)
 
This is called the "Mean Squared Error" (MSE) when you average it out over all samples. It squares the residuals. Why square? Squaring ensures:
 
1. The error is always positive (because a negative error squared becomes positive).
2. Larger errors are penalized more than smaller errors.
 
### ELI5: Mean Squared Error
 
Let's go back to the arrow analogy. Instead of just measuring how far each arrow is from the bullseye, you square that distance. So, if one arrow is twice as far off as another, its penalty isn't just double—it's quadrupled. This way, you're super motivated to get each arrow as close as possible!
 
### Real-World Applications:
 
- **General**: MSE is used in regression problems like predicting housing prices, stock prices, weather forecasting, etc.
  
- **Machine Learning**: MSE is commonly used in training neural networks for regression problems. It helps adjust the weights and biases in the network to reduce the error.
 
- **Trading**: If you're predicting stock prices or any other financial metric, MSE can help gauge the accuracy of your model. A lower MSE would indicate a more reliable model, which is crucial for trading decisions.
 
For **algo trading**, imagine you have a model that predicts the closing price of a stock based on various factors. Using MSE during the training phase ensures that the model's predictions are as close as possible to the actual closing prices. This, in turn, helps in making better trading decisions.
 
For **expiry day trading**, accurate predictions can be very crucial given the high volatility. Using a model with a low MSE can aid traders in making quick decisions with higher confidence.
 
Remember, while MSE is a popular loss function, it's not the only one, and depending on the problem and the data distribution, other loss functions might be more appropriate.

RSE vs MSE

Both the Residual Sum of Squares (RSS or RSE) and the Mean Squared Error (MSE) are metrics used to measure the discrepancy between the observed values and the values predicted by a model. They are especially common in regression analysis. Here's a breakdown of each:
 
### 1. Residual Sum of Squares (RSS or RSE):
 
The Residual Sum of Squares measures the total discrepancy between the observed data and the values predicted by the model.
 
Mathematically, if \( y_i \) is the true value of the \( i^{th} \) sample and \( \hat{y}_i \) is the predicted value, the RSS is given by:
 
\[
RSS = \sum_{i=1}^{n} (y_i - \hat{y}_i)^2
\]
 
Where \( n \) is the total number of samples.
 
### 2. Mean Squared Error (MSE):
 
The Mean Squared Error is the average of the squared discrepancies (or errors) between the observed values and the values predicted by the model.
 
Mathematically, it's:
 
\[
MSE = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2
\]
 
Again, \( n \) is the total number of samples.
 
### Relationship:
 
You can see that the relationship between RSS and MSE is:
 
\[
MSE = \frac{RSS}{n}
\]
 
### ELI5:
 
- **RSS (RSE)**: Imagine you're shooting arrows at a target. For each arrow, measure how far it is from the bullseye, square that distance, and then add up all these squared distances for all the arrows. That's your RSS. It's like a total score of all your mistakes.
  
- **MSE**: Now, instead of just looking at the total score (RSS), you want to know the average mistake you made per arrow. So, you divide the RSS by the number of arrows you shot. That's your MSE. It gives an average sense of your accuracy.
 
### Real-world applications and importance:
 
In the context of **trading**:
 
- **RSS**: Can give you a total measure of how off your model's predictions were over a period. For instance, if you're predicting stock prices for a month, the RSS will tell you the total error your model made over that month.
  
- **MSE**: Offers an average error per prediction. In the same example, if you're predicting stock prices daily for a month, the MSE will tell you, on average, how far off your model was on a daily basis.
 
For **algo trading** or **expiry day trading**, where multiple predictions might be made in a short time frame, the MSE can provide a more normalized measure of the model's accuracy, helping traders understand the average reliability of the model's predictions. On the other hand, the RSS will give a cumulative sense of the mistakes over a period.
 
It's also worth noting that both RSS and MSE are valuable metrics, but neither tells the full story on its own. Other metrics, such as the R-squared value or the Mean Absolute Error (MAE), might also be useful depending on the specific scenario.

Main idea in calculus finding the maxima and minima
functions
- double the money function
- every algo is a function
- types of functions
- wolfram alpha (great visualizations by entering the equations)
- vertical line test - to test if that is a function - it it intersects at one point, then its a fn, otherwise its not
- x^2 is a fn, but sqrt(x) is not? we are dealing with imaginary numbers
- all equations are not functions

Related Concepts Limits