Studying equations
Equations are a vital part of a technical study. However, they can be challenging to study.
Intended for: BSc, MSc
There are a few important lessons when it comes to studying equations. Ignore them, and you will most likely give up, and make them a nightmare throughout your entire studies (and possibly afterwards):
a. Study equations slowly.
Many students do not take enough time to read equations, or skip them all together.
BAD: You do read the course book, but whenever you see an equation, you skip it, or only glance over it. You think others, who are 'good at equations', simply read these in one go and understand what they mean. Instead, you do not possess this magic skill, and have no clue what these symbols mean. In the beginning you still have some idea of what is going on in the text, but after a while you are totally lost.
GOOD: The solution to this problem is very simple: read equations slowly.
The big mistake students make here is that they believe others read equations at the same speed as they read text. This is not true! You should teach yourself a new reading method for equations. You have to find out what each symbol means, and what it represents (a scalar, a vector, a matrix, etc.). Then, you have to figure out why the equations makes sense, what it tries to express. Can you make a small numerical example to see what happens. What happens if you vary the input? etc.
This is an iterative process per equation, which is much slower than general reading. Studying a topic also means getting to know the naming conventions (what does a symbol generally mean and represent), which will make it easier and easier to understand equations.
That's why experts after a while seem to understand equations so fast: they made themselves familiar with what certain symbols usually mean, and what certain combinations of them express. This makes it seems like they understand equations effortlessly, but it actually did take a lot of effort initially (which you don't see anymore). Practice really pays off here (as it does everywhere).
b. Remember the principle, not the equation.
BAD: You decided to study the equations of this course carefully. However, now you are trying to exactly reproduce this weird combination of symbols, and all its variants that appear throughout the course. It feels like you have to remember words in a random language, with nothing to hold on to. You also start to doubt to whom this is actually useful.
GOOD: You do not try to remember any equation. Instead, you try to understand the underlying principle that the equation describes. Then, when you are given a piece of paper and a pen, you can just reason which symbols you need to write down, in which relations.
Example: For reinforcement learning, you should never remembered an exact Q-learning update in symbols. Instead, remember that you need to update a state-action value, which we denote by Q(s,a). Then, we will update this quantity with an new estimate, which you remember is the immediate reward we received plus the maximum Q-value we find at the next timestep: r + max_a' Q(s',a'). Finally, you remember to move the current estimate a little bit in the direction of this new estimate, through a learning rate: η. Combining these ingredients allows you to write down a Q-learning update equation: Q(s,a) ← (1-η) · Q(s,a) + η · [r + max_a' Q(s',a')]. It's a nightmare to remember this as a sequence of symbols, but easy to remember as a combination of intuitive smaller building blocks/concepts.
c. Approach math as a systematic way of thinking.
Another common mistake is to consider math as a nuisance, instead of a systematic way of thinking.
BAD: You consider mathematical notation as a range of silly tricks with symbols, and proper notation about being 'neat'. You just want to understand what you need to code, and think others use equations to show off, or just be annoying.
GOOD: You see that mathematical thinking is a systematic way of thinking. It is not a random language at all, but a very structured one.
Example:
Yes, there is a certain symbol, e.g. x, but it actually represents a vector of numbers, which you visualize in your mind as a point in a higher dimensional space (visualization of spaces is a very useful concept in mathematical thinking).
This vector actually goes into a function, e.g. f(x). In your mind, you add another dimension to the space, where the function maps towards.
Now your equation actually takes the derivative of f(x) with respect to x. You visualize in your mind that this fits a tangent (hyper)plane to the function at a particular point. When you move around over the function, the orientation of this (hyper)plan changes.
This derivative is actually used in some optimization approach, where we try to find some optimum of f(x). For example, for minimization, we iteratively take (small) steps in the direction of the negative gradient. You visualize different possible functions in your mind, and imagine why the step size ('learning rate' in machine learning) should be neither too big nor too small.
etc.