Mathematical Notations#
Notation  | 
Definition  | 
|---|---|
\(n\) or \(m\)  | 
The number of observations in a dataset. Typically \(n\) is used, but sometimes \(m\) is required to distinguish two datasets, e.g., the training set and the inference set.  | 
\(p\) or \(r\)  | 
The number of features in a dataset. Typically \(p\) is used, but sometimes \(r\) is required to distinguish two datasets.  | 
\(a \times b\)  | 
The dimensionality of a matrix (dataset) has \(a\) rows (observations) and \(b\) columns (features).  | 
\(|A|\)  | 
Depending on the context may be interpreted as follows: 
  | 
\(\|x\|\)  | 
The \(L_2\)-norm of a vector \(x \in \mathbb{R}^d\), 
\[\|x\| =  \sqrt{ x_1^2 + x_2^2 + \dots + x_d^2 }.\] 
 | 
\(\mathrm{sgn}(x)\)  | 
Sign function for \(x \in \mathbb{R}\), 
\[\begin{split}\mathrm{sgn}(x)=\begin{cases}
   -1, x < 0,\\
    0, x = 0,\\
    1, x > 0.
\end{cases}\end{split}\] 
 | 
\(x_i\)  | 
In the description of an algorithm, this typically denotes the \(i\)-th feature vector in the training set.  | 
\(x'_i\)  | 
In the description of an algorithm, this typically denotes the \(i\)-th feature vector in the inference set.  | 
\(y_i\)  | 
In the description of an algorithm, this typically denotes the \(i\)-th response in the training set.  | 
\(y'_i\)  | 
In the description of an algorithm, this typically denotes the \(i\)-th response that needs to be predicted by the inference algorithm given the feature vector \(x'_i\) from the inference set.  |