In probability and statistics, a copula is a function that captures the dependency structure between several random variables, independently of their marginal distributions. For random variables X1, X2, …, Xn, their marginal distributions describe their individual behavior. However, in finance, the relationships between these variables can be complex and non-linear. Copulas allow these relationships to be modeled.
Sklar’s theorem states that for a joint distribution F(X1, …, Xn) of random variables X1, …, Xn, there exists a copula C such that:
F(X1, …, Xn) = C(F1(X1), …, Fn(Xn))
where F1, …, Fn are the marginal cumulative distribution functions (CDFs) of X1, …, Xn. The copula C thus connects the marginal CDFs to form the joint distribution, thereby capturing the dependency between the variables.
Let’s consider an example to better understand this by looking at the running times (in minutes) of Alice and Bob over five races:
Race
Alice’s Time (X1) Bob’s Time (X2)
1
25 26
2 23 24
3 22 22
4 27 28
5 26 25
There is an apparent dependency between their times: when one runs faster, so does the other.
To use a copula, we first transform each runner’s race times into uniform variables using their empirical cumulative distribution functions (CDFs). The rank-based CDFs for Alice (F1) and Bob (F2)
are as follows:
Time (xi) F1(xi) for Alice F2(xi) for Bob
22
0,2 0,2
23 0,4 0,4
24 - 0,6
25 0,6 0,8
26 0,8 -
27 1,0 -
28 - 1,0
These CDFs transform the times into uniform variables, U1 and U2. For example, in Race 3, where both Alice and Bob ran in 22 minutes, their corresponding uniform values are U1 = F1(22) = 0.2 and
U2 = F2(22) = 0.2.
The Clayton copula, a type of Archimedean (*) copula,
captures asymmetric dependence, particularly in the lower tail (where both variables experience extreme low values simultaneously). Its formula is:
C(u1, u2) = (u1^(-θ) + u2^(-θ) - 1)^(-1/θ),
where u_1 and u_2 are values between 0 and 1, and θ (theta) controls the strength of dependence.
For our example, let’s assume θ = 2. For Race 3 (U1 = 0.2, U2 = 0.2), the copula value is calculated as follows:
1. u_1^(-θ) = 0.2^(-2) = 25,
2. u_2^(-θ) = 0.2^(-2) = 25,
3. u_1^(-θ) + u2^(-θ) - 1 = 25 + 25 - 1 = 49.
So, C(0.2, 0.2) = 49^(-1/2) = 1/7 ≈ 0.1429.
This value represents the joint probability that both runners will finish with performances below their 20th percentile, indicating strong dependence when they both perform
well. Linear
correlation measures linear dependence but does not capture nonlinear relationships or dependencies in distribution tails.
For instance if Alice and Bob tend to run either very fast or very slow together, a strong dependence in the tails would not be fully captured by linear correlation.
(*) The term "Archimedean" reflects how these copulas use a simple generator function to capture complex dependencies.
Écrire commentaire