ALGOGENE | How to test market randomness?

Hiroki

How to test market randomness?

Quantitative Model

As title, how can I statistically test whether a given price series is random or not?

Appreciated if anyone can share some reference/ programming code.

PS: I don't have statistics background ...

4 0

Posted on : 2021-07-17 12:06:28.851000

Jeremy

There are many statistical research on this topic. You can look at "Wald–Wolfowitz runs test", which can be used to test whether the elements of a sequence are mutually independent.

From Wiki (https://en.wikipedia.org/wiki/Wald%E2%80%93Wolfowitz_runs_test),

"A run of a sequence is a maximal non-empty segment of the sequence consisting of adjacent equal elements. For example, the 22-element-long sequence

+ + + + − − − + + + − − + + + + + + − − − −

consists of 6 runs, 3 of which consist of "+" and the others of "−". The run test is based on the null hypothesis that each element in the sequence is independently drawn from the same distribution.

Under the null hypothesis, the number of runs in a sequence of N elements is a random variable whose conditional distribution given the observation of N₊ positive values and N₋ negative values (N = N₊ + N₋) is approximately normal, with:

These parameters do not assume that the positive and negative elements have equal probabilities of occurring, but only assume that the elements are independent and identically distributed. If the number of runs is significantly higher or lower than expected, the hypothesis of statistical independence of the elements may be rejected."

Python:

Here I write a simple python function to calculate p-value.

from scipy.stats import norm

def run_test(arr, isbinary=False):
    n_pos, n_neg, n, runs = 0, 0, 0, 0
    # ensure at least 2 input observations
    if len(arr)<2:
        return None
    # step 1. convert a price series into (pos/neg)
    if isbinary:
        arr_run = arr
    else:
        arr_run = []
        for i in range(1,len(arr)):
            v = 1 if arr[i]>arr[i-1] else -1
            arr_run.append(v)
    # step 2. count pos/neg runs
    for i in range(0,len(arr_run)):
        if arr_run[i]>0:
            n_pos+=1
        elif arr_run[i]<0:
            n_neg+=1
        if i>1 and arr_run[i]!=arr_run[i-1]:
            runs+=1
    n = n_pos+n_neg
    # compute 2-tailed p-value
    mean = 2*n_pos*n_neg/float(n)+1
    sd = ((mean-1)*(mean-2)/float(n-1))**(0.5)
    prob = norm.cdf(runs, loc=mean, scale=sd)
    pval = min(prob,1-prob)
    return pval

# ------------------------------------
# TEST CASE
arr = [10, 11, 12, 13, 11, 12, 10, 11, 9, 11, 15, 16, 15, 13, 11, 10]
p = run_test(arr)
print("p_value = ",p)

3 0

Posted on : 2021-07-23 13:14:39.806000

Hiroki

Original Posted by - Jeremy:
There are many statistical research on this topic. You can look at "Wald–Wolfowitz runs test", which can be used to test whether the elements of a sequence are mutually independent.

From Wiki (https://en.wikipedia.org/wiki/Wald%E2%80%93Wolfowitz_runs_test),

"A run of a sequence is a maximal non-empty segment of the sequence consisting of adjacent equal elements. For example, the 22-element-long sequence

+ + + + − − − + + + − − + + + + + + − − − −

consists of 6 runs, 3 of which consist of "+" and the others of "−". The run test is based on the null hypothesis that each element in the sequence is independently drawn from the same distribution.

Under the null hypothesis, the number of runs in a sequence of N elements is a random variable whose conditional distribution given the observation of N₊ positive values and N₋ negative values (N = N₊ + N₋) is approximately normal, with:

These parameters do not assume that the positive and negative elements have equal probabilities of occurring, but only assume that the elements are independent and identically distributed. If the number of runs is significantly higher or lower than expected, the hypothesis of statistical independence of the elements may be rejected."

Python:

Here I write a simple python function to calculate p-value.
from scipy.stats import norm

def run_test(arr, isbinary=False):
    n_pos, n_neg, n, runs = 0, 0, 0, 0
    # ensure at least 2 input observations
    if len(arr)<2:
        return None
    # step 1. convert a price series into (pos/neg)
    if isbinary:
        arr_run = arr
    else:
        arr_run = []
        for i in range(1,len(arr)):
            v = 1 if arr[i]>arr[i-1] else -1
            arr_run.append(v)
    # step 2. count pos/neg runs
    for i in range(0,len(arr_run)):
        if arr_run[i]>0:
            n_pos+=1
        elif arr_run[i]<0:
            n_neg+=1
        if i>1 and arr_run[i]!=arr_run[i-1]:
            runs+=1
    n = n_pos+n_neg
    # compute 2-tailed p-value
    mean = 2*n_pos*n_neg/float(n)+1
    sd = ((mean-1)*(mean-2)/float(n-1))**(0.5)
    prob = norm.cdf(runs, loc=mean, scale=sd)
    pval = min(prob,1-prob)
    return pval

# ------------------------------------
# TEST CASE
arr = [10, 11, 12, 13, 11, 12, 10, 11, 9, 11, 15, 16, 15, 13, 11, 10]
p = run_test(arr)
print("p_value = ",p)

Thanks for the reference!

Can you explain how to interpret the function's output? Higher value means it is random?

0 0

Posted on : 2021-07-23 16:02:08.285000

Jeremy

Original Posted by - Hiroki: Thanks for the reference!
Can you explain how to interpret the function's output? Higher value means it is random?

The hypothesis of run test is that:

Null: a given dataset is random.
Alternative: the dataset is non-random

My function computes the p-value which is usually compared to a significant level, say 5%.

If p-value is less than 5%, then we have 95% confidence to say that the given sequence of dataset is non-random.

0 0

Posted on : 2021-07-24 15:21:32.873000

David

Original Posted by - Jeremy:
The hypothesis of run test is that:
Null: a given dataset is random.
Alternative: the dataset is non-random

My function computes the p-value which is usually compared to a significant level, say 5%.
If p-value is less than 5%, then we have 95% confidence to say that the given sequence of dataset is non-random.

Run test just looks at directional moves (either increase or decrease), but ignore its magnitude.

Another approach is to test whether the time series is self-correlated or not. One of this kind of hypothesis tests is the Ljung–Box test (https://en.wikipedia.org/wiki/Ljung%E2%80%93Box_test). Instead of testing randomness at each distinct autocorrelation lag, it tests the "overall" randomness based on a number of lags

The Ljung–Box test is summarized below:

Hypothesis
- H₀: returns are uncorrelated
- H_A: returns are correlated
Lag k autocorrelation is denoted as
ρ_k = Corr(r_t , r_t-k ) for k = 1,2, ...
Sample autocorrelation coefficient is calculated as
Under H₀, ρ₀ = ρ₁ = ... = ρ_M = 0. Test statistics is
For large sample n, the test statistics Q asymptotically follows chi-square distribution 𝛘²_(M)
For significant level α, the critical region for rejection of the hypothesis of randomness is:
Q > 𝛘²_1-α,M

2 0

Posted on : 2021-07-27 13:20:41.810000

tony lam

Variance-Ratio Test of Random Walk is also applicable. Here's the idea.

Hypothesis H₀: returns are indentially and independently distributed (i.i.d).
Under H₀, we have
Var(r_t + r_t-1 + ... r_t-q+1) = q*Var(r_t) for q = 1,2,...
or expressed as variance ratio
VR(q) can further be expressed as
Under H₀, for large sample n

1 0

Posted on : 2021-07-30 14:43:28.731000

Hiroki

@David, @tony lam, thanks for the reference =)

Is there any python library for these tests?

0 0

Posted on : 2021-07-31 10:45:59.354000

Hot Topic
Question about ALGOGENE Python SDK
特斯拉Robotaxi：萬億賽道賭局
May I know why are my trade orders being rejected due to insufficient capital?
貼現窗率 vs LIBOR：了解金融世界的兩大關鍵指標
Way of submission of algo challenge backtest code
May I know if this error is triggered by insuffici
Product Input Simplification
Close position with market order
Is it possible to run the backtest strategy backend
Run machine learning algorithm in the backtest engine.

How to test market randomness?

Quantitative Model

Python:

Python:

Categories