-
Notifications
You must be signed in to change notification settings - Fork 1
jsliu/OLIC
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
******************************************************************************
About the model:
The date we analysed is segmented by a series of changepoints, the number of
which is unknown. We have considered two slightly different piecewise
polynomial model for it.
(i) The first model is constructed based upon the assumption that the
observations before and after a changepoint are independent of each other, so
the model in each segment is a q-th order linear regression model
Y = H x beta + eps
where H is a s x q matrix with s = t_i+1 - t_i, the length between two
changepoints:
| 1, x_((t_i)+1)-x_(t_i), ..., (x_((t_i)+1)-x(t_i))^(q-1) |
| .. .. ..., .. |
| 1, x_(t_i+1)-x_(t_i), ..., (x_(t_i+1)-x(t_i))^(q-1) |,
and eps_i follows a zero mean normal distribution. Note that Y and eps are a
vector of observations and noises respectively. The model allows for an
uncertainty, which means the order q in each segment is unknow.
(ii) The second model sets an constraint on the first model so that we can
check if the lines are connected or not at each changepoint and then make a
proper the decision. To ease the question, we don't allow for the model choice
in each segment, e.g. q is fixed in this model.
In each segment, we still have a linear regression model
Y = H x beta + eps
which is almost the same as the function in the first model. The only
difference comes from that if the lines are continuous at a certain
change point, the intercept is beta_0 ~ N(mu, sigma2), where mu and sigma can
be estimated according to the information known in the previous segment, i.e.
mu = sum w_i*mu_i
sigma2 = sum w_i*sigma_i
where mu_i = H[lastline,]*beta,sigma_i = M_i(1,1) and w_i is the associated
weight of each particle. The rest of beta_is comes from the same distributions
as those in the first model. These are equivalent to transform the data as
y_i = y_i-mu
in that segment. We therefore estimate regression parameters in a longer
segment between two consective discontinuous change points. The corresponding
basis matrix is
| 1 a a^2 0 0 ... |
| .. .. ..., |
H = | 1 s*a (s*a)^2 b b^2 0 0 |
| .. ..., ... |
| 1 s*a (s*a)^2 t*b (t*b)^2 c c^2 0 0 ... |
| ... ... ... |
where s and t are the lengths between two successive change points (whatever
it is), a an b are of form x_s-x_0 and x_t-x_s.
*****************************************************************************
About the program
This is a combination code of C++ and R. To deal with the matrix, a package
called Template Numerical Toolkit (tnt), developed by National Institute of
Standard and Technology (NIST), was adapted.
BCF.cpp implements the Particle Filtering,
rand.cpp gives the different simulations of random probabilities
tnt_addon.hpp implements the inverse of sparse matrix
To generate the .so file, type
make -f Makefile.Rshare
in the terminal.
******************************************************************************
INSTRUCTIONS
To run the code:
(i) Type
source("simBCF.R")
in the R prompt.
(ii) Then, choose the length of data, e.g nn <- 200
load the fucntion you want to analyse,e.g.
# heavisine function
heavi.sine <- function(t)
{
f <- 10*sin(4*pi*t)-4*sign(t-.3)-4*sign(.72-t)+rnorm(nn)
return(f)
}
t <- c(.3,.72,1)
x <- c(1:nn)/nn
y <- heavi.sine(x)
(iii) Run the function "sim.bcf" in R to obtain the simulation results;
Run the function "y.mix" in R to obtain the fitted line.
For example, type following commands in R prompt
sim.out <- sim.bcf(y,x,continuity=T)
y.mix <- para.sim(y,x,sim.out,continuity=T)
(iv) Run the function meanKSD.R in KSdistance archive to obtain the Kolmogorov
Smirnov distance.
P.S: The arguments of sim.bcf includes threshold, largest model order,prior
parameters. See simBCF.R for more details.
For 200 data, the code only takes 8 seconds for the result, while Denison's
code need 1m8s. This is the advantage of Particle Filters.
For 2048 data, Denison's code has to run 145 min on our computer, while our
algorithm spent a couple of minutes on the first model.
*******************************************************************************
Compiling:
Compiling executable code, type "make" in the prompt
Compile share object, type "make -f Make.Rshare" in the prompt
*******************************************************************************
Update:
Modified the way to calculate incremental weight, P(y_t|cpt,q), which is more
efficient
About
On Line Inference for multiple Changepoints problem in time series model.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published