1. 
 Burdakov, Oleg, 1953, et al.
(författare)

An algorithm for isotonic regression problems
 2004

Ingår i: European Congress on Computational Methods in Applied Sciences and Engineering ECCOMAS.  Jyväskylä : University of Jyväskylä.  9513918688 ; s. 19

Konferensbidrag (refereegranskat)abstract
 We consider the problem of minimizing the distance from a given ndimensional vector to a set defined by constraintsof the form xi <img src="http://www.divaportal.org/cgibin/mimetex.cgi?%5Cleq" /> xj Such constraints induce a partial order of the components xi, which can be illustrated by an acyclic directed graph.This problem is known as the isotonic regression (IR) problem. It has important applications in statistics, operations research and signal processing. The most of the applied IR problems are characterized by a very large value of n. For such largescale problems, it is of great practical importance to develop algorithms whose complexity does not rise with n too rapidly.The existing optimizationbased algorithms and statistical IR algorithms have either too high computational complexity or too low accuracy of the approximation to the optimal solution they generate. We introduce a new IR algorithm, which can be viewed as a generalization of the PoolAdjacentViolator (PAV) algorithm from completely to partially ordered data. Our algorithm combines both low computational complexity O(n2) and high accuracy. This allows us to obtain sufficiently accurate solutions to the IR problems with thousands of observations.


2. 
 Burdakov, Oleg, 1953, et al.
(författare)

An O(n2) algorithm for isotonic regression
 2006

Ingår i: LargeScale Nonlinear Optimization.  New York : Springer Science+Business Media B.V..  0387300635 ; s. 2533

Konferensbidrag (övrigt vetenskapligt)abstract
 We consider the problem of minimizing the distance from a given ndimensional vector to a set defined by constraints of the form xi ≤ xj. Such constraints induce a partial order of the components xi, which can be illustrated by an acyclic directed graph. This problem is also known as the isotonic regression (IR) problem. IR has important applications in statistics, operations research and signal processing, with most of them characterized by a very large value of n. For such largescale problems, it is of great practical importance to develop algorithms whose complexity does not rise with n too rapidly. The existing optimizationbased algorithms and statistical IR algorithms have either too high computational complexity or too low accuracy of the approximation to the optimal solution they generate. We introduce a new IR algorithm, which can be viewed as a generalization of the PoolAdjacentViolator (PAV) algorithm from completely to partially ordered data. Our algorithm combines both low computational complexity O(n2) and high accuracy. This allows us to obtain sufficiently accurate solutions to IR problems with thousands of observations.


3. 
 Burdakov, Oleg, 1953, et al.
(författare)

Data preordering in generalized PAV algorithm for monotonic regression
 2006

Ingår i: Journal of Computational Mathematics.  02549409. ; 24:6, s. 771790

Tidskriftsartikel (refereegranskat)abstract
 Monotonic regression (MR) is a least distance problem with monotonicity constraints induced by a partially ordered data set of observations. In our recent publication [In Ser. {\sl Nonconvex Optimization and Its Applications}, SpringerVerlag, (2006) {\bf 83}, pp. 2533], the PoolAdjacentViolators algorithm (PAV) was generalized from completely to partially ordered data sets (posets). The new algorithm, called GPAV, is characterized by the very low computational complexity, which is of second order in the number of observations. It treats the observations in a consecutive order, and it can follow any arbitrarily chosen topological order of the poset of observations. The GPAV algorithm produces a sufficiently accurate solution to the MR problem, but the accuracy depends on the chosen topological order. Here we prove that there exists a topological order for which the resulted GPAV solution is optimal. Furthermore, we present results of extensive numerical experiments, from which we draw conclusions about the most and the least preferable topological orders.


4. 


5. 
 Burdakov, Oleg, 1953, et al.
(författare)

Generalized PAV algorithm with block refinement for partially ordered monotonic regression
 2009

Ingår i: Proceedings of the Workshop on Learning Monotone Models from Data. ; s. 2337

Konferensbidrag (refereegranskat)abstract
 In this paper, the monotonic regression problem (MR) is considered. We have recentlygeneralized for MR the wellknown PoolAdjacentVoilators algorithm(PAV) from the case of completely to partially ordered data sets. Thenew algorithm, called GPAV, combines both high accuracy and lowcomputational complexity which grows quadratically with the problemsize. The actual growth observed in practice is typically far lowerthan quadratic. The fitted values of the exact MR solution composeblocks of equal values. Its GPAV approximation has also a blockstructure. We present here a technique for refining blocks produced bythe GPAV algorithm to make the new blocks more close to those in theexact solution. This substantially improves the accuracy of the GPAVsolution and does not deteriorate its computational complexity. Thecomputational time for the new technique is approximately triple thetime of running the GPAV algorithm. Its efficiency is demonstrated byresults of our numerical experiments.


6. 
 Burdakov, Oleg, 1953, et al.
(författare)

Monotonic data fitting and interpolation with application to postprocessing of FE solutions
 2007

Ingår i: CERFACS 20th Anniversary Conference on Highperformance Computing,2007. ; s. 1112

Konferensbidrag (övrigt vetenskapligt)abstract
 In this talk we consider the isotonic regression (IR) problem which can beformulated as follows. Given a vector $\bar{x} \in R^n$, find $x_* \in R^n$ which solves the problem:\begin{equation}\label{ir2}\begin{array}{cl}\mbox{min} & \x\bar{x}\^2 \\\mbox{s.t.} & Mx \ge 0.\end{array}\end{equation}The set of constraints $Mx \ge 0$ represents here the monotonicityrelations of the form $x_i \le x_j$ for a given set of pairs of thecomponents of $x$. The corresponding row of the matrix $M$ is composed mainly of zeros,but its $i$th and $j$th elements, which are equal to $1$ and $+1$,respectively. The most challenging applications of (\ref{ir2}) arecharacterized by very large values of $n$. We introduce new IRalgorithms. Our numerical experiments demonstrate the high efficiencyof our algorithms, especially for very largescale problems, and theirrobustness. They are able to solve some problems which all existing IRalgorithms fail to solve. We outline also our new algorithms formonotonicitypreserving interpolation of scattered multivariate data.In this talk we focus on application of our IR algorithms inpostprocessing of FE solutions.Nonmonotonicity ofthe numerical solution is a typical drawback of the conventionalmethods of approximation, such as finite elements (FE), finite volumes,and mixed finite elements. The problem of monotonicity isparticularly important in cases of highly anisotropic diffusion tensorsor distorted unstructured meshes. For instance, in the nuclear wastetransport simulation, the nonmonotonicity results in the presence ofnegative concentrations which may lead to unacceptable concentrationand chemistry calculations failure. Another drawback of the conventionalmethods is a possible violation of the discrete maximum principle, whichestablishes lower and upper bounds for the solution.We suggest here a leastchange correction to the available FE solution$\bar{x} \in R^n$. This postprocessing procedure is aimed on recoveringthe monotonicity and some other important properties that may not beexhibited by $\bar{x}$.The mathematical formulation of the postprocessing problemis reduced to the following convex quadratic programming problem\begin{equation}\label{ls2}\begin{array}{cl}\mbox{min} & \x\bar{x}\^2 \\\mbox{s.t.} & Mx \ge 0, \quad l \le x \le u, \quad e^Tx = m,\end{array}\end{equation}where$e=(1,1, \ldots ,1)^T \in R^n$. The set of constraints $Mx \ge 0$ represents here the monotonicityrelations between some of the adjacent mesh cells.The constraints $l \le x \le u$ originate fromthe discrete maximum principle. The last constraint formulates the conservativity requirement.The postprocessing based on (\ref{ls2}) is typically a large scaleproblem. We introduce here algorithms for solving this problem. Theyare based on the observation that, in the presence of the monotonicityconstraints only, problem (\ref{ls2}) is the classical monotonicregression problem, which can be solved efficiently by some of theavailable monotonic regression algorithms. This solution is used thenfor producing the optimal solution to problem (\ref{ls2}) in thepresence of all the constraints. We present results of numericalexperiments to illustrate the efficiency of our algorithms.


7. 
 Burdakov, Oleg, 1953, et al.
(författare)

New optimization algorithms for largescale isotonic regression in L2norm
 2007

Ingår i: EUROPTOMS Conference on Optimization,2007.  University of Hradec Kralove, Czech Republic : Guadeamus. ; s. 4444

Konferensbidrag (övrigt vetenskapligt)abstract
 Isotonicregression problem (IR) has numerous important applications instatistics, operations research, biology, image and signalprocessingand other areas. IR in L2norm is a minimization problem in whichtheobjective function is the squared Euclidean distance from a givenpointto a convex set defined by monotonicity constraints of the form:ithcomponent of the decision vector is less or equal to its jthcomponent. Unfortunately, the conventional optimization methods areunable to solve IR problems originating from large data sets.The existing IR algorithms, such as the minimum lower setsalgorithm byBrunk, the minmax algorithm by Lee, the network flow algorithm byMaxwell & Muchstadt and the IBCR algorithm by Block et al. areable tofind exact solution to IR problem for at most a few thousands ofvariables. The IBCR algorithm, which proved to be the mostefficient ofthem, is not robust enough. An alternative approach is related tosolving IR problem approximately. Following this approach, Burdakovetal. developed an algorithm, called GPAV, whose block refinementextension,GPAVR, is able to solve IR problems with a very high accuracy in afarshorter time than the exact algorithms. Apart from this, GPAVR is avery robust algorithm, and it allows us to solve IR problems withoverhundred thousands of variables.In this talk, we introduce new exact IR algorithms, which can beviewedas active set methods. They use the approximate solution producedbythe GPAVR algorithm as a starting point. We present results of ournumerical experiments demonstrating the high efficiency of the newalgorithms, especially for very largescale problems, and theirrobustness. They are able to solve the problems which all existingexact IR algorithms fail to solve.


8. 


9. 
 Oleg, Sysoev, et al.
(författare)

Bootstrap confidence intervals for largescale multivariate monotonic regression problems
 2014

Ingår i: Communications in statistics. Simulation and computation.  03610918.

Tidskriftsartikel (refereegranskat)abstract
 Recently, the methods used to estimate Monotonic Regression (MR) models have been substantially improved, and some algorithms can now produce highaccuracy monotonic fits to multivariate datasets containing over a million observations. Nevertheless, the computational burden can be prohibitively large for resampling techniques in which numerous datasets are processed independently of each other. Here, we present efficient algorithms for estimation of confidence limits in largescale settings that take into account the similarity of the bootstrap or jackknifed datasets to which MR models are fitted. In addition, we introduce modifications that substantially improve the accuracy of MR solutions for binary response variables. The performance of our algorithms is illustrated using data on death in coronary heart disease for a large population. This example also illustrates that MR can be a valuable complement to logistic regression.


10. 
 Sysoev, Oleg, et al.
(författare)

Bootstrap estimation of the variance of the error term in monotonic regression models
 2013

Ingår i: Journal of Statistical Computation and Simulation.  Taylor & Francis Group.  00949655. ; 83:4, s. 625638

Tidskriftsartikel (refereegranskat)abstract
 The variance of the error term in ordinary regression models and linear smoothers is usually estimated by adjusting the average squared residual for the trace of the smoothing matrix (the degrees of freedom of the predicted response). However, other types of variance estimators are needed when using monotonic regression (MR) models, which are particularly suitable for estimating response functions with pronounced thresholds. Here, we propose a simple bootstrap estimator to compensate for the overfitting that occurs when MR models are estimated from empirical data. Furthermore, we show that, in the case of one or two predictors, the performance of this estimator can be enhanced by introducing adjustment factors that take into account the slope of the response function and characteristics of the distribution of the explanatory variables. Extensive simulations show that our estimators perform satisfactorily for a great variety of monotonic functions and error distributions.

