1. 
 Burdakov, Oleg, 1953, et al.
(författare)

A Dual ActiveSet Algorithm for Regularized Monotonic Regression
 2017

Ingår i: Journal of Optimization Theory and Applications.  Springer.  00223239. ; 172:3, s. 929949

Tidskriftsartikel (refereegranskat)abstract
 Monotonic (isotonic) regression is a powerful tool used for solving a wide range of important applied problems. One of its features, which poses a limitation on its use in some areas, is that it produces a piecewise constant fitted response. For smoothing the fitted response, we introduce a regularization term in the monotonic regression, formulated as a least distance problem with monotonicity constraints. The resulting smoothed monotonic regression is a convex quadratic optimization problem. We focus on the case, where the set of observations is completely (linearly) ordered. Our smoothed pooladjacentviolators algorithm is designed for solving the regularized problem. It belongs to the class of dual activeset algorithms. We prove that it converges to the optimal solution in a finite number of iterations that does not exceed the problem size. One of its advantages is that the active set is progressively enlarging by including one or, typically, more constraints per iteration. This resulted in solving largescale test problems in a few iterations, whereas the size of that problems was prohibitively too large for the conventional quadratic optimization solvers. Although the complexity of our algorithm grows quadratically with the problem size, we found its running time to grow almost linearly in our computational experiments.


2. 
 Burdakov, Oleg, 1953, et al.
(författare)

An algorithm for isotonic regression problems
 2004

Ingår i: European Congress on Computational Methods in Applied Sciences and Engineering ECCOMAS.  Jyväskylä : University of Jyväskylä.  9513918688 ; s. 19

Konferensbidrag (refereegranskat)abstract
 We consider the problem of minimizing the distance from a given ndimensional vector to a set defined by constraintsof the form xi <img src="http://www.divaportal.org/cgibin/mimetex.cgi?%5Cleq" /> xj Such constraints induce a partial order of the components xi, which can be illustrated by an acyclic directed graph.This problem is known as the isotonic regression (IR) problem. It has important applications in statistics, operations research and signal processing. The most of the applied IR problems are characterized by a very large value of n. For such largescale problems, it is of great practical importance to develop algorithms whose complexity does not rise with n too rapidly.The existing optimizationbased algorithms and statistical IR algorithms have either too high computational complexity or too low accuracy of the approximation to the optimal solution they generate. We introduce a new IR algorithm, which can be viewed as a generalization of the PoolAdjacentViolator (PAV) algorithm from completely to partially ordered data. Our algorithm combines both low computational complexity O(n2) and high accuracy. This allows us to obtain sufficiently accurate solutions to the IR problems with thousands of observations.


3. 
 Burdakov, Oleg, 1953, et al.
(författare)

An O(n2) algorithm for isotonic regression
 2006

Ingår i: LargeScale Nonlinear Optimization.  New York : Springer Science+Business Media B.V..  0387300635 ; s. 2533

Konferensbidrag (övrigt vetenskapligt)abstract
 We consider the problem of minimizing the distance from a given ndimensional vector to a set defined by constraints of the form xi ≤ xj. Such constraints induce a partial order of the components xi, which can be illustrated by an acyclic directed graph. This problem is also known as the isotonic regression (IR) problem. IR has important applications in statistics, operations research and signal processing, with most of them characterized by a very large value of n. For such largescale problems, it is of great practical importance to develop algorithms whose complexity does not rise with n too rapidly. The existing optimizationbased algorithms and statistical IR algorithms have either too high computational complexity or too low accuracy of the approximation to the optimal solution they generate. We introduce a new IR algorithm, which can be viewed as a generalization of the PoolAdjacentViolator (PAV) algorithm from completely to partially ordered data. Our algorithm combines both low computational complexity O(n2) and high accuracy. This allows us to obtain sufficiently accurate solutions to IR problems with thousands of observations.


4. 


5. 
 Burdakov, Oleg, 1953, et al.
(författare)

Data preordering in generalized PAV algorithm for monotonic regression
 2006

Ingår i: Journal of Computational Mathematics.  02549409. ; 24:6, s. 771790

Tidskriftsartikel (refereegranskat)abstract
 Monotonic regression (MR) is a least distance problem with monotonicity constraints induced by a partially ordered data set of observations. In our recent publication [In Ser. {\sl Nonconvex Optimization and Its Applications}, SpringerVerlag, (2006) {\bf 83}, pp. 2533], the PoolAdjacentViolators algorithm (PAV) was generalized from completely to partially ordered data sets (posets). The new algorithm, called GPAV, is characterized by the very low computational complexity, which is of second order in the number of observations. It treats the observations in a consecutive order, and it can follow any arbitrarily chosen topological order of the poset of observations. The GPAV algorithm produces a sufficiently accurate solution to the MR problem, but the accuracy depends on the chosen topological order. Here we prove that there exists a topological order for which the resulted GPAV solution is optimal. Furthermore, we present results of extensive numerical experiments, from which we draw conclusions about the most and the least preferable topological orders.


6. 
 Burdakov, Oleg, 1953, et al.
(författare)

Generalized PAV algorithm with block refinement for partially ordered monotonic regression
 2009

Ingår i: Proceedings of the Workshop on Learning Monotone Models from Data. ; s. 2337

Konferensbidrag (övrigt vetenskapligt)abstract
 In this paper, the monotonic regression problem (MR) is considered. We have recentlygeneralized for MR the wellknown PoolAdjacentVoilators algorithm(PAV) from the case of completely to partially ordered data sets. Thenew algorithm, called GPAV, combines both high accuracy and lowcomputational complexity which grows quadratically with the problemsize. The actual growth observed in practice is typically far lowerthan quadratic. The fitted values of the exact MR solution composeblocks of equal values. Its GPAV approximation has also a blockstructure. We present here a technique for refining blocks produced bythe GPAV algorithm to make the new blocks more close to those in theexact solution. This substantially improves the accuracy of the GPAVsolution and does not deteriorate its computational complexity. Thecomputational time for the new technique is approximately triple thetime of running the GPAV algorithm. Its efficiency is demonstrated byresults of our numerical experiments.


7. 
 Burdakov, Oleg, 1953, et al.
(författare)

Monotonic data fitting and interpolation with application to postprocessing of FE solutions
 2007

Ingår i: CERFACS 20th Anniversary Conference on Highperformance Computing,2007. ; s. 1112

Konferensbidrag (övrigt vetenskapligt)abstract
 In this talk we consider the isotonic regression (IR) problem which can be formulated as follows. Given a vector $\bar{x} \in R^n$, find $x_* \in R^n$ which solves the problem: \begin{equation}\label{ir2} \begin{array}{cl} \mbox{min} & \x\bar{x}\^2 \\ \mbox{s.t.} & Mx \ge 0. \end{array} \end{equation} The set of constraints $Mx \ge 0$ represents here the monotonicity relations of the form $x_i \le x_j$ for a given set of pairs of the components of $x$. The corresponding row of the matrix $M$ is composed mainly of zeros, but its $i$th and $j$th elements, which are equal to $1$ and $+1$, respectively. The most challenging applications of (\ref{ir2}) are characterized by very large values of $n$. We introduce new IR algorithms. Our numerical experiments demonstrate the high efficiency of our algorithms, especially for very largescale problems, and their robustness. They are able to solve some problems which all existing IR algorithms fail to solve. We outline also our new algorithms for monotonicitypreserving interpolation of scattered multivariate data. In this talk we focus on application of our IR algorithms in postprocessing of FE solutions. Nonmonotonicity of the numerical solution is a typical drawback of the conventional methods of approximation, such as finite elements (FE), finite volumes, and mixed finite elements. The problem of monotonicity is particularly important in cases of highly anisotropic diffusion tensors or distorted unstructured meshes. For instance, in the nuclear waste transport simulation, the nonmonotonicity results in the presence of negative concentrations which may lead to unacceptable concentration and chemistry calculations failure. Another drawback of the conventional methods is a possible violation of the discrete maximum principle, which establishes lower and upper bounds for the solution. We suggest here a leastchange correction to the available FE solution $\bar{x} \in R^n$. This postprocessing procedure is aimed on recovering the monotonicity and some other important properties that may not be exhibited by $\bar{x}$. The mathematical formulation of the postprocessing problem is reduced to the following convex quadratic programming problem \begin{equation}\label{ls2} \begin{array}{cl} \mbox{min} & \x\bar{x}\^2 \\ \mbox{s.t.} & Mx \ge 0, \quad l \le x \le u, \quad e^Tx = m, \end{array} \end{equation} where$e=(1,1, \ldots ,1)^T \in R^n$. The set of constraints $Mx \ge 0$ represents here the monotonicity relations between some of the adjacent mesh cells. The constraints $l \le x \le u$ originate from the discrete maximum principle. The last constraint formulates the conservativity requirement. The postprocessing based on (\ref{ls2}) is typically a large scale problem. We introduce here algorithms for solving this problem. They are based on the observation that, in the presence of the monotonicity constraints only, problem (\ref{ls2}) is the classical monotonic regression problem, which can be solved efficiently by some of the available monotonic regression algorithms. This solution is used then for producing the optimal solution to problem (\ref{ls2}) in the presence of all the constraints. We present results of numerical experiments to illustrate the efficiency of our algorithms.


8. 
 Burdakov, Oleg, 1953, et al.
(författare)

New optimization algorithms for largescale isotonic regression in L2norm
 2007

Ingår i: EUROPTOMS Conference on Optimization,2007.  University of Hradec Kralove, Czech Republic : Guadeamus. ; s. 4444

Konferensbidrag (övrigt vetenskapligt)abstract
 Isotonicregression problem (IR) has numerous important applications instatistics, operations research, biology, image and signalprocessingand other areas. IR in L2norm is a minimization problem in whichtheobjective function is the squared Euclidean distance from a givenpointto a convex set defined by monotonicity constraints of the form:ithcomponent of the decision vector is less or equal to its jthcomponent. Unfortunately, the conventional optimization methods areunable to solve IR problems originating from large data sets.The existing IR algorithms, such as the minimum lower setsalgorithm byBrunk, the minmax algorithm by Lee, the network flow algorithm byMaxwell & Muchstadt and the IBCR algorithm by Block et al. areable tofind exact solution to IR problem for at most a few thousands ofvariables. The IBCR algorithm, which proved to be the mostefficient ofthem, is not robust enough. An alternative approach is related tosolving IR problem approximately. Following this approach, Burdakovetal. developed an algorithm, called GPAV, whose block refinementextension,GPAVR, is able to solve IR problems with a very high accuracy in afarshorter time than the exact algorithms. Apart from this, GPAVR is avery robust algorithm, and it allows us to solve IR problems withoverhundred thousands of variables.In this talk, we introduce new exact IR algorithms, which can beviewedas active set methods. They use the approximate solution producedbythe GPAVR algorithm as a starting point. We present results of ournumerical experiments demonstrating the high efficiency of the newalgorithms, especially for very largescale problems, and theirrobustness. They are able to solve the problems which all existingexact IR algorithms fail to solve.


9. 


10. 
 Sysoev, Oleg, et al.
(författare)

Bootstrap confidence intervals for largescale multivariate monotonic regression problems
 2016

Ingår i: Communications in statistics. Simulation and computation.  Taylor & Francis.  03610918. ; 45:3, s. 10251040

Tidskriftsartikel (refereegranskat)abstract
 Recently, the methods used to estimate monotonic regression (MR) models have been substantially improved, and some algorithms can now produce highaccuracy monotonic fits to multivariate datasets containing over a million observations. Nevertheless, the computational burden can be prohibitively large for resampling techniques in which numerous datasets are processed independently of each other. Here, we present efficient algorithms for estimation of confidence limits in largescale settings that take into account the similarity of the bootstrap or jackknifed datasets to which MR models are fitted. In addition, we introduce modifications that substantially improve the accuracy of MR solutions for binary response variables. The performance of our algorithms isillustrated using data on death in coronary heart disease for a large population. This example also illustrates that MR can be a valuable complement to logistic regression.

