STATISTICA








(add-on product)


Click here for more user comments
Contents


Design of Experiments Design of Experiments. STATISTICA Design of Experiments offers an extremely comprehensive selection of procedures to design and analyze the experimental designs used in industrial (quality) research: 2**(k-p) factorial designs with blocking (for over 100 factors, including unique, highly efficient search algorithms for finding minimum aberration and maximum unconfounding designs, where the user can specify the interaction effects of interest that are to be unconfounded), screening designs (for over 100 factors, including Plackett-Burman designs), 3**(k-p) factorial designs with blocking (including Box-Behnken designs), mixed-level designs, central composite (or response surface) designs (including small central composite designs), Latin square designs, Taguchi robust design experiments via orthogonal arrays, mixture designs and triangular surfaces designs, vertices and centroids for constrained surfaces and mixtures, and D- and A-optimal designs for factorial designs, surfaces, and mixtures. The specific types of available designs, and methods for generating and analyzing them, are described in the following sections.

STATISTICA Design of Experiments is compatible with Windows 95, Windows 98, Windows NT, Windows 2000, Windows XP, Windows Me.

Analysis of experiments: General features. The options for analyzing all factorial, response surface, and mixture designs are general in nature, can handle unbalanced and incomplete designs, and give the user full control of the choice of models to be fitted to the data. The program will compute the generalized inverse of the X'X matrix (where X stands for the design matrix) to determine the estimable effects, and the effects that are aliases of other effects. The program will then automatically report the table of aliases and compute the parameter estimates for all non-redundant effects. You can also manually "toggle" specific effects in and out of the current model quickly and easily, and observe the effect on the overall fit. All analyses can be performed in terms of recoded factor values or the original factor values, and a large number of output options are provided to review the parameter estimates, analysis of variance table, etc. Numerous additional options are provided for exploring the predicted (fitted) means, surfaces, etc.; these options will be further described in the context of the respective designs below.

Residual analyses and transformations Residual analyses and transformations. A large number of graphs and other output options are provided for further analyses of residuals from a given model. Specifically, the program will compute predicted (fitted) and residual values and their standard errors, user-defined prediction intervals and confidence intervals for the predicted (fitted) values, standardized predicted and residual values, studentized residuals, deleted residuals, studentized deleted residuals, leverage scores, Mahalanobis and Cook distances, and DFFIT and standardized DFFIT values. All of these residual statistics can be saved for further analysis using other STATISTICA modules (e.g., in order to analyze serial correlations of errors via the Time Series module). Also, these residual statistics for each observation can be reviewed in the order of the observation (case) numbers, or displayed in the order sorted by their magnitudes; thus, outliers with respect to any of the residual statistics can quickly be identified. As further aids for evaluating the fit of the respective model, and for identifying outliers, you can review histograms of residual (and deleted residual) and predicted values, scatterplots of (deleted) residual versus predicted values, or normal, half-normal, and de-trended normal probability plots of (deleted) residuals. Also, as a check for serial correlation of residuals, you can plot the (deleted) residual values against the case numbers. In all plots of individual observations (e.g., residual values for cases), the points are identified by their respective case numbers or labels, and therefore, it is very easy to identify outliers in a dataset. Finally, maximum-likelihood lambda values can be computed for the Box-Cox transformation of the response variables; a plot of the residual sums of squares versus lambda, along with the confidence limit of lambda, accompanies the results in the Box-Cox transformation plot.

Back to Top

Response (Desirability) ProfilerOptimization of single or multiple response variables: The response (desirability) profiler. A unique set of options is provided to allow the user to interactively optimize single or multiple response variables, given the current model. First, for second-order response surface models and mixture surface models, the program will compute the factor settings associated with the minimum, maximum, or saddle point value of the respective surface (i.e., determine the critical value of the current surface, along with the respective eigenvalues and eigenvectors, to indicate the curvature and orientation of the quadratic response surface). Note that for mixture designs, the desirability profiler options are not based on a simple reparameterization of the mixture model to an unconstrained surface model (which can lead to erroneous results, such as optimum factor settings that are not valid mixtures); instead all computations will be performed based on the actual (currently fitted constrained) mixture model. Thus, when searching for the optimum factor settings given the desirability function for one or more response variables, it is assured that only the constrained (mixture) experimental region is inspected, and that the resulting factor settings sum to a valid mixture. Second, a comprehensive set of graphical options is provided for visualizing the predicted values of one or more response variables as a function of each factor in the analysis, while holding all other factors constant at particular values. Specifically, for multiple response variables you can specify a desirability function that reflects the most desirable value for each response variable, and the importance of each variable for the overall desirability. Then you can plot the profiles of the desirability function (computed from the predicted values of each response variable) across a user-defined number of levels for each factor. Also, the profiles for each individual response variable, along with confidence intervals, can be displayed in the same graph.

Response (Desirability) ProfilerMoreover, the desirability function can be plotted in 3D surface plots or contour plots (desirability contours), and the user can request matrices of such plots for all factors in the analysis (see the illustration at left). All settings, such as the factor grid or the desirability function, can quickly be modified for interactive analyses (e.g., you can quickly exclude specific response variables from the analysis, and observe the effect on the overall desirability function). Also, the specifications for complex desirability functions for many response variables can be saved to a file, and later quickly retrieved when you want to analyze other experiments using the same response variables. Finally, options are provided for determining the optimum value of the desirability function, either by using a grid search over the experimental region, or by using an efficient general function optimization algorithm (which is particularly useful for optimizing desirability functions for experiments with many factors). Note that desirability profiling options are also provided in STATISTICA General Linear Models (GLM), General Regression Models(GRM), and General Discriminant Analysis Models (GDA) (for categorical responses).

Box-Hunter-Hunter Minimum Aberration DesignsStandard two-level 2**(k-p) fractional factorial designs with blocks (Box-Hunter-Hunter minimum aberration designs). STATISTICA Design of Experiments provides the complete catalog of all standard (so-called, minimum aberration) designs (as, for example, reproduced in the widely used textbooks by Box and Draper, 1987; Box, Hunter, and Hunter, 1978; Montgomery, 1991). The user can review designs in a Spreadsheet; the runs may be randomized (overall or within blocks), and blank columns may be added to the Spreadsheet. Options are provided for specifying the factor highs and lows, and the design can be reviewed and saved in terms of the coded factor levels or the original metric of factors. The user can also request replications, add center points to the design, or add a fold-over of the original design. The fractional design generators and block generators of the design, as well as the matrix of aliases of main effects and interactions can also be reviewed. STATISTICA Design of Experiments will automatically perform a complete ANOVA on the design. The user has full control over the effects and interactions to be included in the model, and can review the correlations among the columns of the design matrix (X) as well as the inverse of the X'X matrix (i.e., the covariance and correlation matrices of the parameter estimates). The program will compute the ANOVA parameter estimates and their standard errors and confidence intervals, the coefficients for the recoded (-1, +1) factor values and their standard errors and confidence intervals, and the coefficients (standard errors, confidence intervals) for the untransformed factor values. Based on those estimates, the program can compute predicted values (standard errors, confidence intervals) for user-specified factor levels.
Back to Top

Box-Hunter-Hunter Minimum Aberration DesignsThe program will compute the complete ANOVA table, based on the mean-square (ms) residual term, or, when the design is at least partially replicated, based on the estimate of pure error. When a pure error estimate is available, the program will also compute a test for overall lack-of-fit; when the design contains center points, the program will perform an overall curvature check. The user can review the table of means and marginal means, and their confidence intervals. Numerous options are available for reviewing the results in graphs: Pareto charts of effects, normal and half-normal probability plots of effects, square and cube plots, means plots and interaction plots (with confidence intervals for marginal means), response surface plots, and response contour plots. In addition, all general features described above (under the headings Design of experiments, Analysis of experiments: General features, Residual analyses and transformations, and Optimization of single or multiple response variables) are available, for performing detailed analyses of residuals, to evaluate the fit of the model, and for finding the optimum factors settings, given one or more response variables.

Minimum aberration and maximum unconfounding 2**(k-p) fractional factorial designs with blocksMinimum aberration and maximum unconfounding 2**(k-p) fractional factorial designs with blocks: General design search. In addition to the standard 2**(k-p) designs, STATISTICA Design of Experiments includes a general design search option for generating minimum aberration (least confounded) fractional factorial designs with or without blocks with over 100 factors and over 2,000 runs. These types of efficient designs have only recently been discovered and they allow you to evaluate a greater number of (specific) factor interactions than the standard Box-Hunter designs; STATISTICA Design of Experiments is the only program that currently offers this functionality. Given a desired resolution, you can either perform a comprehensive search of all (non-isomorphic) sets of generators, or specify particular sets of interactions that you would like to keep unconfounded at the respective resolution. In addition to the common search criterion of "minimum aberration," you can also choose the criterion of "maximum unconfounding" which will lead to the design with the largest possible number of unconfounded effects (unconfounded with all other effects, given the current resolution of the design). These designs can be further enhanced in the same manner as the standard 2**(k-p) designs described in the previous paragraph (by adding replications, center points, foldover, etc.). Also, all analysis options described in the previous paragraph are applicable to these designs (or any arbitrary 2**(k-p) design).

  Click here to read the white paper entitled Minimum Aberration Designs Are Not Maximally Unconfounded.

Back to Top

Screening (Plackett-Burman) Designs Screening (Plackett-Burman) designs. STATISTICA Design of Experiments allows the user to design and analyze screening designs for a large number of factors. The program will generate Plackett-Burman (Hadamard matrix) designs and saturated fractional factorial designs with up to 127 factors. As with 2**(k-p) designs, the user can request replications of the design, manually add points, add center points, and print or save the design. For the analysis of screening designs, the same options are available as those described for the analysis of 2**(k-p) designs (see the previous paragraphs).

Mixed-level factorial designs. The program also supports mixed designs (as enumerated for the National Bureau of Standards of the U.S. Department of Commerce). The design and analysis options available for those designs are identical to those described for 3**(k-p) designs (see the previous paragraph).

Three-level 3**(k-p) fractional factorial designs with blocks and Box-Behnken designsThree-level 3**(k-p) fractional factorial designs with blocks and Box-Behnken designs. STATISTICA Design of Experiments contains a complete implementation of the standard (blocked) 3**(k-p) designs. Also included are the standard Box-Behnken designs. As with all other designs, the user can display and save those designs in standard or randomized order, request replications or add individual runs, review the design and block generators, etc. The program will perform a complete analysis for 3**(k-p) designs. The user has full control over the effects that are to be included in the analysis. The main effects are broken down into linear and quadratic effects, and the interactions are broken down into linear-linear, linear-quadratic, quadratic-linear, and quadratic-quadratic effects. The user can review the correlation matrix of the design matrix (X) as well as the inverse of X'X. The program will compute the standard ANOVA parameter estimates (standard errors, confidence intervals, statistical significance, etc.), coefficients for the recoded (-1, 0, +1) factors, and coefficients for the unrecoded factors. Based on those values, the program provides options for computing predicted values (and standard errors, confidence intervals) based on user-specified values of the factors. The ANOVA table will include tests for the linear and quadratic components of each effect as well as combined multiple-degree-of-freedom tests for the effects. If the design includes replications, then the estimate of pure error can be used for the ANOVA and significance testing; in that case an overall lack-of-fit test will also be performed.

To aid in the interpretation of results, the program will compute the table of means (and confidence intervals) as well as marginal means (and confidence intervals) for interactions. Graphical options include plots of means and marginal means (with confidence intervals), the Pareto chart of effects, normal and half-normal probability plots of effects, and response surface and contour plots. In addition, all general features described above (under the headings Design of experiments, Analysis of experiments: General features, Residual analyses and transformations, and Optimization of single or multiple response variables) are available, for performing detailed analyses of residuals, to evaluate the fit of the model, and for finding the optimum factors settings, given one or more response variables.

Back to Top

Central composite (response surface) designsCentral composite (response surface) designs. The user can choose from a catalog of standard designs, including small central composite designs (based on Plackett-Burman designs). In addition to the standard options available for all designs (adding runs, randomization, replications, factor highs and lows, etc.; refer to the description of 2**(k-p) designs) the user has the choice of star-points that are face-centered, or computed for rotatability, orthogonality, or both. The analysis options are very similar to those described for 3**(k-p) and 2**(k-p) designs above. The user can compute the ANOVA parameters, coefficients for the recoded factor values, and the coefficients for the untransformed factors. Predicted values for user-specified factor values can also be computed. The user has full control over the effects to be included in the model, and can review the correlation matrix for the design matrix (X) as well as the inverse of X'X. If replicates are available, the ANOVA table may include the estimate of pure error, and an overall lack-of-fit test. The standard results graphics options include the Pareto chart of effects, probability plot of effects, and response surface and contour plots (if there are more than two factors, for user-specified values of additional factors). In addition, all general features described above (under the headings Design of experiments, Analysis of experiments: General features, Residual analyses and transformations, and Optimization of single or multiple response variables) are available, for performing detailed analyses of residuals, to evaluate the fit of the model, and for finding the optimum factors settings, given one or more response variables.

Latin squares. The user can choose between different Latin square designs, with up to nine levels. Whenever possible, the program will also make available Greco-Latin squares and Hyper-Greco Latin squares. When there are several alternative Latin squares available, the program will either choose randomly from among them, or the user can select the desired Latin square(s). Designs can be reviewed in a Spreadsheet, randomized order, and blank columns may be added to create convenient data entry forms. The design can also be saved in a standard STATISTICA data file. After appending the observed data to this file, the experiment can then be easily analyzed. In addition to the full ANOVA table, STATISTICA Design of Experiments will compute the means for all factors. These means can be plotted in a summary plot.

Taguchi robust design experimentsTaguchi robust design experiments. STATISTICA Design of Experiments will generate orthogonal arrays for up to 31 factors; designs with up to 65 factors can be analyzed. As in all other types of designs, the runs of the experiment can be randomized, and the user can add blank columns to the Spreadsheet to generate convenient data entry forms. The user can also examine the aliases of two-way interactions. STATISTICA Design of Experiments will automatically compute the standard signal-to-noise (S/N) ratios for problems of these types: (1) Smaller-the-better, (2) Nominal-the-best, (3) Larger-the-better, (4) Signed target, (5) Fraction defective, and (6) Number defective per interval (accumulation analysis). In addition, untransformed data can also be analyzed; thus, the user can produce any type of customized S/N ratios via STATISTICA Visual Basic and analyze them with this procedure. In addition to comprehensive descriptive statistics, the user can review the computed S/N ratios. The full ANOVA results are displayed in an interactive Spreadsheet in which the user can "toggle" effects into or out of the error term. A similar interactive Spreadsheet allows the user to predict Eta (the S/N ratio) under optimum conditions, that is, settings of levels of factors. Again, the user can "toggle" effects into or out of the model, and specify particular levels for factors. Finally, the means can be summarized in a standard main effect plot of Eta by factor level; if an accumulation analysis on categorical data is performed, the results can be summarized in a stacked bar plot as well as line plots of the cumulative probabilities across categories for the levels of selected factors. Note that different types of response desirability functions for single or multiple variables can also be optimized via the response (desirability) profiler described earlier, available in conjunction with 2**(k-p), 3**(k-p), central composite designs, etc. (or in GLM, GRM, GDA).
Back to Top

Designs for mixtures and triangular graphsDesigns for mixtures and triangular graphs. This procedure includes options for designing the simplex-lattice and simplex-centroid designs for mixture variables. These designs can be enhanced by additional interior points and a centroid. The user can enter lower-bound constraints for each factor, and the program will automatically construct the respective design in the sub-simplex defined by the constraints. Multiple upper and lower constraints can be handled via the general facilities for constructing designs in constrained experimental regions (see below). The user can add individual runs or replications, and display and save the design in standard or randomized order. The program will compute the coefficients for the pseudo-components and the components in their original metric, along with the standard errors, confidence intervals, and tests of statistical significance. (Note that the STATISTICA General Linear Models (GLM) module also includes facilities for analyzing mixture experiments; those options are particularly useful for analyzing designs that combine both mixture and non-mixture variables in complex designs.) The user has full control over the terms that are to be included in the model; standard models include the linear, quadratic, special cubic, and full cubic models. The ANOVA table will include tests for the incremental fit of the different models, and if the design includes replicated runs, a test for lack-of-fit based on the estimate of pure error will also be computed. Results options include the table of means, the correlations for the columns of the design matrix (X), the inverse of the design matrix X'X (the variance/covariance matrix for the parameter estimates), the Pareto chart, probability plots of parameter estimates, etc. Also, the user can compute predicted values, based on user-defined values of the factors. Specialized graphs to summarize the results of mixture experiments include response trace plots for user-defined reference blends, and triangular surface and contour plots. If there are more than 3 components in the experiment, then the surface and contour plots can be produced for user-defined values of the additional components. Finally, all general features described above (under the headings Design of experiments, Analysis of experiments: General features, Residual analyses and transformations, and Optimization of single or multiple response variables) are available, for performing detailed analyses of residuals, to evaluate the fit of the model, and for finding the optimum factors settings, given one or more response variables. Note that the response (desirability) profiler options available for mixture designs are not based on a simple reparameterization of the mixture model to an unconstrained surface model; instead all computations will be performed based on the actual (fitted) mixture model. Thus, when searching for the optimum factor settings given the desirability function for one or more response variables, it is assured that only the constrained (mixture) experimental region is inspected, and that the resulting factor settings sum to a valid mixture.
Back to Top

Designs for constrained surfaces and mixturesDesigns for constrained surfaces and mixtures. STATISTICA Design of Experiments contains procedures for computing vertex and centroid points for constrained surfaces and mixtures defined by linear constraints. The user can enter upper and lower limits for the factors, and specify any additional linear constraints (of the form A1*x1 + ... + An*xn + A0 >= 0) on the factor values. The program will then compute the vertex points, and optional centroid points, for the constrained region. The constraints will be processed sequentially, and unnecessary constraints will be identified. There are numerous additional options for reviewing the characteristics of the constrained region. The user can review the vertex and centroid points in 3D and triangular scatterplots (for mixtures). The correlation matrix for the columns of the design matrix X, for various standard types of designs, can also be computed as well as the inverse of the X'X matrix (i.e., the variance/covariance matrix of the parameter estimates). This allows the user to evaluate the characteristics of the design, based on the vertex and centroid points. These points can then be submitted to the optimal design facilities (see below), to construct designs with the minimum number of runs.

D- and A-optimal designsD- and A-optimal designs. The program includes several algorithms for constructing optimal designs. The user can choose between the D (determinant) optimality and the A (or trace) optimality criterion, and specify models for surfaces and mixtures. A list of candidate points for the design can be entered by hand or retrieved from a STATISTICA data file (e.g., a design previously created via the facilities for computing vertex and centroid points for constrained surfaces and mixtures, see above). Points in the candidate list can be marked for forced inclusion in the final design, thus, the user can enhance or "repair" existing experiments. The program includes all common search algorithms developed for constructing D- and A-optimal designs: Dykstra's sequential search procedure, the Wynn-Mitchell simple exchange procedure, the Mitchell DETMAX procedure (exchange with excursions), Fedorov's simultaneous switching procedure, and a modified simultaneous switching procedure. For the final design, the program will compute the determinant of X'X and the D, A, and G efficiencies. The user can also review the correlation matrix for the columns of the final design matrix (X), and the inverse of the X'X matrix (the variance covariance matrix of parameter estimates). The final design points can be visualized in 3D and triangular scatterplots (for mixtures).

Alternative procedures for analyzing data collected in experiments. STATISTICA includes an extremely large number of computational methods for analyzing data collected in experiments, and for fitting ANOVA/ANCOVA - like designs to continuous or categorical outcome variables. Specifically, STATISTICA includes complete implementations of:
Alternative procedures for analyzing data collected in experiments
  • General Linear Models (GLM) and General Regression Models (GRM) (available in STATISTICA Advanced Linear/Non-Linear Models) with sophisticated model-building procedures (stepwise and best-subset selection of predictor effects),
  • Generalized Linear Models (GLZ) (available in STATISTICA Advanced Linear/Non-Linear Models), which also offers stepwise and best-subset selection of predictor effects in ANOVA/ANCOVA - like designs, for various popular alternatives to linear least squares models, such as logit, multinomial logit, and probit models,
  • General Discriminant Analysis Models (GDA) (available in STATISTICA Multivariate Exploratory Techniques), which allows you to use ANOVA/ANCOVA - like experimental designs for classification, and to use stepwise and best-subset selection of predictor effects; GDA also includes desirability profiler and response optimization methods, which can be used to determine the factor combinations, levels, and/or values that maximize the posterior classification probabilities for one or more categories of the dependent (outcome) variable,
  • General Classification and Regression Trees Models and General CHAID models (available in STATISTICA Data Miner), which allow you to evaluate the efficacy of ANOVA/ANCOVA - like experimental designs for building highly non-linear hierarchical classification or regression trees.

  • Thus, STATISTICA can be applied to quality-improvement research in creative and innovative ways, when the dependent variables of interest are categorical in nature, or when the effect of the predictor variables (effects) is clearly non-linear in nature.

    STATISTICA Design of Experiments is an add-on package that requires a base product such as STATISTICA Base or STATISTICA Quality Control Charts.
    Back to Top
    Request Quote
    StatSoft Home Page



    [StatSoft]
    2300 East 14th Street, Tulsa, OK 74104
    Phone: (918) 749-1119; Fax: (918) 749-2217

    [StatSoft]e-mail: info@statsoft.com

    ©Copyright StatSoft, Inc., 1984-2004.
    StatSoft, StatSoft logo, STATISTICA, SEWSS, SEDAS, Data Miner, SEPATH and GTrees are trademarks of StatSoft, Inc.