
|
|
|
The popularity of neural network methodology is rapidly growing in a wide variety of areas from basic research, to data mining applications, business forecasting and risk management, engineering, and others (see Example Applications). STATISTICA Neural Networks is the most technologically advanced and best performing neural networks application on the market. It offers numerous unique advantages and will appeal not only to neural network experts (by offering to them an extraordinary selection of network types and training algorithms), but also to new users in the field of neural computing (via the unique Intelligent Problem Solver, a tool that can guide the user through the necessary procedures for creating neural networks).

STATISTICA Neural Networks is a comprehensive, state-of-the-art, powerful, and extremely fast neural network data analysis package, featuring:
STATISTICA Neural Networks - Tackling the Real Issues in Neural Computing
Using neural networks involves more than simply feeding data to a neural network.
STATISTICA Neural Networks has the functionality to help you through the critical design stages, including not only state-of-the-art Neural Network Architectures and Training Algorithms, but also innovative new approaches to Input Selection and Network Design. Moreover, software developers and those users who experiment with customized applications will appreciate the fact that once your prototyping experiments are completed using STATISTICA Neural Networks' simple and intuitive user interface, neural networks analyses can be incorporated in custom applications by either using the STATISTICA library of COM functions that fully expose all functionality of the program, or by using the C (C++, C#) or Visual Basic code generated by the program to aid in the deployment of fully trained networks or network ensembles.
STATISTICA Neural Networks is fully integrated with the STATISTICA system, so a large selection of tools for editing (preparing) data for analyses is available (transformations, case selection conditions, data verification tools, etc.). Like all STATISTICA analyses, the program can be "connected" to remote databases via the tools for in-place-database processing, or it can be linked to active data so that models are retrained or applied (e.g., to compute predicted values or classifications) automatically every time the data change.
| Back to Technical Description |
Once you have a dataset prepared, you will need to decide which variables to use in your neural network. Larger numbers of input variables require larger neural networks, with consequent increases in storage and training time requirements, and the need for greater numbers of training cases. Lack of data and correlations between variables make the selection of important input variables, and the compression of information into smaller numbers of variables, issues of critical importance in many neural applications.
Input feature selection algorithms.
STATISTICA Neural Networks includes backwards and forwards stepwise selection algorithms. In addition, the Neuro-Genetic Input Selection algorithm uniquely combines the technologies of Genetic Algorithms and PNN/GRNNs (PNN stands for Probabilistic Neural Networks, and GRNN stands for Generalized Regression Neural Network) to automatically search for optimal combinations of input variables, even where there are correlations and non-linear interdependencies. The near-instantaneous training time of PNN/GRNN not only allows the Neuro-Genetic Input Selection algorithm to operate - it also allows you, in conjunction with the STATISTICA Neural Networks dataset Editor's simple variable suppression facilities, to conduct your own input sensitivity experiments on a realistic time scale. STATISTICA Neural Networks also includes built-in Principal Components Analysis (PCA and Autoassociative networks for "non-linear PCA") to extract smaller numbers of dimensions from the raw inputs. Note that a wide variety of statistical tools for data reduction are available in STATISTICA. Data Scaling and Nominal Value Preparation
In general, data must be specially prepared for input into neural networks, and also it is important that the network output can be interpreted correctly. STATISTICA Neural Networks includes automatic data scaling (including Minimax and Mean/SD scaling) for both inputs and outputs; there is also automatic recoding of Nominal valued variables (e.g., Sex={Male,Female}), including one-of-N encoding. STATISTICA Neural Networks also has facilities to handle missing data. Normalization functions such as Unit Sum, Winner-takes-all and Unit Vector are also supported. There are special data preparation and interpretation facilities for use with Time Series. A large number of relevant tools are also included in STATISTICA.
For classification problems, you can set confidence limits which STATISTICA Neural Networks uses to assign cases to classes. In combination with STATISTICA Neural Networks' specialized Softmax activation function and cross-entropy error functions, this supports a principled, probabilistic approach to classification.
Selecting a Neural Network Model, Neural Network Ensembles
The range of neural network models and the number of parameters which must be decided upon (including network size, and training algorithm control parameters) can seem bewildering (the Intelligent Problem Solver is available to automatically search through numerous network architectures of varying complexities; see below ). STATISTICA Neural Networks supports the most important classes of neural networks for real world problem solving, including:
In addition STATISTICA Neural Networks supports Ensembles networks formed from arbitrary (when meaningful) combinations of the network types above. You may also chose to join networks to operate in sequence, a feature which is particularly useful for preprocessing and making minimum cost decisions in classification networks.
STATISTICA Neural Networks has numerous facilities to aid in selecting an appropriate network architecture. STATISTICA Neural Networks' statistical and graphical feedback includes Bar Charts, Matrices and Graphs of individual and overall case errors, summaries of classification/mis-classification performance, and vital statistics such as Regression Error Ratios - all automatically calculated.
For data visualization, STATISTICA Neural Networks can also display Scatterplots and 3D Response Surfaces, to help the user understand the network's "behavior."
Naturally, you can use information from any of these sources for further analyses with other STATISTICA tools, or for inclusion in your reports, further analysis, or customization.
STATISTICA Neural Networks automatically retains copies of the best networks found as you experiment on a problem, which can be retrieved at any time. The usefulness and predictive validity of the network can automatically be assessed by including selection and test cases, and by evaluating the size and efficiency of the network as well as the cost of misclassification. STATISTICA Neural Networks' automatic Cross Verification and Weigend Weight Regularization procedures also allow you to quickly assess whether a network is overly or not sufficiently complex for the problem at hand.
For enhanced performance, STATISTICA Neural Networks supports a number of network customization options. You can specify a Linear output layer for networks used in Regression problems, or Softmax Activation functions for probability-estimation in Classification problems. If your data suffers badly from outliers, you can replace the standard Error function used in training with the less sensitive City-Block Error function. Cross-entropy error functions, based on information-theory models, are also included, and there is a range of specialized activation functions, including Step, Ramp and Sine functions.
The Intelligent Problem Solver (automatic evaluation and selection of multiple network architectures)
Included with STATISTICA Neural Networks is an Intelligent Problem Solver which can automatically evaluate a large number of different neural network architectures of varying complexities, and select the best set of specific architectures for the problem at hand.
The Intelligent Problem Solver can create networks using data whose cases are independent (standard Regression and Classification networks or a mixture of both) as well as networks which predict future observations based on previous observations of the same variable (time series networks).
A significant amount of time during the design of a neural network is spent on the selection of appropriate variables, and then optimizing the network architecture by heuristic search. STATISTICA Neural Networks takes the pain out of the process by automatically conducting a heuristic search for you. This search includes input dimensionality (feature selection), network types, network sizes and indeed the network output encoding functions.
While the Intelligent Problem Solver is conducting this search, you can specify the amount of real time feedback you would like to receive as training progresses. At its most detailed level, the Intelligent Problem Solver displays the architecture and performance levels for each network tested.
The Intelligent Problem Solver is an extremely effective tool which uses sophisticated techniques to search automatically for optimal network architectures. Why labor over a terminal for hours, when you can let STATISTICA Neural Networks do the work for you?
The Intelligent Problem Solver can also be used in a process of model building when STATISTICA Neural Networks is used in conjunction with some modules of the main STATISTICA system to identify the most relevant variables (e.g., the best predictors to be included and then tested in some Nonlinear Estimation model).
As you experiment with architectures and network types, you rely critically on the quality and speed of the network training algorithms. STATISTICA Neural Networks supports the best known state-of-the-art training algorithms.
For Multilayer Perceptrons, STATISTICA Neural Networks naturally includes Back Propagation - with time-varying learning rate and momentum, case-presentation order shuffling and additive Noise for robust generalization. However, STATISTICA Neural Networks also includes two fast, second-order training algorithms: Conjugate Gradient Descent and Levenberg-Marquardt. Levenberg-Marquardt is a very powerful, modern non-linear optimization algorithm, and it is strongly recommended. However, as Levenberg-Marquardt is limited in application to fairly small networks with a single output variable, STATISTICA Neural Networks also includes Conjugate-Gradient Descent for more difficult problems. Both of these algorithms typically converge far more quickly than Back Propagation, and frequently to a far better solution.
STATISTICA Neural Networks' iterative training procedures are complemented by automatic tracking of both the Training error and an independent Selection error, including a real time Graph of the these errors as training progresses. Training can be aborted at any point by the press of a button, and you can also specify Stopping Conditions when training should be prematurely aborted: for example, when a target error level is reached, or when the Selection error deteriorates over a given number of epochs (indicating Over-learning). If over-learning occurs you needn't worry: STATISTICA Neural Networks automatically retains a copy of the Best Network discovered, which can be retrieved at the press of a button. When training has finished, you can finally check performance against an independent Test Set.
STATISTICA Neural Networks also includes a range of training algorithms for other network architectures. Radial Basis Function and Generalized Regression networks can have Radial exemplar units and smoothing factors assigned by a variety of algorithms, including Kohonen training, Sub-Sampling, K-Means, Isotropic and Nearest Neighbor techniques. The Linear output layers of Radial Basis Function networks can be fully optimized using Singular Value Decomposition, as can Linear networks.
Hybridization of Network Structures. STATISTICA Neural Networks also supports hybridization of network structures: for example, a modified Radial Basis Function network could have a first layer trained by Kohonen's algorithm, and a non-linear second layer trained by Levenberg-Marquardt.
Probing and Testing a Neural Network
Once you have trained a network, you'll want to test its performance and explore its characteristics. STATISTICA Neural Networks uses a range of on-screen statistics and graphical facilities.
You may select multiple models (networks and ensembles), in which case, wherever possible, STATISTICA SNN will display any results generated in a comparative fashion (e.g. by plotting the response curves for several models on a single graph, or presenting the predictions of several models in a single spreadsheet). This feature is particularly useful for comparing various models trained on the same data set.
All statistics are generated independently for the Training, Selection, Test and Ignore Sets. You can view the individual weights and activations in the network in convenient data sheet format; one click of a button can also transfer them into STATISTICA Spreadsheets. The results of running individual cases, or the entire set, can also be reviewed in STATISTICA spreadsheets, and used as input for subsequent graphical or numerical analyses.
Overall statistics calculated include mean network error, the so-called Confusion matrix for Classification problems (which summarizes correct and incorrect classification across all classes), and the Regression Error Ratio for Regression problems - all automatically calculated. Kohonen networks include a Topological Map window, which allows you to visually inspect unit activations, and to relabel cases and units during data analysis. There is also a Win Frequencies option which allows you to locate clusters in the Topological Map. Cluster analysis can also be conducted using conventional networks together with STATISTICA Neural Networks' Cluster Diagram (shown below). For example, you can train a Principal Components Analysis network, and plot data through the first two Principal Components.
Network Editing, Modification, and Pipe-lining
STATISTICA Neural Networks includes intelligent facilities to prune existing networks, and to join networks together. Entire layers can be deleted, networks with compatible numbers of Inputs and Outputs can be pipe-lined together, and individual neurons can be added or removed. These facilities allow STATISTICA Neural Networks to support Dimensionality Reduction (for pre-processing) by the use of Autoassociative networks; and Loss Matrices (for minimum-cost decision making). Loss matrices are automatically included with Probabilistic Neural Networks.
Embedded Solutions (Custom Applications that
Use the STATISTICA Neural Networks engines)
STATISTICA Neural Networks' simple and efficient user-interface allows you to rapidly prototype neural network solutions to your problems.
In some applications, you may want to embed these solutions in your own systems and, for example, build them into some larger computing environments (such as pre-designed procedures built into enterprise-wide computing systems).
Trained neural networks can be applied to new data (for prediction) in several ways: You can save the trained neural networks or ensembles of networks (e.g., for computing an average prediction based on multiple architectures), and later retrieve them to be applied to new data (for prediction, predicted classification, or forecasting); you can also use the (optional) code generator options to generate C (C++, C#) or Visual Basic code for scoring (predicting) new data using any programming environment for visual basic or C++ (C#), i.e., to incorporate fully trained networks in your custom application. Finally, all functionality of STATISTICA, including STATISTICA Neural Networks can be accessed as COM (Component Object Model) functions from other applications (e.g., from Java, MS Excel, etc.). For example, you could embed automatic analyses via STATISTICA Neural Networks into your MS Excel spreadsheets.
STATISTICA Neural Networks will run even on relatively small or old systems, however, due to the computationally intensive nature of many procedures, a Pentium system with 32 megabytes of RAM is highly recommended.
The networks can be of practically unlimited size (that is, they can be much larger than what would ever be practical or reasonable). SNN supports up to 128 layers of networks (obtained by joining networks) with no limitation on the number of neurons. For all practical purposes, the program is effectively limited only by the hardware of the computer.
STATISTICA Neural Networks includes a well-illustrated manual, with a comprehensive, conceptual introduction to Neural Networks (and tutorials), and extensive context sensitive Help accessible from every dialog.
See also Comments from STATISTICA Neural Networks Users.
Examples of Real-life Applications
Neural networks can be used in virtually any situation where the objective is to determine an unknown variable or attribute from known observations or registered measurements (i.e., various forms of regression, classification, and time series), where there is a sufficient amount of historical data, and where there actually exists a tractable underlying relationship or a set of relationships (networks are relatively noise tolerant). In addition, neural networks can be used for exploratory analysis by looking for data clustering (Kohonen networks).
A comprehensive discussion of theoretical considerations related to the issue of when neural network applications are most likely to be successful can be found in the chapter on neural networks in the StatSoft Electronic Statistics Textbook (available on the StatSoft web site). The following list includes a selection of representative examples that by no means exhaust all areas where neural networks can be used.
New! Optional Source Code Generator Add-on Available
The source code generator is an optional add-on that allows users the flexibility to build custom applications based on solutions found with STATISTICA Neural Networks. This add-on generates a source code version of a neural network (in C, C++, C#), which can then be compiled and integrated into your own programs for royalty-free distribution. The Code Generator can also create SVB (STATISTICA Visual Basic) code. The add-on product is designed for corporate system developers and other users who need to convert the highly optimized solutions generated by STATISTICA Neural Networks procedures into fixed, predefined applications that will solve routine analytic problems. (Important Note: To ensure compliance with licensing restrictions, users must notify StatSoft Inc. before distributing programs that use generated code))
| Back to Top |
| Request Quote |
| StatSoft Home Page |
![[StatSoft]](../images/sssmall.gif)
2300 East 14th Street, Tulsa, OK 74104
Phone: (918) 749-1119; Fax: (918) 749-2217
e-mail: info@statsoft.com
©Copyright StatSoft, Inc., 1984-2004.
StatSoft, StatSoft logo, STATISTICA, SEWSS, SEDAS, Data Miner, SEPATH and GTrees are trademarks of StatSoft, Inc.