Logo ROOT   6.13/01
Reference Guide
TMultiLayerPerceptron.cxx
Go to the documentation of this file.
1 // @(#)root/mlp:$Id$
2 // Author: Christophe.Delaere@cern.ch 20/07/03
3 
4 /*************************************************************************
5  * Copyright (C) 1995-2003, Rene Brun and Fons Rademakers. *
6  * All rights reserved. *
7  * *
8  * For the licensing terms see $ROOTSYS/LICENSE. *
9  * For the list of contributors see $ROOTSYS/README/CREDITS. *
10  *************************************************************************/
11 
12 ///////////////////////////////////////////////////////////////////////////
13 //
14 // TMultiLayerPerceptron
15 //
16 // This class describes a neural network.
17 // There are facilities to train the network and use the output.
18 //
19 // The input layer is made of inactive neurons (returning the
20 // optionaly normalized input) and output neurons are linear.
21 // The type of hidden neurons is free, the default being sigmoids.
22 // (One should still try to pass normalized inputs, e.g. between [0.,1])
23 //
24 // The basic input is a TTree and two (training and test) TEventLists.
25 // Input and output neurons are assigned a value computed for each event
26 // with the same possibilities as for TTree::Draw().
27 // Events may be weighted individualy or via TTree::SetWeight().
28 // 6 learning methods are available: kStochastic, kBatch,
29 // kSteepestDescent, kRibierePolak, kFletcherReeves and kBFGS.
30 //
31 // This implementation, written by C. Delaere, is *inspired* from
32 // the mlpfit package from J.Schwindling et al. with some extensions:
33 // * the algorithms are globally the same
34 // * in TMultilayerPerceptron, there is no limitation on the number of
35 // layers/neurons, while MLPFIT was limited to 2 hidden layers
36 // * TMultilayerPerceptron allows you to save the network in a root file, and
37 // provides more export functionalities
38 // * TMultilayerPerceptron gives more flexibility regarding the normalization of
39 // inputs/outputs
40 // * TMultilayerPerceptron provides, thanks to Andrea Bocci, the possibility to
41 // use cross-entropy errors, which allows to train a network for pattern
42 // classification based on Bayesian posterior probability.
43 //
44 ///////////////////////////////////////////////////////////////////////////
45 //BEGIN_HTML <!--
46 /* -->
47 <UL>
48  <LI><P><A NAME="intro"></A><FONT COLOR="#5c8526">
49  <FONT SIZE=4 STYLE="font-size: 15pt">Introduction</FONT></FONT></P>
50 </UL>
51 <P>Neural Networks are more and more used in various fields for data
52 analysis and classification, both for research and commercial
53 institutions. Some randomly chosen examples are:</P>
54 <UL>
55  <LI><P>image analysis</P>
56  <LI><P>financial movements predictions and analysis</P>
57  <LI><P>sales forecast and product shipping optimisation</P>
58  <LI><P>in particles physics: mainly for classification tasks (signal
59  over background discrimination)</P>
60 </UL>
61 <P>More than 50% of neural networks are multilayer perceptrons. This
62 implementation of multilayer perceptrons is inspired from the
63 <A HREF="http://schwind.home.cern.ch/schwind/MLPfit.html">MLPfit
64 package</A> originaly written by Jerome Schwindling. MLPfit remains
65 one of the fastest tool for neural networks studies, and this ROOT
66 add-on will not try to compete on that. A clear and flexible Object
67 Oriented implementation has been chosen over a faster but more
68 difficult to maintain code. Nevertheless, the time penalty does not
69 exceed a factor 2.</P>
70 <UL>
71  <LI><P><A NAME="mlp"></A><FONT COLOR="#5c8526">
72  <FONT SIZE=4 STYLE="font-size: 15pt">The
73  MLP</FONT></FONT></P>
74 </UL>
75 <P>The multilayer perceptron is a simple feed-forward network with
76 the following structure:</P>
77 <P ALIGN=CENTER><IMG SRC="gif/mlp.png" NAME="MLP"
78 ALIGN=MIDDLE WIDTH=333 HEIGHT=358 BORDER=0>
79 </P>
80 <P>It is made of neurons characterized by a bias and weighted links
81 between them (let's call those links synapses). The input neurons
82 receive the inputs, normalize them and forward them to the first
83 hidden layer.
84 </P>
85 <P>Each neuron in any subsequent layer first computes a linear
86 combination of the outputs of the previous layer. The output of the
87 neuron is then function of that combination with <I>f</I> being
88 linear for output neurons or a sigmoid for hidden layers. This is
89 useful because of two theorems:</P>
90 <OL>
91  <LI><P>A linear combination of sigmoids can approximate any
92  continuous function.</P>
93  <LI><P>Trained with output = 1 for the signal and 0 for the
94  background, the approximated function of inputs X is the probability
95  of signal, knowing X.</P>
96 </OL>
97 <UL>
98  <LI><P><A NAME="lmet"></A><FONT COLOR="#5c8526">
99  <FONT SIZE=4 STYLE="font-size: 15pt">Learning
100  methods</FONT></FONT></P>
101 </UL>
102 <P>The aim of all learning methods is to minimize the total error on
103 a set of weighted examples. The error is defined as the sum in
104 quadrature, devided by two, of the error on each individual output
105 neuron.</P>
106 <P>In all methods implemented, one needs to compute
107 the first derivative of that error with respect to the weights.
108 Exploiting the well-known properties of the derivative, especialy the
109 derivative of compound functions, one can write:</P>
110 <UL>
111  <LI><P>for a neuton: product of the local derivative with the
112  weighted sum on the outputs of the derivatives.</P>
113  <LI><P>for a synapse: product of the input with the local derivative
114  of the output neuron.</P>
115 </UL>
116 <P>This computation is called back-propagation of the errors. A
117 loop over all examples is called an epoch.</P>
118 <P>Six learning methods are implemented.</P>
119 <P><FONT COLOR="#006b6b"><I>Stochastic minimization</I>:</FONT> This
120 is the most trivial learning method. This is the Robbins-Monro
121 stochastic approximation applied to multilayer perceptrons. The
122 weights are updated after each example according to the formula:</P>
123 <P ALIGN=CENTER>$w_{ij}(t+1) = w_{ij}(t) + \Delta w_{ij}(t)$
124 </P>
125 <P ALIGN=CENTER>with
126 </P>
127 <P ALIGN=CENTER>$\Delta w_{ij}(t) = - \eta(\d e_p / \d w_{ij} +
128 \delta) + \epsilon \Deltaw_{ij}(t-1)$</P>
129 <P>The parameters for this method are Eta, EtaDecay, Delta and
130 Epsilon.</P>
131 <P><FONT COLOR="#006b6b"><I>Steepest descent with fixed step size
132 (batch learning)</I>:</FONT> It is the same as the stochastic
133 minimization, but the weights are updated after considering all the
134 examples, with the total derivative dEdw. The parameters for this
135 method are Eta, EtaDecay, Delta and Epsilon.</P>
136 <P><FONT COLOR="#006b6b"><I>Steepest descent algorithm</I>: </FONT>Weights
137 are set to the minimum along the line defined by the gradient. The
138 only parameter for this method is Tau. Lower tau = higher precision =
139 slower search. A value Tau = 3 seems reasonable.</P>
140 <P><FONT COLOR="#006b6b"><I>Conjugate gradients with the
141 Polak-Ribiere updating formula</I>: </FONT>Weights are set to the
142 minimum along the line defined by the conjugate gradient. Parameters
143 are Tau and Reset, which defines the epochs where the direction is
144 reset to the steepes descent.</P>
145 <P><FONT COLOR="#006b6b"><I>Conjugate gradients with the
146 Fletcher-Reeves updating formula</I>: </FONT>Weights are set to the
147 minimum along the line defined by the conjugate gradient. Parameters
148 are Tau and Reset, which defines the epochs where the direction is
149 reset to the steepes descent.</P>
150 <P><FONT COLOR="#006b6b"><I>Broyden, Fletcher, Goldfarb, Shanno
151 (BFGS) method</I>:</FONT> Implies the computation of a NxN matrix
152 computation, but seems more powerful at least for less than 300
153 weights. Parameters are Tau and Reset, which defines the epochs where
154 the direction is reset to the steepes descent.</P>
155 <UL>
156  <LI><P><A NAME="use"></A><FONT COLOR="#5c8526">
157  <FONT SIZE=4 STYLE="font-size: 15pt">How
158  to use it...</FONT></FONT></P></LI>
159 </UL>
160 <P><FONT SIZE=3>TMLP is build from 3 classes: TNeuron, TSynapse and
161 TMultiLayerPerceptron. Only TMultiLayerPerceptron should be used
162 explicitly by the user.</FONT></P>
163 <P><FONT SIZE=3>TMultiLayerPerceptron will take examples from a TTree
164 given in the constructor. The network is described by a simple
165 string: The input/output layers are defined by giving the expression for
166 each neuron, separated by comas. Hidden layers are just described
167 by the number of neurons. The layers are separated by colons.
168 In addition, input/output layer formulas can be preceded by '@' (e.g "@out")
169 if one wants to also normalize the data from the TTree.
170 Input and outputs are taken from the TTree given as second argument.
171 Expressions are evaluated as for TTree::Draw(), arrays are expended in
172 distinct neurons, one for each index.
173 This can only be done for fixed-size arrays.
174 If the formula ends with &quot;!&quot;, softmax functions are used for the output layer.
175 One defines the training and test datasets by TEventLists.</FONT></P>
176 <P STYLE="margin-left: 2cm"><FONT SIZE=3><SPAN STYLE="background: #e6e6e6">
177 <U><FONT COLOR="#ff0000">Example</FONT></U><SPAN STYLE="text-decoration: none">:
178 </SPAN>TMultiLayerPerceptron(&quot;x,y:10:5:f&quot;,inputTree);</SPAN></FONT></P>
179 <P><FONT SIZE=3>Both the TTree and the TEventLists can be defined in
180 the constructor, or later with the suited setter method. The lists
181 used for training and test can be defined either explicitly, or via
182 a string containing the formula to be used to define them, exactly as
183 for a TCut.</FONT></P>
184 <P><FONT SIZE=3>The learning method is defined using the
185 TMultiLayerPerceptron::SetLearningMethod() . Learning methods are :</FONT></P>
186 <P><FONT SIZE=3>TMultiLayerPerceptron::kStochastic, <BR>
187 TMultiLayerPerceptron::kBatch,<BR>
188 TMultiLayerPerceptron::kSteepestDescent,<BR>
189 TMultiLayerPerceptron::kRibierePolak,<BR>
190 TMultiLayerPerceptron::kFletcherReeves,<BR>
191 TMultiLayerPerceptron::kBFGS<BR></FONT></P>
192 <P>A weight can be assigned to events, either in the constructor, either
193 with TMultiLayerPerceptron::SetEventWeight(). In addition, the TTree weight
194 is taken into account.</P>
195 <P><FONT SIZE=3>Finally, one starts the training with
196 TMultiLayerPerceptron::Train(Int_t nepoch, Option_t* options). The
197 first argument is the number of epochs while option is a string that
198 can contain: &quot;text&quot; (simple text output) , &quot;graph&quot;
199 (evoluting graphical training curves), &quot;update=X&quot; (step for
200 the text/graph output update) or &quot;+&quot; (will skip the
201 randomisation and start from the previous values). All combinations
202 are available. </FONT></P>
203 <P STYLE="margin-left: 2cm"><FONT SIZE=3><SPAN STYLE="background: #e6e6e6">
204 <U><FONT COLOR="#ff0000">Example</FONT></U>:
205 net.Train(100,&quot;text, graph, update=10&quot;).</SPAN></FONT></P>
206 <P><FONT SIZE=3>When the neural net is trained, it can be used
207 directly ( TMultiLayerPerceptron::Evaluate() ) or exported to a
208 standalone C++ code ( TMultiLayerPerceptron::Export() ).</FONT></P>
209 <P><FONT SIZE=3>Finaly, note that even if this implementation is inspired from the mlpfit code,
210 the feature lists are not exactly matching:
211 <UL>
212  <LI><P>mlpfit hybrid learning method is not implemented</P></LI>
213  <LI><P>output neurons can be normalized, this is not the case for mlpfit</P></LI>
214  <LI><P>the neural net is exported in C++, FORTRAN or PYTHON</P></LI>
215  <LI><P>the drawResult() method allows a fast check of the learning procedure</P></LI>
216 </UL>
217 In addition, the paw version of mlpfit had additional limitations on the number of neurons, hidden layers and inputs/outputs that does not apply to TMultiLayerPerceptron.
218 <!-- */
219 // -->END_HTML
220 
221 #include "TMultiLayerPerceptron.h"
222 #include "TSynapse.h"
223 #include "TNeuron.h"
224 #include "TClass.h"
225 #include "TTree.h"
226 #include "TEventList.h"
227 #include "TRandom3.h"
228 #include "TTimeStamp.h"
229 #include "TRegexp.h"
230 #include "TCanvas.h"
231 #include "TH2.h"
232 #include "TGraph.h"
233 #include "TLegend.h"
234 #include "TMultiGraph.h"
235 #include "TDirectory.h"
236 #include "TSystem.h"
237 #include "Riostream.h"
238 #include "TMath.h"
239 #include "TTreeFormula.h"
240 #include "TTreeFormulaManager.h"
241 #include "TMarker.h"
242 #include "TLine.h"
243 #include "TText.h"
244 #include "TObjString.h"
245 #include <stdlib.h>
246 
247 ClassImp(TMultiLayerPerceptron);
248 
249 ////////////////////////////////////////////////////////////////////////////////
250 /// Default constructor
251 
253 {
254  if(!TClass::GetClass("TTreePlayer")) gSystem->Load("libTreePlayer");
255  fNetwork.SetOwner(true);
256  fFirstLayer.SetOwner(false);
257  fLastLayer.SetOwner(false);
258  fSynapses.SetOwner(true);
259  fData = 0;
260  fCurrentTree = -1;
261  fCurrentTreeWeight = 1;
262  fStructure = "";
263  fWeight = "1";
264  fTraining = 0;
265  fTrainingOwner = false;
266  fTest = 0;
267  fTestOwner = false;
268  fEventWeight = 0;
269  fManager = 0;
271  fEta = .1;
272  fEtaDecay = 1;
273  fDelta = 0;
274  fEpsilon = 0;
275  fTau = 3;
276  fLastAlpha = 0;
277  fReset = 50;
280  fextF = "";
281  fextD = "";
282 }
283 
284 ////////////////////////////////////////////////////////////////////////////////
285 /// The network is described by a simple string:
286 /// The input/output layers are defined by giving
287 /// the branch names separated by comas.
288 /// Hidden layers are just described by the number of neurons.
289 /// The layers are separated by colons.
290 /// Ex: "x,y:10:5:f"
291 /// The output can be prepended by '@' if the variable has to be
292 /// normalized.
293 /// The output can be followed by '!' to use Softmax neurons for the
294 /// output layer only.
295 /// Ex: "x,y:10:5:c1,c2,c3!"
296 /// Input and outputs are taken from the TTree given as second argument.
297 /// training and test are the two TEventLists defining events
298 /// to be used during the neural net training.
299 /// Both the TTree and the TEventLists can be defined in the constructor,
300 /// or later with the suited setter method.
301 
302 TMultiLayerPerceptron::TMultiLayerPerceptron(const char * layout, TTree * data,
303  TEventList * training,
304  TEventList * test,
306  const char* extF, const char* extD)
307 {
308  if(!TClass::GetClass("TTreePlayer")) gSystem->Load("libTreePlayer");
309  fNetwork.SetOwner(true);
310  fFirstLayer.SetOwner(false);
311  fLastLayer.SetOwner(false);
312  fSynapses.SetOwner(true);
313  fStructure = layout;
314  fData = data;
315  fCurrentTree = -1;
316  fCurrentTreeWeight = 1;
317  fTraining = training;
318  fTrainingOwner = false;
319  fTest = test;
320  fTestOwner = false;
321  fWeight = "1";
322  fType = type;
324  fextF = extF;
325  fextD = extD;
326  fEventWeight = 0;
327  fManager = 0;
328  if (data) {
329  BuildNetwork();
330  AttachData();
331  }
333  fEta = .1;
334  fEpsilon = 0;
335  fDelta = 0;
336  fEtaDecay = 1;
337  fTau = 3;
338  fLastAlpha = 0;
339  fReset = 50;
340 }
341 
342 ////////////////////////////////////////////////////////////////////////////////
343 /// The network is described by a simple string:
344 /// The input/output layers are defined by giving
345 /// the branch names separated by comas.
346 /// Hidden layers are just described by the number of neurons.
347 /// The layers are separated by colons.
348 /// Ex: "x,y:10:5:f"
349 /// The output can be prepended by '@' if the variable has to be
350 /// normalized.
351 /// The output can be followed by '!' to use Softmax neurons for the
352 /// output layer only.
353 /// Ex: "x,y:10:5:c1,c2,c3!"
354 /// Input and outputs are taken from the TTree given as second argument.
355 /// training and test are the two TEventLists defining events
356 /// to be used during the neural net training.
357 /// Both the TTree and the TEventLists can be defined in the constructor,
358 /// or later with the suited setter method.
359 
361  const char * weight, TTree * data,
362  TEventList * training,
363  TEventList * test,
365  const char* extF, const char* extD)
366 {
367  if(!TClass::GetClass("TTreePlayer")) gSystem->Load("libTreePlayer");
368  fNetwork.SetOwner(true);
369  fFirstLayer.SetOwner(false);
370  fLastLayer.SetOwner(false);
371  fSynapses.SetOwner(true);
372  fStructure = layout;
373  fData = data;
374  fCurrentTree = -1;
375  fCurrentTreeWeight = 1;
376  fTraining = training;
377  fTrainingOwner = false;
378  fTest = test;
379  fTestOwner = false;
380  fWeight = weight;
381  fType = type;
383  fextF = extF;
384  fextD = extD;
385  fEventWeight = 0;
386  fManager = 0;
387  if (data) {
388  BuildNetwork();
389  AttachData();
390  }
392  fEta = .1;
393  fEtaDecay = 1;
394  fDelta = 0;
395  fEpsilon = 0;
396  fTau = 3;
397  fLastAlpha = 0;
398  fReset = 50;
399 }
400 
401 ////////////////////////////////////////////////////////////////////////////////
402 /// The network is described by a simple string:
403 /// The input/output layers are defined by giving
404 /// the branch names separated by comas.
405 /// Hidden layers are just described by the number of neurons.
406 /// The layers are separated by colons.
407 /// Ex: "x,y:10:5:f"
408 /// The output can be prepended by '@' if the variable has to be
409 /// normalized.
410 /// The output can be followed by '!' to use Softmax neurons for the
411 /// output layer only.
412 /// Ex: "x,y:10:5:c1,c2,c3!"
413 /// Input and outputs are taken from the TTree given as second argument.
414 /// training and test are two cuts (see TTreeFormula) defining events
415 /// to be used during the neural net training and testing.
416 /// Example: "Entry$%2", "(Entry$+1)%2".
417 /// Both the TTree and the cut can be defined in the constructor,
418 /// or later with the suited setter method.
419 
420 TMultiLayerPerceptron::TMultiLayerPerceptron(const char * layout, TTree * data,
421  const char * training,
422  const char * test,
424  const char* extF, const char* extD)
425 {
426  if(!TClass::GetClass("TTreePlayer")) gSystem->Load("libTreePlayer");
427  fNetwork.SetOwner(true);
428  fFirstLayer.SetOwner(false);
429  fLastLayer.SetOwner(false);
430  fSynapses.SetOwner(true);
431  fStructure = layout;
432  fData = data;
433  fCurrentTree = -1;
434  fCurrentTreeWeight = 1;
435  fTraining = new TEventList(Form("fTrainingList_%lu",(ULong_t)this));
436  fTrainingOwner = true;
437  fTest = new TEventList(Form("fTestList_%lu",(ULong_t)this));
438  fTestOwner = true;
439  fWeight = "1";
440  TString testcut = test;
441  if(testcut=="") testcut = Form("!(%s)",training);
442  fType = type;
444  fextF = extF;
445  fextD = extD;
446  fEventWeight = 0;
447  fManager = 0;
448  if (data) {
449  BuildNetwork();
450  data->Draw(Form(">>fTrainingList_%lu",(ULong_t)this),training,"goff");
451  data->Draw(Form(">>fTestList_%lu",(ULong_t)this),(const char *)testcut,"goff");
452  AttachData();
453  }
454  else {
455  Warning("TMultiLayerPerceptron::TMultiLayerPerceptron","Data not set. Cannot define datasets");
456  }
458  fEta = .1;
459  fEtaDecay = 1;
460  fDelta = 0;
461  fEpsilon = 0;
462  fTau = 3;
463  fLastAlpha = 0;
464  fReset = 50;
465 }
466 
467 ////////////////////////////////////////////////////////////////////////////////
468 /// The network is described by a simple string:
469 /// The input/output layers are defined by giving
470 /// the branch names separated by comas.
471 /// Hidden layers are just described by the number of neurons.
472 /// The layers are separated by colons.
473 /// Ex: "x,y:10:5:f"
474 /// The output can be prepended by '@' if the variable has to be
475 /// normalized.
476 /// The output can be followed by '!' to use Softmax neurons for the
477 /// output layer only.
478 /// Ex: "x,y:10:5:c1,c2,c3!"
479 /// Input and outputs are taken from the TTree given as second argument.
480 /// training and test are two cuts (see TTreeFormula) defining events
481 /// to be used during the neural net training and testing.
482 /// Example: "Entry$%2", "(Entry$+1)%2".
483 /// Both the TTree and the cut can be defined in the constructor,
484 /// or later with the suited setter method.
485 
487  const char * weight, TTree * data,
488  const char * training,
489  const char * test,
491  const char* extF, const char* extD)
492 {
493  if(!TClass::GetClass("TTreePlayer")) gSystem->Load("libTreePlayer");
494  fNetwork.SetOwner(true);
495  fFirstLayer.SetOwner(false);
496  fLastLayer.SetOwner(false);
497  fSynapses.SetOwner(true);
498  fStructure = layout;
499  fData = data;
500  fCurrentTree = -1;
501  fCurrentTreeWeight = 1;
502  fTraining = new TEventList(Form("fTrainingList_%lu",(ULong_t)this));
503  fTrainingOwner = true;
504  fTest = new TEventList(Form("fTestList_%lu",(ULong_t)this));
505  fTestOwner = true;
506  fWeight = weight;
507  TString testcut = test;
508  if(testcut=="") testcut = Form("!(%s)",training);
509  fType = type;
511  fextF = extF;
512  fextD = extD;
513  fEventWeight = 0;
514  fManager = 0;
515  if (data) {
516  BuildNetwork();
517  data->Draw(Form(">>fTrainingList_%lu",(ULong_t)this),training,"goff");
518  data->Draw(Form(">>fTestList_%lu",(ULong_t)this),(const char *)testcut,"goff");
519  AttachData();
520  }
521  else {
522  Warning("TMultiLayerPerceptron::TMultiLayerPerceptron","Data not set. Cannot define datasets");
523  }
525  fEta = .1;
526  fEtaDecay = 1;
527  fDelta = 0;
528  fEpsilon = 0;
529  fTau = 3;
530  fLastAlpha = 0;
531  fReset = 50;
532 }
533 
534 ////////////////////////////////////////////////////////////////////////////////
535 /// Destructor
536 
538 {
539  if(fTraining && fTrainingOwner) delete fTraining;
540  if(fTest && fTestOwner) delete fTest;
541 }
542 
543 ////////////////////////////////////////////////////////////////////////////////
544 /// Set the data source
545 
547 {
548  if (fData) {
549  std::cerr << "Error: data already defined." << std::endl;
550  return;
551  }
552  fData = data;
553  if (data) {
554  BuildNetwork();
555  AttachData();
556  }
557 }
558 
559 ////////////////////////////////////////////////////////////////////////////////
560 /// Set the event weight
561 
562 void TMultiLayerPerceptron::SetEventWeight(const char * branch)
563 {
564  fWeight=branch;
565  if (fData) {
566  if (fEventWeight) {
567  fManager->Remove(fEventWeight);
568  delete fEventWeight;
569  }
570  fManager->Add((fEventWeight = new TTreeFormula("NNweight",fWeight.Data(),fData)));
571  }
572 }
573 
574 ////////////////////////////////////////////////////////////////////////////////
575 /// Sets the Training dataset.
576 /// Those events will be used for the minimization
577 
579 {
580  if(fTraining && fTrainingOwner) delete fTraining;
581  fTraining = train;
582  fTrainingOwner = false;
583 }
584 
585 ////////////////////////////////////////////////////////////////////////////////
586 /// Sets the Test dataset.
587 /// Those events will not be used for the minimization but for control
588 
590 {
591  if(fTest && fTestOwner) delete fTest;
592  fTest = test;
593  fTestOwner = false;
594 }
595 
596 ////////////////////////////////////////////////////////////////////////////////
597 /// Sets the Training dataset.
598 /// Those events will be used for the minimization.
599 /// Note that the tree must be already defined.
600 
602 {
603  if(fTraining && fTrainingOwner) delete fTraining;
604  fTraining = new TEventList(Form("fTrainingList_%lu",(ULong_t)this));
605  fTrainingOwner = true;
606  if (fData) {
607  fData->Draw(Form(">>fTrainingList_%lu",(ULong_t)this),train,"goff");
608  }
609  else {
610  Warning("TMultiLayerPerceptron::TMultiLayerPerceptron","Data not set. Cannot define datasets");
611  }
612 }
613 
614 ////////////////////////////////////////////////////////////////////////////////
615 /// Sets the Test dataset.
616 /// Those events will not be used for the minimization but for control.
617 /// Note that the tree must be already defined.
618 
620 {
621  if(fTest && fTestOwner) {delete fTest; fTest=0;}
622  if(fTest) if(strncmp(fTest->GetName(),Form("fTestList_%lu",(ULong_t)this),10)) delete fTest;
623  fTest = new TEventList(Form("fTestList_%lu",(ULong_t)this));
624  fTestOwner = true;
625  if (fData) {
626  fData->Draw(Form(">>fTestList_%lu",(ULong_t)this),test,"goff");
627  }
628  else {
629  Warning("TMultiLayerPerceptron::TMultiLayerPerceptron","Data not set. Cannot define datasets");
630  }
631 }
632 
633 ////////////////////////////////////////////////////////////////////////////////
634 /// Sets the learning method.
635 /// Available methods are: kStochastic, kBatch,
636 /// kSteepestDescent, kRibierePolak, kFletcherReeves and kBFGS.
637 /// (look at the constructor for the complete description
638 /// of learning methods and parameters)
639 
641 {
642  fLearningMethod = method;
643 }
644 
645 ////////////////////////////////////////////////////////////////////////////////
646 /// Sets Eta - used in stochastic minimisation
647 /// (look at the constructor for the complete description
648 /// of learning methods and parameters)
649 
651 {
652  fEta = eta;
653 }
654 
655 ////////////////////////////////////////////////////////////////////////////////
656 /// Sets Epsilon - used in stochastic minimisation
657 /// (look at the constructor for the complete description
658 /// of learning methods and parameters)
659 
661 {
662  fEpsilon = eps;
663 }
664 
665 ////////////////////////////////////////////////////////////////////////////////
666 /// Sets Delta - used in stochastic minimisation
667 /// (look at the constructor for the complete description
668 /// of learning methods and parameters)
669 
671 {
672  fDelta = delta;
673 }
674 
675 ////////////////////////////////////////////////////////////////////////////////
676 /// Sets EtaDecay - Eta *= EtaDecay at each epoch
677 /// (look at the constructor for the complete description
678 /// of learning methods and parameters)
679 
681 {
682  fEtaDecay = ed;
683 }
684 
685 ////////////////////////////////////////////////////////////////////////////////
686 /// Sets Tau - used in line search
687 /// (look at the constructor for the complete description
688 /// of learning methods and parameters)
689 
691 {
692  fTau = tau;
693 }
694 
695 ////////////////////////////////////////////////////////////////////////////////
696 /// Sets number of epochs between two resets of the
697 /// search direction to the steepest descent.
698 /// (look at the constructor for the complete description
699 /// of learning methods and parameters)
700 
702 {
703  fReset = reset;
704 }
705 
706 ////////////////////////////////////////////////////////////////////////////////
707 /// Load an entry into the network
708 
709 void TMultiLayerPerceptron::GetEntry(Int_t entry) const
710 {
711  if (!fData) return;
712  fData->GetEntry(entry);
713  if (fData->GetTreeNumber() != fCurrentTree) {
714  ((TMultiLayerPerceptron*)this)->fCurrentTree = fData->GetTreeNumber();
715  fManager->Notify();
716  ((TMultiLayerPerceptron*)this)->fCurrentTreeWeight = fData->GetWeight();
717  }
718  Int_t nentries = fNetwork.GetEntriesFast();
719  for (Int_t i=0;i<nentries;i++) {
720  TNeuron *neuron = (TNeuron *)fNetwork.UncheckedAt(i);
721  neuron->SetNewEvent();
722  }
723 }
724 
725 ////////////////////////////////////////////////////////////////////////////////
726 /// Train the network.
727 /// nEpoch is the number of iterations.
728 /// option can contain:
729 /// - "text" (simple text output)
730 /// - "graph" (evoluting graphical training curves)
731 /// - "update=X" (step for the text/graph output update)
732 /// - "+" will skip the randomisation and start from the previous values.
733 /// - "current" (draw in the current canvas)
734 /// - "minErrorTrain" (stop when NN error on the training sample gets below minE
735 /// - "minErrorTest" (stop when NN error on the test sample gets below minE
736 /// All combinations are available.
737 
738 void TMultiLayerPerceptron::Train(Int_t nEpoch, Option_t * option, Double_t minE)
739 {
740  Int_t i;
741  TString opt = option;
742  opt.ToLower();
743  // Decode options and prepare training.
744  Int_t verbosity = 0;
745  Bool_t newCanvas = true;
746  Bool_t minE_Train = false;
747  Bool_t minE_Test = false;
748  if (opt.Contains("text"))
749  verbosity += 1;
750  if (opt.Contains("graph"))
751  verbosity += 2;
752  Int_t displayStepping = 1;
753  if (opt.Contains("update=")) {
754  TRegexp reg("update=[0-9]*");
755  TString out = opt(reg);
756  displayStepping = atoi(out.Data() + 7);
757  }
758  if (opt.Contains("current"))
759  newCanvas = false;
760  if (opt.Contains("minerrortrain"))
761  minE_Train = true;
762  if (opt.Contains("minerrortest"))
763  minE_Test = true;
764  TVirtualPad *canvas = 0;
765  TMultiGraph *residual_plot = 0;
766  TGraph *train_residual_plot = 0;
767  TGraph *test_residual_plot = 0;
768  if ((!fData) || (!fTraining) || (!fTest)) {
769  Error("Train","Training/Test samples still not defined. Cannot train the neural network");
770  return;
771  }
772  Info("Train","Using %d train and %d test entries.",
773  fTraining->GetN(), fTest->GetN());
774  // Text and Graph outputs
775  if (verbosity % 2)
776  std::cout << "Training the Neural Network" << std::endl;
777  if (verbosity / 2) {
778  residual_plot = new TMultiGraph;
779  if(newCanvas)
780  canvas = new TCanvas("NNtraining", "Neural Net training");
781  else {
782  canvas = gPad;
783  if(!canvas) canvas = new TCanvas("NNtraining", "Neural Net training");
784  }
785  train_residual_plot = new TGraph(nEpoch);
786  test_residual_plot = new TGraph(nEpoch);
787  canvas->SetLeftMargin(0.14);
788  train_residual_plot->SetLineColor(4);
789  test_residual_plot->SetLineColor(2);
790  residual_plot->Add(train_residual_plot);
791  residual_plot->Add(test_residual_plot);
792  residual_plot->Draw("LA");
793  if (residual_plot->GetXaxis()) residual_plot->GetXaxis()->SetTitle("Epoch");
794  if (residual_plot->GetYaxis()) residual_plot->GetYaxis()->SetTitle("Error");
795  }
796  // If the option "+" is not set, one has to randomize the weights first
797  if (!opt.Contains("+"))
798  Randomize();
799  // Initialisation
800  fLastAlpha = 0;
801  Int_t els = fNetwork.GetEntriesFast() + fSynapses.GetEntriesFast();
802  Double_t *buffer = new Double_t[els];
803  Double_t *dir = new Double_t[els];
804  for (i = 0; i < els; i++)
805  buffer[i] = 0;
806  Int_t matrix_size = fLearningMethod==TMultiLayerPerceptron::kBFGS ? els : 1;
807  TMatrixD bfgsh(matrix_size, matrix_size);
808  TMatrixD gamma(matrix_size, 1);
809  TMatrixD delta(matrix_size, 1);
810  // Epoch loop. Here is the training itself.
811  Double_t training_E = 1e10;
812  Double_t test_E = 1e10;
813  for (Int_t iepoch = 0; (iepoch < nEpoch) && (!minE_Train || training_E>minE) && (!minE_Test || test_E>minE) ; iepoch++) {
814  switch (fLearningMethod) {
816  {
817  MLP_Stochastic(buffer);
818  break;
819  }
821  {
822  ComputeDEDw();
823  MLP_Batch(buffer);
824  break;
825  }
827  {
828  ComputeDEDw();
829  SteepestDir(dir);
830  if (LineSearch(dir, buffer))
831  MLP_Batch(buffer);
832  break;
833  }
835  {
836  ComputeDEDw();
837  if (!(iepoch % fReset)) {
838  SteepestDir(dir);
839  } else {
840  Double_t norm = 0;
841  Double_t onorm = 0;
842  for (i = 0; i < els; i++)
843  onorm += dir[i] * dir[i];
844  Double_t prod = 0;
845  Int_t idx = 0;
846  TNeuron *neuron = 0;
847  TSynapse *synapse = 0;
848  Int_t nentries = fNetwork.GetEntriesFast();
849  for (i=0;i<nentries;i++) {
850  neuron = (TNeuron *) fNetwork.UncheckedAt(i);
851  prod -= dir[idx++] * neuron->GetDEDw();
852  norm += neuron->GetDEDw() * neuron->GetDEDw();
853  }
854  nentries = fSynapses.GetEntriesFast();
855  for (i=0;i<nentries;i++) {
856  synapse = (TSynapse *) fSynapses.UncheckedAt(i);
857  prod -= dir[idx++] * synapse->GetDEDw();
858  norm += synapse->GetDEDw() * synapse->GetDEDw();
859  }
860  ConjugateGradientsDir(dir, (norm - prod) / onorm);
861  }
862  if (LineSearch(dir, buffer))
863  MLP_Batch(buffer);
864  break;
865  }
867  {
868  ComputeDEDw();
869  if (!(iepoch % fReset)) {
870  SteepestDir(dir);
871  } else {
872  Double_t norm = 0;
873  Double_t onorm = 0;
874  for (i = 0; i < els; i++)
875  onorm += dir[i] * dir[i];
876  TNeuron *neuron = 0;
877  TSynapse *synapse = 0;
878  Int_t nentries = fNetwork.GetEntriesFast();
879  for (i=0;i<nentries;i++) {
880  neuron = (TNeuron *) fNetwork.UncheckedAt(i);
881  norm += neuron->GetDEDw() * neuron->GetDEDw();
882  }
883  nentries = fSynapses.GetEntriesFast();
884  for (i=0;i<nentries;i++) {
885  synapse = (TSynapse *) fSynapses.UncheckedAt(i);
886  norm += synapse->GetDEDw() * synapse->GetDEDw();
887  }
888  ConjugateGradientsDir(dir, norm / onorm);
889  }
890  if (LineSearch(dir, buffer))
891  MLP_Batch(buffer);
892  break;
893  }
895  {
896  SetGammaDelta(gamma, delta, buffer);
897  if (!(iepoch % fReset)) {
898  SteepestDir(dir);
899  bfgsh.UnitMatrix();
900  } else {
901  if (GetBFGSH(bfgsh, gamma, delta)) {
902  SteepestDir(dir);
903  bfgsh.UnitMatrix();
904  } else {
905  BFGSDir(bfgsh, dir);
906  }
907  }
908  if (DerivDir(dir) > 0) {
909  SteepestDir(dir);
910  bfgsh.UnitMatrix();
911  }
912  if (LineSearch(dir, buffer)) {
913  bfgsh.UnitMatrix();
914  SteepestDir(dir);
915  if (LineSearch(dir, buffer)) {
916  Error("TMultiLayerPerceptron::Train()","Line search fail");
917  iepoch = nEpoch;
918  }
919  }
920  break;
921  }
922  }
923  // Security: would the learning lead to non real numbers,
924  // the learning should stop now.
926  Error("TMultiLayerPerceptron::Train()","Stop.");
927  iepoch = nEpoch;
928  }
929  // Process other ROOT events. Time penalty is less than
930  // 1/1000 sec/evt on a mobile AMD Athlon(tm) XP 1500+
931  gSystem->ProcessEvents();
934  // Intermediate graph and text output
935  if ((verbosity % 2) && ((!(iepoch % displayStepping)) || (iepoch == nEpoch - 1))) {
936  std::cout << "Epoch: " << iepoch
937  << " learn=" << training_E
938  << " test=" << test_E
939  << std::endl;
940  }
941  if (verbosity / 2) {
942  train_residual_plot->SetPoint(iepoch, iepoch,training_E);
943  test_residual_plot->SetPoint(iepoch, iepoch,test_E);
944  if (!iepoch) {
945  Double_t trp = train_residual_plot->GetY()[iepoch];
946  Double_t tep = test_residual_plot->GetY()[iepoch];
947  for (i = 1; i < nEpoch; i++) {
948  train_residual_plot->SetPoint(i, i, trp);
949  test_residual_plot->SetPoint(i, i, tep);
950  }
951  }
952  if ((!(iepoch % displayStepping)) || (iepoch == nEpoch - 1)) {
953  if (residual_plot->GetYaxis()) {
954  residual_plot->GetYaxis()->UnZoom();
955  residual_plot->GetYaxis()->SetTitleOffset(1.4);
956  residual_plot->GetYaxis()->SetDecimals();
957  }
958  canvas->Modified();
959  canvas->Update();
960  }
961  }
962  }
963  // Cleaning
964  delete [] buffer;
965  delete [] dir;
966  // Final Text and Graph outputs
967  if (verbosity % 2)
968  std::cout << "Training done." << std::endl;
969  if (verbosity / 2) {
970  TLegend *legend = new TLegend(.75, .80, .95, .95);
971  legend->AddEntry(residual_plot->GetListOfGraphs()->At(0),
972  "Training sample", "L");
973  legend->AddEntry(residual_plot->GetListOfGraphs()->At(1),
974  "Test sample", "L");
975  legend->Draw();
976  }
977 }
978 
979 ////////////////////////////////////////////////////////////////////////////////
980 /// Computes the output for a given event.
981 /// Look at the output neuron designed by index.
982 
983 Double_t TMultiLayerPerceptron::Result(Int_t event, Int_t index) const
984 {
985  GetEntry(event);
986  TNeuron *out = (TNeuron *) (fLastLayer.At(index));
987  if (out)
988  return out->GetValue();
989  else
990  return 0;
991 }
992 
993 ////////////////////////////////////////////////////////////////////////////////
994 /// Error on the output for a given event
995 
996 Double_t TMultiLayerPerceptron::GetError(Int_t event) const
997 {
998  GetEntry(event);
999  Double_t error = 0;
1000  // look at 1st output neruon to determine type and error function
1001  Int_t nEntries = fLastLayer.GetEntriesFast();
1002  if (nEntries == 0) return 0.0;
1003  switch (fOutType) {
1004  case (TNeuron::kSigmoid):
1005  error = GetCrossEntropyBinary();
1006  break;
1007  case (TNeuron::kSoftmax):
1008  error = GetCrossEntropy();
1009  break;
1010  case (TNeuron::kLinear):
1011  error = GetSumSquareError();
1012  break;
1013  default:
1014  // default to sum-of-squares error
1015  error = GetSumSquareError();
1016  }
1017  error *= fEventWeight->EvalInstance();
1018  error *= fCurrentTreeWeight;
1019  return error;
1020 }
1021 
1022 ////////////////////////////////////////////////////////////////////////////////
1023 /// Error on the whole dataset
1024 
1026 {
1027  TEventList *list =
1029  Double_t error = 0;
1030  Int_t i;
1031  if (list) {
1032  Int_t nEvents = list->GetN();
1033  for (i = 0; i < nEvents; i++) {
1034  error += GetError(list->GetEntry(i));
1035  }
1036  } else if (fData) {
1037  Int_t nEvents = (Int_t) fData->GetEntries();
1038  for (i = 0; i < nEvents; i++) {
1039  error += GetError(i);
1040  }
1041  }
1042  return error;
1043 }
1044 
1045 ////////////////////////////////////////////////////////////////////////////////
1046 /// Error on the output for a given event
1047 
1049 {
1050  Double_t error = 0;
1051  for (Int_t i = 0; i < fLastLayer.GetEntriesFast(); i++) {
1052  TNeuron *neuron = (TNeuron *) fLastLayer[i];
1053  error += neuron->GetError() * neuron->GetError();
1054  }
1055  return (error / 2.);
1056 }
1057 
1058 ////////////////////////////////////////////////////////////////////////////////
1059 /// Cross entropy error for sigmoid output neurons, for a given event
1060 
1062 {
1063  Double_t error = 0;
1064  for (Int_t i = 0; i < fLastLayer.GetEntriesFast(); i++) {
1065  TNeuron *neuron = (TNeuron *) fLastLayer[i];
1066  Double_t output = neuron->GetValue(); // sigmoid output and target
1067  Double_t target = neuron->GetTarget(); // values lie in [0,1]
1068  if (target < DBL_EPSILON) {
1069  if (output == 1.0)
1070  error = DBL_MAX;
1071  else
1072  error -= TMath::Log(1 - output);
1073  } else
1074  if ((1 - target) < DBL_EPSILON) {
1075  if (output == 0.0)
1076  error = DBL_MAX;
1077  else
1078  error -= TMath::Log(output);
1079  } else {
1080  if (output == 0.0 || output == 1.0)
1081  error = DBL_MAX;
1082  else
1083  error -= target * TMath::Log(output / target) + (1-target) * TMath::Log((1 - output)/(1 - target));
1084  }
1085  }
1086  return error;
1087 }
1088 
1089 ////////////////////////////////////////////////////////////////////////////////
1090 /// Cross entropy error for a softmax output neuron, for a given event
1091 
1093 {
1094  Double_t error = 0;
1095  for (Int_t i = 0; i < fLastLayer.GetEntriesFast(); i++) {
1096  TNeuron *neuron = (TNeuron *) fLastLayer[i];
1097  Double_t output = neuron->GetValue(); // softmax output and target
1098  Double_t target = neuron->GetTarget(); // values lie in [0,1]
1099  if (target > DBL_EPSILON) { // (target == 0) => dE = 0
1100  if (output == 0.0)
1101  error = DBL_MAX;
1102  else
1103  error -= target * TMath::Log(output / target);
1104  }
1105  }
1106  return error;
1107 }
1108 
1109 ////////////////////////////////////////////////////////////////////////////////
1110 /// Compute the DEDw = sum on all training events of dedw for each weight
1111 /// normalized by the number of events.
1112 
1114 {
1115  Int_t i,j;
1116  Int_t nentries = fSynapses.GetEntriesFast();
1117  TSynapse *synapse;
1118  for (i=0;i<nentries;i++) {
1119  synapse = (TSynapse *) fSynapses.UncheckedAt(i);
1120  synapse->SetDEDw(0.);
1121  }
1122  TNeuron *neuron;
1123  nentries = fNetwork.GetEntriesFast();
1124  for (i=0;i<nentries;i++) {
1125  neuron = (TNeuron *) fNetwork.UncheckedAt(i);
1126  neuron->SetDEDw(0.);
1127  }
1128  Double_t eventWeight = 1.;
1129  if (fTraining) {
1130  Int_t nEvents = fTraining->GetN();
1131  for (i = 0; i < nEvents; i++) {
1132  GetEntry(fTraining->GetEntry(i));
1133  eventWeight = fEventWeight->EvalInstance();
1134  eventWeight *= fCurrentTreeWeight;
1135  nentries = fSynapses.GetEntriesFast();
1136  for (j=0;j<nentries;j++) {
1137  synapse = (TSynapse *) fSynapses.UncheckedAt(j);
1138  synapse->SetDEDw(synapse->GetDEDw() + (synapse->GetDeDw()*eventWeight));
1139  }
1140  nentries = fNetwork.GetEntriesFast();
1141  for (j=0;j<nentries;j++) {
1142  neuron = (TNeuron *) fNetwork.UncheckedAt(j);
1143  neuron->SetDEDw(neuron->GetDEDw() + (neuron->GetDeDw()*eventWeight));
1144  }
1145  }
1146  nentries = fSynapses.GetEntriesFast();
1147  for (j=0;j<nentries;j++) {
1148  synapse = (TSynapse *) fSynapses.UncheckedAt(j);
1149  synapse->SetDEDw(synapse->GetDEDw() / (Double_t) nEvents);
1150  }
1151  nentries = fNetwork.GetEntriesFast();
1152  for (j=0;j<nentries;j++) {
1153  neuron = (TNeuron *) fNetwork.UncheckedAt(j);
1154  neuron->SetDEDw(neuron->GetDEDw() / (Double_t) nEvents);
1155  }
1156  } else if (fData) {
1157  Int_t nEvents = (Int_t) fData->GetEntries();
1158  for (i = 0; i < nEvents; i++) {
1159  GetEntry(i);
1160  eventWeight = fEventWeight->EvalInstance();
1161  eventWeight *= fCurrentTreeWeight;
1162  nentries = fSynapses.GetEntriesFast();
1163  for (j=0;j<nentries;j++) {
1164  synapse = (TSynapse *) fSynapses.UncheckedAt(j);
1165  synapse->SetDEDw(synapse->GetDEDw() + (synapse->GetDeDw()*eventWeight));
1166  }
1167  nentries = fNetwork.GetEntriesFast();
1168  for (j=0;j<nentries;j++) {
1169  neuron = (TNeuron *) fNetwork.UncheckedAt(j);
1170  neuron->SetDEDw(neuron->GetDEDw() + (neuron->GetDeDw()*eventWeight));
1171  }
1172  }
1173  nentries = fSynapses.GetEntriesFast();
1174  for (j=0;j<nentries;j++) {
1175  synapse = (TSynapse *) fSynapses.UncheckedAt(j);
1176  synapse->SetDEDw(synapse->GetDEDw() / (Double_t) nEvents);
1177  }
1178  nentries = fNetwork.GetEntriesFast();
1179  for (j=0;j<nentries;j++) {
1180  neuron = (TNeuron *) fNetwork.UncheckedAt(j);
1181  neuron->SetDEDw(neuron->GetDEDw() / (Double_t) nEvents);
1182  }
1183  }
1184 }
1185 
1186 ////////////////////////////////////////////////////////////////////////////////
1187 /// Randomize the weights
1188 
1190 {
1191  Int_t nentries = fSynapses.GetEntriesFast();
1192  Int_t j;
1193  TSynapse *synapse;
1194  TNeuron *neuron;
1195  TTimeStamp ts;
1196  TRandom3 gen(ts.GetSec());
1197  for (j=0;j<nentries;j++) {
1198  synapse = (TSynapse *) fSynapses.UncheckedAt(j);
1199  synapse->SetWeight(gen.Rndm() - 0.5);
1200  }
1201  nentries = fNetwork.GetEntriesFast();
1202  for (j=0;j<nentries;j++) {
1203  neuron = (TNeuron *) fNetwork.UncheckedAt(j);
1204  neuron->SetWeight(gen.Rndm() - 0.5);
1205  }
1206 }
1207 
1208 ////////////////////////////////////////////////////////////////////////////////
1209 /// Connects the TTree to Neurons in input and output
1210 /// layers. The formulas associated to each neuron are created
1211 /// and reported to the network formula manager.
1212 /// By default, the branch is not normalised since this would degrade
1213 /// performance for classification jobs.
1214 /// Normalisation can be requested by putting '@' in front of the formula.
1215 
1217 {
1218  Int_t j = 0;
1219  TNeuron *neuron = 0;
1220  Bool_t normalize = false;
1221  fManager = new TTreeFormulaManager;
1222 
1223  // Set the size of the internal array of parameters of the formula
1224  Int_t maxop, maxpar, maxconst;
1225  ROOT::v5::TFormula::GetMaxima(maxop, maxpar, maxconst);
1226  ROOT::v5::TFormula::SetMaxima(10, 10, 10);
1227 
1228  //first layer
1229  const TString input = TString(fStructure(0, fStructure.First(':')));
1230  const TObjArray *inpL = input.Tokenize(", ");
1231  Int_t nentries = fFirstLayer.GetEntriesFast();
1232  // make sure nentries == entries in inpL
1233  R__ASSERT(nentries == inpL->GetLast()+1);
1234  for (j=0;j<nentries;j++) {
1235  normalize = false;
1236  const TString brName = ((TObjString *)inpL->At(j))->GetString();
1237  neuron = (TNeuron *) fFirstLayer.UncheckedAt(j);
1238  if (brName[0]=='@')
1239  normalize = true;
1240  fManager->Add(neuron->UseBranch(fData,brName.Data() + (normalize?1:0)));
1241  if(!normalize) neuron->SetNormalisation(0., 1.);
1242  }
1243  delete inpL;
1244 
1245  // last layer
1246  TString output = TString(
1247  fStructure(fStructure.Last(':') + 1,
1248  fStructure.Length() - fStructure.Last(':')));
1249  const TObjArray *outL = output.Tokenize(", ");
1250  nentries = fLastLayer.GetEntriesFast();
1251  // make sure nentries == entries in outL
1252  R__ASSERT(nentries == outL->GetLast()+1);
1253  for (j=0;j<nentries;j++) {
1254  normalize = false;
1255  const TString brName = ((TObjString *)outL->At(j))->GetString();
1256  neuron = (TNeuron *) fLastLayer.UncheckedAt(j);
1257  if (brName[0]=='@')
1258  normalize = true;
1259  fManager->Add(neuron->UseBranch(fData,brName.Data() + (normalize?1:0)));
1260  if(!normalize) neuron->SetNormalisation(0., 1.);
1261  }
1262  delete outL;
1263 
1264  fManager->Add((fEventWeight = new TTreeFormula("NNweight",fWeight.Data(),fData)));
1265  //fManager->Sync();
1266 
1267  // Set the old values
1268  ROOT::v5::TFormula::SetMaxima(maxop, maxpar, maxconst);
1269 }
1270 
1271 ////////////////////////////////////////////////////////////////////////////////
1272 /// Expand the structure of the first layer
1273 
1275 {
1276  TString input = TString(fStructure(0, fStructure.First(':')));
1277  const TObjArray *inpL = input.Tokenize(", ");
1278  Int_t nneurons = inpL->GetLast()+1;
1279 
1280  TString hiddenAndOutput = TString(
1281  fStructure(fStructure.First(':') + 1,
1282  fStructure.Length() - fStructure.First(':')));
1283  TString newInput;
1284  Int_t i = 0;
1285  // loop on input neurons
1286  for (i = 0; i<nneurons; i++) {
1287  const TString name = ((TObjString *)inpL->At(i))->GetString();
1288  TTreeFormula f("sizeTestFormula",name,fData);
1289  // Variable size arrays are unrelialable
1290  if(f.GetMultiplicity()==1 && f.GetNdata()>1) {
1291  Warning("TMultiLayerPerceptron::ExpandStructure()","Variable size arrays cannot be used to build implicitely an input layer. The index 0 will be assumed.");
1292  }
1293  // Check if we are coping with an array... then expand
1294  // The array operator used is {}. It is detected in TNeuron, and
1295  // passed directly as instance index of the TTreeFormula,
1296  // so that complex compounds made of arrays can be used without
1297  // parsing the details.
1298  else if(f.GetNdata()>1) {
1299  for(Int_t j=0; j<f.GetNdata(); j++) {
1300  if(i||j) newInput += ",";
1301  newInput += name;
1302  newInput += "{";
1303  newInput += j;
1304  newInput += "}";
1305  }
1306  continue;
1307  }
1308  if(i) newInput += ",";
1309  newInput += name;
1310  }
1311  delete inpL;
1312 
1313  // Save the result
1314  fStructure = newInput + ":" + hiddenAndOutput;
1315 }
1316 
1317 ////////////////////////////////////////////////////////////////////////////////
1318 /// Instanciates the network from the description
1319 
1321 {
1322  ExpandStructure();
1323  TString input = TString(fStructure(0, fStructure.First(':')));
1324  TString hidden = TString(
1325  fStructure(fStructure.First(':') + 1,
1326  fStructure.Last(':') - fStructure.First(':') - 1));
1327  TString output = TString(
1328  fStructure(fStructure.Last(':') + 1,
1329  fStructure.Length() - fStructure.Last(':')));
1330  Int_t bll = atoi(TString(
1331  hidden(hidden.Last(':') + 1,
1332  hidden.Length() - (hidden.Last(':') + 1))).Data());
1333  if (input.Length() == 0) {
1334  Error("BuildNetwork()","malformed structure. No input layer.");
1335  return;
1336  }
1337  if (output.Length() == 0) {
1338  Error("BuildNetwork()","malformed structure. No output layer.");
1339  return;
1340  }
1341  BuildFirstLayer(input);
1342  BuildHiddenLayers(hidden);
1343  BuildLastLayer(output, bll);
1344 }
1345 
1346 ////////////////////////////////////////////////////////////////////////////////
1347 /// Instanciates the neurons in input
1348 /// Inputs are normalised and the type is set to kOff
1349 /// (simple forward of the formula value)
1350 
1352 {
1353  const TObjArray *inpL = input.Tokenize(", ");
1354  const Int_t nneurons =inpL->GetLast()+1;
1355  TNeuron *neuron = 0;
1356  Int_t i = 0;
1357  for (i = 0; i<nneurons; i++) {
1358  const TString name = ((TObjString *)inpL->At(i))->GetString();
1359  neuron = new TNeuron(TNeuron::kOff, name);
1360  fFirstLayer.AddLast(neuron);
1361  fNetwork.AddLast(neuron);
1362  }
1363  delete inpL;
1364 }
1365 
1366 ////////////////////////////////////////////////////////////////////////////////
1367 /// Builds hidden layers.
1368 
1370 {
1371  Int_t beg = 0;
1372  Int_t end = hidden.Index(":", beg + 1);
1373  Int_t prevStart = 0;
1374  Int_t prevStop = fNetwork.GetEntriesFast();
1375  Int_t layer = 1;
1376  while (end != -1) {
1377  BuildOneHiddenLayer(hidden(beg, end - beg), layer, prevStart, prevStop, false);
1378  beg = end + 1;
1379  end = hidden.Index(":", beg + 1);
1380  }
1381 
1382  BuildOneHiddenLayer(hidden(beg, hidden.Length() - beg), layer, prevStart, prevStop, true);
1383 }
1384 
1385 ////////////////////////////////////////////////////////////////////////////////
1386 /// Builds a hidden layer, updates the number of layers.
1387 
1388 void TMultiLayerPerceptron::BuildOneHiddenLayer(const TString& sNumNodes, Int_t& layer,
1389  Int_t& prevStart, Int_t& prevStop,
1390  Bool_t lastLayer)
1391 {
1392  TNeuron *neuron = 0;
1393  TSynapse *synapse = 0;
1394  TString name;
1395  if (!sNumNodes.IsAlnum() || sNumNodes.IsAlpha()) {
1396  Error("BuildOneHiddenLayer",
1397  "The specification '%s' for hidden layer %d must contain only numbers!",
1398  sNumNodes.Data(), layer - 1);
1399  } else {
1400  Int_t num = atoi(sNumNodes.Data());
1401  for (Int_t i = 0; i < num; i++) {
1402  name.Form("HiddenL%d:N%d",layer,i);
1403  neuron = new TNeuron(fType, name, "", (const char*)fextF, (const char*)fextD);
1404  fNetwork.AddLast(neuron);
1405  for (Int_t j = prevStart; j < prevStop; j++) {
1406  synapse = new TSynapse((TNeuron *) fNetwork[j], neuron);
1407  fSynapses.AddLast(synapse);
1408  }
1409  }
1410 
1411  if (!lastLayer) {
1412  // tell each neuron which ones are in its own layer (for Softmax)
1413  Int_t nEntries = fNetwork.GetEntriesFast();
1414  for (Int_t i = prevStop; i < nEntries; i++) {
1415  neuron = (TNeuron *) fNetwork[i];
1416  for (Int_t j = prevStop; j < nEntries; j++)
1417  neuron->AddInLayer((TNeuron *) fNetwork[j]);
1418  }
1419  }
1420 
1421  prevStart = prevStop;
1422  prevStop = fNetwork.GetEntriesFast();
1423  layer++;
1424  }
1425 }
1426 
1427 ////////////////////////////////////////////////////////////////////////////////
1428 /// Builds the output layer
1429 /// Neurons are linear combinations of input, by defaul.
1430 /// If the structure ends with "!", neurons are set up for classification,
1431 /// ie. with a sigmoid (1 neuron) or softmax (more neurons) activation function.
1432 
1433 void TMultiLayerPerceptron::BuildLastLayer(TString & output, Int_t prev)
1434 {
1435  Int_t nneurons = output.CountChar(',')+1;
1436  if (fStructure.EndsWith("!")) {
1437  fStructure = TString(fStructure(0, fStructure.Length() - 1)); // remove "!"
1438  if (nneurons == 1)
1440  else
1442  }
1443  Int_t prevStop = fNetwork.GetEntriesFast();
1444  Int_t prevStart = prevStop - prev;
1445  Ssiz_t pos = 0;
1446  TNeuron *neuron;
1447  TSynapse *synapse;
1448  TString name;
1449  Int_t i,j;
1450  for (i = 0; i<nneurons; i++) {
1451  Ssiz_t nextpos=output.Index(",",pos);
1452  if (nextpos!=kNPOS)
1453  name=output(pos,nextpos-pos);
1454  else name=output(pos,output.Length());
1455  pos+=nextpos+1;
1456  neuron = new TNeuron(fOutType, name);
1457  for (j = prevStart; j < prevStop; j++) {
1458  synapse = new TSynapse((TNeuron *) fNetwork[j], neuron);
1459  fSynapses.AddLast(synapse);
1460  }
1461  fLastLayer.AddLast(neuron);
1462  fNetwork.AddLast(neuron);
1463  }
1464  // tell each neuron which ones are in its own layer (for Softmax)
1465  Int_t nEntries = fNetwork.GetEntriesFast();
1466  for (i = prevStop; i < nEntries; i++) {
1467  neuron = (TNeuron *) fNetwork[i];
1468  for (j = prevStop; j < nEntries; j++)
1469  neuron->AddInLayer((TNeuron *) fNetwork[j]);
1470  }
1471 
1472 }
1473 
1474 ////////////////////////////////////////////////////////////////////////////////
1475 /// Draws the neural net output
1476 /// It produces an histogram with the output for the two datasets.
1477 /// Index is the number of the desired output neuron.
1478 /// "option" can contain:
1479 /// - test or train to select a dataset
1480 /// - comp to produce a X-Y comparison plot
1481 /// - nocanv to not create a new TCanvas for the plot
1482 
1483 void TMultiLayerPerceptron::DrawResult(Int_t index, Option_t * option) const
1484 {
1485  TString opt = option;
1486  opt.ToLower();
1487  TNeuron *out = (TNeuron *) (fLastLayer.At(index));
1488  if (!out) {
1489  Error("DrawResult()","no such output.");
1490  return;
1491  }
1492  //TCanvas *canvas = new TCanvas("NNresult", "Neural Net output");
1493  if (!opt.Contains("nocanv"))
1494  new TCanvas("NNresult", "Neural Net output");
1495  const Double_t *norm = out->GetNormalisation();
1496  TEventList *events = 0;
1497  TString setname;
1498  Int_t i;
1499  if (opt.Contains("train")) {
1500  events = fTraining;
1501  setname = Form("train%d",index);
1502  } else if (opt.Contains("test")) {
1503  events = fTest;
1504  setname = Form("test%d",index);
1505  }
1506  if ((!fData) || (!events)) {
1507  Error("DrawResult()","no dataset.");
1508  return;
1509  }
1510  if (opt.Contains("comp")) {
1511  //comparison plot
1512  TString title = "Neural Net Output control. ";
1513  title += setname;
1514  setname = "MLP_" + setname + "_comp";
1515  TH2D *hist = ((TH2D *) gDirectory->Get(setname.Data()));
1516  if (!hist)
1517  hist = new TH2D(setname.Data(), title.Data(), 50, -1, 1, 50, -1, 1);
1518  hist->Reset();
1519  Int_t nEvents = events->GetN();
1520  for (i = 0; i < nEvents; i++) {
1521  GetEntry(events->GetEntry(i));
1522  hist->Fill(out->GetValue(), (out->GetBranch() - norm[1]) / norm[0]);
1523  }
1524  hist->Draw();
1525  } else {
1526  //output plot
1527  TString title = "Neural Net Output. ";
1528  title += setname;
1529  setname = "MLP_" + setname;
1530  TH1D *hist = ((TH1D *) gDirectory->Get(setname.Data()));
1531  if (!hist)
1532  hist = new TH1D(setname, title, 50, 1, -1);
1533  hist->Reset();
1534  Int_t nEvents = events->GetN();
1535  for (i = 0; i < nEvents; i++)
1536  hist->Fill(Result(events->GetEntry(i), index));
1537  hist->Draw();
1538  if (opt.Contains("train") && opt.Contains("test")) {
1539  events = fTraining;
1540  setname = "train";
1541  hist = ((TH1D *) gDirectory->Get("MLP_test"));
1542  if (!hist)
1543  hist = new TH1D(setname, title, 50, 1, -1);
1544  hist->Reset();
1545  nEvents = events->GetN();
1546  for (i = 0; i < nEvents; i++)
1547  hist->Fill(Result(events->GetEntry(i), index));
1548  hist->Draw("same");
1549  }
1550  }
1551 }
1552 
1553 ////////////////////////////////////////////////////////////////////////////////
1554 /// Dumps the weights to a text file.
1555 /// Set filename to "-" (default) to dump to the standard output
1556 
1557 Bool_t TMultiLayerPerceptron::DumpWeights(Option_t * filename) const
1558 {
1559  TString filen = filename;
1560  std::ostream * output;
1561  if (filen == "") {
1562  Error("TMultiLayerPerceptron::DumpWeights()","Invalid file name");
1563  return kFALSE;
1564  }
1565  if (filen == "-")
1566  output = &std::cout;
1567  else
1568  output = new std::ofstream(filen.Data());
1569  TNeuron *neuron = 0;
1570  *output << "#input normalization" << std::endl;
1571  Int_t nentries = fFirstLayer.GetEntriesFast();
1572  Int_t j=0;
1573  for (j=0;j<nentries;j++) {
1574  neuron = (TNeuron *) fFirstLayer.UncheckedAt(j);
1575  *output << neuron->GetNormalisation()[0] << " "
1576  << neuron->GetNormalisation()[1] << std::endl;
1577  }
1578  *output << "#output normalization" << std::endl;
1579  nentries = fLastLayer.GetEntriesFast();
1580  for (j=0;j<nentries;j++) {
1581  neuron = (TNeuron *) fLastLayer.UncheckedAt(j);
1582  *output << neuron->GetNormalisation()[0] << " "
1583  << neuron->GetNormalisation()[1] << std::endl;
1584  }
1585  *output << "#neurons weights" << std::endl;
1586  TObjArrayIter *it = (TObjArrayIter *) fNetwork.MakeIterator();
1587  while ((neuron = (TNeuron *) it->Next()))
1588  *output << neuron->GetWeight() << std::endl;
1589  delete it;
1590  it = (TObjArrayIter *) fSynapses.MakeIterator();
1591  TSynapse *synapse = 0;
1592  *output << "#synapses weights" << std::endl;
1593  while ((synapse = (TSynapse *) it->Next()))
1594  *output << synapse->GetWeight() << std::endl;
1595  delete it;
1596  if (filen != "-") {
1597  ((std::ofstream *) output)->close();
1598  delete output;
1599  }
1600  return kTRUE;
1601 }
1602 
1603 ////////////////////////////////////////////////////////////////////////////////
1604 /// Loads the weights from a text file conforming to the format
1605 /// defined by DumpWeights.
1606 
1607 Bool_t TMultiLayerPerceptron::LoadWeights(Option_t * filename)
1608 {
1609  TString filen = filename;
1610  Double_t w;
1611  if (filen == "") {
1612  Error("TMultiLayerPerceptron::LoadWeights()","Invalid file name");
1613  return kFALSE;
1614  }
1615  char *buff = new char[100];
1616  std::ifstream input(filen.Data());
1617  // input normalzation
1618  input.getline(buff, 100);
1619  TObjArrayIter *it = (TObjArrayIter *) fFirstLayer.MakeIterator();
1620  Float_t n1,n2;
1621  TNeuron *neuron = 0;
1622  while ((neuron = (TNeuron *) it->Next())) {
1623  input >> n1 >> n2;
1624  neuron->SetNormalisation(n2,n1);
1625  }
1626  input.getline(buff, 100);
1627  // output normalization
1628  input.getline(buff, 100);
1629  delete it;
1630  it = (TObjArrayIter *) fLastLayer.MakeIterator();
1631  while ((neuron = (TNeuron *) it->Next())) {
1632  input >> n1 >> n2;
1633  neuron->SetNormalisation(n2,n1);
1634  }
1635  input.getline(buff, 100);
1636  // neuron weights
1637  input.getline(buff, 100);
1638  delete it;
1639  it = (TObjArrayIter *) fNetwork.MakeIterator();
1640  while ((neuron = (TNeuron *) it->Next())) {
1641  input >> w;
1642  neuron->SetWeight(w);
1643  }
1644  delete it;
1645  input.getline(buff, 100);
1646  // synapse weights
1647  input.getline(buff, 100);
1648  it = (TObjArrayIter *) fSynapses.MakeIterator();
1649  TSynapse *synapse = 0;
1650  while ((synapse = (TSynapse *) it->Next())) {
1651  input >> w;
1652  synapse->SetWeight(w);
1653  }
1654  delete it;
1655  delete[] buff;
1656  return kTRUE;
1657 }
1658 
1659 ////////////////////////////////////////////////////////////////////////////////
1660 /// Returns the Neural Net for a given set of input parameters
1661 /// #parameters must equal #input neurons
1662 
1663 Double_t TMultiLayerPerceptron::Evaluate(Int_t index, Double_t *params) const
1664 {
1665  TObjArrayIter *it = (TObjArrayIter *) fNetwork.MakeIterator();
1666  TNeuron *neuron;
1667  while ((neuron = (TNeuron *) it->Next()))
1668  neuron->SetNewEvent();
1669  delete it;
1670  it = (TObjArrayIter *) fFirstLayer.MakeIterator();
1671  Int_t i=0;
1672  while ((neuron = (TNeuron *) it->Next()))
1673  neuron->ForceExternalValue(params[i++]);
1674  delete it;
1675  TNeuron *out = (TNeuron *) (fLastLayer.At(index));
1676  if (out)
1677  return out->GetValue();
1678  else
1679  return 0;
1680 }
1681 
1682 ////////////////////////////////////////////////////////////////////////////////
1683 /// Exports the NN as a function for any non-ROOT-dependant code
1684 /// Supported languages are: only C++ , FORTRAN and Python (yet)
1685 /// This feature is also usefull if you want to plot the NN as
1686 /// a function (TF1 or TF2).
1687 
1688 void TMultiLayerPerceptron::Export(Option_t * filename, Option_t * language) const
1689 {
1690  TString lg = language;
1691  lg.ToUpper();
1692  Int_t i;
1693  if(GetType()==TNeuron::kExternal) {
1694  Warning("TMultiLayerPerceptron::Export","Request to export a network using an external function");
1695  }
1696  if (lg == "C++") {
1697  TString basefilename = filename;
1698  Int_t slash = basefilename.Last('/')+1;
1699  if (slash) basefilename = TString(basefilename(slash, basefilename.Length()-slash));
1700 
1701  TString classname = basefilename;
1702  TString header = filename;
1703  header += ".h";
1704  TString source = filename;
1705  source += ".cxx";
1706  std::ofstream headerfile(header);
1707  std::ofstream sourcefile(source);
1708  headerfile << "#ifndef " << basefilename << "_h" << std::endl;
1709  headerfile << "#define " << basefilename << "_h" << std::endl << std::endl;
1710  headerfile << "class " << classname << " { " << std::endl;
1711  headerfile << "public:" << std::endl;
1712  headerfile << " " << classname << "() {}" << std::endl;
1713  headerfile << " ~" << classname << "() {}" << std::endl;
1714  sourcefile << "#include \"" << header << "\"" << std::endl;
1715  sourcefile << "#include <cmath>" << std::endl << std::endl;
1716  headerfile << " double Value(int index";
1717  sourcefile << "double " << classname << "::Value(int index";
1718  for (i = 0; i < fFirstLayer.GetEntriesFast(); i++) {
1719  headerfile << ",double in" << i;
1720  sourcefile << ",double in" << i;
1721  }
1722  headerfile << ");" << std::endl;
1723  sourcefile << ") {" << std::endl;
1724  for (i = 0; i < fFirstLayer.GetEntriesFast(); i++)
1725  sourcefile << " input" << i << " = (in" << i << " - "
1726  << ((TNeuron *) fFirstLayer[i])->GetNormalisation()[1] << ")/"
1727  << ((TNeuron *) fFirstLayer[i])->GetNormalisation()[0] << ";"
1728  << std::endl;
1729  sourcefile << " switch(index) {" << std::endl;
1730  TNeuron *neuron;
1731  TObjArrayIter *it = (TObjArrayIter *) fLastLayer.MakeIterator();
1732  Int_t idx = 0;
1733  while ((neuron = (TNeuron *) it->Next()))
1734  sourcefile << " case " << idx++ << ":" << std::endl
1735  << " return neuron" << neuron << "();" << std::endl;
1736  sourcefile << " default:" << std::endl
1737  << " return 0.;" << std::endl << " }"
1738  << std::endl;
1739  sourcefile << "}" << std::endl << std::endl;
1740  headerfile << " double Value(int index, double* input);" << std::endl;
1741  sourcefile << "double " << classname << "::Value(int index, double* input) {" << std::endl;
1742  for (i = 0; i < fFirstLayer.GetEntriesFast(); i++)
1743  sourcefile << " input" << i << " = (input[" << i << "] - "
1744  << ((TNeuron *) fFirstLayer[i])->GetNormalisation()[1] << ")/"
1745  << ((TNeuron *) fFirstLayer[i])->GetNormalisation()[0] << ";"
1746  << std::endl;
1747  sourcefile << " switch(index) {" << std::endl;
1748  delete it;
1749  it = (TObjArrayIter *) fLastLayer.MakeIterator();
1750  idx = 0;
1751  while ((neuron = (TNeuron *) it->Next()))
1752  sourcefile << " case " << idx++ << ":" << std::endl
1753  << " return neuron" << neuron << "();" << std::endl;
1754  sourcefile << " default:" << std::endl
1755  << " return 0.;" << std::endl << " }"
1756  << std::endl;
1757  sourcefile << "}" << std::endl << std::endl;
1758  headerfile << "private:" << std::endl;
1759  for (i = 0; i < fFirstLayer.GetEntriesFast(); i++)
1760  headerfile << " double input" << i << ";" << std::endl;
1761  delete it;
1762  it = (TObjArrayIter *) fNetwork.MakeIterator();
1763  idx = 0;
1764  while ((neuron = (TNeuron *) it->Next())) {
1765  if (!neuron->GetPre(0)) {
1766  headerfile << " double neuron" << neuron << "();" << std::endl;
1767  sourcefile << "double " << classname << "::neuron" << neuron
1768  << "() {" << std::endl;
1769  sourcefile << " return input" << idx++ << ";" << std::endl;
1770  sourcefile << "}" << std::endl << std::endl;
1771  } else {
1772  headerfile << " double input" << neuron << "();" << std::endl;
1773  sourcefile << "double " << classname << "::input" << neuron
1774  << "() {" << std::endl;
1775  sourcefile << " double input = " << neuron->GetWeight()
1776  << ";" << std::endl;
1777  TSynapse *syn = 0;
1778  Int_t n = 0;
1779  while ((syn = neuron->GetPre(n++))) {
1780  sourcefile << " input += synapse" << syn << "();" << std::endl;
1781  }
1782  sourcefile << " return input;" << std::endl;
1783  sourcefile << "}" << std::endl << std::endl;
1784 
1785  headerfile << " double neuron" << neuron << "();" << std::endl;
1786  sourcefile << "double " << classname << "::neuron" << neuron << "() {" << std::endl;
1787  sourcefile << " double input = input" << neuron << "();" << std::endl;
1788  switch(neuron->GetType()) {
1789  case (TNeuron::kSigmoid):
1790  {
1791  sourcefile << " return ((input < -709. ? 0. : (1/(1+exp(-input)))) * ";
1792  break;
1793  }
1794  case (TNeuron::kLinear):
1795  {
1796  sourcefile << " return (input * ";
1797  break;
1798  }
1799  case (TNeuron::kTanh):
1800  {
1801  sourcefile << " return (tanh(input) * ";
1802  break;
1803  }
1804  case (TNeuron::kGauss):
1805  {
1806  sourcefile << " return (exp(-input*input) * ";
1807  break;
1808  }
1809  case (TNeuron::kSoftmax):
1810  {
1811  sourcefile << " return (exp(input) / (";
1812  Int_t nn = 0;
1813  TNeuron* side = neuron->GetInLayer(nn++);
1814  sourcefile << "exp(input" << side << "())";
1815  while ((side = neuron->GetInLayer(nn++)))
1816  sourcefile << " + exp(input" << side << "())";
1817  sourcefile << ") * ";
1818  break;
1819  }
1820  default:
1821  {
1822  sourcefile << " return (0.0 * ";
1823  }
1824  }
1825  sourcefile << neuron->GetNormalisation()[0] << ")+" ;
1826  sourcefile << neuron->GetNormalisation()[1] << ";" << std::endl;
1827  sourcefile << "}" << std::endl << std::endl;
1828  }
1829  }
1830  delete it;
1831  TSynapse *synapse = 0;
1832  it = (TObjArrayIter *) fSynapses.MakeIterator();
1833  while ((synapse = (TSynapse *) it->Next())) {
1834  headerfile << " double synapse" << synapse << "();" << std::endl;
1835  sourcefile << "double " << classname << "::synapse"
1836  << synapse << "() {" << std::endl;
1837  sourcefile << " return (neuron" << synapse->GetPre()
1838  << "()*" << synapse->GetWeight() << ");" << std::endl;
1839  sourcefile << "}" << std::endl << std::endl;
1840  }
1841  delete it;
1842  headerfile << "};" << std::endl << std::endl;
1843  headerfile << "#endif // " << basefilename << "_h" << std::endl << std::endl;
1844  headerfile.close();
1845  sourcefile.close();
1846  std::cout << header << " and " << source << " created." << std::endl;
1847  }
1848  else if(lg == "FORTRAN") {
1849  TString implicit = " implicit double precision (a-h,n-z)\n";
1850  std::ofstream sigmoid("sigmoid.f");
1851  sigmoid << " double precision FUNCTION SIGMOID(X)" << std::endl
1852  << implicit
1853  << " IF(X.GT.37.) THEN" << std::endl
1854  << " SIGMOID = 1." << std::endl
1855  << " ELSE IF(X.LT.-709.) THEN" << std::endl
1856  << " SIGMOID = 0." << std::endl
1857  << " ELSE" << std::endl
1858  << " SIGMOID = 1./(1.+EXP(-X))" << std::endl
1859  << " ENDIF" << std::endl
1860  << " END" << std::endl;
1861  sigmoid.close();
1862  TString source = filename;
1863  source += ".f";
1864  std::ofstream sourcefile(source);
1865 
1866  // Header
1867  sourcefile << " double precision function " << filename
1868  << "(x, index)" << std::endl;
1869  sourcefile << implicit;
1870  sourcefile << " double precision x(" <<
1871  fFirstLayer.GetEntriesFast() << ")" << std::endl << std::endl;
1872 
1873  // Last layer
1874  sourcefile << "C --- Last Layer" << std::endl;
1875  TNeuron *neuron;
1876  TObjArrayIter *it = (TObjArrayIter *) fLastLayer.MakeIterator();
1877  Int_t idx = 0;
1878  TString ifelseif = " if (index.eq.";
1879  while ((neuron = (TNeuron *) it->Next())) {
1880  sourcefile << ifelseif.Data() << idx++ << ") then" << std::endl
1881  << " " << filename
1882  << "=neuron" << neuron << "(x);" << std::endl;
1883  ifelseif = " else if (index.eq.";
1884  }
1885  sourcefile << " else" << std::endl
1886  << " " << filename << "=0.d0" << std::endl
1887  << " endif" << std::endl;
1888  sourcefile << " end" << std::endl;
1889 
1890  // Network
1891  sourcefile << "C --- First and Hidden layers" << std::endl;
1892  delete it;
1893  it = (TObjArrayIter *) fNetwork.MakeIterator();
1894  idx = 0;
1895  while ((neuron = (TNeuron *) it->Next())) {
1896  sourcefile << " double precision function neuron"
1897  << neuron << "(x)" << std::endl
1898  << implicit;
1899  sourcefile << " double precision x("
1900  << fFirstLayer.GetEntriesFast() << ")" << std::endl << std::endl;
1901  if (!neuron->GetPre(0)) {
1902  sourcefile << " neuron" << neuron
1903  << " = (x(" << idx+1 << ") - "
1904  << ((TNeuron *) fFirstLayer[idx])->GetNormalisation()[1]
1905  << "d0)/"
1906  << ((TNeuron *) fFirstLayer[idx])->GetNormalisation()[0]
1907  << "d0" << std::endl;
1908  idx++;
1909  } else {
1910  sourcefile << " neuron" << neuron
1911  << " = " << neuron->GetWeight() << "d0" << std::endl;
1912  TSynapse *syn;
1913  Int_t n = 0;
1914  while ((syn = neuron->GetPre(n++)))
1915  sourcefile << " neuron" << neuron
1916  << " = neuron" << neuron
1917  << " + synapse" << syn << "(x)" << std::endl;
1918  switch(neuron->GetType()) {
1919  case (TNeuron::kSigmoid):
1920  {
1921  sourcefile << " neuron" << neuron
1922  << "= (sigmoid(neuron" << neuron << ")*";
1923  break;
1924  }
1925  case (TNeuron::kLinear):
1926  {
1927  break;
1928  }
1929  case (TNeuron::kTanh):
1930  {
1931  sourcefile << " neuron" << neuron
1932  << "= (tanh(neuron" << neuron << ")*";
1933  break;
1934  }
1935  case (TNeuron::kGauss):
1936  {
1937  sourcefile << " neuron" << neuron
1938  << "= (exp(-neuron" << neuron << "*neuron"
1939  << neuron << "))*";
1940  break;
1941  }
1942  case (TNeuron::kSoftmax):
1943  {
1944  Int_t nn = 0;
1945  TNeuron* side = neuron->GetInLayer(nn++);
1946  sourcefile << " div = exp(neuron" << side << "())" << std::endl;
1947  while ((side = neuron->GetInLayer(nn++)))
1948  sourcefile << " div = div + exp(neuron" << side << "())" << std::endl;
1949  sourcefile << " neuron" << neuron ;
1950  sourcefile << "= (exp(neuron" << neuron << ") / div * ";
1951  break;
1952  }
1953  default:
1954  {
1955  sourcefile << " neuron " << neuron << "= 0.";
1956  }
1957  }
1958  sourcefile << neuron->GetNormalisation()[0] << "d0)+" ;
1959  sourcefile << neuron->GetNormalisation()[1] << "d0" << std::endl;
1960  }
1961  sourcefile << " end" << std::endl;
1962  }
1963  delete it;
1964 
1965  // Synapses
1966  sourcefile << "C --- Synapses" << std::endl;
1967  TSynapse *synapse = 0;
1968  it = (TObjArrayIter *) fSynapses.MakeIterator();
1969  while ((synapse = (TSynapse *) it->Next())) {
1970  sourcefile << " double precision function " << "synapse"
1971  << synapse << "(x)\n" << implicit;
1972  sourcefile << " double precision x("
1973  << fFirstLayer.GetEntriesFast() << ")" << std::endl << std::endl;
1974  sourcefile << " synapse" << synapse
1975  << "=neuron" << synapse->GetPre()
1976  << "(x)*" << synapse->GetWeight() << "d0" << std::endl;
1977  sourcefile << " end" << std::endl << std::endl;
1978  }
1979  delete it;
1980  sourcefile.close();
1981  std::cout << source << " created." << std::endl;
1982  }
1983  else if(lg == "PYTHON") {
1984  TString classname = filename;
1985  TString pyfile = filename;
1986  pyfile += ".py";
1987  std::ofstream pythonfile(pyfile);
1988  pythonfile << "from math import exp" << std::endl << std::endl;
1989  pythonfile << "from math import tanh" << std::endl << std::endl;
1990  pythonfile << "class " << classname << ":" << std::endl;
1991  pythonfile << "\tdef value(self,index";
1992  for (i = 0; i < fFirstLayer.GetEntriesFast(); i++) {
1993  pythonfile << ",in" << i;
1994  }
1995  pythonfile << "):" << std::endl;
1996  for (i = 0; i < fFirstLayer.GetEntriesFast(); i++)
1997  pythonfile << "\t\tself.input" << i << " = (in" << i << " - "
1998  << ((TNeuron *) fFirstLayer[i])->GetNormalisation()[1] << ")/"
1999  << ((TNeuron *) fFirstLayer[i])->GetNormalisation()[0] << std::endl;
2000  TNeuron *neuron;
2001  TObjArrayIter *it = (TObjArrayIter *) fLastLayer.MakeIterator();
2002  Int_t idx = 0;
2003  while ((neuron = (TNeuron *) it->Next()))
2004  pythonfile << "\t\tif index==" << idx++
2005  << ": return self.neuron" << neuron << "();" << std::endl;
2006  pythonfile << "\t\treturn 0." << std::endl;
2007  delete it;
2008  it = (TObjArrayIter *) fNetwork.MakeIterator();
2009  idx = 0;
2010  while ((neuron = (TNeuron *) it->Next())) {
2011  pythonfile << "\tdef neuron" << neuron << "(self):" << std::endl;
2012  if (!neuron->GetPre(0))
2013  pythonfile << "\t\treturn self.input" << idx++ << std::endl;
2014  else {
2015  pythonfile << "\t\tinput = " << neuron->GetWeight() << std::endl;
2016  TSynapse *syn;
2017  Int_t n = 0;
2018  while ((syn = neuron->GetPre(n++)))
2019  pythonfile << "\t\tinput = input + self.synapse"
2020  << syn << "()" << std::endl;
2021  switch(neuron->GetType()) {
2022  case (TNeuron::kSigmoid):
2023  {
2024  pythonfile << "\t\tif input<-709. : return " << neuron->GetNormalisation()[1] << std::endl;
2025  pythonfile << "\t\treturn ((1/(1+exp(-input)))*";
2026  break;
2027  }
2028  case (TNeuron::kLinear):
2029  {
2030  pythonfile << "\t\treturn (input*";
2031  break;
2032  }
2033  case (TNeuron::kTanh):
2034  {
2035  pythonfile << "\t\treturn (tanh(input)*";
2036  break;
2037  }
2038  case (TNeuron::kGauss):
2039  {
2040  pythonfile << "\t\treturn (exp(-input*input)*";
2041  break;
2042  }
2043  case (TNeuron::kSoftmax):
2044  {
2045  pythonfile << "\t\treturn (exp(input) / (";
2046  Int_t nn = 0;
2047  TNeuron* side = neuron->GetInLayer(nn++);
2048  pythonfile << "exp(self.neuron" << side << "())";
2049  while ((side = neuron->GetInLayer(nn++)))
2050  pythonfile << " + exp(self.neuron" << side << "())";
2051  pythonfile << ") * ";
2052  break;
2053  }
2054  default:
2055  {
2056  pythonfile << "\t\treturn 0.";
2057  }
2058  }
2059  pythonfile << neuron->GetNormalisation()[0] << ")+" ;
2060  pythonfile << neuron->GetNormalisation()[1] << std::endl;
2061  }
2062  }
2063  delete it;
2064  TSynapse *synapse = 0;
2065  it = (TObjArrayIter *) fSynapses.MakeIterator();
2066  while ((synapse = (TSynapse *) it->Next())) {
2067  pythonfile << "\tdef synapse" << synapse << "(self):" << std::endl;
2068  pythonfile << "\t\treturn (self.neuron" << synapse->GetPre()
2069  << "()*" << synapse->GetWeight() << ")" << std::endl;
2070  }
2071  delete it;
2072  pythonfile.close();
2073  std::cout << pyfile << " created." << std::endl;
2074  }
2075 }
2076 
2077 ////////////////////////////////////////////////////////////////////////////////
2078 /// Shuffle the Int_t index[n] in input.
2079 /// Input:
2080 /// index: the array to shuffle
2081 /// n: the size of the array
2082 /// Output:
2083 /// index: the shuffled indexes
2084 /// This method is used for stochastic training
2085 
2086 void TMultiLayerPerceptron::Shuffle(Int_t * index, Int_t n) const
2087 {
2088  TTimeStamp ts;
2089  TRandom3 rnd(ts.GetSec());
2090  Int_t j, k;
2091  Int_t a = n - 1;
2092  for (Int_t i = 0; i < n; i++) {
2093  j = (Int_t) (rnd.Rndm() * a);
2094  k = index[j];
2095  index[j] = index[i];
2096  index[i] = k;
2097  }
2098  return;
2099 }
2100 
2101 ////////////////////////////////////////////////////////////////////////////////
2102 /// One step for the stochastic method
2103 /// buffer should contain the previous dw vector and will be updated
2104 
2106 {
2107  Int_t nEvents = fTraining->GetN();
2108  Int_t *index = new Int_t[nEvents];
2109  Int_t i,j,nentries;
2110  for (i = 0; i < nEvents; i++)
2111  index[i] = i;
2112  fEta *= fEtaDecay;
2113  Shuffle(index, nEvents);
2114  TNeuron *neuron;
2115  TSynapse *synapse;
2116  for (i = 0; i < nEvents; i++) {
2117  GetEntry(fTraining->GetEntry(index[i]));
2118  // First compute DeDw for all neurons: force calculation before
2119  // modifying the weights.
2120  nentries = fFirstLayer.GetEntriesFast();
2121  for (j=0;j<nentries;j++) {
2122  neuron = (TNeuron *) fFirstLayer.UncheckedAt(j);
2123  neuron->GetDeDw();
2124  }
2125  Int_t cnt = 0;
2126  // Step for all neurons
2127  nentries = fNetwork.GetEntriesFast();
2128  for (j=0;j<nentries;j++) {
2129  neuron = (TNeuron *) fNetwork.UncheckedAt(j);
2130  buffer[cnt] = (-fEta) * (neuron->GetDeDw() + fDelta)
2131  + fEpsilon * buffer[cnt];
2132  neuron->SetWeight(neuron->GetWeight() + buffer[cnt++]);
2133  }
2134  // Step for all synapses
2135  nentries = fSynapses.GetEntriesFast();
2136  for (j=0;j<nentries;j++) {
2137  synapse = (TSynapse *) fSynapses.UncheckedAt(j);
2138  buffer[cnt] = (-fEta) * (synapse->GetDeDw() + fDelta)
2139  + fEpsilon * buffer[cnt];
2140  synapse->SetWeight(synapse->GetWeight() + buffer[cnt++]);
2141  }
2142  }
2143  delete[]index;
2144 }
2145 
2146 ////////////////////////////////////////////////////////////////////////////////
2147 /// One step for the batch (stochastic) method.
2148 /// DEDw should have been updated before calling this.
2149 
2150 void TMultiLayerPerceptron::MLP_Batch(Double_t * buffer)
2151 {
2152  fEta *= fEtaDecay;
2153  Int_t cnt = 0;
2154  TObjArrayIter *it = (TObjArrayIter *) fNetwork.MakeIterator();
2155  TNeuron *neuron = 0;
2156  // Step for all neurons
2157  while ((neuron = (TNeuron *) it->Next())) {
2158  buffer[cnt] = (-fEta) * (neuron->GetDEDw() + fDelta)
2159  + fEpsilon * buffer[cnt];
2160  neuron->SetWeight(neuron->GetWeight() + buffer[cnt++]);
2161  }
2162  delete it;
2163  it = (TObjArrayIter *) fSynapses.MakeIterator();
2164  TSynapse *synapse = 0;
2165  // Step for all synapses
2166  while ((synapse = (TSynapse *) it->Next())) {
2167  buffer[cnt] = (-fEta) * (synapse->GetDEDw() + fDelta)
2168  + fEpsilon * buffer[cnt];
2169  synapse->SetWeight(synapse->GetWeight() + buffer[cnt++]);
2170  }
2171  delete it;
2172 }
2173 
2174 ////////////////////////////////////////////////////////////////////////////////
2175 /// Sets the weights to a point along a line
2176 /// Weights are set to [origin + (dist * dir)].
2177 
2178 void TMultiLayerPerceptron::MLP_Line(Double_t * origin, Double_t * dir, Double_t dist)
2179 {
2180  Int_t idx = 0;
2181  TNeuron *neuron = 0;
2182  TSynapse *synapse = 0;
2183  TObjArrayIter *it = (TObjArrayIter *) fNetwork.MakeIterator();
2184  while ((neuron = (TNeuron *) it->Next())) {
2185  neuron->SetWeight(origin[idx] + (dir[idx] * dist));
2186  idx++;
2187  }
2188  delete it;
2189  it = (TObjArrayIter *) fSynapses.MakeIterator();
2190  while ((synapse = (TSynapse *) it->Next())) {
2191  synapse->SetWeight(origin[idx] + (dir[idx] * dist));
2192  idx++;
2193  }
2194  delete it;
2195 }
2196 
2197 ////////////////////////////////////////////////////////////////////////////////
2198 /// Sets the search direction to steepest descent.
2199 
2201 {
2202  Int_t idx = 0;
2203  TNeuron *neuron = 0;
2204  TSynapse *synapse = 0;
2205  TObjArrayIter *it = (TObjArrayIter *) fNetwork.MakeIterator();
2206  while ((neuron = (TNeuron *) it->Next()))
2207  dir[idx++] = -neuron->GetDEDw();
2208  delete it;
2209  it = (TObjArrayIter *) fSynapses.MakeIterator();
2210  while ((synapse = (TSynapse *) it->Next()))
2211  dir[idx++] = -synapse->GetDEDw();
2212  delete it;
2213 }
2214 
2215 ////////////////////////////////////////////////////////////////////////////////
2216 /// Search along the line defined by direction.
2217 /// buffer is not used but is updated with the new dw
2218 /// so that it can be used by a later stochastic step.
2219 /// It returns true if the line search fails.
2220 
2221 bool TMultiLayerPerceptron::LineSearch(Double_t * direction, Double_t * buffer)
2222 {
2223  Int_t idx = 0;
2224  Int_t j,nentries;
2225  TNeuron *neuron = 0;
2226  TSynapse *synapse = 0;
2227  // store weights before line search
2228  Double_t *origin = new Double_t[fNetwork.GetEntriesFast() +
2229  fSynapses.GetEntriesFast()];
2230  nentries = fNetwork.GetEntriesFast();
2231  for (j=0;j<nentries;j++) {
2232  neuron = (TNeuron *) fNetwork.UncheckedAt(j);
2233  origin[idx++] = neuron->GetWeight();
2234  }
2235  nentries = fSynapses.GetEntriesFast();
2236  for (j=0;j<nentries;j++) {
2237  synapse = (TSynapse *) fSynapses.UncheckedAt(j);
2238  origin[idx++] = synapse->GetWeight();
2239  }
2240  // try to find a triplet (alpha1, alpha2, alpha3) such that
2241  // Error(alpha1)>Error(alpha2)<Error(alpha3)
2242  Double_t err1 = GetError(kTraining);
2243  Double_t alpha1 = 0.;
2244  Double_t alpha2 = fLastAlpha;
2245  if (alpha2 < 0.01)
2246  alpha2 = 0.01;
2247  if (alpha2 > 2.0)
2248  alpha2 = 2.0;
2249  Double_t alpha3 = alpha2;
2250  MLP_Line(origin, direction, alpha2);
2251  Double_t err2 = GetError(kTraining);
2252  Double_t err3 = err2;
2253  Bool_t bingo = false;
2254  Int_t icount;
2255  if (err1 > err2) {
2256  for (icount = 0; icount < 100; icount++) {
2257  alpha3 *= fTau;
2258  MLP_Line(origin, direction, alpha3);
2259  err3 = GetError(kTraining);
2260  if (err3 > err2) {
2261  bingo = true;
2262  break;
2263  }
2264  alpha1 = alpha2;
2265  err1 = err2;
2266  alpha2 = alpha3;
2267  err2 = err3;
2268  }
2269  if (!bingo) {
2270  MLP_Line(origin, direction, 0.);
2271  delete[]origin;
2272  return true;
2273  }
2274  } else {
2275  for (icount = 0; icount < 100; icount++) {
2276  alpha2 /= fTau;
2277  MLP_Line(origin, direction, alpha2);
2278  err2 = GetError(kTraining);
2279  if (err1 > err2) {
2280  bingo = true;
2281  break;
2282  }
2283  alpha3 = alpha2;
2284  err3 = err2;
2285  }
2286  if (!bingo) {
2287  MLP_Line(origin, direction, 0.);
2288  delete[]origin;
2289  fLastAlpha = 0.05;
2290  return true;
2291  }
2292  }
2293  // Sets the weights to the bottom of parabola
2294  fLastAlpha = 0.5 * (alpha1 + alpha3 -
2295  (err3 - err1) / ((err3 - err2) / (alpha3 - alpha2)
2296  - (err2 - err1) / (alpha2 - alpha1)));
2297  fLastAlpha = fLastAlpha < 10000 ? fLastAlpha : 10000;
2298  MLP_Line(origin, direction, fLastAlpha);
2300  // Stores weight changes (can be used by a later stochastic step)
2301  idx = 0;
2302  nentries = fNetwork.GetEntriesFast();
2303  for (j=0;j<nentries;j++) {
2304  neuron = (TNeuron *) fNetwork.UncheckedAt(j);
2305  buffer[idx] = neuron->GetWeight() - origin[idx];
2306  idx++;
2307  }
2308  nentries = fSynapses.GetEntriesFast();
2309  for (j=0;j<nentries;j++) {
2310  synapse = (TSynapse *) fSynapses.UncheckedAt(j);
2311  buffer[idx] = synapse->GetWeight() - origin[idx];
2312  idx++;
2313  }
2314  delete[]origin;
2315  return false;
2316 }
2317 
2318 ////////////////////////////////////////////////////////////////////////////////
2319 /// Sets the search direction to conjugate gradient direction
2320 /// beta should be:
2321 /// ||g_{(t+1)}||^2 / ||g_{(t)}||^2 (Fletcher-Reeves)
2322 /// g_{(t+1)} (g_{(t+1)}-g_{(t)}) / ||g_{(t)}||^2 (Ribiere-Polak)
2323 
2325 {
2326  Int_t idx = 0;
2327  Int_t j,nentries;
2328  TNeuron *neuron = 0;
2329  TSynapse *synapse = 0;
2330  nentries = fNetwork.GetEntriesFast();
2331  for (j=0;j<nentries;j++) {
2332  neuron = (TNeuron *) fNetwork.UncheckedAt(j);
2333  dir[idx] = -neuron->GetDEDw() + beta * dir[idx];
2334  idx++;
2335  }
2336  nentries = fSynapses.GetEntriesFast();
2337  for (j=0;j<nentries;j++) {
2338  synapse = (TSynapse *) fSynapses.UncheckedAt(j);
2339  dir[idx] = -synapse->GetDEDw() + beta * dir[idx];
2340  idx++;
2341  }
2342 }
2343 
2344 ////////////////////////////////////////////////////////////////////////////////
2345 /// Computes the hessian matrix using the BFGS update algorithm.
2346 /// from gamma (g_{(t+1)}-g_{(t)}) and delta (w_{(t+1)}-w_{(t)}).
2347 /// It returns true if such a direction could not be found
2348 /// (if gamma and delta are orthogonal).
2349 
2351 {
2352  TMatrixD gd(gamma, TMatrixD::kTransposeMult, delta);
2353  if ((Double_t) gd[0][0] == 0.)
2354  return true;
2355  TMatrixD aHg(bfgsh, TMatrixD::kMult, gamma);
2356  TMatrixD tmp(gamma, TMatrixD::kTransposeMult, bfgsh);
2357  TMatrixD gHg(gamma, TMatrixD::kTransposeMult, aHg);
2358  Double_t a = 1 / (Double_t) gd[0][0];
2359  Double_t f = 1 + ((Double_t) gHg[0][0] * a);
2360  TMatrixD res( TMatrixD(delta, TMatrixD::kMult,
2361  TMatrixD(TMatrixD::kTransposed, delta)));
2362  res *= f;
2363  res -= (TMatrixD(delta, TMatrixD::kMult, tmp) +
2365  TMatrixD(TMatrixD::kTransposed, delta)));
2366  res *= a;
2367  bfgsh += res;
2368  return false;
2369 }
2370 
2371 ////////////////////////////////////////////////////////////////////////////////
2372 /// Sets the gamma (g_{(t+1)}-g_{(t)}) and delta (w_{(t+1)}-w_{(t)}) vectors
2373 /// Gamma is computed here, so ComputeDEDw cannot have been called before,
2374 /// and delta is a direct translation of buffer into a TMatrixD.
2375 
2377  Double_t * buffer)
2378 {
2379  Int_t els = fNetwork.GetEntriesFast() + fSynapses.GetEntriesFast();
2380  Int_t idx = 0;
2381  Int_t j,nentries;
2382  TNeuron *neuron = 0;
2383  TSynapse *synapse = 0;
2384  nentries = fNetwork.GetEntriesFast();
2385  for (j=0;j<nentries;j++) {
2386  neuron = (TNeuron *) fNetwork.UncheckedAt(j);
2387  gamma[idx++][0] = -neuron->GetDEDw();
2388  }
2389  nentries = fSynapses.GetEntriesFast();
2390  for (j=0;j<nentries;j++) {
2391  synapse = (TSynapse *) fSynapses.UncheckedAt(j);
2392  gamma[idx++][0] = -synapse->GetDEDw();
2393  }
2394  for (Int_t i = 0; i < els; i++)
2395  delta[i].Assign(buffer[i]);
2396  //delta.SetElements(buffer,"F");
2397  ComputeDEDw();
2398  idx = 0;
2399  nentries = fNetwork.GetEntriesFast();
2400  for (j=0;j<nentries;j++) {
2401  neuron = (TNeuron *) fNetwork.UncheckedAt(j);
2402  gamma[idx++][0] += neuron->GetDEDw();
2403  }
2404  nentries = fSynapses.GetEntriesFast();
2405  for (j=0;j<nentries;j++) {
2406  synapse = (TSynapse *) fSynapses.UncheckedAt(j);
2407  gamma[idx++][0] += synapse->GetDEDw();
2408  }
2409 }
2410 
2411 ////////////////////////////////////////////////////////////////////////////////
2412 /// scalar product between gradient and direction
2413 /// = derivative along direction
2414 
2415 Double_t TMultiLayerPerceptron::DerivDir(Double_t * dir)
2416 {
2417  Int_t idx = 0;
2418  Int_t j,nentries;
2419  Double_t output = 0;
2420  TNeuron *neuron = 0;
2421  TSynapse *synapse = 0;
2422  nentries = fNetwork.GetEntriesFast();
2423  for (j=0;j<nentries;j++) {
2424  neuron = (TNeuron *) fNetwork.UncheckedAt(j);
2425  output += neuron->GetDEDw() * dir[idx++];
2426  }
2427  nentries = fSynapses.GetEntriesFast();
2428  for (j=0;j<nentries;j++) {
2429  synapse = (TSynapse *) fSynapses.UncheckedAt(j);
2430  output += synapse->GetDEDw() * dir[idx++];
2431  }
2432  return output;
2433 }
2434 
2435 ////////////////////////////////////////////////////////////////////////////////
2436 /// Computes the direction for the BFGS algorithm as the product
2437 /// between the Hessian estimate (bfgsh) and the dir.
2438 
2439 void TMultiLayerPerceptron::BFGSDir(TMatrixD & bfgsh, Double_t * dir)
2440 {
2441  Int_t els = fNetwork.GetEntriesFast() + fSynapses.GetEntriesFast();
2442  TMatrixD dedw(els, 1);
2443  Int_t idx = 0;
2444  Int_t j,nentries;
2445  TNeuron *neuron = 0;
2446  TSynapse *synapse = 0;
2447  nentries = fNetwork.GetEntriesFast();
2448  for (j=0;j<nentries;j++) {
2449  neuron = (TNeuron *) fNetwork.UncheckedAt(j);
2450  dedw[idx++][0] = neuron->GetDEDw();
2451  }
2452  nentries = fSynapses.GetEntriesFast();
2453  for (j=0;j<nentries;j++) {
2454  synapse = (TSynapse *) fSynapses.UncheckedAt(j);
2455  dedw[idx++][0] = synapse->GetDEDw();
2456  }
2457  TMatrixD direction(bfgsh, TMatrixD::kMult, dedw);
2458  for (Int_t i = 0; i < els; i++)
2459  dir[i] = -direction[i][0];
2460  //direction.GetElements(dir,"F");
2461 }
2462 
2463 ////////////////////////////////////////////////////////////////////////////////
2464 /// Draws the network structure.
2465 /// Neurons are depicted by a blue disk, and synapses by
2466 /// lines connecting neurons.
2467 /// The line width is proportionnal to the weight.
2468 
2469 void TMultiLayerPerceptron::Draw(Option_t * /*option*/)
2470 {
2471 #define NeuronSize 2.5
2472 
2473  Int_t nLayers = fStructure.CountChar(':')+1;
2474  Float_t xStep = 1./(nLayers+1.);
2475  Int_t layer;
2476  for(layer=0; layer< nLayers-1; layer++) {
2477  Float_t nNeurons_this = 0;
2478  if(layer==0) {
2479  TString input = TString(fStructure(0, fStructure.First(':')));
2480  nNeurons_this = input.CountChar(',')+1;
2481  }
2482  else {
2483  Int_t cnt=0;
2484  TString hidden = TString(fStructure(fStructure.First(':') + 1,fStructure.Last(':') - fStructure.First(':') - 1));
2485  Int_t beg = 0;
2486  Int_t end = hidden.Index(":", beg + 1);
2487  while (end != -1) {
2488  Int_t num = atoi(TString(hidden(beg, end - beg)).Data());
2489  cnt++;
2490  beg = end + 1;
2491  end = hidden.Index(":", beg + 1);
2492  if(layer==cnt) nNeurons_this = num;
2493  }
2494  Int_t num = atoi(TString(hidden(beg, hidden.Length() - beg)).Data());
2495  cnt++;
2496  if(layer==cnt) nNeurons_this = num;
2497  }
2498  Float_t nNeurons_next = 0;
2499  if(layer==nLayers-2) {
2500  TString output = TString(fStructure(fStructure.Last(':') + 1,fStructure.Length() - fStructure.Last(':')));
2501  nNeurons_next = output.CountChar(',')+1;
2502  }
2503  else {
2504  Int_t cnt=0;
2505  TString hidden = TString(fStructure(fStructure.First(':') + 1,fStructure.Last(':') - fStructure.First(':') - 1));
2506  Int_t beg = 0;
2507  Int_t end = hidden.Index(":", beg + 1);
2508  while (end != -1) {
2509  Int_t num = atoi(TString(hidden(beg, end - beg)).Data());
2510  cnt++;
2511  beg = end + 1;
2512  end = hidden.Index(":", beg + 1);
2513  if(layer+1==cnt) nNeurons_next = num;
2514  }
2515  Int_t num = atoi(TString(hidden(beg, hidden.Length() - beg)).Data());
2516  cnt++;
2517  if(layer+1==cnt) nNeurons_next = num;
2518  }
2519  Float_t yStep_this = 1./(nNeurons_this+1.);
2520  Float_t yStep_next = 1./(nNeurons_next+1.);
2521  TObjArrayIter* it = (TObjArrayIter *) fSynapses.MakeIterator();
2522  TSynapse *theSynapse = 0;
2523  Float_t maxWeight = 0;
2524  while ((theSynapse = (TSynapse *) it->Next()))
2525  maxWeight = maxWeight < theSynapse->GetWeight() ? theSynapse->GetWeight() : maxWeight;
2526  delete it;
2527  it = (TObjArrayIter *) fSynapses.MakeIterator();
2528  for(Int_t neuron1=0; neuron1<nNeurons_this; neuron1++) {
2529  for(Int_t neuron2=0; neuron2<nNeurons_next; neuron2++) {
2530  TLine* synapse = new TLine(xStep*(layer+1),yStep_this*(neuron1+1),xStep*(layer+2),yStep_next*(neuron2+1));
2531  synapse->Draw();
2532  theSynapse = (TSynapse *) it->Next();
2533  if (!theSynapse) continue;
2534  synapse->SetLineWidth(Int_t((theSynapse->GetWeight()/maxWeight)*10.));
2535  synapse->SetLineStyle(1);
2536  if(((TMath::Abs(theSynapse->GetWeight())/maxWeight)*10.)<0.5) synapse->SetLineStyle(2);
2537  if(((TMath::Abs(theSynapse->GetWeight())/maxWeight)*10.)<0.25) synapse->SetLineStyle(3);
2538  }
2539  }
2540  delete it;
2541  }
2542  for(layer=0; layer< nLayers; layer++) {
2543  Float_t nNeurons = 0;
2544  if(layer==0) {
2545  TString input = TString(fStructure(0, fStructure.First(':')));
2546  nNeurons = input.CountChar(',')+1;
2547  }
2548  else if(layer==nLayers-1) {
2549  TString output = TString(fStructure(fStructure.Last(':') + 1,fStructure.Length() - fStructure.Last(':')));
2550  nNeurons = output.CountChar(',')+1;
2551  }
2552  else {
2553  Int_t cnt=0;
2554  TString hidden = TString(fStructure(fStructure.First(':') + 1,fStructure.Last(':') - fStructure.First(':') - 1));
2555  Int_t beg = 0;
2556  Int_t end = hidden.Index(":", beg + 1);
2557  while (end != -1) {
2558  Int_t num = atoi(TString(hidden(beg, end - beg)).Data());
2559  cnt++;
2560  beg = end + 1;
2561  end = hidden.Index(":", beg + 1);
2562  if(layer==cnt) nNeurons = num;
2563  }
2564  Int_t num = atoi(TString(hidden(beg, hidden.Length() - beg)).Data());
2565  cnt++;
2566  if(layer==cnt) nNeurons = num;
2567  }
2568  Float_t yStep = 1./(nNeurons+1.);
2569  for(Int_t neuron=0; neuron<nNeurons; neuron++) {
2570  TMarker* m = new TMarker(xStep*(layer+1),yStep*(neuron+1),20);
2571  m->SetMarkerColor(4);
2572  m->SetMarkerSize(NeuronSize);
2573  m->Draw();
2574  }
2575  }
2576  const TString input = TString(fStructure(0, fStructure.First(':')));
2577  const TObjArray *inpL = input.Tokenize(" ,");
2578  const Int_t nrItems = inpL->GetLast()+1;
2579  Float_t yStep = 1./(nrItems+1);
2580  for (Int_t item = 0; item < nrItems; item++) {
2581  const TString brName = ((TObjString *)inpL->At(item))->GetString();
2582  TText* label = new TText(0.5*xStep,yStep*(item+1),brName.Data());
2583  label->Draw();
2584  }
2585  delete inpL;
2586 
2587  Int_t numOutNodes=fLastLayer.GetEntriesFast();
2588  yStep=1./(numOutNodes+1);
2589  for (Int_t outnode=0; outnode<numOutNodes; outnode++) {
2590  TNeuron* neuron=(TNeuron*)fLastLayer[outnode];
2591  if (neuron && neuron->GetName()) {
2592  TText* label = new TText(xStep*nLayers,
2593  yStep*(outnode+1),
2594  neuron->GetName());
2595  label->Draw();
2596  }
2597  }
2598 }
Double_t DerivDir(Double_t *)
scalar product between gradient and direction = derivative along direction
virtual TMatrixTBase< Element > & UnitMatrix()
Make a unit matrix (matrix need not be a square one).
ENeuronType GetType() const
Returns the neuron type.
Definition: TNeuron.cxx:867
Bool_t LineSearch(Double_t *, Double_t *)
Search along the line defined by direction.
void SteepestDir(Double_t *)
Sets the search direction to steepest descent.
double dist(Rotation3D const &r1, Rotation3D const &r2)
Definition: 3DDistances.cxx:48
TSynapse * GetPre(Int_t n) const
Definition: TNeuron.h:54
void Randomize() const
Randomize the weights.
void BuildHiddenLayers(TString &)
Builds hidden layers.
void BFGSDir(TMatrixD &, Double_t *)
Computes the direction for the BFGS algorithm as the product between the Hessian estimate (bfgsh) and...
Random number generator class based on M.
Definition: TRandom3.h:27
void ForceExternalValue(Double_t value)
Uses the branch type to force an external value.
Definition: TNeuron.cxx:1125
Double_t fEpsilon
Eta - used in stochastic minimisation - Default=0.1.
TTreeFormulaManager * fManager
formula representing the event weight
Double_t Log(Double_t x)
Definition: TMath.h:759
void SetTestDataSet(TEventList *test)
Sets the Test dataset.
void SetNewEvent() const
Inform the neuron that inputs of the network have changed, so that the buffered values have to be rec...
Definition: TNeuron.cxx:1157
Double_t fCurrentTreeWeight
index of the current tree in a chain
void SetEpsilon(Double_t eps)
Sets Epsilon - used in stochastic minimisation (look at the constructor for the complete description ...
void Export(Option_t *filename="NNfunction", Option_t *language="C++") const
Exports the NN as a function for any non-ROOT-dependant code Supported languages are: only C++ ...
Double_t Evaluate(Int_t index, Double_t *params) const
Returns the Neural Net for a given set of input parameters #parameters must equal #input neurons...
ENeuronType
Definition: TNeuron.h:48
void AttachData()
Connects the TTree to Neurons in input and output layers.
void SetEtaDecay(Double_t ed)
Sets EtaDecay - Eta *= EtaDecay at each epoch (look at the constructor for the complete description o...
void SetEta(Double_t eta)
Sets Eta - used in stochastic minimisation (look at the constructor for the complete description of l...
Bool_t IsNaN(Double_t x)
Definition: TMath.h:891
void SetData(TTree *)
Set the data source.
Double_t GetDeDw() const
Computes the derivative of the error wrt the synapse weight.
Definition: TSynapse.cxx:92
void Shuffle(Int_t *, Int_t) const
Shuffle the Int_t index[n] in input.
Double_t GetSumSquareError() const
Error on the output for a given event.
TObjArray fNetwork
weight of the current tree in a chain
double beta(double x, double y)
Calculates the beta function.
void ConjugateGradientsDir(Double_t *, Double_t)
Sets the search direction to conjugate gradient direction beta should be: ||g_{(t+1)}||^2 / ||g_{(t)}...
Double_t GetBranch() const
Returns the formula value.
Definition: TNeuron.cxx:914
void DrawResult(Int_t index=0, Option_t *option="test") const
Draws the neural net output It produces an histogram with the output for the two datasets.
bool GetBFGSH(TMatrixD &, TMatrixD &, TMatrixD &)
Computes the hessian matrix using the BFGS update algorithm.
Double_t fEta
TTreeFormulaManager for the weight and neurons.
you should not use this method at all Int_t Int_t Double_t Double_t Double_t Int_t Double_t Double_t Double_t tau
Definition: TRolke.cxx:630
Double_t GetValue() const
Computes the output using the appropriate function and all the weighted inputs, or uses the branch as...
Definition: TNeuron.cxx:948
TTreeFormula * UseBranch(TTree *, const char *)
Sets a formula that can be used to make the neuron an input.
Definition: TNeuron.cxx:878
virtual ~TMultiLayerPerceptron()
Destructor.
void SetReset(Int_t reset)
Sets number of epochs between two resets of the search direction to the steepest descent.
TNeuron * GetInLayer(Int_t n) const
Definition: TNeuron.h:56
Bool_t DumpWeights(Option_t *filename="-") const
Dumps the weights to a text file.
Int_t fCurrentTree
pointer to the tree used as datasource
TNeuron::ENeuronType fType
void SetWeight(Double_t w)
Sets the neuron weight to w.
Definition: TNeuron.cxx:1148
you should not use this method at all Int_t Int_t Double_t Double_t Double_t Int_t Double_t Double_t Double_t Double_t Int_t m
Definition: TRolke.cxx:637
TMatrixT< Double_t > TMatrixD
Definition: TMatrixDfwd.h:22
const Double_t * GetNormalisation() const
Definition: TNeuron.h:69
#define NeuronSize
double gamma(double x)
void BuildOneHiddenLayer(const TString &sNumNodes, Int_t &layer, Int_t &prevStart, Int_t &prevStop, Bool_t lastLayer)
Builds a hidden layer, updates the number of layers.
void AddInLayer(TNeuron *)
Tells a neuron which neurons form its layer (including itself).
Definition: TNeuron.cxx:857
void SetNormalisation(Double_t mean, Double_t RMS)
Sets the normalization variables.
Definition: TNeuron.cxx:1137
void Train(Int_t nEpoch, Option_t *option="text", Double_t minE=0)
Train the network.
TNeuron::ENeuronType fOutType
virtual void Draw(Option_t *option="")
Draws the network structure.
Double_t GetError() const
Computes the error for output neurons.
Definition: TNeuron.cxx:1063
void SetDEDw(Double_t in)
Sets the derivative of the total error wrt the neuron weight.
Definition: TNeuron.cxx:1168
Bool_t fTestOwner
internal flag whether one has to delete fTraining or not
Double_t GetError(Int_t event) const
Error on the output for a given event.
Bool_t LoadWeights(Option_t *filename="")
Loads the weights from a text file conforming to the format defined by DumpWeights.
Int_t fReset
internal parameter used in line search
void SetTrainingDataSet(TEventList *train)
Sets the Training dataset.
TNeuron * GetPre() const
Definition: TSynapse.h:37
void SetWeight(Double_t w)
Sets the weight of the synapse.
Definition: TSynapse.cxx:105
Double_t GetWeight() const
Definition: TSynapse.h:40
void SetGammaDelta(TMatrixD &, TMatrixD &, Double_t *)
Sets the gamma (g_{(t+1)}-g_{(t)}) and delta (w_{(t+1)}-w_{(t)}) vectors Gamma is computed here...
void SetDEDw(Double_t in)
Sets the derivative of the total error wrt the synapse weight.
Definition: TSynapse.cxx:113
Double_t GetDeDw() const
Computes the derivative of the error wrt the neuron weight.
Definition: TNeuron.cxx:1084
void ComputeDEDw() const
Compute the DEDw = sum on all training events of dedw for each weight normalized by the number of eve...
Double_t fEtaDecay
Delta - used in stochastic minimisation - Default=0.
Double_t fDelta
Epsilon - used in stochastic minimisation - Default=0.
void SetDelta(Double_t delta)
Sets Delta - used in stochastic minimisation (look at the constructor for the complete description of...
Double_t fTau
EtaDecay - Eta *= EtaDecay at each epoch - Default=1.
Double_t GetTarget() const
Computes the normalized target pattern for output neurons.
Definition: TNeuron.cxx:1074
Double_t GetDEDw() const
Definition: TSynapse.h:44
void MLP_Line(Double_t *, Double_t *, Double_t)
Sets the weights to a point along a line Weights are set to [origin + (dist * dir)].
TEventList * fTest
EventList defining the events in the training dataset.
void MLP_Batch(Double_t *)
One step for the batch (stochastic) method.
Double_t GetWeight() const
Definition: TNeuron.h:67
ELearningMethod fLearningMethod
EventList defining the events in the test dataset.
Double_t GetDEDw() const
Definition: TNeuron.h:72
void MLP_Stochastic(Double_t *)
One step for the stochastic method buffer should contain the previous dw vector and will be updated...
Bool_t fTrainingOwner
number of epochs between two resets of the search direction to the steepest descent - Default=50 ...
Double_t GetCrossEntropyBinary() const
Cross entropy error for sigmoid output neurons, for a given event.
void ExpandStructure()
Expand the structure of the first layer.
TMultiLayerPerceptron()
Default constructor.
void BuildFirstLayer(TString &)
Instanciates the neurons in input Inputs are normalised and the type is set to kOff (simple forward o...
void SetEventWeight(const char *)
Set the event weight.
void GetEntry(Int_t) const
Load an entry into the network.
Double_t Sqrt(Double_t x)
Definition: TMath.h:690
Double_t fLastAlpha
Tau - used in line search - Default=3.
Double_t Result(Int_t event, Int_t index=0) const
Computes the output for a given event.
Double_t GetCrossEntropy() const
Cross entropy error for a softmax output neuron, for a given event.
void BuildLastLayer(TString &, Int_t)
Builds the output layer Neurons are linear combinations of input, by defaul.
void SetLearningMethod(TMultiLayerPerceptron::ELearningMethod method)
Sets the learning method.
void SetTau(Double_t tau)
Sets Tau - used in line search (look at the constructor for the complete description of learning meth...
void BuildNetwork()
Instanciates the network from the description.
TNeuron::ENeuronType GetType() const
TTreeFormula * fEventWeight
The Learning Method.