Application of a Neural Network and a Genetic Algorithm ... - CiteSeerX

3 downloads 0 Views 27KB Size Report
ware Engineering, Artificial Intelligence and Expert Systems for High Energy and. Nuclear Physics, Oberammergau, October 4.- 8., 1993, pp.359, World Scientific.
Application of a Neural Network and a Genetic Algorithm in the Analysis of Multi Particle Final States Thomas F. Degener, Marcel Kunze Fakultät für Physik und Astronomie, Lehrstuhl I für Experimentalphysik, Ruhr-Universität Bochum, Universitätsstr. 150, 44801 Bochum, Germany e-mail: [email protected] ABSTRACT The analysis of a particle physics experiment requires the correct selection of the desired multiplicity of charged (pions or kaons) and neutral (photons) particles. Due to fluctuations in hadronic and electromagnetic showers often too many photons are reconstructed by the analysis software. The combination of a feed forward neural network and a genetic algorithm is used to separate photons and fluctuations.

1. Introduction The number of charged particles arising from pp-annihilation measured by the Crystal Barrel detector1 at LEAR (CERN) is determined by counting the number of tracks in the wire chambers of the experiment. The curvature of the tracks in a magnetic field of 1.5 T and dE/dx information provide the corresponding fourvectors. The measurement of photons is done by use of a segmented CsI(Tl)-calorimeter. In order to determine the number of photons and their fourvectors, neighbouring crystals with an energy deposit are grouped to form clusters. These clusters are searched for local maxima to resolve photons of which the electromagnetic showers are overlapping. It is assumed that every local maximum corresponds to a photon, if the energy deposit in the whole cluster is above a threshold. Monte Carlo studies show that about 5% of all local maxima are just due to fluctuations in the electromagnetic cascade. A neural network has been developed that reduces these electromagnetic fluctuations to less than 0.1%. A full description of the hierarchic network architecture is given in the proceedings of the III. workshop2. The simultaneous measurement of charged and neutral particles makes the analysis calorimeter surface more complicated, because charged particles deposit energy in the calorimeter too. To prevent the misinterpretation track points of the surplus energy (minimum ionizing peak of 180 MeV) in the calorime5˚ energy deposit vertex ter as photons, all deposits in a cone Fig. 1 Matching of charged tracks to with an opening angle of 5˚ around the energy deposits in the track (see Fig. 1) are matched to the calorimeter.

charged track and rejected from further analysis. Unfortunately hadronic showers fluctuate much stronger than electromagnetic showers. The hadronic fluctuations give rise to energy deposits far away from the primary deposits in contrast to electromagnetic fluctuations. About 40% of all charged tracks produce a hadronic fluctuation. 2. Statement of the problem The neural network has an efficiency of 20% to identify hadronic fluctuations, although it was trained with electromagnetic showers only. A training including Monte Carlo simulation of hadronic showers was not investigated, because the input parameters of the network consist of a local 5x5 crystal matrix around a local maximum, which naturally can not cope with a long range effecta. Furthermore it is an advantage to distinguish between two types of fluctuations, because in the electromagnetic shower the energy of a fluctuation, not more than 50 MeV, is part of the original photon, whereas hadronic fluctuations produce additional energy. The measurement of an event can thus be improved two fold: • Identification of electromagnetic fluctuations, assignment to the primary energy deposit of a photon and recovery of the energy. • Identification of hadronic fluctuations and dropping their energy. The first step after the standard reconstruction of an event is to use the neural network in order to identify fluctuations. An assumption has to be made which are “pure“ electromagnetic. If the fluctuation is in a cluster matched to a charged track or the closest local maximum is matched to a charged track, then the fluctuation is considered hadronic and rejected from further analysis. Otherwise its energy is added to the closest photon and the fourvector of the photon and its errors are recalculated. Without hadronic fluctuations the event should now fulfil energy and momentum conservation, tested by means of a 4C-kinematic fit. On the same basis a method for the identification of hadronic fluctuations was developed for proton-antiproton annihilation at rest, assuming that all fluctuations are hadronic. The method successively drops 0,1,2.. photons of an event and applies a phasespace fit. The fit with the highest confidence level decides on the correct photon multiplicity. For example: Suppose an analysis of a pp-annihilation into a π+π-π0ω → π+π-5γ final state. If there are 2 fluctuations (the standard reconstruction has found 7 photons), at least 29b phasespace fits have to be tried to have a chance to select the correct multiplicity. This is feasible for pp-annihilation at rest, because the charged tracks are long and well measured, such that for most events only the correct combination has a fit with a good confidence level. The situation changes dramatically for the analysis of pp-annihilation in flight. At higher beam momenta the particles are boosted in the forward direction. The measurement of the charged particles becomes much worse due to shorter tracks in the wire chambers. This is illustrated in Fig. 2 (pie chart for the standard selection). Not even half of the events at a confidence level above 10% for a 4C-fit have the correct photon a.

A network that is able to identify hadronic fluctuations on the level of pattern recognition was presented on 3 this workshop . b. 1 for 7γ, 7 possibilities for 6γ and 21 for 5γ.

multiplicity. The direct decay of the pp-system into charged pions and photons only is a very rare process. Most of the photons stem from intermediate mesons, such as π0→2γ,η→2γ and ω→π0γ→3γ. It is the mesonic final state that is of major interest. A higher constraint mass fit is used to reconstruct the mesonic final state, one constraint for each intermediate meson. A mass fit is very sensitive on surplus photons due to hadronic fluctuations, because the photon energy is very well measured by the CsI(Tl)-calorimeter. A mass fit reduces the possibility to drop “real“ photons, but is very CPU time consuming. Testing all combinations for photons to drop and all combinations of invariant two- and three masses for the remaining photons would take to long for millions of events. The fit is replaced by an evaluation function, which uses the constraints and the errors on the measured quantities to find the correct photon multiplicity. A mass fit converges better the closer the measured quantities are to the constraints within the errors. Finding the correct multiplicity then corresponds to finding the maximum (minimum) of the evaluation function, which is a discrete, non differentiable function of the photon combinations. A GA is able to optimize any kind of function and can solve this combinatorial problem much faster than a series of kinematic fits. Fig. 2

pp → π+π-π0ω → π+π-5γ Monte Carlo events with a confidence level above 10% for a 4C-fit. The size of the pie charts is proportional to the number of events (From 20.000 events 10934 were preselected before the fit).



1313

8γ 7γ

without repaired photons

6γ correct 5γ

2703 standard reconstruction

4372 with repaired photons

3. The genetic algorithm 3.1. The objective function   Measured are the fourvectors P i =  E i, p i  . The errors on the components of P i 1 , tan λ are calculated from the Gaussian distributed errors in E, Θ, φ for photons and ψ, ----pT for charged pions (kaons). The energy and momentum constraints are: •

The total energy

E tot =

∑i E i



The total momentum p tot =

∑i pi

2 = m P + m P2 + p beam = 3093.3 MeV/c

= ( 0., 0., p beam ) = ( 0., 0., 1940. MeV/c ) .

The mass constraints are: • The invariant masses of two photons mγγ = mπ0 = 134.974 MeV/c2 or mγγ = mη = 547.45 MeV/c2.

The invariant mass of three photons mγγγ = mω = 781.95 MeV/c2 (where one pair of the three is a pion). The invariant mass squared of two photons in terms of E, Θ, φ is for example •

2 = 2E E ( 1 – sin Θ sin Θ cos ( φ – φ ) + cos Θ cos Θ ) . m γγ 1 2 1 2 1 2 1 2

(1)

For every constraint x c , on a measured quantity x with an error ∆x , a Gaussian function of width ∆x is used: 2  ( x – xc)  1 G ( x, ∆x ) = -----------------exp  ---------------------. 2 ( ∆x ) 2 2π∆x (2) The objective function F is a3sum of weighted Gaussians: F = w E G ( E, ∆E ) +

∑ wp G ( pi, ∆pi ) + ∑ wm i

i

i=1

∑ wm

γγ

G ( m γγ , ∆m γγ ) + i

i

G ( m γγγ , ∆m γγγ ) + max ( w' m G ( m γγ , ∆m γγ , m π 0 ) ) (3) j j γγγ k k k ∈ γγγ j F is a function of the number of photons taken into account and of the 2 and 3 photon combinations. The weights had to be introduced to make sure that complete events give the highest value of F and not just the best combination of two or three photons. These constant weights are listed in Tab. 1. j

γγγ

Weight for wE z-momentum w p z x-, y-momentum w p , w p x y two photon mass w m γγ three photon mass w m , w' m

Value

energy

γγγ

4.0 2.0 1.0

Tab.1: The constant weights for the evaluation function. They were found by trying all combinatorial cases and depend strongly on the errors of the measured values.

2.0 γγγ

8.0, 2.0

3.2. Genotype of an event How is it possible to encode an event in a bitstring for a genetic algorithm? The number of charged particles is fixed, but of course the corresponding fourvectors are used to calculate the total energy and momentum for F. If an event consists of a π+, a π- and n photon candidates (local maxima), the genome consists of n + 1 -genes of n -bit as shown in Fig. 3. The first n bit encode the phasespace, where an unset bit means that the corresponding local maximum is a hadronic fluctuation to be excluded from further analysis. The next n genes of n bit each, encode the invariant mass combinations. If less than two or more than three bits are set in one gene, then the corresponding photons are not used for any invariant mass. Only if two or three bits are set in one gene the invariant mass of the corresponding photons is calculated and contributes to the evaluation function. Before an individual is added to the population, it is tested on “photon number conservation“: A bit that is not set in the phasespace gene can not appear in any invariant mass gene. Every bit

set in the phasespace gene must appear in one and only one invariant mass gene. If all bits set in the phasespace gene are used in an invariant mass gene, then the rest of the genome phasespace gene

n-invariant mass genes

1 1 0 1 1 .. 1 0 0 0 0 .. 0 0 0 1 1 .. 0 1 0 0 0 .. 0 0 0 0 0.... n-bit Example :

π+π-

n-bit

n-bit

n-bit

η γ2f ω (where γ2f is a fluctuation) → γ γ γ → γ1γ3 4 5 6

a possible encoding would be: 101111 101000 000000 000111 ...(3 times 6 bit do not care) or: 101111 000111 101000 ...(4 times 6-bit do not care) Fig. 3

Encoding of events into genomes. As shown in the example, different bitstrings can encode the same decay chain.

will be ignored. If a genome violates “photon number conservation“ the relevant bits are set appropriately before the genome is included in the population. 3.3. Optimization strategy For less than five photon candidates all combinations are tested. For higher multiplicities the GA uses a (50,50) strategy, where up to 30 individuals are initialised with suitable structures and the rest by random. Two point crossover is used at a rate of 0.7, i.e. 70% of the population, chosen randomly, undergoes crossover per iteration. Each position (bit) has a chance of 10%a to be selected for mutation per iteration. If a position is selected it is randomly set to 0 or 1. The strong influence of chance, due to the high mutation rate, is compensated by the test on “photon number conservation“, that forces the genomes to fulfil the combinatorial constraints before the evaluation. The objective function is only calculated for a genome, if it has actually changed after these procedures. The GA terminates for a total number of 400 evaluationsb of the objective function F. 4. Results The method has been tested on 20.000 pp→π+π-π0ω→π+π-5γ Monte Carlo events at a p-beam momentum of 1200 MeV/c. After the reconstruction of the charged tracks, 10934 events remain with two well measured tracks of opposite charge and at least 5 photon candidates (with a mean value of about 7). In order to select the mesonic final state, the events are preselected by a 4C-kinematic fit. Fig. 2 compares the preselection with the neural network and the GA to the standard reconstruction. Without recovering the energy of electromagnetic fluctuations, identified by the neural network, the ratio of 5γ-events to all events that have a 4C-fit with a confidence level higher than 10% increases from 48% a.

This is rather large: For other problems the mutation rate is often 100 times smaller. The number of all combinations is rapidly increasing with the number of photon candidates (212 for 5, about 6000 for 7 candidates). b.

to 86%, which is a strong improvement in background reduction. The disadvantage however, is that the number of 5γ events is decreased by 13%. When recovering the energy of electromagnetic fluctuations the ratio is a bit lower (72%), but the number of 5γ-events is increased by 144%. Then the preselected 5γ-events are submitted to a 7C-fit to the hypoth2000

Fig. 4 1500 with repaired photons 1000

without repaired photons standard reconstruction

500

0

π+π-π0ω - Monte Carlo events

with a confidence level above 10% for a 7C-fit to the hypothesis π+ππ0ω. Only events that survived the confidence level cut on the phasespace fit for the correct multiplicity of 5γ were given to the mass fit.

π+π−π0ω

esis π π π ω. As is shown in Fig. 4 the reconstruction efficiency for the mesonic final state is nearly doubled (a 92% increase). + - 0

5. Conclusion and outlook The complete method has not been tested yet on a large sample of real data, because there are still improvements for the charged track reconstruction, especially for annihilations at high beam momenta. The performance of the GA and the mass fits depend on the error estimate for the measured quantities. First test results on small data samples look very promising, because the reconstruction efficiency for mesonic final a states is increased by 20% on average. References 1 2

3 4

E. Aker et al., The Crystal Barrel Spectrometer at LEAR, Nucl.Inst., A321(1992)6. T.F.Degener, A Feed Forward Neural Network for Recognition of Fluctuations in Electromagnetic Showers, Proceedings of the III. International Workshop on Software Engineering, Artificial Intelligence and Expert Systems for High Energy and Nuclear Physics, Oberammergau, October 4.- 8., 1993, pp.359, World Scientific R.Berlich, Training Feed Forward Neural Network Using Evolutionary Strategies, these proceedings. C. A. Meyer, User Guide for USDROP, a charged Splitoff Suppression Package, Crystal Barrel note 191, (1992), unpublished.

a.

Tested are: π+π-π0, π+π-η, π+π-ω, π+π-2π0, π+π-ηπ0, π+π-ωπ0, π+π-3π0 and π+π-η2π0.