Sep 29, 2006 - the right sampling rate is a necessary initial step in developing a software receiver. The GPS L1 signal is transmitted at 1575.42 MHz and has.
Performance Testing of a Real-Time Software-Based GPS Receiver for x86 Processors Shahin Charkhandeh, M.G. Petovello, G. Lachapelle Position, Location and Navigation (PLAN) Group Department of Geomatics Engineering, University of Calgary Calgary, Alberta, Canada, T2N 1N4
BIOGRAPHY Shahin Charkhadeh is an MSc student in the Department of Geomatics Engineering at the University of Calgary. He has a BSc in software engineering from the same university. His area of specialization is software receiver design and algorithms. Dr. Mark Petovello holds a PhD from the Department of Geomatics Engineering at the University of Calgary where he is a senior research engineer in the Position, Location and Navigation (PLAN) group. He has been involved in various navigation research areas since 1998, including satellite-based navigation, inertial navigation, reliability analysis, and dead-reckoning sensor integration. Dr. Gerard Lachapelle is a professor in the above department, where he is responsible for teaching and research related to location, positioning and navigation. He has been involved with GPS developments and application since 1980. More info on http://plan.geomatics.ucalgary.ca ABSTRACT Given the demanding computational requirements of software-based GPS receivers, high data processing efficiency is required to obtain real-time performance. There are two basic approaches to accomplish this: reducing the number of computations required, or improving the efficiency with which the computations are carried out. This paper takes the latter approach, primarily by using the MMX technology available on x86-compatible processors. The interface between hardware and software, fast acquisition through FFT methods, Doppler removal and code correlation are the focus of this paper. The real-time performance of the receiver in static mode is also reviewed in this paper.. It is demonstrated that FFT acquisition can deliver a fast acquisition method suitable for software receivers. Initial
ION GNSS 2006, Fort Worth TX, 26 – 29 September 2006
results indicate a reasonable tracking performance of the receiver while tracking up to eight channels in real-time. The single-point position and velocity accuracies are at the meter and decimeter per second level, respectively when using a 1 ms coherent integration time. INTRODUCTION A typical GPS software receiver performs the entire baseband signal processing in the software module. This allows developers to modify and test new algorithms without making any changes to the hardware. Also, expanding a GPS software receiver in order to add support for new signal structures is easier and cheaper compared to traditional receivers. This is an appealing characteristic since we will see the arrival of many new signals such as GPS L1C and L5 and the entire Galileo system in the near future. Doppler removal and correlation of GPS signals requires millions of complex operations per second, the execution of which is governed by both a processor’s speed and the number of GPS data samples. If a software receiver operates slower than the rate required to process incoming data, it can lose lock on the signal. The PLAN (Position, Location And Navigation) Group developed its first software receiver (GNSS_SoftRx™) during the past three years. The post-mission version has already been used by numerous PLAN Group members for many research projects (Ma et al 2004, Skone et al 2005, Zheng & Lachapelle 2005). The second version of the software was developed in 2005. In this version, real-time operation capability was achieved by using optimized algorithm and SIMD/MMX instructions to perform Doppler removal and code correlation (Charkhandeh et al 2006), although that version still only operated in post-mission.
1/8
The current work focuses on expansion and optimization of this second receiver, interfacing the software with hardware, and real-time acquisition of the signal. The work mainly addresses the issues with the acquisition component and the final performance of the overall system. FRONT-END AND SOFTWARE INTERFACE One of the main factors that affect the performance of GPS software receivers is the sampling rate of the signal. The sampling rate contributes to the computational load of the overall system. The higher the sampling rate, the more data needs to be processed by the receiver and the more computer resources are needed. Therefore choosing the right sampling rate is a necessary initial step in developing a software receiver. The GPS L1 signal is transmitted at 1575.42 MHz and has a null-to-null bandwidth of 2 MHz. This signal is phase modulated with Pseudo-Random Noise (PRN) code with a 1.023 MHz chipping rate. The sampling rate should not be a multiple of the chipping rate which means that the sampling rate since this makes it difficult to obtain a fine distance resolution (Tsui 2000). The other factor in choosing the sampling rate is the Nyquist rate. The Nyquist sampling theorem requires that the minimum sampling bandwidth should be twice the information bandwidth. Therefore in case of GPS L1 C/A code signal, a minimum of 2 MHz sampling rate is required. However, since there are no ideal filters, it is usually recommended to choose a sampling rate that is 2.5 times the bandwidth to mitigate the effect of filter rolloff. This means a 5 MHz sampling rate is a preferred sampling frequency for the GPS C/A code signal. This being said, it has been shown that the GPS C/A code signal can be sampled at rates lower than 5 MHz. Pany et al (2003) shows that it is not necessary to retrieve the whole information (basically the code sequence) of the signal to reconstruct the auto correlation function. Therefore, GPS signal can be tracked with a lower sampling rate to improve processing efficiency at the cost of increased tracking error. This has allowed many research groups to develop software GPS receivers that operate on sampling frequency of less than 5 MHz (Akos et al 2001, Heckler & Garrison 2004, Pany et al 2002). However, the GNSS_SoftRx™ software used for this paper has been designed to operate on 5 MHz (real data) sampling rate. There are two main reasons behind this. Reaching a real-time performance with a higher rate will give us the flexibility of moving down in sampling rate without worrying about the speed and the performance of the receiver. Also, higher sampling rate allows us to use more advanced correlator techniques which cannot implemented with lower sampling rates and are therefore desirable. ION GNSS 2006, Fort Worth TX, 26 – 29 September 2006
A National Instruments (NI) Data Acquisition Card (NIDAQ) 5335 has been used to transfer the data from the hardware to the software. NI-DAQ offers a library which supports traditional single buffer data transfer while also offering the new technique of double buffering. Double buffering gives the ability for uninterrupted and continuous transfer of large blocks of data to the software. This will be discussed further in later sections of this paper. COMPUTATIONAL BOTTLENECKS The most computationally expensive task of a GPS receiver is IF signal processing, which includes Doppler removal and correlation with the local code. A receiver typically needs to process GPS data acquired at a sampling rate of 5 MHz or higher for C/A code. Additionally, the receiver must perform other operations such as tracking and solution calculation in parallel with the signal processing operations. Although the latter tasks typically run at a lower rate (20 Hz or lower), the computational requirements must still be considered, as they consume some computational resources in the receiver. Furthermore, the additional processing requirements for interfacing with an RF front-end in realtime must be considered. Charkhandeh et al (2006), discuss these problems in more detail. Here we briefly look at the solutions that were employed in this receiver to overcome these issues. DOPPLER REMOVAL AND CORRELATION To obtain the most efficient processing capability, the two major issues that need to be addressed are 1) Generation of the sine and cosine values used for Doppler removal; and 2) Mixing the GPS data with sine and cosine values and performing the code correlation. Generation of sine and cosine values Computing a large number of sine and cosine values in real-time is computationally too expensive for software receivers. This is a challenging task in software particularly because it does not have the same level of parallelism as hardware. As a solution to this problem, these values can be computed and stored in memory. Later, they can be used in real-time to perform the Doppler removal. There are two methods that have been developed to do this more efficiently. First, a table-lookup method reduces the time significantly. In this method, sine and cosine values for all possible phases are calculated and stored in an array. The software uses the estimated phase of the GPS signal to index into this array to obtain the desired sine and cosine value. However, the indexing requires that the data be processed on a sample by sample basis and the memory look up in each step is time consuming. Despite the fact that this method
2/8
enhances the performance, the gain is not enough for the receiver to perform in real-time. The second method is called table grid method. In this method sine and cosine values are computed based on some grid frequencies and saved in a table (Ledvina & Psiaki 2003). With the table grid method, a decrease of signal to noise ration (SNR) is expected from using an inexact frequency. However, with proper selection of the frequency spacing in the grid, the effect on the accuracy of the solution and tracking sensitivity can be minimized. Ledvina et al (2003) shows that the worst case SNR loss in the case of 1 ms coherent integration and 350 Hz frequency spacing, the maximum attenuation is 0.44 dB. For more detail on this method, please refer to Charkhandeh et al (2006) or Ledvina et al (2003). Mixing the GPS data with sine and cosine values and code correlation In addition to forming the sine and cosine values, mixing the signals to baseband (i.e., Doppler removal) is a time consuming task. Herein, MMX instructions are used to perform this operation (Charkhandeh et al 2006). The next step after Doppler removal is the correlation of GPS data with the local code. This is a computationally expensive task that involves thousands of multiplications and additions to process 1 ms of GPS data for one satellite. Results of the performance tests show that by using standard (integer math) C operators, this task cannot be completed in real-time for more than 1 satellite. MMX instructions can help the software to speed up these operations. MMX technology is a set of SIMD (Single Instruction, Multiple Data) instructions available on the Intel platform and on x86-compatible processors. SIMD works best when data is stored in blocks. Therefore, several pieces of data can be loaded with one operation. This will reduce the memory access time while dealing with large quantities of data. The other advantage of SIMD is that it has a set of operations which work in parallel on a set of data (Charkhandeh et al 2006, Heckler & Garrison 2004). Correlation across the C/A code boundaries may introduce a processing loss due to navigation data bit transitions. A solution to this problem was proposed in Heckler & Garrison (2003). The correlation is achieved in three steps: 1) Correlate the GPS data with local code to the code roll over point. 2) Perform the tracking loop update (if necessary). 3) Use the new Doppler to correlate the remaining data. The above steps repeat for every block of data with the result from step 3 from the previous block of data being added to the result from step 1 from the current block of data.
ION GNSS 2006, Fort Worth TX, 26 – 29 September 2006
Another important technique used in the receiver to enhance its performance is the use of the pre-computed C/A code table. Two periods of C/A code were computed one after the other in memory at the appropriate sampling frequency. Therefore, there is no need to do a wrap around while performing a 1 ms correlation. It is noted that C/A code Doppler was not considered (Charkhandeh et al 2006), although for a coherent integration time of 1 ms (used herein) this should have a minimal effect. ACQUISITION The first step in a GPS receiver is to find a rough estimate of PRN code offset and carrier Doppler. Acquisition is a coarse synchronization process which determines the estimate of PRN code and Doppler. This information is used to initialize the tracking loops for signal tracking and navigation data demodulation. GPS signal acquisition is essentially a two-dimensional search process in which a replica code and a replica carrier are aligned with the received signal. The correct alignment is identified by measurement of the output power of the correlators. When both the code and the carrier Doppler match the incoming signal sufficiently well, the signal is de-spread and the data can be demodulated. FFT acquisition is a common method that is used in GPS software receivers. The main advantage of the FFT approach is that it calculates the entire correlation function for a selected Doppler in a single step. There are many different optimized FFT algorithms that have been developed (Yang 2001). This receiver uses a FFT algorithm which is based on the mix-radix method that was first proposed by Singleton (1969). This FFT algorithm works fast and efficient on any series that are factors of 2, 3, 4, 5, 8 and 10. To save more time, the conjugate of the FFT of local PRN code can be calculated and stored in the initialization module of the receiver. This will free some computational resources that would have been otherwise needed to generate those FFT values on the fly. Also, in order to save memory, these values can be stored at the baseband frequency. Since the receiver is already taking advantage of fast Doppler removal method using MMX, the Doppler frequency of the incoming signal will be removed before taking the FFT. The following steps are performed during the acquisition. • Perform a Doppler removal for appropriate Doppler bin using MMX • Take the FFT of the Doppler removed signal • Multiply the Doppler removed signal’s FFT with the stored conjugate of the FFT of the local PRN • Take the inverse FFT of the product, which gives the correlation result in the time domain for all code phase offsets • Repeat the above for all the possible PRNs • Repeat the same procedure for next Doppler bin
3/8
The threshold for detection Vt is based on the following (Krumvieda et al 2001):
Vt = σ n −2 ln Pfa where Pfa the probability of false alarm and σ n is the 1-sigma noise amplitude. σ n is obtained by using a reference PRN which is known to be absent in the incoming signal such as PRN 37. The next step is comparing the result of the amplitude of the signal in each Doppler and code bin with the threshold. If the amplitude is at or above the threshold a signal is deemed present. CODE TRACKING Inside the GPS receiver, a second order DLL (Delay Lock Loop) is used to track the C/A code of the signals. Although all the discriminators discussed by Ward (1996) can be used in the receiver, the normalized dot product discriminator has been chosen since it uses the early minus late code. To save time and increase the processing performance, the early minus late code can be calculated off line and stored and used during the correlation process. This will reduce the number of correlators per satellite from six to four (since you require one correlator for each of the in-phase, I, and quadrature, Q, channels) and increase the computational performance of the receiver. The normalized dot product discriminator for 1/2 chip correlator spacing produces nearly true error output within ±1 / 2 chip of input error. The discriminator output is calculated as ( ∑ ( I E − I L ) × I P + ∑ ( QE − QL ) × QP ) / 2N
Where N is normalization factor, given as N = I P 2 + QP 2
This is described by Julien (2005). IP and QP are the prompt in-phase and quadrature values of correlation. CARRIER TRACKING Carrier tracking is initially performed by using a 2nd order FLL (Frequency Lock Loop). The decision-directed cross product discriminator has been implemented for this receiver. This discriminator is optimal at high SNR and has a slope proportional to signal amplitude. The calculations required to generate the carrier phase discriminator output are given below:
ION GNSS 2006, Fort Worth TX, 26 – 29 September 2006
dot = I P1 * I P 2 + QP1 * QP 2 cross = I P1 * QP 2 + I P 2 * QP1 discrim =
sign(dot )cross t 2 − t1
An FLL assisted PLL (Phase Lock Loop) is also implemented in the receiver after initial frequency determination. The ATAN carrier phase discriminator is used for the PLL. The discriminator is optimal at high and low SNR. Below is the discriminator algorithm: discrim = ATAN(
QP ) IP
C/N0 estimation is done based on an algorithm discussed by Van Dierendonck (1996). The estimation result depends on the a priori knowledge of bit transition. TEST SETUP Figure 1 shows the test and data gathering setup used to evaluate receiver performance. GPS RF signals are input to a NovAtel Euro-3M™ card. The Euro-3MTM card has an intermediate frequency (IF) of 70.42 MHz and a front– end bandwidth of 16 MHz. The output from the card is 6-bit samples (3-bit L1 and 3-bit L2) that are synchronized with a 40 MHz sampling clock. Digital down sampling is implemented in the FPGA. A 6-tap bandpass filter is implemented to reduce the aliasing effect and then data is down sampled by choosing every 8th sample; thus yielding a 5 MHz output rate. Unfortunately, the low-order bandpass filter introduces a 6 dB degradation in SNR. Better performance could be achieved with higher-order filters, but such a filter could not be implemented herein because of speed limitations of the FPGA. Normally, the front-end would be connected to the GPS antenna to receive real signals. However, since digital down conversion introduces a 6 dB loss, Spirent 7700 GPS simulator was used for testing. The simulator allowed an increase in the signal power to compensate for the down sampling loss such that the signal “seen” by the receiver is comparable to what would be seen with a higher-order (i.e., more ideal) filter were available.
4/8
Real-time Performance The receiver is capable of tracking 8 satellites in real-time on a Pentium 4 processor 3.2 GHz and 1 MB of RAM and hyper-threading enabled. In this configuration, the receiver only uses 55-60 % of the all the CPU resources available. Hyper-threading has no effect on the MMX performance. The receiver is also single threaded which is not effected by the hyper-threading feature of the window. However, by simulating the second logical CPU, hyper-threading enables other applications to run more easily while the software receiver program is being executed. The results that follow were all gathered while the receiver was operating in real-time and tracking 8 satellites. ACQUISITION PERFORMANCE Figure 1: Front end and test set up The Altera UP2 FPGA (Altera University Program 2), takes in 6-bit samples from the NovAtel card, as well as the synchronized 40 MHz sampling clock. Several 1-bit in/8-bit out serial-to-parallel shift registers, D-latches, and one 3-bit counter are implemented in the FPGA. The output of the FPGA board is 3-bit L1 samples. The NI data acquisition card (6534) has DMA (Dynamic Memory Access) capability that increases the speed of transferring data to the PC. As mentioned earlier, a double buffer scheme is used to transfer blocks of data in a continuous manner. In double-buffered input operations, the data buffer is configured as a circular buffer. The DAQ device fills the circular buffer with data. When the end of the buffer is reached, the device returns to the beginning of the buffer and fills it with data again. This process continues indefinitely until it is interrupted by a hardware error or cleared by a function call (DAQ 2000). However, during double buffering, the NI-DAQ card internally divides the buffer into two halves. This allows the NI-DAQ to coordinate user access to the data buffer. The coordination scheme is conceptually simple; the NIDAQ card copies data from the circular buffer in sequential halves to a transfer buffer that the user creates. The user can process this data while the DAQ device transfers the incoming data into the other half of the circular buffer. The key to real-time implementation therefore, is to process all of the data in the transfer buffer before it is over-written.
Table 1 shows the time it takes the receiver to perform the full sky search in a cold start mode. Results presented are for 1 and 2 ms coherent integration. The time is the total time needed to search for all the possible satellites. Table 1: Acquisition speed Coherent Integration Time (s) 1ms 2.4 2ms 6.1 Figure 2 and Figure 3 show the result of FFT search for all possible code and Doppler bins for PRN 17 using 1 and 2 ms of coherent integration respectively.
Figure 2: 1 ms coherent integration
For the following tests, the receiver was configured with the following parameters: • 1 ms coherent integration • Second order FLL with 10 Hz bandwidth and decision directed cross product discriminator • Third order PLL with 18 Hz bandwidth and ATAN discriminator (aided by FLL). • 20 Hz Pseudo-range measurements output rate. ION GNSS 2006, Fort Worth TX, 26 – 29 September 2006
5/8
Figure 3: 2 ms coherent integration As expected, increasing the time of coherent integration, increases the size of the peak, therefore making the task of detection easier and more robust. TRACKING PERFORMANCE
Figure 5: Doppler value for PRN 17 Figure 6 is a smoothed version of the output of the PLL lock detector. As expected, the value converges to one which shows a good phase lock during tracking.
This section presents some tracking results. All of the results were obtained from PRN 17, but are indicative of other satellites. Figure 4 shows the C/N0 calculated by the receiver for all the PRNs that are being tracked by the receiver in the real-time. As expected, the signals are strong since it was generated by the simulator and does not contain effects of antenna gain pattern rolloff.
Figure 6: PLL Lock detector output
Figure 4: Estimated C/N0 for all PRNs Figure 5, showing the Doppler for PRN 17, indicates good carrier tracking by the receiver. The Doppler appears somewhat noisy, which is likely due to the short integration time used (1 ms).
ION GNSS 2006, Fort Worth TX, 26 – 29 September 2006
6/8
POSITION ACCURACY
CONCLUSION AND FUTURE WORK
A single-point position and velocity solution was computed using the PLAN group’s C3NAVG2™ software. C3NAVG2™ uses an epoch-by-epoch least squares algorithm. The results shown in Figure 7 and Figure 8 were obtained using the software receiver pseudo-range measurements only. Table 2 shows the RMS errors for position (north, east and up) and their corresponding DOP. The errors are within a reasonable range (meter level) for single-point operation. Table 2 results also show a direct relation between the DOP and the size of errors.
In this paper the development and testing of a highperformance GPS software receiver was reviewed. The effects of MMX technology and re-designed algorithms on the speed of the software were also examined in this paper. It was shown that the optimized algorithms will enable the receiver to operate in real-time while interfacing the hardware. Results were presented from the operation of the receiver while tracking 8 satellites in real-time. Code and carrier tracking performance were satisfactory despite the short integration time used. RMS of position and velocity error was at meter and decimeter per second level. The SSE2 (Streaming SIMD Extension 2) extension, which is similar to MMX and is offered by Intel, may further improve the performance of the real-time receiver on general purpose platforms. SSE2 has 128 bit registers, which in turn doubles the amount of parallelism in the software. Further improvements in performance are expected with the use of SSE2. ACKNOWLEDGEMENTS The authors would like to acknowledge the Informatics Circle of Research Excellence (iCORE) for funding part of this research.
Figure 7: Scatter plot of North and East errors REFERENCES Table 2: Position error statistic Parameter North East DOP 0.9 0.9 RMS Error 5.59 m 3.70 m
Up 2.4 17.04 m
As mentioned above, a static antenna was used in the experiment. Figure 10 shows the velocity errors which have an RMS value of 0.09 m/s for north, 0.08 m/s for east and 0.15 m/s for the vertical.
Akos, D.M., P.L. Normak, A. Hansson, A. Rosenlind, Enge.P,(2001), Real-Time GPS Software Radio Receiver, Proc, of institute of Navigation 2001 National Technical Meeting (January 22-24, 2001,Long beach, CA) 809-816. DAQ, NI-DAQ user manual for PC compatibles version 6.7, National Instrument, January 2000 edition Heckler,G.W & James L.Garrison (2004), Architecture of a Reconfigurable Software Receiver, ION GNSS 17th International technical Meeting of Satellite Division(September 21-24, 2004, Long Beach, CA) 947955 Julien, O. (2005) Design of Galileo L1F Receiver Tracking Loops. PhD Thesis, published as Report No. 20227, Department of Geomatics Engineering, The University of Calgary. Kaplan, E. D. (1996), Understanding GPS, Principles and Applications, Boston: Artech House, Inc. Krumvieda K., P. Madhani et al (2001), A Complete IF Softare GPS Receiver: A Tutorial about the Details, ION GPS 2001,11-14 September, 2001, Salt Lake City, UT.
Figure 8: Velocity error versus times
ION GNSS 2006, Fort Worth TX, 26 – 29 September 2006
7/8
Ledvina, B.M., M.L. Psiaki, S.P. Powel, and P.M. Kintner (2003), A 12-Channel Real-Time GPS L1 Software Receiver, Proc. of institute of Navigation National Technical Meeting (January 22-24, 2003 Anaheim, CA) 767- 783 Ma, C, G. Lachapelle & M.E. Cannon, Implementation of a Software Receiver, ION GNSS 17th International technical Meeting of Satellite Division (September 21-24, 2004, Long Beach, CA) 882-893. Pany, T., B. Eissfeller and J. Winkel (2003), Tracking of High Bandwidth GPS/Galileo Signals with a Low Sample Rate Software Receiver, Proc. ENC-GNSS 2003, Graz, April 2003 Pany, T., B. Eissfeller, G. Hein, S.W. Moon and D. Sanroma (2002),IPEXSR: A PC Based Software GNSS Receiver Compeletly developed in Europe, Proc.ENCGNSS,2002. Singleton, R., An Algorithm for Computing the Mixed Radix Fast Fourier Transform, IEEE Trans. Audio Electroacoust., v. AU-17, p. 93, June 1969 Skone, S., G. Lachapelle, D. Yao, W. Yu and R. Watson (2005) Investigating the Impact of Ionospheric Scintillation using a GPS Software Receiver. Proceedings of GNSS 2005 (Session C3, Long Beach, CA, 13-16 September). Tsui, James B-Y. (2000), Fundamentals of Global Positioning System Receivers: A Software Approach, John Wiley & Sons Inc. Van Dierendonck, A.J. (1996), Global Positioning System: Theory and Applications, Volume I, Chapter 8: GPS Receivers, AJ Systems, Los Altos, CA 94024. Inc. Yang, C., FFT Acquisition of Periodic, Aperiodic, Puncture, and Overlaid Code Sequences in GPS, Proc. ION-GPS 2001, Salt Lake City, pp. 137-147. Zheng, B. and G. Lachapelle (2005), GPS Software Enhancements for Indoor Use. Proceedings of GNSS 2005 (Session C3, Long Beach, CA, 13-16 September). .
ION GNSS 2006, Fort Worth TX, 26 – 29 September 2006
8/8