A project report for EE 230C.
By, Laura Balzano
Source location determination is an important problem in the area of sensor networks. The emerging technology of sensor networks offers us the ability to spread out our resources and learn about the world in a way we never have before. At the same time, in order to take advantage of this resource, we have to learn how to process the data collected in new ways so as to extract as much information as possible.
Many sensor network applications involve an array, grid or scattered topology of sensors trying to inform a base station (and the scientist sitting there) about a specific event occurring in the space. Perhaps we would like to know about an earthquake, a forest fire, an intruder, or some animal behavior like birdsong. The sensors that detect and measure these phenomena can hopefully tell us that the event has occurred—an earthquake is happening, a bird is singing. Even more exciting is the prospect that the sensors can tell us where it is happening. Where in the forest is there a fire? Where is the epicenter of the earthquake?
Statistical Signal Processing is a useful tool for such estimation. For sources producing waves such as acoustic and seismic, spectral information can be used to estimate the location of the source.
First I want to motivate the geometric model behind source location estimation. Here you see a simple figure with an array of sensors and a far-field source wave impinging on the array. The wave hits each of the sensors one after the other, and in this far-field example there is a constant time delay between each sensor. Here I examine the problem of near-field sources, and so the relative time delay between sensors is not constant.
The work I am explaining in this report is the work of Chen and Yao in their paper Maximum Likelihood Source Localization and Unknown Sensor Location Estimation for Wideband Signals in the Near-Field [1]. This paper lists other methods of doing source location estimation. Some of the methods include source location based on the relative time-delay estimates as I mentioned in the above paragraph. Those that require information about the speed of wave propagation include spherical interpolation, hyperbolic intersection and linear intersection. Those that do not need the speed of propagation include Least Squares (LS) and constrained Least Squares (CLS). These methods are two-step methods where first the relative time-delays are estimated and then the source location is estimated.
Chen and Yao show that the parametric Maximum Likelihood solution to locate a wideband source in the near-field only requires a single step of optimization.
First we need to consider the following model and
definitions.
_{} |
the source signal from the m^{th} source at time
n. |
_{} |
the signal gain level of the m^{th} source at the
p^{th} sensor. |
_{} |
the time-delay in samples, that is, the time it takes the
m^{th} source to reach the p^{th} sensor. |
_{} |
the location of the m^{th} source. This is a
three-dimensional vector: (x, y, z) |
_{} |
the location of the p^{th} sensor. This is a
two-dimensional vector: (x, y, 0) |
_{} |
the speed of propagation in length per unit sample. |
_{} |
Zero-mean i.i.d. Gaussian noise sequence with variance _{} |
From these definitions, we can conclude that the time-delay _{} is defined by _{}. Also, the relative time-delay between two sensors is defined by _{}.
Now we can define the data collected by the p^{th} sensor at time n in a randomly distributed array.
_{}
If we take the _{} DFT on a block of _{} samples, the DFT will introduce some problems due to the circular shift.
When _{}, and if _{} is very small, we will see “edge effects.” When a DFT is taken for a small set of data points, and then again for the next small set of data points, if the frequency content in each set is very different we have edge effects, that is, jumps in the data from one block to the next. Instead we want smoother results. Larger blocks of data will have less chance of being very different overall from the next block of data, and so large _{} is preferred. The problem with large _{} is that the data is usually not stationary over time. For instance, the source may be moving or the noise is pink and non-stationary.
Stationary random signals are those whose statistics does not change over time. Random processes with probability density functions (PDF) proportional to _{} are stationary for _{}, that is, a strictly less than 1. Pink noise, for example, has a PDF is proportional to _{}, and so it is not stationary.
To eliminate problems with small _{}, we can zero-pad our signal such that _{}. The padding must be larger or equal to the maximum relative
time-delay among the sensors. Zero-Padding, though, will introduce an
extraneous jump at the end of the data, which introduces frequency content and eliminates
orthogonality of the noise across the frequencies.
Taking the DFT helps us with our Gaussian noise assumption. As long as the noise is i.i.d., then by the central limit theorem, the spectral values will asymptotically approach the Gaussian distribution. Therefore after the DFT, the spectrum of the noise is zero-mean Gaussian distributed with variance _{} in each of the _{} elements.
From this discussion, Chen and Yao conclude that they have
two choices: they can either assume that _{} is large enough or
that the noise is almost uncorrelated across frequency. Then they do their
maximum likelihood derivation assuming one or the other of these is true. They
call this “approximate maximum-likelihood.”
After taking the DFT, we will do our estimation in the frequency domain. The estimation setup is as follows.
_{} |
The k^{th} spectral component from the m^{th}
source |
_{} |
The steering vector information of the m^{th}
source at the p^{th} sensor |
_{} |
The k^{th} spectral component as observed at the p^{th}
sensor |
_{} |
The location of the m^{th} source. This is a
three-dimensional vector: (x, y, z) |
_{} |
The location of the p^{th} sensor. This is a
two-dimensional vector: (x, y, 0) |
_{} |
Frequency-domain representation of zero-mean i.i.d.
Gaussian noise with variance _{} |
_{}
In addition, since _{}, each element in the above matrices is a vector. For example, _{} is equal to
_{}
We denote the unknown parameters as _{}. This vector consists of the location of the sources as well
as the source signal’s spectrum. The location of each source has three
coordinates. The spectrum has components between _{}. The 0^{th} component is only the DC value of the
signal. Because the signal is of course real and not imaginary, the negative
components are simply mirror images of the positive components.
So, we are only interested in those components between _{}. Then our parameter _{} is made up of two vectors:
_{}
and our parameter then looks as follows:
_{}
Then let _{}and we have
_{}
Remember from the DFT section above, we have that the noise spectrum vector _{} is zero-mean Gaussian distributed with variance _{} in each element. Therefore the PDF of the data we assume to be
_{}
And the log likelihood function is
_{}
We want to maximize this likelihood with respect to _{}, so the constant terms do not matter. Therefore we simplify:
_{}
To maximize our likelihood, we therefore minimize the right hand side of this equation.
_{}
So now we understand our problem in terms of the optimal maximum likelihood estimator! This resulting equation is our criterion for optimization. See the appendix for proof and details.
Chen and Yao develop three different lower bounds on the variance in their paper, taking into consideration various unknown parameters. Here I will show the CRB for the case where the signal is known and the speed of propagation is known.
The Fisher information matrix from the signal model defined above is as follows.
_{}
where _{} and _{}. Recall _{}is the DFT length and _{}is the number of sensors. Thus we have
_{}
From what we know about the CRB, we then know that the
variance of our estimator is bounded underneath by the inverse of the Fisher
information matrix. See the appendix for proof and details.
In this project, I used Chen’s MATLAB code to simulate the Direction-of-Arrival (DOA) estimation. I have three acoustic files of birdsong in the .wav format. There are two important MATLAB files: the first generates signals that are “heard” at each simulated sensor with the necessary delay. The second does the ML source estimation based on these signals.
The birds are placed on the ground in the simulation at some angle from the sensor array, and the bird song file is run through the first MATLAB file to generate the signal heard by each sensor. Currently, the code is not written to do 3D location estimation, and so the bird cannot be placed in the tree—though this is the future direction of the work. Also, right now only two sources can be identified. We simulate at most two birds singing at the same time.
The three different bird songs look like this in the time domain.
For the project, I used a truncated section of these files. Here are the time and frequency domain plots for the truncated portion of the bird song.
As we can see, the structure of the different bird songs is very interesting. After a project like this on identifying a bird’s location, one might be interested in then automatically identifying what kind of bird has been located. This process is called Classification, and it also fits into estimation theory, but it will not be covered in this project report.
The first set of graphs here shows the estimated direction of arrival (DOA) versus the noise variance. I used bird song 1 and bird song 2 in this simulation. All graphs show just one simulation with one instance of the random noise. Across the three graphs, the two sources are getting closer and closer together. We note that the noise variance does not greatly affect the angle estimate. In the first graph, the birds are at _{} and _{} and the estimate is always within 3% of the actual angle.
In these next two graphs, we can see that the error goes up as the bird sources get very close together. In the first graph, the birds are _{} and _{}, and in the second graph they are at _{} and _{}.
Next, I wanted to see if the bird songs I chose were particularly good for the AML algorithm. I ran the algorithm with bird song 2 and bird song 3 as sources, and I got very similar results.
Finally, I wanted to see how the different parameters in the simulation, other than noise variance, affect the outcome of the AML algorithm. The two I explored were the length of the signal used for the algorithm and the number of grid sectors over which the AML algorithm searched.
First, here is a graph showing the DOA estimates for AML when the signal length is about 250,000 samples and the number of grid sectors is 200.
Then I tried shorter signal chunks, all the way down to 512 samples, to see if AML could still estimate the angles correctly. The code worked very well for all values I tried, and here is a graph showing the estimates when the number of samples was 512.
On the other hand, the number of grid sectors to search of course very greatly affected the algorithm. The number of grid sectors is directly equivalent to the resolution of the DOA algorithm. Here is a graph showing the estimates when the number of grid sectors is only 15.
My final simulation shows the actual source location estimate by finding the intersection of the DOA of two sensor arrays. In my setup I had two sensor arrays, each with three sensors, estimating the DOA of a single bird source (bird song 1). As we can see from this graph, since the DOA estimates are generally very good, our source location estimate is also very good.
The Maximum Likelihood estimator is an optimal estimator,
and it is important to understand what it can do in estimation problems like
this so that we can compare other estimators to this one. The tradeoff between
computation and accuracy is important, and Chen and Yao show in detail the
results from different estimators including MUSIC, least squares, and their own
AML. They also compare these results to the CRB.
For my own future work, I would like to understand maximum likelihood in a variety of contexts, especially those where the signal is not as well-behaved as the examples mentioned here (acoustic, seismic). Hopefully I will get a chance to explore the maximum likelihood estimation of locations of all kinds of phenomena in a space, given a set of sensors and a PDF for a phenomenon’s diffusion throughout that space.
References:
[1] J. C. Chen, R. E. Hudson, and K. Yao,
“Maximum-likelihood source localization and unknown sensor location estimation for
wideband signals in the near field,” IEEE Trans. Signal Processing,
vol. 50, pp. 1843–1854, August 2002.
[2] My notes from Professor Wu’s STAT 200B course, winter quarter 2005, University of California at Los Angeles. His webpage can be found here: http://www.stat.ucla.edu/~ywu
[3] My notes from Professor Lorenzelli’s EE 230A course fall quarter 2004, University of California at Los Angeles.
[4] Kay, Steven. Fundamentals
of Statistical Signal Processing: Estimation Theory, 1993.