Re: Speed up calculating the pair correlation function for
- To: mathgroup at smc.vnet.net
- Subject: [mg103356] Re: [mg103335] Speed up calculating the pair correlation function for
- From: Leonid Shifrin <lshifr at gmail.com>
- Date: Thu, 17 Sep 2009 06:20:15 -0400 (EDT)
- References: <200909160946.FAA12977@smc.vnet.net>
Hi Szabolcs, You can gain a two-fold speedup by vectorizing the problem: pcfOneAltComp = Compile[{{points, _Real, 2}, {origin, _Real, 1}, {dr, _Real}, {rmax, _Real}, {density, _Real}}, Module[{hist}, hist = BinCounts[ Sqrt[Total[(origin - Transpose@points)^2]], {0, rmax, dr}]; Transpose[{Range[0, rmax - dr, dr] + dr/2, hist/(Pi (dr^2 + 2 dr Range[0, rmax - dr, dr]) density)}]]]; In fact, Compile here helps very little - it gives a marginal (few percent) improvement. I would also try to use ParallelMap when you map on origin points. Regards, Leonid 2009/9/16 Szabolcs Horv=E1t <szhorvat at gmail.com> > Hello, > > I would like to calculate the pair correlation function normalized to 1 > for some 2D point data. I.e. I need to find the mean density of points > at distance r from any point, normalized to 1. > > I am looking for advice on speeding this up. > > This is the current implementation I have: > > The pcfOne function calculates the mean density of 'points' at distance > r from one single point ('origin'), up to 'rmax' in steps of 'dr'. > 'density' is the average density of all points over the complete region > (since the shape of the region is unknown to the function, this quantity > is passed separately): > > pcfOne[points_, origin_, dr_, rmax_, density_] := > Module[{hist}, > hist = BinCounts[ > With[{v = # - origin}, Sqrt[v.v]] & /@ points, > {0, rmax, dr}]; > Transpose[ > {Range[0, rmax - dr, dr] + dr/2, > hist/(Pi (dr^2 + 2 dr Range[0, rmax - dr, dr]) density)} > ] > ] > > Now we can select a subset of the points, calculate this function for > all of them and average the results. For randomly distributed points > the result will be a constant function of value 1 (at least until we get > too close to the edge of the region): > > data = RandomReal[1, {50000, 2}]; > > ListPlot[ > Mean[ > pcfOne[data, #, 0.05, 0.5, Length[data]] & /@ > Nearest[data, {.5, .5}, 1000] > ], > > PlotRange -> {0, 2}, Axes -> False, Frame -> True > ] > > This runs in 80 seconds on my machine. I would like to use this > function on datasets of up to 300,000 points and average over more than > just 1000 points near the middle, say 10000. That would take 60 times > as long, ~80 minutes, which is way too much. > > Is it possible to speed this up significantly? > >
- References:
- Speed up calculating the pair correlation function for 2D point data
- From: Szabolcs Horvát <szhorvat@gmail.com>
- Speed up calculating the pair correlation function for 2D point data