MathGroup Archive 1999

[Date Index] [Thread Index] [Author Index]

Search the Archive

Re: Element Extraction

  • To: mathgroup at smc.vnet.net
  • Subject: [mg19375] Re: [mg19240] Element Extraction
  • From: "Allan Hayes" <hay at haystack.demon.co.uk>
  • Date: Sat, 21 Aug 1999 00:04:44 -0400
  • References: <199908110606.CAA01017@smc.vnet.net.> <37B47049.33AB607@fermi.phys.washington.edu> <37B54909.973D6F5E@hixnet.co.za> <37B53F8E.B16E04A2@fermi.phys.washington.edu> <001c01bee71f$5b311ba0$9b16989e@machine1> <7pipkh$39t$2@dragonfly.wolfram.com>
  • Sender: owner-wri-mathgroup at wolfram.com

Carl,

Re:
> Length/@(Inner[SameQ,lis,tar,List]/.True->Sequence[])
> will be faster than any other attempt to count the number of False entries
...

A slight improvement can be gained by using  DeleteCases instead of
/.True->Sequence[] (timings below) . But it seems to me that some lessons to
be learned from these postings are:

    1) operate on whole structures;
    2) keep functions and tests simple particularly when they have to be
used repeatedly.
    3) look for suitable built in fuctions

For example my original
hamming[1][lis_, tar_, dist_] :=
    Cases[lis, x_ /; Count[MapThread[SameQ, {target, x}], False] <= dist]

failed on 2),  with the complex test Count[MapThread[SameQ, {target, x}],
False] <= dist
--------------------------------

TIMINGS

testlist = Table[Random[Integer, {0, 2}], {10000}, {10}];
target = Table[Random[Integer, {0, 1}], {10}];
dist = 2;

TableForm[
  Table[{DeleteCases[testlist, 0, {2}] // Timing // First,
        testlist /. 0 -> Sequence[] // Timing // First}/Second, {10}],
  TableHeadings -> {None, { "DeleteCases", "True->Sequence[]"}}
  ]

   DeleteCases   True->Sequence[]
   0.77          0.94
   0.77          1.04
   0.71          0.93
   0.76          0.93
   0.71          0.93
   0.76          0.93
   0.77          0.99
   0.71          0.93
   0.77          0.94
   0.77          0.99

hamming[1.2][lis_, tar_, dist_] :=
lis[[Flatten[
Position[Length /@ DeleteCases[Inner[SameQ, lis, tar, List],True, {2}],
Alternatives @@ Range[0, dist]]]]]

hamming[6.1][lis_, tar_, dist_] :=
lis[[Flatten[
Position[Length /@ (Inner[SameQ, lis, tar, List] /. True -> Sequence[]),
Alternatives @@ Range[0, dist]]]]]


TableForm[Table[
    {hamming[1.2][testlist, target, dist] // Timing // First,
        hamming[6.1][testlist, target, dist]; // Timing // First
        }/Second,
    {10}], TableHeadings -> {None, {hamming[1.2], hamming[6.1]}}]

   hamming[1.2]   hamming[6.1]
   2.52           2.75
   2.53           2.74
   2.64           2.75
   2.52           2.8
   2.53           2.75
   2.52           2.8
   2.59           2.74
   2.53           2.8
   2.53           2.8
   2.58           3.95

Allan
---------------------
Allan Hayes
Mathematica Training and Consulting
Leicester UK
www.haystack.demon.co.uk
hay at haystack.demon.co.uk
Voice: +44 (0)116 271 4198
Fax: +44 (0)870 164 0565

Carl K.Woll <carlw at fermi.phys.washington.edu> wrote in message
news:7pipkh$39t$2 at dragonfly.wolfram.com...
> Allan,
>
> Very nice! Using Inner, I rewrote my original hamming[6]:
>
> hamming[6.1][lis_,tar_,dist_]:=
>   lis[[Flatten[
>           Position[Length/@(Inner[SameQ,lis,tar,List]/.True->Sequence[]),
>             Alternatives@@Range[0,dist]]]]]
>
> which on my test is still about 30% faster than hamming[1.1]. I think
>
> Length/@(Inner[SameQ,lis,tar,List]/.True->Sequence[])
>
> will be faster than any other attempt to count the number of False entries
using
>
> f/@(Inner[SameQ,lis,tar,List]
>
> (or almost equivalently Inner[SameQ,lis,tar,f]). The hardest thing here is
doing
> something to the False entries (counting) or to the True entries
(replacing).
> Counting False entries of each row must be done as many times as the
number of
> rows, but replacing True can be done once for the whole Table. This is why
the
> first method is approximately twice as fast.
>
> Carl Woll
> Physics Dept
> U of Washington
>
>
> Allan Hayes wrote:
>
> > Carl:
> > Another approach, using Inner:
> >
> > hamming[1.1][lis_, tar_, dist_] :=
> >   lis[[
> >       Flatten[Position[Inner[SameQ, lis, tar
> >             , (Count[{##}, False] <= dist) &], True]]]]
> >
> > This is considerbly faster on your test than my original hamming[1]:
> >
> > hamming[1][lis_, tar_, dist_] :=
> >   Cases[lis, x_ /; Count[MapThread[SameQ, {tar, x}], False] <= dist]
> >
> > testlist = Table[Random[Integer, {0, 2}], {10000}, {10}];
> > target = Table[Random[Integer, {0, 2}], {10}];
> > dist = 2;
> >
> > hamming[1.1][testlist, target, dist]; // Timing // First
> >
> > 3.73 Second
> >
> > hamming[1][testlist, target, dist]; // Timing // First
> >
> > 6.37 Second
> >
> > Allan
> > ---------------------
> > Allan Hayes
> > Mathematica Training and Consulting
> > Leicester UK
> > www.haystack.demon.co.uk
> > hay at haystack.demon.co.uk
> > Voice: +44 (0)116 271 4198
> > Fax: +44 (0)870 164 0565
> >
> > ----- Original Message -----
> > From: Carl K.Woll <carlw at fermi.phys.washington.edu>
To: mathgroup at smc.vnet.net
> > To: <kewjoi at hixnet.co.za>
> > Cc: Allan Hayes <hay at haystack.demon.co.uk>; David Park
<djmp at earthlink.net>;
> > Andrzej Kozlowski <andrzej at tuins.ac.jp>; Bob Hanlon <BobHanlon at aol.com>;
> > <mathgroup at smc.vnet.net>
> > Sent: 14 August 1999 11:06
> > Subject: [mg19375] Re: [mg19240] Element Extraction
> >
> > Below Kew complains about the lack of generality of the fastest solution
to
> > his problem. I should have given some words of
> > explanation about how the fastest solution worked.
> >
> > Basically, for my hamming[5] function, I took Allan Hayes solution, and
> > tried to speed it up. There were two things that I
> > did.
> >
> > 1. He used MapThread to thread SameQ over his two vectors. I temporarily
> > gave SameQ the attribute Listable to achieve the
> > same thing. This gave a 20-30% improvement in speed. I modified my
previous
> > hamming[5] function to do just this single
> > change.
> >
> > 2. Rather than counting the number of false entries and seeing if it was
> > less than some prescribed number, given the
> > shortness of the vector I instead hardcoded the close enough
possibilities
> > (alternatively, I could have hardcoded the far
> > enough away possibilities). This provided another 100% improvement in
speed
> > at the sacrifice of generality.
> >
> > For the hamming[6] function, I tried to attack the problem from a
different
> > angle. Since Position and Part are such
> > extremely quick functions, I massaged the whole testlist function
instead of
> > each element individually. I also used the two
> > things I noted above.
> >
> > If we want a general hamming function, then approach 2 above is
obviously
> > not the way to go. However, for the approach taken
> > in hamming[6], I was able to come up with a similar method, which was
still
> > general enough. In this approach I used the ever
> > useful Sequence[] function. At any rate, here is a revised version of my
> > previous notebook with the above changes using your
> > new syntax hamming[list,target,distance] (I also had to change David
Park's
> > solution a little bit more than the others to
> > make it work, and in the process speeded it up a bit).
> >
> > In[1]:=
> > (* Hayes *)
> > hamming[1][lis_,tar_,dist_]:=
> >   Cases[lis,x_/;Count[MapThread[SameQ,{target,x}],False]<=dist]
> >
> > (* Park revised *)
> >
hamming[2][lis_,tar_,dist_]:=Select[lis,Count[#-tar,0]>Length[tar]-dist-1&]
> >
> > (* Kozlowski *)
> > distance[a_, b_] := If[a === b, 0, 1]
> > SetAttributes[distance, Listable]
> >
> > hamming[3][lis_,tar_,dist_]:=
> >   Select[lis, Apply[Plus, distance[tar, #]] <= dist &]
> >
> > (* Hanlon *)
> > hammingDistance[x_?VectorQ, y_?VectorQ] /; Length[x] == Length[y] :=
> >     Plus @@ MapThread[If[#1 == #2, 0, 1] &, {x, y}];
> >
> > hamming[4][lis_,tar_,dist_]:=Select[lis, hammingDistance[tar, #] <
dist+1&]
> >
> > (* Woll #1 *)hamming[5][lis_,tar_,dist_]:=Module[{ans},
> >   SetAttributes[SameQ,{Listable}];
> >   ans=Cases[lis,x_/;Count[SameQ[x,tar],False]<=dist];
> >   ClearAttributes[SameQ,{Listable}];
> >   ans]
> >
> > (* Woll #2 *)
> > hamming[6][lis_,tar_,dist_]:=Module[{ans,t,len=Length[lis]},
> >   SetAttributes[SameQ,{Listable}];
> >   t=SameQ[testlist,Table[tar,{len}]];
> >   ans=lis[[
> >           Flatten[Position[Length/@(t/.True->Sequence[]),
> >               Alternatives@@Range[0,dist]]]]];
> >   ClearAttributes[SameQ,{Listable}];
> >   ans]
> > In[27]:=
> > target = Table[Random[Integer,{0,2}],{10}];
> > In[28]:=
> > testlist =
> >     Table[Random[Integer, {0, 2}], {10000},{10}];
> > In[29]:=
> > Do[Print[Timing[r[i]=hamming[i][testlist,target,2];]],{i,1,6}]
> >
> > r[1]==r[2]==r[3]==r[4]==r[5]==r[6]
> > {3.734 Second,Null}
> > {4.547 Second,Null}
> > {9.281 Second,Null}
> > {9.109 Second,Null}
> > {3.063 Second,Null}
> > {2.187 Second,Null}
> > Out[30]=
> > True
> >
> > Carl Woll
> > Dept of Physics
> > U of Washington
> >
> > Kew Joinery wrote:
> >
> > > Hello,
> > > I'd like to thank to all replies again.
> > > Only to add few words:
> > > To define universal Humming function it should be of the form :
> > > Hamming[someList , target , distance]
> > > Obviously all solutions from the different authors (excluding the
> > "fastest" solution
> > > of Mr.K.Woll )could be easy adapted to this form (they are at the
moment
> > of the form
> > > hamming[someList ,target] .
> > > What about the fastest solution? At the moment it detects only lists
with
> > elements of
> > > length=3 and distance <=1. Consider example when testing
> > >
> >
someList={{0,1,2,3,4,5,6,7,8,9},{0,0,1,2,3,4,5,6,7,8},{0,0,0,1,2,3,4,5,6,7},
> > {0,0,0,0,1,2,3,4,5,6}.{0,0,0,0,0,0,0,0,0,0}}
> > >
> > > According to Mr.Woll I should define all 10! "success[.]=0 " in
advance to
> > detect
> > > someList in Hamming Distance <=9 for example.I thing this solution is
> > inpractical .
> > > Regards,
> > > Eugene
> > >
> > > Carl K.Woll wrote:
> > >
> > > > Hi all,
> > > >
> > > > I thought I would give a comparison of the various solutions
provided,
> > along with
> > > > a couple of new ones by yours truly.
> > > >
> > > > (* Hayes *)
> > > > hamming[1][lis_,tar_]:=
> > > >   Cases[lis,x_/;Count[MapThread[SameQ,{target,x}],False]<=1]
> > > >
> > > > (* Park *)
> > > > hammtest[p : {_, _, _}, t : {_, _, _}] := Count[Abs[t - p], 0] > 1;
> > > >
> > > > hamming[2][lis_,tar_]:=Select[lis,hammtest[#,tar]&]
> > > >
> > > > (* Kozlowski *)
> > > > dist[a_, b_] := If[a === b, 0, 1]
> > > > SetAttributes[dist, Listable]
> > > >
> > > > hamming[3][lis_,tar_]:=Select[lis, Apply[Plus, dist[tar, #]] <= 1 &]
> > > >
> > > > (* Hanlon *)
> > > > hammingDistance[x_?VectorQ, y_?VectorQ] /; Length[x] == Length[y] :=
> > > >     Plus @@ MapThread[If[#1 == #2, 0, 1] &, {x, y}];
> > > >
> > > > hamming[4][lis_,tar_]:=Select[lis, hammingDistance[tar, #] < 2 &]
> > > >
> > > > (* Woll #1 *)
> > > > Clear[closeEnough];
> > > > closeEnough[{True,True,True}]=True;
> > > > closeEnough[{True,True,False}]=True;
> > > > closeEnough[{True,False,True}]=True;
> > > > closeEnough[{False,True,True}]=True;
> > > >
> > > > hamming[5][lis_,tar_]:=Module[{ans},
> > > >   SetAttributes[SameQ,{Listable}];
> > > >   ans=Select[lis,closeEnough[SameQ[#,tar]]&];
> > > >   ClearAttributes[SameQ,{Listable}];
> > > >   ans]
> > > >
> > > > (* Woll #2 *)
> > > > Clear[success]
> > > > success[True,True,True]=0;
> > > > success[True,True,False]=0;
> > > > success[True,False,True]=0;
> > > > success[False,True,True]=0;
> > > >
> > > > hamming[6][lis_,tar_]:=Module[{ans,t,len=Length[lis]},
> > > >   SetAttributes[SameQ,{Listable}];
> > > >   t=SameQ[testlist,Table[tar,{len}]];
> > > >   ans=lis[[Flatten[Position[Apply[success,t,1],0]]]];
> > > >   ClearAttributes[SameQ,{Listable}];
> > > >   ans]
> > > > In[22]:=
> > > > target = {0, 0, 1};
> > > > In[263]:=
> > > > testlist =
> > > >     Table[{Random[Integer, {0, 2}], Random[Integer, {0, 2}],
> > > >         Random[Integer, {0, 2}]}, {10000}];
> > > > In[381]:=
> > > > Do[Print[Timing[r[i]=hamming[i][testlist,target];]],{i,1,6}]
> > > >
> > > > r[1]==r[2]==r[3]==r[4]==r[5]==r[6]
> > > > {2.703 Second,Null}
> > > > {3.937 Second,Null}
> > > > {3.969 Second,Null}
> > > > {5.234 Second,Null}
> > > > {1.125 Second,Null}
> > > > {0.969 Second,Null}
> > > > Out[382]=
> > > > True
> > > >
> > > > Carl Woll
> > > > Dept of Physics
> > > > U of Washington
> > > >
> > > > Kew Joinery wrote:
> > > >
> > > > > Hello everyone,
> > > > > I have a problem extracting list of list.
> > > > > Given some list, consisting of elements (not all of them distinct)
> > with
> > > > > the same length and some target list of the same length, for
example:
> > > > > In[26]:=
> > > > >
> >
someList={{0,0,0},{0,0,1},{0,0,2},{0,1,0},{0,1,1},{0,1,2},{0,2,0},{0,2,1},{0
> > ,
> > > > >
> > > > >
> > 2,2},{1,0,0},{1,0,1},{1,0,2},{1,1,0},{1,1,1},{1,1,2},{1,2,0},{1,2,1},{1,
> > > > >
> > > > >
> > 2,2},{2,0,0},{2,0,1},{2,0,2},{2,1,0},{2,1,1},{2,1,2},{2,2,0},{2,2,1},{2,
> > > > >
> > > > >       2,2}};
> > > > >
> > > > > In[27]:=
> > > > > target={0,0,1};
> > > > >
> > > > > Hamming distance between two bit strings means the number of bit
> > > > > positions in which they differ. For example consecutive elements
of
> > the
> > > > > Gray code list have Hamming distance = 1.
> > > > > I need to extract these cases(elements of someList) which differ
from
> > > > > target in one coordinate (Hamming distance = 1) or less (the
> > > > > element===target itself//Hamming distance=0), so the answer should
be:
> > > > >
> > > > > In[33]:=
> > > > > answer={{0,0,0},{0,0,1},{0,0,2},{0,1,1},{0,2,1},{1,0,1},{2,0,1}};
> > > > >
> > > > > How could I achieve the extraction efficiently?
> > > > > Thank you in advance for any suggestions.
> > > > > Eugene
>
>
>
>





  • Prev by Date: Re: incompatibilities between releases of Mathematica (was: Mathematica Link for Excel and Excel 2000) Organization: Princeton University - CIT/IS/ASIG
  • Next by Date: multi dimensional mapping
  • Previous by thread: Re: Element Extraction
  • Next by thread: Re: Element Extraction