<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: Lucene made my app embarrassingly fast</title>
	<atom:link href="http://madbean.com/2004/mb2004-7/feed/" rel="self" type="application/rss+xml" />
	<link>http://madbean.com/2004/mb2004-7/</link>
	<description>Your zero step program</description>
	<pubDate>Sat, 22 Nov 2008 07:31:08 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.6.3</generator>
		<item>
		<title>By: Sam Terrell</title>
		<link>http://madbean.com/2004/mb2004-7/#comment-309</link>
		<dc:creator>Sam Terrell</dc:creator>
		<pubDate>Sun, 25 Jul 2004 16:08:41 +0000</pubDate>
		<guid isPermaLink="false">http://madbean.com/blog/2004-7#comment-309</guid>
		<description>&lt;p&gt;I learned that Lucene is fast the hard way.  I implemented my own file-based B+Tree Indexing, dictionary, and inverted index.  I paid attention to every disk seek.  I thought, &#34;I can get more performance on the indexing if I broke it up into memory sized chunks, flushed it to disk, then started a new file that I would merge after so many files accumulate.&#34;  Yeah, a bright one I was, beginning to catch on to this whole indexing thing.  Then on some site I found a link to a Java base indexer hosted on Apache.  I was suprised, because I looked everywhere for one before, and only found C-coded ones that had false positives because it used that weird table method instead of an inverted index.  Lucene is indeed the fastest indexing engine I've found to date.  I've had Gigabyte Lucene indexes, and the network is still the bottleneck.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>I learned that Lucene is fast the hard way.  I implemented my own file-based B+Tree Indexing, dictionary, and inverted index.  I paid attention to every disk seek.  I thought, &quot;I can get more performance on the indexing if I broke it up into memory sized chunks, flushed it to disk, then started a new file that I would merge after so many files accumulate.&quot;  Yeah, a bright one I was, beginning to catch on to this whole indexing thing.  Then on some site I found a link to a Java base indexer hosted on Apache.  I was suprised, because I looked everywhere for one before, and only found C-coded ones that had false positives because it used that weird table method instead of an inverted index.  Lucene is indeed the fastest indexing engine I&#8217;ve found to date.  I&#8217;ve had Gigabyte Lucene indexes, and the network is still the bottleneck.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Matt Quail</title>
		<link>http://madbean.com/2004/mb2004-7/#comment-308</link>
		<dc:creator>Matt Quail</dc:creator>
		<pubDate>Thu, 22 Apr 2004 00:11:00 +0000</pubDate>
		<guid isPermaLink="false">http://madbean.com/blog/2004-7#comment-308</guid>
		<description>&lt;p&gt;Kevin,&lt;/p&gt;

&lt;p&gt;I &lt;em&gt;do&lt;/em&gt; indeed know python (infact, this site is generated by a Jython script I hacked together). I can even point out a bug in your code; the constructor should contain the line &#34;self.index={}&#34; not &#34;index={}&#34; :P&lt;/p&gt;

&lt;p&gt;I do have a preference for doing code examples in Java since that is the main theme of my blog, and I can assume that the reader will be more familiar with Java than something else. Having said that, I'm a big fan of Python/Jython from way back.&lt;/p&gt;

&lt;p&gt;=Matt&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Kevin,</p>
<p>I <em>do</em> indeed know python (infact, this site is generated by a Jython script I hacked together). I can even point out a bug in your code; the constructor should contain the line &quot;self.index={}&quot; not &quot;index={}&quot; :P</p>
<p>I do have a preference for doing code examples in Java since that is the main theme of my blog, and I can assume that the reader will be more familiar with Java than something else. Having said that, I&#8217;m a big fan of Python/Jython from way back.</p>
<p>=Matt</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Kevin J. Butler</title>
		<link>http://madbean.com/2004/mb2004-7/#comment-307</link>
		<dc:creator>Kevin J. Butler</dc:creator>
		<pubDate>Wed, 21 Apr 2004 16:17:18 +0000</pubDate>
		<guid isPermaLink="false">http://madbean.com/blog/2004-7#comment-307</guid>
		<description>&lt;p&gt;Matt, nice descriptions - I really need to get a round tuit for Lucene...&lt;/p&gt;

&lt;p&gt;You should get to know Python/Jython, especially if you think in &#38; explain things in code.  It is much clearer and more concise than &#34;Java as pseudocode&#34;.&lt;/p&gt;

&lt;p&gt;I translated the FastCustomerIndex example into Python.&lt;/p&gt;

&lt;p&gt;A few things about Python illustrated below:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;init&lt;/strong&gt; defines a class constructor&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;[a,b,c] defines a list&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;{key: value, key1:value1 } defines a dictionary (like a map)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;you must explicitly pass the class instance into methods (&#34;self&#34;, corrseponding to Java's &#34;this&#34;), and must use it explicitly to access fields/methods&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;triple-quoted strings can span lines:
&#34;&#34;&#34;line 1
line 2&#34;&#34;&#34;
A string as the first part of a method body becomes the method's
documentation string - an attribute of the method that is available
at runtime for inspection (Reflection)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;I didn't return immutable lists - I found the code clearer without that.  The easiest way in Python
is just to copy the list, but if size is a concern, you could wrap it in an immutable sequence class as well...
&lt;code&gt;
class FastCustomerIndex:
def &lt;strong&gt;init&lt;/strong&gt;( self ):
&#34;&#34;&#34;
Construct FastCustomerIndex.
index is a dict that maps field name (&#34;first&#34; or &#34;last&#34;) to a dict
that maps values (first names or last names) to a list of Customers with 
that name/value (ie; that has that first/last name).&lt;/p&gt;

&lt;p&gt;Example:  Given j = Customer( &#34;George Bush&#34; )
  {
    &#34;first&#34; : { &#34;George&#34;: [ Customer( &#34;George Bush&#34; ) ] },
    &#34;last&#34; : { &#34;Bush&#34;: [ Customer( &#34;George Bush&#34; ) ] }
  }
&#34;&#34;&#34;
index = {}&lt;/p&gt;

&lt;p&gt;def getFieldList( self, fieldName, fieldValue ):
&#34;&#34;&#34;Returns list of customers&#34;&#34;&#34;
fieldMap = self.index.setdefault( fieldName, {} )
l = fieldMap.setdefault( fieldValue, [] )
return l&lt;/p&gt;

&lt;p&gt;def addCustomer( self, customer ):
&#34;&#34;&#34;Adds customer to first and last name indices&#34;&#34;&#34;
self.getFieldList( &#34;first&#34;, customer.firstName ).append( c )
self.getFieldList( &#34;last&#34;, customer.lastName ).append( c )&lt;/p&gt;

&lt;p&gt;def findByFirstName( self, firstName ):
&#34;&#34;&#34;Returns a copy of the list of customers matching firstName&#34;&#34;&#34;
return self.getFieldList( &#34;first&#34;, firstName )&lt;/p&gt;

&lt;p&gt;def findByLastName( self, lastName ):
&#34;&#34;&#34;Returns a copy of the list of customers matching lastName&#34;&#34;&#34;
return self.getFieldList( &#34;last&#34;, lastName )
&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
		<content:encoded><![CDATA[<p>Matt, nice descriptions - I really need to get a round tuit for Lucene&#8230;</p>
<p>You should get to know Python/Jython, especially if you think in &amp; explain things in code.  It is much clearer and more concise than &quot;Java as pseudocode&quot;.</p>
<p>I translated the FastCustomerIndex example into Python.</p>
<p>A few things about Python illustrated below:</p>
<ul>
<li>
<p><strong>init</strong> defines a class constructor</p>
</li>
<li>
<p>[a,b,c] defines a list</p>
</li>
<li>
<p>{key: value, key1:value1 } defines a dictionary (like a map)</p>
</li>
<li>
<p>you must explicitly pass the class instance into methods (&quot;self&quot;, corrseponding to Java&#8217;s &quot;this&quot;), and must use it explicitly to access fields/methods</p>
</li>
<li>
<p>triple-quoted strings can span lines:<br />
&quot;&quot;&quot;line 1<br />
line 2&quot;&quot;&quot;<br />
A string as the first part of a method body becomes the method&#8217;s<br />
documentation string - an attribute of the method that is available<br />
at runtime for inspection (Reflection)</p>
</li>
<li>
<p>I didn&#8217;t return immutable lists - I found the code clearer without that.  The easiest way in Python<br />
is just to copy the list, but if size is a concern, you could wrap it in an immutable sequence class as well&#8230;<br />
<code><br />
class FastCustomerIndex:<br />
def <strong>init</strong>( self ):<br />
&quot;&quot;&quot;<br />
Construct FastCustomerIndex.<br />
index is a dict that maps field name (&quot;first&quot; or &quot;last&quot;) to a dict<br />
that maps values (first names or last names) to a list of Customers with<br />
that name/value (ie; that has that first/last name).</code></p>
<p>Example:  Given j = Customer( &quot;George Bush&quot; )<br />
  {<br />
    &quot;first&quot; : { &quot;George&quot;: [ Customer( &quot;George Bush&quot; ) ] },<br />
    &quot;last&quot; : { &quot;Bush&quot;: [ Customer( &quot;George Bush&quot; ) ] }<br />
  }<br />
&quot;&quot;&quot;<br />
index = {}</p>
<p>def getFieldList( self, fieldName, fieldValue ):<br />
&quot;&quot;&quot;Returns list of customers&quot;&quot;&quot;<br />
fieldMap = self.index.setdefault( fieldName, {} )<br />
l = fieldMap.setdefault( fieldValue, [] )<br />
return l</p>
<p>def addCustomer( self, customer ):<br />
&quot;&quot;&quot;Adds customer to first and last name indices&quot;&quot;&quot;<br />
self.getFieldList( &quot;first&quot;, customer.firstName ).append( c )<br />
self.getFieldList( &quot;last&quot;, customer.lastName ).append( c )</p>
<p>def findByFirstName( self, firstName ):<br />
&quot;&quot;&quot;Returns a copy of the list of customers matching firstName&quot;&quot;&quot;<br />
return self.getFieldList( &quot;first&quot;, firstName )</p>
<p>def findByLastName( self, lastName ):<br />
&quot;&quot;&quot;Returns a copy of the list of customers matching lastName&quot;&quot;&quot;<br />
return self.getFieldList( &quot;last&quot;, lastName )
</p>
</li>
</ul>
]]></content:encoded>
	</item>
	<item>
		<title>By: Matt Quail</title>
		<link>http://madbean.com/2004/mb2004-7/#comment-306</link>
		<dc:creator>Matt Quail</dc:creator>
		<pubDate>Wed, 24 Mar 2004 22:50:15 +0000</pubDate>
		<guid isPermaLink="false">http://madbean.com/blog/2004-7#comment-306</guid>
		<description>&lt;p&gt;Gray,&lt;/p&gt;

&lt;p&gt;you bet! Most things in FishEye are powered by EyeQL on the inside (which is in turn powered by Lucene; go Lucene!). Some of these pages will have a &#34;see associated EyeQL&#34; kind of link.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Gray,</p>
<p>you bet! Most things in FishEye are powered by EyeQL on the inside (which is in turn powered by Lucene; go Lucene!). Some of these pages will have a &quot;see associated EyeQL&quot; kind of link.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Otis</title>
		<link>http://madbean.com/2004/mb2004-7/#comment-305</link>
		<dc:creator>Otis</dc:creator>
		<pubDate>Wed, 24 Mar 2004 22:41:16 +0000</pubDate>
		<guid isPermaLink="false">http://madbean.com/blog/2004-7#comment-305</guid>
		<description>&lt;p&gt;Lucene comments -- :)
FishEye -- :))
Me like: http://jroller.com/page/otis/20040324#fisheye_candy&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Lucene comments &#8212; :)<br />
FishEye &#8212; :))<br />
Me like: <a href="http://jroller.com/page/otis/20040324#fisheye_candy" rel="nofollow">http://jroller.com/page/otis/20040324#fisheye_candy</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Gary</title>
		<link>http://madbean.com/2004/mb2004-7/#comment-304</link>
		<dc:creator>Gary</dc:creator>
		<pubDate>Wed, 24 Mar 2004 22:37:58 +0000</pubDate>
		<guid isPermaLink="false">http://madbean.com/blog/2004-7#comment-304</guid>
		<description>&lt;p&gt;Will there be any way to look at the EyeQL that gets generated by using a wizard or GUI form to create a query? That would be a nice way to learn the language by example.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Will there be any way to look at the EyeQL that gets generated by using a wizard or GUI form to create a query? That would be a nice way to learn the language by example.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Matt Quail</title>
		<link>http://madbean.com/2004/mb2004-7/#comment-303</link>
		<dc:creator>Matt Quail</dc:creator>
		<pubDate>Wed, 24 Mar 2004 21:27:24 +0000</pubDate>
		<guid isPermaLink="false">http://madbean.com/blog/2004-7#comment-303</guid>
		<description>&lt;p&gt;Gary,&lt;/p&gt;

&lt;p&gt;Good point. No, no one has to learn EyeQL to make full use of FishEye. The final version will have all sorts of GUI goodness to help you get the data and reports you need (wizards, canned reports, etc.).&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Gary,</p>
<p>Good point. No, no one has to learn EyeQL to make full use of FishEye. The final version will have all sorts of GUI goodness to help you get the data and reports you need (wizards, canned reports, etc.).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Gary</title>
		<link>http://madbean.com/2004/mb2004-7/#comment-302</link>
		<dc:creator>Gary</dc:creator>
		<pubDate>Wed, 24 Mar 2004 18:13:55 +0000</pubDate>
		<guid isPermaLink="false">http://madbean.com/blog/2004-7#comment-302</guid>
		<description>&lt;p&gt;Matt: Your custom query language is &#34;SQL-like&#34;, and from the looks of it would be very easy to learn for someone used to writing SQL. But do you think people will be willing to learn a new language, even one has familiar as this one, just to make full use of your product? It almost seems a bit like overkill to me.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Matt: Your custom query language is &quot;SQL-like&quot;, and from the looks of it would be very easy to learn for someone used to writing SQL. But do you think people will be willing to learn a new language, even one has familiar as this one, just to make full use of your product? It almost seems a bit like overkill to me.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
