<?xml version="1.0" encoding="UTF-8"?>











<rss version="2.0" xmlns:jf="http://www.jivesoftware.com/xmlns/jiveforums/rss">



<channel>
    <title>Support Forums: Message List - Introduction to Text Indexing with Apache Jakarta Lucene</title>
    <link>http://www.theserverside.com</link>
    <description>Most recent forum messages</description>
    <language>en</language>
    
        <generator>Jive Forums Silver 5.5.30 (www.jivesoftware.com)</generator>
    
    <pubDate>Wed, 19 Jun 2013 01:42:19 -0400</pubDate>


    <item>

        <title>usage of lucene for the application with oracle db</title>
        <link>http://www.theserverside.com/discussions/thread.tss?thread_id=17462</link>

        

        
            <description><![CDATA[I would like to know how Lucene is advantageous over Advanced features of Oracle 9i text. <br>My application manages huge content and assets upto 2TB size now, and the db size is growing very fast. We plan to use oracle9i interMedia to store assets(word,...]]></description>
        

        <pubDate>Wed, 24 Aug 2005 14:49:37 -0400</pubDate>

        

        <jf:creationDate>Wed, 24 Aug 2005 14:49:37 -0400</jf:creationDate>
        <jf:modificationDate>Wed, 24 Aug 2005 14:49:37 -0400</jf:modificationDate>
        <jf:date>Aug 24, 2005</jf:date>
        <jf:author>seema bhat</jf:author>
        <jf:replyCount>0</jf:replyCount>
    </item>


    <item>

        <title>Indexing Microsoft Word</title>
        <link>http://www.theserverside.com/discussions/thread.tss?thread_id=17462</link>

        

        
            <description><![CDATA[You could use the Stellent Outside In Server, formerly the Inso Filters, which is pretty common (for instance, Yahoo Mail used them to let you view Office documents as HTML last time I checked ).  I can't comment on cost however so maybe there is a free...]]></description>
        

        <pubDate>Thu, 23 Jan 2003 02:52:42 -0500</pubDate>

        

        <jf:creationDate>Thu, 23 Jan 2003 02:52:42 -0500</jf:creationDate>
        <jf:modificationDate>Thu, 23 Jan 2003 02:52:42 -0500</jf:modificationDate>
        <jf:date>Jan 23, 2003</jf:date>
        <jf:author>Chad Williams</jf:author>
        <jf:replyCount>0</jf:replyCount>
    </item>


    <item>

        <title>Indexing Microsoft Word</title>
        <link>http://www.theserverside.com/discussions/thread.tss?thread_id=17462</link>

        

        
            <description><![CDATA[There is no reason you couldn't. The hardest part would be parsing the document to extract the content you want index. 
<br>
<br>Ryan]]></description>
        

        <pubDate>Wed, 22 Jan 2003 22:40:55 -0500</pubDate>

        

        <jf:creationDate>Wed, 22 Jan 2003 22:40:55 -0500</jf:creationDate>
        <jf:modificationDate>Wed, 22 Jan 2003 22:40:55 -0500</jf:modificationDate>
        <jf:date>Jan 22, 2003</jf:date>
        <jf:author>Ryan Breidenbach</jf:author>
        <jf:replyCount>1</jf:replyCount>
    </item>


    <item>

        <title>Introduction to Text Indexing with Apache Jakarta Lucene</title>
        <link>http://www.theserverside.com/discussions/thread.tss?thread_id=17462</link>

        

        
            <description><![CDATA[Does anyone know if Lucene provides any way to index Microsft Office DOcuments ?]]></description>
        

        <pubDate>Wed, 22 Jan 2003 09:09:44 -0500</pubDate>

        

        <jf:creationDate>Wed, 22 Jan 2003 09:09:44 -0500</jf:creationDate>
        <jf:modificationDate>Wed, 22 Jan 2003 09:09:44 -0500</jf:modificationDate>
        <jf:date>Jan 22, 2003</jf:date>
        <jf:author>Sriraman Venkataraman</jf:author>
        <jf:replyCount>2</jf:replyCount>
    </item>


    <item>

        <title>Lettertokenizer</title>
        <link>http://www.theserverside.com/discussions/thread.tss?thread_id=17462</link>

        

        
            <description><![CDATA[It is saying that the LetterTokenizer class that comes with Lucene does not tokenize some Asian languages well. Basically, this tokenizer assumes that tokens are represented by adjacent characters. If, however, there are adjacent characters that...]]></description>
        

        <pubDate>Tue, 21 Jan 2003 11:38:42 -0500</pubDate>

        

        <jf:creationDate>Tue, 21 Jan 2003 11:38:42 -0500</jf:creationDate>
        <jf:modificationDate>Tue, 21 Jan 2003 11:38:42 -0500</jf:modificationDate>
        <jf:date>Jan 21, 2003</jf:date>
        <jf:author>Ryan Breidenbach</jf:author>
        <jf:replyCount>0</jf:replyCount>
    </item>


    <item>

        <title>Lucene does support multibyte languages, but.....</title>
        <link>http://www.theserverside.com/discussions/thread.tss?thread_id=17462</link>

        

        
            <description><![CDATA[From the documentation, it mentions........]]></description>
        

        <pubDate>Tue, 21 Jan 2003 07:06:59 -0500</pubDate>

        

        <jf:creationDate>Tue, 21 Jan 2003 07:06:59 -0500</jf:creationDate>
        <jf:modificationDate>Tue, 21 Jan 2003 07:06:59 -0500</jf:modificationDate>
        <jf:date>Jan 21, 2003</jf:date>
        <jf:author>Gen Ho</jf:author>
        <jf:replyCount>1</jf:replyCount>
    </item>


    <item>

        <title>Lucene does support multibyte languages</title>
        <link>http://www.theserverside.com/discussions/thread.tss?thread_id=17462</link>

        

        
            <description><![CDATA[Lucene can support multibyte languages, but not &quot;out of the box&quot;. An important part of Lucene is breaking down documents/search terms into tokens that can be search upon/with. Currently, the Tokenizers that come with Lucene work well with...]]></description>
        

        <pubDate>Mon, 20 Jan 2003 21:34:12 -0500</pubDate>

        

        <jf:creationDate>Mon, 20 Jan 2003 21:34:12 -0500</jf:creationDate>
        <jf:modificationDate>Mon, 20 Jan 2003 21:34:12 -0500</jf:modificationDate>
        <jf:date>Jan 20, 2003</jf:date>
        <jf:author>Ryan Breidenbach</jf:author>
        <jf:replyCount>2</jf:replyCount>
    </item>


    <item>

        <title>Doesn't Lucent support Multi-byte language?</title>
        <link>http://www.theserverside.com/discussions/thread.tss?thread_id=17462</link>

        

        
            <description><![CDATA[Just wondering if Lucent supports indexing and searching of multi-byte languages, like Chinese? I looked at the documentation, but didn't find it being mentioned any where.]]></description>
        

        <pubDate>Mon, 20 Jan 2003 17:27:37 -0500</pubDate>

        

        <jf:creationDate>Mon, 20 Jan 2003 17:27:37 -0500</jf:creationDate>
        <jf:modificationDate>Mon, 20 Jan 2003 17:27:37 -0500</jf:modificationDate>
        <jf:date>Jan 20, 2003</jf:date>
        <jf:author>Saul Q Yuan</jf:author>
        <jf:replyCount>3</jf:replyCount>
    </item>


    <item>

        <title>One of the few &amp;quot;killer libraries&amp;quot; around!</title>
        <link>http://www.theserverside.com/discussions/thread.tss?thread_id=17462</link>

        

        
            <description><![CDATA[Lucene get's a 100%-approved ultra-cool library of the year award :-)! No kidding, it is exceptional, both from a programmer's perspective and in real-world production operations. I've used it a lot for search spaces of about 500k documents and it was...]]></description>
        

        <pubDate>Sun, 19 Jan 2003 07:39:28 -0500</pubDate>

        

        <jf:creationDate>Sun, 19 Jan 2003 07:39:28 -0500</jf:creationDate>
        <jf:modificationDate>Sun, 19 Jan 2003 07:39:28 -0500</jf:modificationDate>
        <jf:date>Jan 19, 2003</jf:date>
        <jf:author>Henrik Klagges</jf:author>
        <jf:replyCount>0</jf:replyCount>
    </item>


    <item>

        <title>Lucene review</title>
        <link>http://www.theserverside.com/discussions/thread.tss?thread_id=17462</link>

        

        
            <description><![CDATA[I had to implement a keyword search feature for my company's software similar to any of the search boxes you see at the major ecommerce websites.  I decided to evaluate Lucene for that purpose and I couldn't be happier with the results.  It took me about...]]></description>
        

        <pubDate>Fri, 17 Jan 2003 16:46:23 -0500</pubDate>

        

        <jf:creationDate>Fri, 17 Jan 2003 16:46:23 -0500</jf:creationDate>
        <jf:modificationDate>Fri, 17 Jan 2003 16:46:23 -0500</jf:modificationDate>
        <jf:date>Jan 17, 2003</jf:date>
        <jf:author>Mike Perham</jf:author>
        <jf:replyCount>0</jf:replyCount>
    </item>


    <item>

        <title>Introduction to Text Indexing with Apache Jakarta Lucene</title>
        <link>http://www.theserverside.com/discussions/thread.tss?thread_id=17462</link>

        

        
            <description><![CDATA[Lucene is a Java library that adds text indexing and searching capabilities to an application (and is commonly used to create searchable websites), part of Apache Jakarta. As of November 2002, Lucene version 1.2 has been released, with version 1.3 in the...]]></description>
        

        <pubDate>Fri, 17 Jan 2003 13:09:29 -0500</pubDate>

        

        <jf:creationDate>Fri, 17 Jan 2003 13:09:29 -0500</jf:creationDate>
        <jf:modificationDate>Fri, 17 Jan 2003 13:09:29 -0500</jf:modificationDate>
        <jf:date>Jan 17, 2003</jf:date>
        <jf:author>Floyd Marinescu</jf:author>
        <jf:replyCount>10</jf:replyCount>
    </item>



</channel>
</rss>

