<?xml version="1.0" encoding="UTF-8"?>











<rss version="2.0" xmlns:jf="http://www.jivesoftware.com/xmlns/jiveforums/rss">



<channel>
    <title>Support Forums: Message List - Combining Google Language API and Lucene</title>
    <link>http://www.theserverside.com</link>
    <description>Most recent forum messages</description>
    <language>en</language>
    
        <generator>Jive Forums Silver 5.5.30 (www.jivesoftware.com)</generator>
    
    <pubDate>Sat, 25 May 2013 15:57:58 -0400</pubDate>


    <item>

        <title>Re: libraries for offline use</title>
        <link>http://www.theserverside.com/discussions/thread.tss?thread_id=55136</link>

        

        
            <description><![CDATA[Great stuff. One good thing on google api is the large number of languages supported :)]]></description>
        

        <pubDate>Fri, 10 Jul 2009 13:17:27 -0400</pubDate>

        

        <jf:creationDate>Fri, 10 Jul 2009 13:17:27 -0400</jf:creationDate>
        <jf:modificationDate>Fri, 10 Jul 2009 13:17:27 -0400</jf:modificationDate>
        <jf:date>Jul 10, 2009</jf:date>
        <jf:author>Vinicius Carvalho</jf:author>
        <jf:replyCount>0</jf:replyCount>
    </item>


    <item>

        <title>Re: libraries for offline use</title>
        <link>http://www.theserverside.com/discussions/thread.tss?thread_id=55136</link>

        

        
            <description><![CDATA[yes, building an n-gram model on a corpus and subsequently comparing it to the n-gram frequency distribution of the to be classified sentence works extremely well for language identification. Even on very short sentences....]]></description>
        

        <pubDate>Fri, 10 Jul 2009 04:20:19 -0400</pubDate>

        

        <jf:creationDate>Fri, 10 Jul 2009 04:20:19 -0400</jf:creationDate>
        <jf:modificationDate>Fri, 10 Jul 2009 04:20:19 -0400</jf:modificationDate>
        <jf:date>Jul 10, 2009</jf:date>
        <jf:author>Faizal Abdoelrahman</jf:author>
        <jf:replyCount>0</jf:replyCount>
    </item>


    <item>

        <title>libraries for offline use</title>
        <link>http://www.theserverside.com/discussions/thread.tss?thread_id=55136</link>

        

        
            <description><![CDATA[Several libraries for language detection are available that do not require online access, e.g. this one: <a class="jive-link-external" href="http://www.jroller.com/melix/entry/nlp_in_java_a_language"...]]></description>
        

        <pubDate>Fri, 10 Jul 2009 03:12:59 -0400</pubDate>

        

        <jf:creationDate>Fri, 10 Jul 2009 03:12:59 -0400</jf:creationDate>
        <jf:modificationDate>Fri, 10 Jul 2009 03:12:59 -0400</jf:modificationDate>
        <jf:date>Jul 10, 2009</jf:date>
        <jf:author>Ulf Dittmer</jf:author>
        <jf:replyCount>2</jf:replyCount>
    </item>


    <item>

        <title>Re: Combining Google Language API and Lucene</title>
        <link>http://www.theserverside.com/discussions/thread.tss?thread_id=55136</link>

        

        
            <description><![CDATA[Cool stuff indeed.  I will definitely follow the progress on this!]]></description>
        

        <pubDate>Thu, 09 Jul 2009 08:03:49 -0400</pubDate>

        

        <jf:creationDate>Thu, 09 Jul 2009 08:03:49 -0400</jf:creationDate>
        <jf:modificationDate>Thu, 09 Jul 2009 08:03:49 -0400</jf:modificationDate>
        <jf:date>Jul 9, 2009</jf:date>
        <jf:author>Amin Mohammed-Coleman</jf:author>
        <jf:replyCount>0</jf:replyCount>
    </item>


    <item>

        <title>Re: Combining Google Language API and Lucene</title>
        <link>http://www.theserverside.com/discussions/thread.tss?thread_id=55136</link>

        

        
            <description><![CDATA[Yeah, being online is a must. I was thinking in using some sort of classifier for that, a naive-bayes for instance. I may still implement it one day. A good thing about google tough is that large number of languages supported. I don't think I could find...]]></description>
        

        <pubDate>Wed, 08 Jul 2009 16:53:48 -0400</pubDate>

        

        <jf:creationDate>Wed, 08 Jul 2009 16:53:48 -0400</jf:creationDate>
        <jf:modificationDate>Wed, 08 Jul 2009 16:53:48 -0400</jf:modificationDate>
        <jf:date>Jul 8, 2009</jf:date>
        <jf:author>Vinicius Carvalho</jf:author>
        <jf:replyCount>0</jf:replyCount>
    </item>


    <item>

        <title>Re: Combining Google Language API and Lucene</title>
        <link>http://www.theserverside.com/discussions/thread.tss?thread_id=55136</link>

        

        
            <description><![CDATA[Good stuff,

But if you are offline... I had developed a similar feature which used a neural network. It computed entries built with n-gram fragments of any text of any langage. Maybe Google work like this?]]></description>
        

        <pubDate>Wed, 08 Jul 2009 16:40:49 -0400</pubDate>

        

        <jf:creationDate>Wed, 08 Jul 2009 16:40:49 -0400</jf:creationDate>
        <jf:modificationDate>Wed, 08 Jul 2009 16:40:49 -0400</jf:modificationDate>
        <jf:date>Jul 8, 2009</jf:date>
        <jf:author>Vachon Ulrich</jf:author>
        <jf:replyCount>1</jf:replyCount>
    </item>


    <item>

        <title>Combining Google Language API and Lucene</title>
        <link>http://www.theserverside.com/discussions/thread.tss?thread_id=55136</link>

        

        
            <description><![CDATA[Lucene is one of the most used IR frameworks around. But in order to work properly it's documents must be indexed/analyzed in a proper manner. Choosing the right Analyzer implementation could be the difference between a good and a bad index....]]></description>
        

        <pubDate>Wed, 08 Jul 2009 10:42:06 -0400</pubDate>

        

        <jf:creationDate>Wed, 08 Jul 2009 10:42:06 -0400</jf:creationDate>
        <jf:modificationDate>Wed, 08 Jul 2009 10:42:06 -0400</jf:modificationDate>
        <jf:date>Jul 8, 2009</jf:date>
        <jf:author>Vinicius Carvalho</jf:author>
        <jf:replyCount>6</jf:replyCount>
    </item>



</channel>
</rss>

