Skip to main content

solr 5.1 DIH

Recently I had to use Data Import Handler to index data from postgres database. Unfortunately I had to encounter few issues, I'm blogging the steps and the issues faced.


Setting up DataImportHandler
  • Edit your solrconfig.xml to add the request handler

    <requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
    <lst name="defaults">
      <str name="config">data-config.xml</str>
    </lst>
    </requestHandler>
  • Create a data-config.xml file as follows and save it to the conf dir

    <dataConfig>
      <dataSource type="JdbcDataSource" 
                  driver="org.postgresql.Driver"
                  url="jdbc:postgresql://host:port/dbname" 
                  user="username" 
                  password="password"/>
      <document>
        <entity name="col_id" 
                query="select * from report_ks">
        </entity>
      </document>
    </dataConfig>
    •  You need to add table fields in schema.xml
  • Put rmdbs driver, in my case postgres driver in $SOLR_HOME/dist folder and point it in solrconfig.xml

    <lib dir="${solr.install.dir:../../../..}/dist/" regex="postgresql.*\.jar" />
  •  add dataImportHandler jars in solrconfig.xml

    <lib dir="${solr.install.dir:../../../..}/dist/" regex="solr-dataimporthandler-.*\.jar" />

Comments

Popular posts from this blog

Upgrading nodejs in ubuntu 14.04

My machine has 5.x installed and had lot of trouble updating it to 8.x. Below are the steps I followed to upgrade nodejs from 5.x to 8.x #add the new source list sudo apt-key adv --keyserver keyserver.ubuntu.com --recv 68576280  sudo apt-add-repository "deb https://deb.nodesource.com/node_8.x $(lsb_release -sc) main" sudo apt-get update #Remove the previous installation sudo apt-get purge nodejs npm  #Verify if proper version is going to be installed apt-cache policy <package> #Install new version sudo apt-get install -y nodejs

Common issues on Shark with CDH5-beta2

Issues on Shark with CDH5-beta2 1. IncompatibleClassChangeError: Implementing class Exception in thread "main" java.lang.IncompatibleClassChangeError: Implementing class at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:800) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:449) at java.net.URLClassLoader.access$100(URLClassLoader.java:71) at java.net.URLClassLoader$1.run(URLClassLoader.java:361) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at java.lang.ClassLoader.defineC...

org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized request to start container.

Recently installed the latest cloudera hadoop. First issue I faced while working with hive. Diagnostic Messages for this Task: Container launch failed for container_1406173012885_0009_01_000021 : org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized request to start container . This token is expired. current time is 1406254943000 found 1406254938244     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)     at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)     at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)     at java.lang.reflect.Constructor.newInstance(Constructor.java:526)     at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:152)     at org.apache.hadoop.yarn.api.records.impl.pb.Serial...