Skip to main content

Posts

Showing posts from 2015

Common Issues with Solr Data Import Handler (DIH)

1. Could not load driver: org.postgresql.Driver org.apache.solr.common.SolrException; Full Import failed:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException: Could not load driver: org.postgresql.Driver Solution : Put rmdbs driver, in my case postgres driver in $SOLR_HOME/dist folder and point it in solrconfig.xml <lib dir="${solr.install.dir:../../../..}/dist/" regex="postgresql.*\.jar" />  2. ERROR StreamingSolrClients org.apache.solr.common.SolrException: Bad Request request: http://host:7574/solr/collection_shard2_replica1/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2Fhost%3A8983%2Fsolr%2Fcollection_shard1_replica2%2F&wt=javabin&version=2 at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:241) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurr

solr 5.1 DIH

Recently I had to use Data Import Handler to index data from postgres database. Unfortunately I had to encounter few issues, I'm blogging the steps and the issues faced. Setting up DataImportHandler Edit your solrconfig.xml to add the request handler <requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler"> <lst name="defaults"> <str name="config">data-config.xml</str> </lst> </requestHandler> Create a data-config.xml file as follows and save it to the conf dir <dataConfig> <dataSource type="JdbcDataSource" driver="org.postgresql.Driver" url="jdbc:postgresql://host:port/dbname" user="username" password="password"/> <document> <entity name="col_id" query="select * from report_ks">

rWordCloud - An htmlwidget interface for D3 word cloud

With htmlwidget, its become easy to bind d3 scripts to R. rWordCloud is one such package. To install rWordCloud require(devtools) install_github('adymimos/rWordCloud') Two main functions in rWordClouds are d3TextCloud - this function takes strings as input, and performs word count. Before word count, it does stemming, and stop word removal. content <- c('R is a programming language and software environment for statistical computing and graphics open source','The R language is widely used among statisticians and data miners for developing statistical software and data analysis','Polls, surveys of data miners,and studies of scholarly literature databases show that R popularity has increased substantially in recent years','languages programming study open source, analysis') label <- c('a1','a2','a3','a4') d3TextCloud(content = content, label = label ) d3Cloud - Function accepts word and it