Recently I started playing with open street data in spark. Here are the steps to load the data into spark
1. Convert the PBF data into Parquet format.
https://github.com/adrianulbona/osm-parquetizer
2. Read the data in Spark
This ensures, tags are properly read as string instead of binary objects
1. Convert the PBF data into Parquet format.
https://github.com/adrianulbona/osm-parquetizer
2. Read the data in Spark
spark.sqlContext.setConf("spark.sql.parquet.binaryAsString","true")
This ensures, tags are properly read as string instead of binary objects
Comments
Post a Comment