java - spark hive and datanucleus -
in java spark (&spring) project, used sparkhivecontext , got initial error classnotfoundexception: org.datanucleus.api.jdo.jdopersistencemanagerfactory
when doing:
// sparkhivecontext = new javahivecontext(sparkcontext); // javardd<myclass> myrdd = ... javaschemardd schema = sparkhivecontext.applyschema(myrdd, myclass.class); schema.registertemptable("temptable"); sparkhivecontext.sql("create table mytable select * temptable");
so added ̀datanucleus-core
datanucleus-api-jdo
, datanucleus-rdbms
maven dependencies, version 3.2.1.
but error ...nosuchmethoderror: org.datanucleus.flushordered
.
the strange thing find class in datanucleus-core-3.2.1.jar in generated war web-inf/lib. , in no other jar of war.
does have idea how happen?
details:
- maven project
- spark 1.1.1 (with provided scope)
- include $spark_home/lib/spark-assembly-1.1.1-hadoop2.4.0.jar servlet container
- use maven jetty plugin run (i.e. servlet container)
- it worked before using spark-hive
- i don't have hive installed, told me wasn't necessary
- i use spark hive manage sql interface hdfs files, because spark sql (1.1.1) not enough.
well, stupid: used version 3.2.1 datanucleus dependencies while datanucleus-core provided spark 3.2.2 :-\
anyway, on way made this simple prototype of spring webapp using spark-hive. if interested.