java - Parallelize a collection with Spark -


i'm trying parallelize collection spark , example in documentation doesn't seem work:

list<integer> data = arrays.aslist(1, 2, 3, 4, 5); javardd<integer> distdata = sc.parallelize(data); 

i'm creating list of labeledpoints records each of contain data points (double[]) , label (defaulted: true/false).

 public list<labeledpoint> createlabeledpoints(list<esrecord> records) {     list<labeledpoint> points = new arraylist<>();      (esrecord rec : records) {          points.add(new labeledpoint(                 rec.defaulted ? 1.0 : 0.0, vectors.dense(rec.todatapoints())));     }     return points; }  public void test(list<esrecord> records) {     sparkconf conf = new sparkconf().setappname("svm classifier example");     sparkcontext sc = new sparkcontext(conf);     list<labeledpoint> points = createlabeledpoints(records);             javardd<labeledpoint> data = sc.parallelize(points);     ...  } 

the function signature of parallelize no longer taking 1 parameter, here how looks in spark-mllib_2.11 v1.3.0: sc.parallelize(seq, numslices, evidence$1)

so ideas on how working?

in java, should use javasparkcontext.

https://spark.apache.org/docs/0.6.2/api/core/spark/api/java/javasparkcontext.html


Comments

Popular posts from this blog

java - Could not locate OpenAL library -

c++ - Delete matches in OpenCV (Keypoints and descriptors) -

sorting - opencl Bitonic sort with 64 bits keys -