apache spark avec nodejs ? oui, c'est possible avec eclairjs !

Post on 20-Mar-2017

288 Views

Category:

Software

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

DEVFEST NANTES 16

Bruno Bonnin - @_bruno_b_

Apache Spark avec NodeJS ?Oui, c’est possible avec EclairJS !

2DEVFEST NANTES 16

About me

Bruno Bonnin - @_bruno_b_

Architecte / Développeur

http://webdemo.myscript.com/

3DEVFEST NANTES 16

Java, Scala, Python, R

JavaScript

4DEVFEST NANTES 16

EclairJS

5DEVFEST NANTES 16

Spark WorkerSpark Worker

EclairJS

Cluster Manager

Spark WorkersSpark Driver

Libs SparkLibs Spark

Script EngineScript Engine

EclairJS Nashorn

EclairJS Nashorn

JVMJVM

App

NodeJS

EclairJS NodeJS

6DEVFEST NANTES 16

EclairJS: implémentation des composants

7DEVFEST NANTES 16

EclairJS: API Spark Core

val lines = sc.textFile("dream.txt")

val words = lines

.flatMap(

line => line.split(" "))

.filter(

word => word.trim.length > 0)

val counts = words

.mapToPair(

word => (word, 1))

.reduceByKey(_ + _)

var lines = sc.textFile("dream.txt");

var words = lines

.flatMap(function (line) {

return line.split(" "); })

.filter(function (word) {

return word.trim().length > 0; });

var counts = words

.mapToPair(function (word, Tuple2) {

return new Tuple2(word, 1); }, [Tuple2])

.reduceByKey(function (a, b) { return a + b; });

8DEVFEST NANTES 16

EclairJS: API Spark SQL

val df = sqlCtx.read.json("people.json")

df.printSchema()

df.select("name", "age")

.filter(col("age").gt(30))

.show()

df.registerTempTable("people")

val sqlDF = sqlCtx

.sql("SELECT * FROM people")

sqlDF.show()

var df = sqlCtx.read().json('people.json');

df.printSchema();

df.select('name', 'age')

.filter(col('age').gt(30))

.show();

df.registerTempTable('people');

var sqlDF = sqlCtx

.sql('SELECT * FROM people');

sqlDF.show();

9DEVFEST NANTES 16

app.get('/words', (req, res) => {

var lines = sc.textFile('dream.txt').cache();

var something = 0;

var words = lines

.flatMap(line => {

something += 1;

console.log('flatmap:' + line);

return line.split(' ');

})

.collect()

});

EclairJS: code dans NodeJS

10DEVFEST NANTES 16

app.get('/words', (req, res) => {

var lines = sc.textFile('dream.txt').cache();

var something = 0;

var words = lines

.flatMap(line => {

something += 1;

console.log('flatmap:' + line);

return line.split(' ');

})

.collect()

});

EclairJS: code dans NodeJS

.then(results => { res.json({result: results}); })

.catch(err => { res.status(500).send(err); });

11DEVFEST NANTES 16

app.get('/words', (req, res) => {

var lines = sc.textFile('dream.txt').cache();

var something = 0;

var words = lines

.flatMap(line => {

something += 1;

console.log('flatmap:' + line);

return line.split(' ');

})

.collect()

});

EclairJS: code dans NodeJS

.then(results => { res.json({result: results}); })

.catch(err => { res.status(500).send(err); });

12DEVFEST NANTES 16

app.get('/words', (req, res) => {

var lines = sc.textFile('dream.txt').cache();

var something = 0;

var words = lines

.flatMap(line => { function (line) {

something += 1;

console.log('flatmap:' + line);

return line.split(' ');

})

.collect()

.then(results => { res.json({result: results}); })

.catch(err => { res.status(500).send(err); });

});

EclairJS: code dans NodeJS

Code métier interpréter dans Nashorn (pas dans NodeJS) :

● Ne connait pas console● Est compliant ES5

13DEVFEST NANTES 16

EclairJS: déploiement

Spark WorkerSpark WorkersApplication

EclairJS - API Node

Cluster Manager EclairJS - API

Nashorn

Apache Toree(Spark Driver)

EclairJS - API Nashorn

Jupyter NotebookGateway

EclairJS Shell

EclairJS - API Nashorn

14DEVFEST NANTES 16

Démo

Client SparkApplication

EclairJS - API Node EclairJS - API Nashorn

Spark Workers

EclairJS - API Nashorn

15DEVFEST NANTES 16

Conclusion

Facile à apprendre (surtout pour ceux qui connaissent l’API Scala/Java)

Idéal pour les développeurs JS que le Scala peut rebuter :-)

Implémentation en cours : dernière version avec le support de Spark 2.0

Pour aller plus loin:

● https://eclairjs.github.io/ ● https://github.com/EclairJS/eclairjs

16DEVFEST NANTES 16 16DEVFEST NANTES 16

Merci ! @_bruno_b_

https://github.com/bbonnin/devfestnantes2016

top related