Day – 3 -> Assignments / Tutorial for ( CVR Engg. College )

Name of Innovation

Day – 3 -> Assignments / Tutorial for ( CVR Engg. College )

December 14, 2016 Uncategorized 0

Start VM

  • Start Virtual Box, chose your VM and Click on the Start Button

screen-shot-2016-12-16-at-2-15-05-am

  • Login as hduser
    • Click on the right top corner and Chose Hadoop User.
    • Enter password – abcd1234

screen-shot-2016-12-16-at-2-14-27-am

  • Click on the Top Left Ubuntu Button and search for the terminal and click on it.

screen-shot-2016-12-16-at-2-18-44-am

  • You should see something similar as below

screen-shot-2016-12-16-at-2-20-24-am

Start all the services-

  • cd /home/hduser
  • set JAVA_HOME
    • export JAVA_HOME=/usr/lib/jvm/java-8-oracle/
  • Start elasticsearch
    • /home/hduser/elasticsearch-5.0.2/bin/elasticsearch 2>&1 > /home/hduser/elastic123.log &
    • Check – open firefox and type http://localhost:9200
  • Start Kibana
    • /home/hduser/kibana-5.0.2-linux-x86_64/bin/kibana 2>&1 > /home/hduser/kibana123.log &
    • Check – open firefox and type http://localhost:5601

Reference to Unix Commands –

  • http://www.thegeekstuff.com/2010/11/50-linux-commands/

Elasticsearch

  • Run all the commands given below on the terminal which you have opened in the previous step.

Terminology:

Table comparing Elasticsearch terminology with traditional relational database terminology:

§  MYSQL (RDBMS) TERMINOLOGY ELASTICSEARCH TERMINOLOGY
Database Index
Table Type
Row Document

 

Used Restful methods

  • HTTP Methods used: GET, POST, PUT, DELETE

 

Exercises and Solutions

 

CLICK BELOW TO USE COPY PASTE FRIENDLY VERSION

—–>> elastic-search-cvr<<<-

To start, please open a terminal on your Ubuntu Box
  • Check the status of the elasticseach from the command line.

i/p

curl http://localhost:9200

o/p

Expected output on the screen –

{
“name” : “MRPWrOy”,
“cluster_name” : “elasticsearch”,
“cluster_uuid” : “-YtQj9REQjaCbROg0Nc74w”,
“version” : {
“number” : “5.0.2”,
“build_hash” : “f6b4951”,
“build_date” : “2016-11-24T10:07:18.101Z”,
“build_snapshot” : false,
“lucene_version” : “6.2.1”
},
“tagline” : “You Know, for Search”

}

  • Create an index named “company”

i/p

curl -XPUT http://localhost:9200/company

o/p

{“acknowledged”:true,”shards_acknowledged”:true}

  • Create another index named “govtcompany”

i/p

curl -XPUT http://localhost:9200/govtcompany

o/p

{“acknowledged”:true,”shards_acknowledged”:true}

 

  • Get list of indices created so far –

i/p

curl -XGET http://localhost:9200/_cat/indices?pretty

o/p

yellow open nyc_visionzero BlWR26RdQYaHHym9JpYq_w 5 1 424707 0 436.7mb 436.7mb

yellow open govtcompany    6Wp2J3AxRoa9eCl82jPL2A 5 1      0 0    650b    650b

yellow open company        HB89IjvdSVez_In5nqSzWQ 5 1      0 0    650b    650b

yellow open employee       Xt3nJVgFRiWs3kwKEhY6XQ 5 1      0 0    650b    650b

yellow open .kibana        6Wl9qr8DSLm-g__2oQbX-w 1 1     14 0  34.6kb  34.6kb

 

  • Delete an Index and check the indices again –

i/p

curl -XDELETE http://localhost:9200/govtcompany

curl -XGET http://localhost:9200/_cat/indices?pretty

o/p

yellow open nyc_visionzero BlWR26RdQYaHHym9JpYq_w 5 1 424707 0 436.7mb 436.7mb

yellow open company        HB89IjvdSVez_In5nqSzWQ 5 1      0 0    650b    650b

yellow open employee       Xt3nJVgFRiWs3kwKEhY6XQ 5 1      0 0    650b    650b

yellow open .kibana        6Wl9qr8DSLm-g__2oQbX-w 1 1     14 0  34.6kb  34.6kb

 

  • Status of index ‘company’

i/p

curl -XGET  http://localhost:9200/company?pretty

o/p

{
“company” : {
“aliases” : { },
“mappings” : { },
“settings” : {
“index” : {
“creation_date” : “1481891504208”,
“number_of_shards” : “5”,
“number_of_replicas” : “1”,
“uuid” : “WNW9bNGRTdqbXVWFhBcYRg”,
“version” : {
“created” : “5000299”
},
“provided_name” : “company”
}
}
}
}

 

 

  • Delete company also

i/p

curl -XDELETE http://localhost:9200/company

o/p

{“acknowledged”:true }

 

 

  • Create index with specified data type –

i/p

curl -XPUT http://localhost:9200/company -d ‘{

“mappings”: {

“employee”: {

“properties”: {

“age”: {

“type”: “long”

},

“experience”: {

“type”: “long”

},

“name”: {

“type”: “string”,

“analyzer”: “standard”

}

}

}

}

}’

o/p

{“acknowledged”:true,”shards_acknowledged”:true}

  • Status of index ‘company’

i/p

curl -XGET  http://localhost:9200/company?pretty

o/p

curl -XGET  http://localhost:9200/company?pretty

{

“company” : {

“aliases” : { },

“mappings” : {

“employee” : {

“properties” : {

“age” : {

“type” : “long”

},

“experience” : {

“type” : “long”

},

“name” : {

“type” : “text”,

“analyzer” : “standard”

}

}

}

},

“settings” : {

“index” : {

“creation_date” : “1481744753984”,

“number_of_shards” : “5”,

“number_of_replicas” : “1”,

“uuid” : “Wqsco08iROCqiwRr7eXnwA”,

“version” : {

“created” : “5000299”

},

“provided_name” : “company”

}

}

}

}

  • Data Insertion

i/p

 curl -XPOST http://localhost:9200/company/employee -d ‘{

“name”: “Amar Sharma”,

“age” : 45,

“experience” : 10

}’

o/p

{“_index”:”company”,”_type”:”employee”,”_id”:”AVj-48T3Vl1JB8XO-oqO”,”_version”:1,”result”:”created”,”_shards”:{“total”:2,”successful”:1,”failed”:0},”created”:true}

 

i/p

 curl -XPOST http://localhost:9200/company/employee -d ‘{

“name”: “Sriknaht Kandi”,

“age” : 35,

“experience” : 7

}’

o/p

{“_index”:”company”,”_type”:”employee”,”_id”:”AVj-5CrJVl1JB8XO-oqP”,”_version”:1,”result”:”created”,”_shards”:{“total”:2,”successful”:1,”failed”:0},”created”:true}

 

 

i/p

curl -XPOST http://localhost:9200/company/employee -d ‘{

“name”: “Abdul Malik”,

“age” : 25,

“experience” : 3

}’

o/p

{“_index”:”company”,”_type”:”employee”,”_id”:”AVj-5HiXVl1JB8XO-oqQ”,”_version”:1,”result”:”created”,”_shards”:{“total”:2,”successful”:1,”failed”:0},”created”:true}

  • Retreive Data

 

i/p

curl -XGET http://localhost:9200/company/employee/_search?pretty

o/p

{

“took” : 2,

“timed_out” : false,

“_shards” : {

“total” : 5,

“successful” : 5,

“failed” : 0

},

“hits” : {

“total” : 4,

“max_score” : 1.0,

“hits” : [

{

“_index” : “company”,

“_type” : “employee”,

“_id” : “AVj-5HiXVl1JB8XO-oqQ”,

“_score” : 1.0,

“_source” : {

“name” : “Abdul Malik”,

“age” : 25,

“experience” : 3

}

},

{

“_index” : “company”,

“_type” : “employee”,

“_id” : “AVj-5CrJVl1JB8XO-oqP”,

“_score” : 1.0,

“_source” : {

“name” : “Sriknaht Kandi”,

“age” : 35,

“experience” : 7

}

},

{

“_index” : “company”,

“_type” : “employee”,

“_id” : “AVj-48T3Vl1JB8XO-oqO”,

“_score” : 1.0,

“_source” : {

“name” : “Amar Sharma”,

“age” : 45,

“experience” : 10

}

},

{

“_index” : “company”,

“_type” : “employee”,

“_id” : “AVj-40-mVl1JB8XO-oqN”,

“_score” : 1.0,

“_source” : {

“name” : “Andrew”,

“age” : 45,

“experience” : 10

}

}

]

}

}

 

  • Conditional Search:

i/p

curl -XPOST http://localhost:9200/company/employee/_search?pretty -d ‘{

“query”: {

“match_all”: {}

}

}’

o/p

{

“took” : 1,

“timed_out” : false,

“_shards” : {

“total” : 5,

“successful” : 5,

“failed” : 0

},

“hits” : {

“total” : 4,

“max_score” : 1.0,

“hits” : [

{

“_index” : “company”,

“_type” : “employee”,

“_id” : “AVj-5HiXVl1JB8XO-oqQ”,

“_score” : 1.0,

“_source” : {

“name” : “Abdul Malik”,

“age” : 25,

“experience” : 3

}

},

{

“_index” : “company”,

“_type” : “employee”,

“_id” : “AVj-5CrJVl1JB8XO-oqP”,

“_score” : 1.0,

“_source” : {

“name” : “Sriknaht Kandi”,

“age” : 35,

“experience” : 7

}

},

{

“_index” : “company”,

“_type” : “employee”,

“_id” : “AVj-48T3Vl1JB8XO-oqO”,

“_score” : 1.0,

“_source” : {

“name” : “Amar Sharma”,

“age” : 45,

“experience” : 10

}

},

{

“_index” : “company”,

“_type” : “employee”,

“_id” : “AVj-40-mVl1JB8XO-oqN”,

“_score” : 1.0,

“_source” : {

“name” : “Andrew”,

“age” : 45,

“experience” : 10

}

}

]

}

}

  • Another Variant

i/p

curl -XGET http://localhost:9200/company/employee/_search?pretty

o/p

{

“took” : 1,

“timed_out” : false,

“_shards” : {

“total” : 5,

“successful” : 5,

“failed” : 0

},

“hits” : {

“total” : 4,

“max_score” : 1.0,

“hits” : [

{

“_index” : “company”,

“_type” : “employee”,

“_id” : “AVj-5HiXVl1JB8XO-oqQ”,

“_score” : 1.0,

“_source” : {

“name” : “Abdul Malik”,

“age” : 25,

“experience” : 3

}

},

{

“_index” : “company”,

“_type” : “employee”,

“_id” : “AVj-5CrJVl1JB8XO-oqP”,

“_score” : 1.0,

“_source” : {

“name” : “Sriknaht Kandi”,

“age” : 35,

“experience” : 7

}

},

{

“_index” : “company”,

“_type” : “employee”,

“_id” : “AVj-48T3Vl1JB8XO-oqO”,

“_score” : 1.0,

“_source” : {

“name” : “Amar Sharma”,

“age” : 45,

“experience” : 10

}

},

{

“_index” : “company”,

“_type” : “employee”,

“_id” : “AVj-40-mVl1JB8XO-oqN”,

“_score” : 1.0,

“_source” : {

“name” : “Andrew”,

“age” : 45,

“experience” : 10

}

}

]

}

}

  • Fetch all employees with a particular name

i/p

curl -XGET http://localhost:9200/_search?pretty -d  ‘{

“query”: {

“match”: {

“name”: “Amar Sharma”

}

}

}’

o/p

{

“took” : 11,

“timed_out” : false,

“_shards” : {

“total” : 11,

“successful” : 11,

“failed” : 0

},

“hits” : {

“total” : 1,

“max_score” : 0.51623213,

“hits” : [

{

“_index” : “company”,

“_type” : “employee”,

“_id” : “AVj-48T3Vl1JB8XO-oqO”,

“_score” : 0.51623213,

“_source” : {

“name” : “Amar Sharma”,

“age” : 45,

“experience” : 10

}

}

]

}

}

  • Employees with age greater than a number :

 

i/p

curl -XGET http://localhost:9200/_search?pretty -d ‘

{

“query”: {

“range”: {

“age”: {“gt”: 35 }

}

}

}’

o/p

{

“took” : 7,

“timed_out” : false,

“_shards” : {

“total” : 11,

“successful” : 11,

“failed” : 0

},

“hits” : {

“total” : 2,

“max_score” : 1.0,

“hits” : [

{

“_index” : “company”,

“_type” : “employee”,

“_id” : “AVj-48T3Vl1JB8XO-oqO”,

“_score” : 1.0,

“_source” : {

“name” : “Amar Sharma”,

“age” : 45,

“experience” : 10

}

},

{

“_index” : “company”,

“_type” : “employee”,

“_id” : “AVj-40-mVl1JB8XO-oqN”,

“_score” : 1.0,

“_source” : {

“name” : “Andrew”,

“age” : 45,

“experience” : 10

}

}

]

}

}

 

  • Fetch data with multiple conditions

 

i/p

curl -XGET http://localhost:9200/_search?pretty -d ‘{

“query”: {   “bool”: {

“must”:     { “match”: {“name”: “Andrew” }},

“should”:   { “range”: {“age”: { “gte”:  35 }}}

}

}}’

o/p

{

“took” : 9,

“timed_out” : false,

“_shards” : {

“total” : 11,

“successful” : 11,

“failed” : 0

},

“hits” : {

“total” : 1,

“max_score” : 1.287682,

“hits” : [

{

“_index” : “company”,

“_type” : “employee”,

“_id” : “AVj-40-mVl1JB8XO-oqN”,

“_score” : 1.287682,

“_source” : {

“name” : “Andrew”,

“age” : 45,

“experience” : 10

}

}

]

}

}

 

  • Create the records in Elasticsearch with specific id ( here it is ‘2’ )

 

i/p

curl -XPUT ‘http://localhost:9200/company/employee/2’ -d ‘{

“name”: “Amar3 Sharma”,

“age” : 45,

“experience” : 10

}’

 

 

o/p

{“_index”:”company”,”_type”:”employee”,”_id”:”2″,”_version”:1,”result”:”created”,”_shards”:{“total”:2,”successful”:1,”failed”:0},”created”:true}

 

  • Update the record which you created ( please note the version number in the o/p ) –

 

i/p

curl -XPUT ‘http://localhost:9200/company/employee/2’ -d ‘{

“name”: “Amar4 Sharma”,

“age” : 45,

“experience” : 10

}’

o/p

{“_index”:”company”,”_type”:”employee”,”_id”:”2″,”_version”:2,”result”:”updated”,”_shards”:{“total”:2,”successful”:1,”failed”:0},”created”:false}

  • Update the record which you created again (note the version number in the o/p again)

i/p

curl -XPUT http://localhost:9200/company/employee/2 -d ‘{

“name”: “Amar5 Sharma”,

“age” : 45,

“experience” : 10

}’

o/p

{“_index”:”company”,”_type”:”employee”,”_id”:”2″,”_version”:3,”result”:”updated”,”_shards”:{“total”:2,”successful”:1,”failed”:0},”created”:false}

 

[1pdfviewer]http://woir.in/wp-content/uploads/2016/12/AssignmentCVR-ES-watermark.pdf[/pdfviewer1]

Logstash –

Please run following commands from terminal login as user –

  • Login as hduser
  • cd /home/hduser
  • set JAVA_HOME
    • export JAVA_HOME=/usr/lib/jvm/java-8-oracle/
  • Download the data to be inserted into ES
    • Sample input file to be used with Logstash – table / larger file table-3
    • wget -O /home/hduser/Downloads/table-3.csv http://woir.in/wp-content/uploads/2016/12/table-3.csv
  • Download the conf file
    • Config File to be used with logstash – stock-logstash-conf_1
    • wget -O /home/hduser/Downloads/stock-logstash.conf_1.csv http://woir.in/wp-content/uploads/2016/12/stock-logstash.conf_1.csv
  • Point the config file and run the logstash – it will insert data into elasticsearch
    • /home/hduser/logstash-5.0.2/bin/logstash -f  /home/hduser/Downloads/stock-logstash.conf_1.csv
  • Check data insertion is done or not –
    • curl -XGET http://localhost:9200/stock/_search?pretty

      • You should see o/p similar to following –

        {
          “took” : 5,
          “timed_out” : false,
          “_shards” : {
            “total” : 5,
            “successful” : 5,
            “failed” : 0
          },
          “hits” : {
            “total” : 22,
            “max_score” : 1.0,
            “hits” : [
              {
                “_index” : “stock”,
                “_type” : “logs”,
                “_id” : “AVkBgrZ2Vl1JB8XO-oqh”,
                “_score” : 1.0,
                “_source” : {

                  “High” : “High”, ………………

        ………………………………..

Kibana

screen-shot-2016-12-16-at-1-26-42-am

  • Configure Index in Kibana ( Click on Management )

screen-shot-2016-12-16-at-1-27-00-am

  • Click on Add New , fill in stock and choose Date field.

screen-shot-2016-12-16-at-1-28-01-am

  • Mark it as default Index to be used by clicking on the star

screen-shot-2016-12-16-at-1-27-13-am

  • Time is to create visualization – click on the tab right side.

screen-shot-2016-12-16-at-1-14-31-am

  • Chose Stock ( if there are more than one index available )

screen-shot-2016-12-16-at-1-14-53-am

  • Chose Line Chart for this example

screen-shot-2016-12-16-at-1-28-45-am

  • Fill in the fields as below

screen-shot-2016-12-16-at-1-29-03-am

 

  • Fill in the fields as below

screen-shot-2016-12-16-at-1-29-35-am

 

  • Fill in the fields as below

screen-shot-2016-12-16-at-1-29-52-am

  • Fill in the fields as below

screen-shot-2016-12-16-at-1-31-15-am

  • Click on the little carrot sign at the right top corner.

screen-shot-2016-12-16-at-1-31-48-am

  • Choose time interval last two year

screen-shot-2016-12-16-at-1-33-15-am

  • Apply your changes again

screen-shot-2016-12-16-at-1-33-35-am

  • Click on the save button and give it a name ( Line Chart in this case )

screen-shot-2016-12-16-at-1-33-58-am

  • Prepare another chart yourself ( Bar Chart )

screen-shot-2016-12-16-at-1-36-40-am screen-shot-2016-12-16-at-1-37-00-am

 

  • Visualizations are ready – we need to add them in the Dashboard – click on the Dashboard on the left side.

screen-shot-2016-12-16-at-1-37-28-am

  • Click on the New and then on the Add button, now chose the visualize which just have been added.

screen-shot-2016-12-16-at-1-51-48-am

 

NYC Motor Vehicle Collision

  • Please open a readymade report to play around with the dashboard.
  • screen-shot-2016-12-23-at-12-13-14-am