Day – 3 -> Assignments / Tutorial for ( CVR Engg. College )
Start VM
- Start Virtual Box, chose your VM and Click on the Start Button
- Login as hduser
- Click on the right top corner and Chose Hadoop User.
- Enter password – abcd1234
- Click on the Top Left Ubuntu Button and search for the terminal and click on it.
- You should see something similar as below
Start all the services-
- cd /home/hduser
- set JAVA_HOME
- export JAVA_HOME=/usr/lib/jvm/java-8-oracle/
- Start elasticsearch
- /home/hduser/elasticsearch-5.0.2/bin/elasticsearch 2>&1 > /home/hduser/elastic123.log &
- Check – open firefox and type http://localhost:9200
- Start Kibana
- /home/hduser/kibana-5.0.2-linux-x86_64/bin/kibana 2>&1 > /home/hduser/kibana123.log &
- Check – open firefox and type http://localhost:5601
Reference to Unix Commands –
- http://www.thegeekstuff.com/2010/11/50-linux-commands/
Elasticsearch
- Run all the commands given below on the terminal which you have opened in the previous step.
Terminology:
Table comparing Elasticsearch terminology with traditional relational database terminology:
§ MYSQL (RDBMS) TERMINOLOGY | ELASTICSEARCH TERMINOLOGY |
Database | Index |
Table | Type |
Row | Document |
Used Restful methods
- HTTP Methods used: GET, POST, PUT, DELETE
Exercises and Solutions
CLICK BELOW TO USE COPY PASTE FRIENDLY VERSION
—–>> elastic-search-cvr<<<-
To start, please open a terminal on your Ubuntu Box
- Check the status of the elasticseach from the command line.
i/p
o/p
Expected output on the screen –
{
“name” : “MRPWrOy”,
“cluster_name” : “elasticsearch”,
“cluster_uuid” : “-YtQj9REQjaCbROg0Nc74w”,
“version” : {
“number” : “5.0.2”,
“build_hash” : “f6b4951”,
“build_date” : “2016-11-24T10:07:18.101Z”,
“build_snapshot” : false,
“lucene_version” : “6.2.1”
},
“tagline” : “You Know, for Search”
}
- Create an index named “company”
i/p
curl -XPUT http://localhost:9200/company
o/p
{“acknowledged”:true,”shards_acknowledged”:true}
- Create another index named “govtcompany”
i/p
curl -XPUT http://localhost:9200/govtcompany
o/p
{“acknowledged”:true,”shards_acknowledged”:true}
- Get list of indices created so far –
i/p
curl -XGET http://localhost:9200/_cat/indices?pretty
o/p
yellow open nyc_visionzero BlWR26RdQYaHHym9JpYq_w 5 1 424707 0 436.7mb 436.7mb
yellow open govtcompany 6Wp2J3AxRoa9eCl82jPL2A 5 1 0 0 650b 650b
yellow open company HB89IjvdSVez_In5nqSzWQ 5 1 0 0 650b 650b
yellow open employee Xt3nJVgFRiWs3kwKEhY6XQ 5 1 0 0 650b 650b
yellow open .kibana 6Wl9qr8DSLm-g__2oQbX-w 1 1 14 0 34.6kb 34.6kb
- Delete an Index and check the indices again –
i/p
curl -XDELETE http://localhost:9200/govtcompany
curl -XGET http://localhost:9200/_cat/indices?pretty
o/p
yellow open nyc_visionzero BlWR26RdQYaHHym9JpYq_w 5 1 424707 0 436.7mb 436.7mb
yellow open company HB89IjvdSVez_In5nqSzWQ 5 1 0 0 650b 650b
yellow open employee Xt3nJVgFRiWs3kwKEhY6XQ 5 1 0 0 650b 650b
yellow open .kibana 6Wl9qr8DSLm-g__2oQbX-w 1 1 14 0 34.6kb 34.6kb
- Status of index ‘company’
i/p
curl -XGET http://localhost:9200/company?pretty
o/p
{
“company” : {
“aliases” : { },
“mappings” : { },
“settings” : {
“index” : {
“creation_date” : “1481891504208”,
“number_of_shards” : “5”,
“number_of_replicas” : “1”,
“uuid” : “WNW9bNGRTdqbXVWFhBcYRg”,
“version” : {
“created” : “5000299”
},
“provided_name” : “company”
}
}
}
}
- Delete company also
i/p
curl -XDELETE http://localhost:9200/company
o/p
{“acknowledged”:true }
- Create index with specified data type –
i/p
curl -XPUT http://localhost:9200/company -d ‘{
“mappings”: {
“employee”: {
“properties”: {
“age”: {
“type”: “long”
},
“experience”: {
“type”: “long”
},
“name”: {
“type”: “string”,
“analyzer”: “standard”
}
}
}
}
}’
o/p
{“acknowledged”:true,”shards_acknowledged”:true}
- Status of index ‘company’
i/p
curl -XGET http://localhost:9200/company?pretty
o/p
curl -XGET http://localhost:9200/company?pretty
{
“company” : {
“aliases” : { },
“mappings” : {
“employee” : {
“properties” : {
“age” : {
“type” : “long”
},
“experience” : {
“type” : “long”
},
“name” : {
“type” : “text”,
“analyzer” : “standard”
}
}
}
},
“settings” : {
“index” : {
“creation_date” : “1481744753984”,
“number_of_shards” : “5”,
“number_of_replicas” : “1”,
“uuid” : “Wqsco08iROCqiwRr7eXnwA”,
“version” : {
“created” : “5000299”
},
“provided_name” : “company”
}
}
}
}
- Data Insertion
i/p
curl -XPOST http://localhost:9200/company/employee -d ‘{
“name”: “Amar Sharma”,
“age” : 45,
“experience” : 10
}’
o/p
{“_index”:”company”,”_type”:”employee”,”_id”:”AVj-48T3Vl1JB8XO-oqO”,”_version”:1,”result”:”created”,”_shards”:{“total”:2,”successful”:1,”failed”:0},”created”:true}
i/p
curl -XPOST http://localhost:9200/company/employee -d ‘{
“name”: “Sriknaht Kandi”,
“age” : 35,
“experience” : 7
}’
o/p
{“_index”:”company”,”_type”:”employee”,”_id”:”AVj-5CrJVl1JB8XO-oqP”,”_version”:1,”result”:”created”,”_shards”:{“total”:2,”successful”:1,”failed”:0},”created”:true}
i/p
curl -XPOST http://localhost:9200/company/employee -d ‘{
“name”: “Abdul Malik”,
“age” : 25,
“experience” : 3
}’
o/p
{“_index”:”company”,”_type”:”employee”,”_id”:”AVj-5HiXVl1JB8XO-oqQ”,”_version”:1,”result”:”created”,”_shards”:{“total”:2,”successful”:1,”failed”:0},”created”:true}
- Retreive Data
i/p
curl -XGET http://localhost:9200/company/employee/_search?pretty
o/p
{
“took” : 2,
“timed_out” : false,
“_shards” : {
“total” : 5,
“successful” : 5,
“failed” : 0
},
“hits” : {
“total” : 4,
“max_score” : 1.0,
“hits” : [
{
“_index” : “company”,
“_type” : “employee”,
“_id” : “AVj-5HiXVl1JB8XO-oqQ”,
“_score” : 1.0,
“_source” : {
“name” : “Abdul Malik”,
“age” : 25,
“experience” : 3
}
},
{
“_index” : “company”,
“_type” : “employee”,
“_id” : “AVj-5CrJVl1JB8XO-oqP”,
“_score” : 1.0,
“_source” : {
“name” : “Sriknaht Kandi”,
“age” : 35,
“experience” : 7
}
},
{
“_index” : “company”,
“_type” : “employee”,
“_id” : “AVj-48T3Vl1JB8XO-oqO”,
“_score” : 1.0,
“_source” : {
“name” : “Amar Sharma”,
“age” : 45,
“experience” : 10
}
},
{
“_index” : “company”,
“_type” : “employee”,
“_id” : “AVj-40-mVl1JB8XO-oqN”,
“_score” : 1.0,
“_source” : {
“name” : “Andrew”,
“age” : 45,
“experience” : 10
}
}
]
}
}
- Conditional Search:
i/p
curl -XPOST http://localhost:9200/company/employee/_search?pretty -d ‘{
“query”: {
“match_all”: {}
}
}’
o/p
{
“took” : 1,
“timed_out” : false,
“_shards” : {
“total” : 5,
“successful” : 5,
“failed” : 0
},
“hits” : {
“total” : 4,
“max_score” : 1.0,
“hits” : [
{
“_index” : “company”,
“_type” : “employee”,
“_id” : “AVj-5HiXVl1JB8XO-oqQ”,
“_score” : 1.0,
“_source” : {
“name” : “Abdul Malik”,
“age” : 25,
“experience” : 3
}
},
{
“_index” : “company”,
“_type” : “employee”,
“_id” : “AVj-5CrJVl1JB8XO-oqP”,
“_score” : 1.0,
“_source” : {
“name” : “Sriknaht Kandi”,
“age” : 35,
“experience” : 7
}
},
{
“_index” : “company”,
“_type” : “employee”,
“_id” : “AVj-48T3Vl1JB8XO-oqO”,
“_score” : 1.0,
“_source” : {
“name” : “Amar Sharma”,
“age” : 45,
“experience” : 10
}
},
{
“_index” : “company”,
“_type” : “employee”,
“_id” : “AVj-40-mVl1JB8XO-oqN”,
“_score” : 1.0,
“_source” : {
“name” : “Andrew”,
“age” : 45,
“experience” : 10
}
}
]
}
}
- Another Variant
i/p
curl -XGET http://localhost:9200/company/employee/_search?pretty
o/p
{
“took” : 1,
“timed_out” : false,
“_shards” : {
“total” : 5,
“successful” : 5,
“failed” : 0
},
“hits” : {
“total” : 4,
“max_score” : 1.0,
“hits” : [
{
“_index” : “company”,
“_type” : “employee”,
“_id” : “AVj-5HiXVl1JB8XO-oqQ”,
“_score” : 1.0,
“_source” : {
“name” : “Abdul Malik”,
“age” : 25,
“experience” : 3
}
},
{
“_index” : “company”,
“_type” : “employee”,
“_id” : “AVj-5CrJVl1JB8XO-oqP”,
“_score” : 1.0,
“_source” : {
“name” : “Sriknaht Kandi”,
“age” : 35,
“experience” : 7
}
},
{
“_index” : “company”,
“_type” : “employee”,
“_id” : “AVj-48T3Vl1JB8XO-oqO”,
“_score” : 1.0,
“_source” : {
“name” : “Amar Sharma”,
“age” : 45,
“experience” : 10
}
},
{
“_index” : “company”,
“_type” : “employee”,
“_id” : “AVj-40-mVl1JB8XO-oqN”,
“_score” : 1.0,
“_source” : {
“name” : “Andrew”,
“age” : 45,
“experience” : 10
}
}
]
}
}
- Fetch all employees with a particular name
i/p
curl -XGET http://localhost:9200/_search?pretty -d ‘{
“query”: {
“match”: {
“name”: “Amar Sharma”
}
}
}’
o/p
{
“took” : 11,
“timed_out” : false,
“_shards” : {
“total” : 11,
“successful” : 11,
“failed” : 0
},
“hits” : {
“total” : 1,
“max_score” : 0.51623213,
“hits” : [
{
“_index” : “company”,
“_type” : “employee”,
“_id” : “AVj-48T3Vl1JB8XO-oqO”,
“_score” : 0.51623213,
“_source” : {
“name” : “Amar Sharma”,
“age” : 45,
“experience” : 10
}
}
]
}
}
- Employees with age greater than a number :
i/p
curl -XGET http://localhost:9200/_search?pretty -d ‘
{
“query”: {
“range”: {
“age”: {“gt”: 35 }
}
}
}’
o/p
{
“took” : 7,
“timed_out” : false,
“_shards” : {
“total” : 11,
“successful” : 11,
“failed” : 0
},
“hits” : {
“total” : 2,
“max_score” : 1.0,
“hits” : [
{
“_index” : “company”,
“_type” : “employee”,
“_id” : “AVj-48T3Vl1JB8XO-oqO”,
“_score” : 1.0,
“_source” : {
“name” : “Amar Sharma”,
“age” : 45,
“experience” : 10
}
},
{
“_index” : “company”,
“_type” : “employee”,
“_id” : “AVj-40-mVl1JB8XO-oqN”,
“_score” : 1.0,
“_source” : {
“name” : “Andrew”,
“age” : 45,
“experience” : 10
}
}
]
}
}
- Fetch data with multiple conditions
i/p
curl -XGET http://localhost:9200/_search?pretty -d ‘{
“query”: { “bool”: {
“must”: { “match”: {“name”: “Andrew” }},
“should”: { “range”: {“age”: { “gte”: 35 }}}
}
}}’
o/p
{
“took” : 9,
“timed_out” : false,
“_shards” : {
“total” : 11,
“successful” : 11,
“failed” : 0
},
“hits” : {
“total” : 1,
“max_score” : 1.287682,
“hits” : [
{
“_index” : “company”,
“_type” : “employee”,
“_id” : “AVj-40-mVl1JB8XO-oqN”,
“_score” : 1.287682,
“_source” : {
“name” : “Andrew”,
“age” : 45,
“experience” : 10
}
}
]
}
}
- Create the records in Elasticsearch with specific id ( here it is ‘2’ )
i/p
curl -XPUT ‘http://localhost:9200/company/employee/2’ -d ‘{
“name”: “Amar3 Sharma”,
“age” : 45,
“experience” : 10
}’
o/p
{“_index”:”company”,”_type”:”employee”,”_id”:”2″,”_version”:1,”result”:”created”,”_shards”:{“total”:2,”successful”:1,”failed”:0},”created”:true}
- Update the record which you created ( please note the version number in the o/p ) –
i/p
curl -XPUT ‘http://localhost:9200/company/employee/2’ -d ‘{
“name”: “Amar4 Sharma”,
“age” : 45,
“experience” : 10
}’
o/p
{“_index”:”company”,”_type”:”employee”,”_id”:”2″,”_version”:2,”result”:”updated”,”_shards”:{“total”:2,”successful”:1,”failed”:0},”created”:false}
- Update the record which you created again (note the version number in the o/p again)
i/p
curl -XPUT http://localhost:9200/company/employee/2 -d ‘{
“name”: “Amar5 Sharma”,
“age” : 45,
“experience” : 10
}’
o/p
{“_index”:”company”,”_type”:”employee”,”_id”:”2″,”_version”:3,”result”:”updated”,”_shards”:{“total”:2,”successful”:1,”failed”:0},”created”:false}
[1pdfviewer]http://woir.in/wp-content/uploads/2016/12/AssignmentCVR-ES-watermark.pdf[/pdfviewer1]
Logstash –
Please run following commands from terminal login as user –
- Login as hduser
- cd /home/hduser
- set JAVA_HOME
- export JAVA_HOME=/usr/lib/jvm/java-8-oracle/
- Download the data to be inserted into ES
- Download the conf file
- Config File to be used with logstash – stock-logstash-conf_1
- wget -O /home/hduser/Downloads/stock-logstash.conf_1.csv http://woir.in/wp-content/uploads/2016/12/stock-logstash.conf_1.csv
- Point the config file and run the logstash – it will insert data into elasticsearch
- /home/hduser/logstash-5.0.2/bin/logstash -f /home/hduser/Downloads/stock-logstash.conf_1.csv
- Check data insertion is done or not –
-
curl -XGET http://localhost:9200/stock/_search?pretty
-
You should see o/p similar to following –
{
“took” : 5,
“timed_out” : false,
“_shards” : {
“total” : 5,
“successful” : 5,
“failed” : 0
},
“hits” : {
“total” : 22,
“max_score” : 1.0,
“hits” : [
{
“_index” : “stock”,
“_type” : “logs”,
“_id” : “AVkBgrZ2Vl1JB8XO-oqh”,
“_score” : 1.0,
“_source” : {“High” : “High”, ………………
………………………………..
-
-
Kibana
- Open firefox type http://localhost:5601
- It should look something similar to following –
- Configure Index in Kibana ( Click on Management )
- Click on Add New , fill in stock and choose Date field.
- Mark it as default Index to be used by clicking on the star
- Time is to create visualization – click on the tab right side.
- Chose Stock ( if there are more than one index available )
- Chose Line Chart for this example
- Fill in the fields as below
- Fill in the fields as below
- Fill in the fields as below
- Fill in the fields as below
- Click on the little carrot sign at the right top corner.
- Choose time interval last two year
- Apply your changes again
- Click on the save button and give it a name ( Line Chart in this case )
- Prepare another chart yourself ( Bar Chart )
- Visualizations are ready – we need to add them in the Dashboard – click on the Dashboard on the left side.
- Click on the New and then on the Add button, now chose the visualize which just have been added.
NYC Motor Vehicle Collision
- Please open a readymade report to play around with the dashboard.