HBase Commands

./bin/hbase shell

woir> list

woir> status

woir> version

woir> table_help

woir> whoami

woir> create 'woir', 'family data', ’work data’

woir> disable 'woir'

woir> is_disabled 'table name'

woir> is_disabled 'woir'

woir> disable_all 'amar.*'

woir> enable 'woir'

woir> scan 'woir'

woir> is_enabled 'table name'

woir> is_enabled 'woir'

woir> describe 'woir'

woir> alter 'woir', NAME => 'family data', VERSIONS => 5

woir> alter 'woir', READONLY

woir> alter 't1', METHOD => 'table_att_unset', NAME => 'MAX_FILESIZE'

woir> alter ‘ table name ’, ‘delete’ => ‘ column family ’ 

woir> scan 'woir'

woir> alter 'woir','delete'=>'work'

woir> scan 'woir'

woir> exists 'woir'

woir> exists 'student'

woir> exists 'student'

woir> drop 'woir'

woir> exists 'woir'

woir> drop_all ‘t.*’ 

./bin/stop-hbase.sh

woir> put 'woir','1','family data:name','kaka'

woir> put 'woir','1','work 

woir> put 'woir','1','work data:salary','50000'

woir> scan 'woir'

woir> put 'woir','row1','family:city','Delhi'

woir> scan 'woir'

woir> get 'woir', '1'

woir>get 'table name', ‘rowid’, {COLUMN => ‘column family:column name ’}

woir> get 'woir', 'row1', {COLUMN=>'family:name'}

woir> delete 'woir', '1', 'family data:city', 1417521848375

woir> deleteall 'woir','1'

woir> scan 'woir'

woir> count 'woir'

woir> truncate 'table name'




Command Usage
hbase> scan ‘.META.’, {COLUMNS => ‘info:regioninfo’} It display all the meta data information related to columns that are present in the tables in HBase
hbase> scan ‘woir’, {COLUMNS => [‘c1’, ‘c2’], LIMIT => 10, STARTROW => ‘abc’} It display contents of table guru99 with their column families c1 and c2 limiting the values to 10
hbase> scan ‘woir’, {COLUMNS => ‘c1’, TIMERANGE => [1303668804, 1303668904]} It display contents of guru99 with its column name c1 with the values present in between the mentioned time range attribute value
hbase> scan ‘woir’, {RAW => true, VERSIONS =>10} In this command RAW=> true provides advanced feature like to display all the cell values present in the table guru99
 examples
Elasticsearch, logstash and Kibana

Elasticsearch, logstash and Kibana

  • Start Desk Top

 

  • Login as woir
    • Click on the right top corner and Chose Hadoop User.
    • Enter password – <your password>

screen-shot-2016-12-16-at-2-14-27-am

  • Click on the Top Left Ubuntu Button and search for the terminal and click on it.

screen-shot-2016-12-16-at-2-18-44-am

  • You should see something similar as below

screen-shot-2016-12-16-at-2-20-24-am

Start all the services-

  • cd /home/woir
  • Start elasticsearch
    /home/woir/start_elasticsearch.sh

    Check – open firefox and type –

  • http://localhost:9200
  • Start Kibana
    • /home/woir/start_kibana.sh
      
      
  • Open firefox and type –
     http://localhost:5601

Reference to Unix Commands –

  • http://www.thegeekstuff.com/2010/11/50-linux-commands/

Elasticsearch

  • Run all the commands given below on the terminal which you have opened in the previous step.

Terminology:

Table comparing Elasticsearch terminology with traditional relational database terminology:

§  MYSQL (RDBMS) TERMINOLOGY ELASTICSEARCH TERMINOLOGY
Database Index
Table Type
Row Document

Used Restful methods

  • HTTP Methods used: GET, POST, PUT, DELETE

Exercises and Solutions

CLICK BELOW TO USE COPY PASTE FRIENDLY VERSION

—–>> elastic-search-cvr<<<-

To start, please open a terminal on your Ubuntu Box
  • Check the status of the elasticseach from the command line.

i/p

curl http://localhost:9200

o/p

Expected output on the screen –

{
 "name" : "MRPWrOy",
 "cluster_name" : "elasticsearch",
 "cluster_uuid" : "-YtQj9REQjaCbROg0Nc74w",
 "version" : {
 "number" : "5.0.2",
 "build_hash" : "f6b4951",
 "build_date" : "2016-11-24T10:07:18.101Z",
 "build_snapshot" : false,
 "lucene_version" : "6.2.1"
 },
 "tagline" : "You Know, for Search"
}
  • Create an index named “company”

i/p

curl -XPUT http://localhost:9200/company

o/p

{"acknowledged":true,"shards_acknowledged":true}

  • Create another index named “govtcompany”

i/p

curl -XPUT http://localhost:9200/govtcompany

o/p

{"acknowledged":true,"shards_acknowledged":true}
  • Get list of indices created so far –

i/p

curl -XGET http://localhost:9200/_cat/indices?pretty

o/p

yellow open nyc_visionzero BlWR26RdQYaHHym9JpYq_w 5 1 424707 0 436.7mb 436.7mb

yellow open govtcompany    6Wp2J3AxRoa9eCl82jPL2A 5 1      0 0    650b    650b

yellow open company        HB89IjvdSVez_In5nqSzWQ 5 1      0 0    650b    650b

yellow open employee       Xt3nJVgFRiWs3kwKEhY6XQ 5 1      0 0    650b    650b

yellow open .kibana        6Wl9qr8DSLm-g__2oQbX-w 1 1     14 0  34.6kb  34.6kb
  • Delete an Index and check the indices again –

i/p

curl -XDELETE http://localhost:9200/govtcompany

curl -XGET http://localhost:9200/_cat/indices?pretty

o/p

yellow open nyc_visionzero BlWR26RdQYaHHym9JpYq_w 5 1 424707 0 436.7mb 436.7mb

yellow open company        HB89IjvdSVez_In5nqSzWQ 5 1      0 0    650b    650b

yellow open employee       Xt3nJVgFRiWs3kwKEhY6XQ 5 1      0 0    650b    650b

yellow open .kibana        6Wl9qr8DSLm-g__2oQbX-w 1 1     14 0  34.6kb  34.6kb
  • Status of index ‘company’

i/p

curl -XGET  http://localhost:9200/company?pretty

o/p

{
 "company" : {
 "aliases" : { },
 "mappings" : { },
 "settings" : {
 "index" : {
 "creation_date" : "1481891504208",
 "number_of_shards" : "5",
 "number_of_replicas" : "1",
 "uuid" : "WNW9bNGRTdqbXVWFhBcYRg",
 "version" : {
 "created" : "5000299"
 },
 "provided_name" : "company"
 }
 }
 }
 }
  • Delete company also

i/p

curl -XDELETE http://localhost:9200/company

o/p

{"acknowledged":true }
  • Create index with specified data type –

i/p

curl -XPUT http://localhost:9200/company -d '{

"mappings": {
"employee": {
"properties": {
"age": {
"type": "long"
},

"experience": {
"type": "long"
},

"name": {
"type": "string",
"analyzer": "standard"
}
}
}
}
}'

o/p

{"acknowledged":true,"shards_acknowledged":true}
  • Status of index ‘company’

i/p

curl -XGET  http://localhost:9200/company?pretty

o/p

curl -XGET  http://localhost:9200/company?pretty

{

"company" : {

"aliases" : { },

"mappings" : {

"employee" : {

"properties" : {

"age" : {

"type" : "long"

},

"experience" : {

"type" : "long"

},

"name" : {

"type" : "text",

"analyzer" : "standard"

}

}

}

},

"settings" : {

"index" : {

"creation_date" : "1481744753984",

"number_of_shards" : "5",

"number_of_replicas" : "1",

"uuid" : "Wqsco08iROCqiwRr7eXnwA",

"version" : {

"created" : "5000299"

},

"provided_name" : "company"

}

}

}

}
  • Data Insertion

i/p

 curl -XPOST http://localhost:9200/company/employee -d '{

"name": "Amar Sharma",

"age" : 45,

"experience" : 10

}'

o/p

{"_index":"company","_type":"employee","_id":"AVj-48T3Vl1JB8XO-oqO","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"created":true}

i/p

 curl -XPOST http://localhost:9200/company/employee -d '{

"name": "Sriknaht Kandi",

"age" : 35,

"experience" : 7

}'

o/p

{"_index":"company","_type":"employee","_id":"AVj-5CrJVl1JB8XO-oqP","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"created":true}

i/p

curl -XPOST http://localhost:9200/company/employee -d '{

"name": "Abdul Malik",

"age" : 25,

"experience" : 3

}'

o/p

{"_index":"company","_type":"employee","_id":"AVj-5HiXVl1JB8XO-oqQ","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"created":true}
  • Retreive Data

i/p

curl -XGET http://localhost:9200/company/employee/_search?pretty

o/p

{

"took" : 2,

"timed_out" : false,

"_shards" : {

"total" : 5,

"successful" : 5,

"failed" : 0

},

"hits" : {

"total" : 4,

"max_score" : 1.0,

"hits" : [

{

"_index" : "company",

"_type" : "employee",

"_id" : "AVj-5HiXVl1JB8XO-oqQ",

"_score" : 1.0,

"_source" : {

"name" : "Abdul Malik",

"age" : 25,

"experience" : 3

}

},

{

"_index" : "company",

"_type" : "employee",

"_id" : "AVj-5CrJVl1JB8XO-oqP",

"_score" : 1.0,

"_source" : {

"name" : "Sriknaht Kandi",

"age" : 35,

"experience" : 7

}

},

{

"_index" : "company",

"_type" : "employee",

"_id" : "AVj-48T3Vl1JB8XO-oqO",

"_score" : 1.0,

"_source" : {

"name" : "Amar Sharma",

"age" : 45,

"experience" : 10

}

},

{

"_index" : "company",

"_type" : "employee",

"_id" : "AVj-40-mVl1JB8XO-oqN",

"_score" : 1.0,

"_source" : {

"name" : "Andrew",

"age" : 45,

"experience" : 10

}

}

]

}

}
  • Conditional Search:

i/p

curl -XPOST http://localhost:9200/company/employee/_search?pretty -d '{

"query": {

"match_all": {}

}

}'

o/p

{

"took" : 1,

"timed_out" : false,

"_shards" : {

"total" : 5,

"successful" : 5,

"failed" : 0

},

"hits" : {

"total" : 4,

"max_score" : 1.0,

"hits" : [

{

"_index" : "company",

"_type" : "employee",

"_id" : "AVj-5HiXVl1JB8XO-oqQ",

"_score" : 1.0,

"_source" : {

"name" : "Abdul Malik",

"age" : 25,

"experience" : 3

}

},

{

"_index" : "company",

"_type" : "employee",

"_id" : "AVj-5CrJVl1JB8XO-oqP",

"_score" : 1.0,

"_source" : {

"name" : "Sriknaht Kandi",

"age" : 35,

"experience" : 7

}

},

{

"_index" : "company",

"_type" : "employee",

"_id" : "AVj-48T3Vl1JB8XO-oqO",

"_score" : 1.0,

"_source" : {

"name" : "Amar Sharma",

"age" : 45,

"experience" : 10

}

},

{

"_index" : "company",

"_type" : "employee",

"_id" : "AVj-40-mVl1JB8XO-oqN",

"_score" : 1.0,

"_source" : {

"name" : "Andrew",

"age" : 45,

"experience" : 10

}

}

]

}

}
  • Another Variant

i/p

curl -XGET http://localhost:9200/company/employee/_search?pretty

o/p

{

"took" : 1,

"timed_out" : false,

"_shards" : {

"total" : 5,

"successful" : 5,

"failed" : 0

},

"hits" : {

"total" : 4,

"max_score" : 1.0,

"hits" : [

{

"_index" : "company",

"_type" : "employee",

"_id" : "AVj-5HiXVl1JB8XO-oqQ",

"_score" : 1.0,

"_source" : {

"name" : "Abdul Malik",

"age" : 25,

"experience" : 3

}

},

{

"_index" : "company",

"_type" : "employee",

"_id" : "AVj-5CrJVl1JB8XO-oqP",

"_score" : 1.0,

"_source" : {

"name" : "Sriknaht Kandi",

"age" : 35,

"experience" : 7

}

},

{

"_index" : "company",

"_type" : "employee",

"_id" : "AVj-48T3Vl1JB8XO-oqO",

"_score" : 1.0,

"_source" : {

"name" : "Amar Sharma",

"age" : 45,

"experience" : 10

}

},

{

"_index" : "company",

"_type" : "employee",

"_id" : "AVj-40-mVl1JB8XO-oqN",

"_score" : 1.0,

"_source" : {

"name" : "Andrew",

"age" : 45,

"experience" : 10

}

}

]

}

}
  • Fetch all employees with a particular name

i/p

curl -XGET http://localhost:9200/_search?pretty -d  '{

"query": {

"match": {

"name": "Amar Sharma"

}

}

}'

o/p

{

"took" : 11,

"timed_out" : false,

"_shards" : {

"total" : 11,

"successful" : 11,

"failed" : 0

},

"hits" : {

"total" : 1,

"max_score" : 0.51623213,

"hits" : [

{

"_index" : "company",

"_type" : "employee",

"_id" : "AVj-48T3Vl1JB8XO-oqO",

"_score" : 0.51623213,

"_source" : {

"name" : "Amar Sharma",

"age" : 45,

"experience" : 10

}

}

]

}

}
  • Employees with age greater than a number :

i/p

curl -XGET http://localhost:9200/_search?pretty -d '

{

"query": {

"range": {

"age": {"gt": 35 }

}

}

}'

o/p

{

"took" : 7,

"timed_out" : false,

"_shards" : {

"total" : 11,

"successful" : 11,

"failed" : 0

},

"hits" : {

"total" : 2,

"max_score" : 1.0,

"hits" : [

{

"_index" : "company",

"_type" : "employee",

"_id" : "AVj-48T3Vl1JB8XO-oqO",

"_score" : 1.0,

"_source" : {

"name" : "Amar Sharma",

"age" : 45,

"experience" : 10

}

},

{

"_index" : "company",

"_type" : "employee",

"_id" : "AVj-40-mVl1JB8XO-oqN",

"_score" : 1.0,

"_source" : {

"name" : "Andrew",

"age" : 45,

"experience" : 10

}

}

]

}

}
  • Fetch data with multiple conditions

i/p

curl -XGET http://localhost:9200/_search?pretty -d '{

"query": {   "bool": {

"must":     { "match": {"name": "Andrew" }},

"should":   { "range": {"age": { "gte":  35 }}}

}

}}'

o/p

{

"took" : 9,

"timed_out" : false,

"_shards" : {

"total" : 11,

"successful" : 11,

"failed" : 0

},

"hits" : {

"total" : 1,

"max_score" : 1.287682,

"hits" : [

{

"_index" : "company",

"_type" : "employee",

"_id" : "AVj-40-mVl1JB8XO-oqN",

"_score" : 1.287682,

"_source" : {

"name" : "Andrew",

"age" : 45,

"experience" : 10

}

}

]

}

}
  • Create the records in Elasticsearch with specific id ( here it is ‘2’ )
i/p

curl -XPUT 'http://localhost:9200/company/employee/2' -d '{

"name": "Amar3 Sharma",

"age" : 45,

"experience" : 10

}'

 

 

o/p

{"_index":"company","_type":"employee","_id":"2","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"created":true}
  • Update the record which you created ( please note the version number in the o/p ) –

 

i/p

curl -XPUT 'http://localhost:9200/company/employee/2' -d '{

"name": "Amar4 Sharma",

"age" : 45,

"experience" : 10

}'

o/p

{"_index":"company","_type":"employee","_id":"2","_version":2,"result":"updated","_shards":{"total":2,"successful":1,"failed":0},"created":false}
  • Update the record which you created again (note the version number in the o/p again)

i/p

curl -XPUT http://localhost:9200/company/employee/2 -d '{

"name": "Amar5 Sharma",

"age" : 45,

"experience" : 10

}'

o/p

{"_index":"company","_type":"employee","_id":"2","_version":3,"result":"updated","_shards":{"total":2,"successful":1,"failed":0},"created":false}

[1pdfviewer]http://woir.in/wp-content/uploads/2016/12/AssignmentCVR-ES-watermark.pdf[/pdfviewer1]

Logstash –

Please run following commands from terminal login as user –

  • Login as woir
  • cd /home/woir
  • Cleanup the environment
curl -XDELETE http://localhost:9200/stock
\rm -rf ~/logstash-5.0.2/data/plugins/inputs/file/.sincedb_*
  • set JAVA_HOME
    export JAVA_HOME=~/JAVA
  • Download the data to be inserted into ES
    • Sample input file to be used with Logstash – table / larger file table-3
wget -O /home/woir/Downloads/table-3.csv http://woir.in/wp-content/uploads/2016/12/table-3.csv

 

 

wget -O /home/woir/Downloads/stock-logstash.conf_1.csv http://woir.in/wp-content/uploads/2016/12/stock-logstash.conf_1.csv
  • Point the config file and run the logstash – it will insert data into elasticsearch
/home/woir/logstash-5.0.2/bin/logstash -f /home/woir/Downloads/stock-logstash.conf_1.csv
  • Check data insertion is done or not –

 

curl -XGET http://localhost:9200/stock/_search?pretty

 

You should see o/p similar to following –

{
   "took" : 5,
   "timed_out" : false,
   "_shards" : {
     "total" : 5,
     "successful" : 5,
     "failed" : 0
   },
   "hits" : {
     "total" : 22,
     "max_score" : 1.0,
     "hits" : [
       {
         "_index" : "stock",
         "_type" : "logs",
         "_id" : "AVkBgrZ2Vl1JB8XO-oqh",
         "_score" : 1.0,
         "_source" : {
          "High" : "High", ........................................................

Kibana

Start Kibana
export JAVA_HOME=~/JAVA

~/kibana-5.0.2-linux-x86/bin/kibana 2>&1 > ~/kibana123.log &
  • Open firefox type –
 http://localhost:5601

It should look something similar to following –

screen-shot-2016-12-16-at-1-26-42-am

  • Configure Index in Kibana ( Click on Management )

screen-shot-2016-12-16-at-1-27-00-am

  • Click on Add New , fill in stock and choose Date field.

screen-shot-2016-12-16-at-1-28-01-am

  • Mark it as default Index to be used by clicking on the star

screen-shot-2016-12-16-at-1-27-13-am

  • Time is to create visualization – click on the tab right side.

screen-shot-2016-12-16-at-1-14-31-am

  • Chose Stock ( if there are more than one index available )

screen-shot-2016-12-16-at-1-14-53-am

  • Chose Line Chart for this example

screen-shot-2016-12-16-at-1-28-45-am

  • Fill in the fields as below

screen-shot-2016-12-16-at-1-29-03-am

  • Fill in the fields as below

screen-shot-2016-12-16-at-1-29-35-am

  • Fill in the fields as below

screen-shot-2016-12-16-at-1-29-52-am

  • Fill in the fields as below

screen-shot-2016-12-16-at-1-31-15-am

  • Click on the little carrot sign at the right top corner.

screen-shot-2016-12-16-at-1-31-48-am

  • Choose time interval last two year

screen-shot-2016-12-16-at-1-33-15-am

  • Apply your changes again

screen-shot-2016-12-16-at-1-33-35-am

  • Click on the save button and give it a name ( Line Chart in this case )

screen-shot-2016-12-16-at-1-33-58-am

  • Prepare another chart yourself ( Bar Chart )

screen-shot-2016-12-16-at-1-36-40-am screen-shot-2016-12-16-at-1-37-00-am

  • Visualizations are ready – we need to add them in the Dashboard – click on the Dashboard on the left side.

screen-shot-2016-12-16-at-1-37-28-am

  • Click on the New and then on the Add button, now chose the visualize which just have been added.

screen-shot-2016-12-16-at-1-51-48-am

NYC Motor Vehicle Collision

  • Please open a readymade report to play around with the dashboard.
  • screen-shot-2016-12-23-at-12-13-14-am
Mongo Db Tutorial

Mongo Db Tutorial

Start MongoDB

  • Enter following commands

1.) cd 
2.) /home/hduser/start_mongodb.sh
Wait for approx 30 seconds
3). type mongo
you should see mongodb prompt as below
> db

 

In case if you have difficulty in connecting – try setting following –

 

export LC_ALL=en_US.UTF-8
export LANG=en_US.UTF-8
export LANGUAGE=en_US.UTF-8
Check the DBs already existing in your setup:

> show dbs
local  0.078GB
woir   0.078GB

Delete all the DBs existing to start your tutorial from scratch. To delete / drop a database, make sure you have selected the database and then do this:

> use woir
> db.dropDatabase()
{"dropped" : "scores", "ok" : 1 }

To select a database:

> use sports
switched to db sports

To find out the currently selected database:

> db.getName()
sports

To delete / drop a database, make sure you have selected the database and then do this:

> use sports
> db.dropDatabase()
{"dropped" : "scores", "ok" : 1 }

Create and select DB again

> use sports
switched to db sports

To see the collections in a databases:

> show collections

or

> db.getCollectionNames()

Let’s create a new database named “sports” – it is lazy creation:

> use sports
switched to db sports
> db.information.insert({name:"Andre Adams",year:1975} )

That above commands resulted in the creation of the new database named “sports”. In the process a new collection called “information” was also created in the database.

Create / add data in MongoDB

To create documents in a collection:

> db.information.insert({name:"Brendon McCullum",year:1981})
> db.information.insert({name:"Carl Cachopa",year:1986})

The two commands above added two more documents to the collection named “information” in the “sports” database.

You could have done the same thing using the save command instead of the insert command.

> db.information.save({name:"Chris Cairns",year:1970}) 
> db.information.save({name:"Chris Harris",year:1969})

The difference between insert and save is this:

insert will always add a new document.
save does an insert if there is no _id key in the object it receives, else it does an update.

To read data from a collection

> db.information.find()
> db.information.find()
{ "_id" : ObjectId("5894174f3b9f8fd81c8b9768"), "name" : "Andre Adams", "year" : 1975 }
{ "_id" : ObjectId("589417583b9f8fd81c8b9769"), "name" : "Brendon McCullum", "year" : 1981 }
{ "_id" : ObjectId("5894175e3b9f8fd81c8b976a"), "name" : "Carl Cachopa", "year" : 1986 }
{ "_id" : ObjectId("5894176c3b9f8fd81c8b976b"), "name" : "Chris Cairns", "year" : 1970 }
{ "_id" : ObjectId("589417723b9f8fd81c8b976c"), "name" : "Chris Harris", "year" : 1969 }

To limit it to just two:

> db.information.find().limit(2)
{ "_id" : ObjectId("5894174f3b9f8fd81c8b9768"), "name" : "Andre Adams", "year" : 1975 }
{ "_id" : ObjectId("589417583b9f8fd81c8b9769"), "name" : "Brendon McCullum", "year" : 1981 }

Similar to using find().limit(1),  equivalent –

>db.information.findOne()
{
"_id" : ObjectId("5894174f3b9f8fd81c8b9768"),
"name" : "Andre Adams",
"year" : 1975
}

What if you want to conditionally find documents?

> db.information.find({year:{$lt:1981}})
{ "_id" : ObjectId("5894174f3b9f8fd81c8b9768"), "name" : "Andre Adams", "year" : 1975 }
{ "_id" : ObjectId("5894176c3b9f8fd81c8b976b"), "name" : "Chris Cairns", "year" : 1970 }
{ "_id" : ObjectId("589417723b9f8fd81c8b976c"), "name" : "Chris Harris", "year" : 1969 }

$lt is one of the many conditional operators in MongoDB. Here are the rest.

$lt - ' $lte - ' $gte - '>='
$ne - '!='
$in - 'is in array'
$nin - '! in array'

And how do we do an ‘equal to’ (==) query? Just match the value for the queried key:

> db.information.find({year:1992})
{ "_id" : ObjectId("5894174f3b9f8fd81c8b9768"), "name" : "Andre Adams", "year" : 1975 }

We can even use regular expressions in our queries!

> db.information.find({name:{$regex: /Chri|dre/i}})
{ "_id" : ObjectId("5894174f3b9f8fd81c8b9768"), "name" : "Andre Adams", "year" : 1975 }
{ "_id" : ObjectId("5894176c3b9f8fd81c8b976b"), "name" : "Chris Cairns", "year" : 1970 }
{ "_id" : ObjectId("589417723b9f8fd81c8b976c"), "name" : "Chris Harris", "year" : 1969 }

Now let’s try a more complicated use of regex in a query.

> var names = ['Andre', 'Chri']
> names = names.join('|');
> var re = new RegExp(names, 'i')
> db.information.find({name:{$regex: re}})
{ "_id" : ObjectId("5894174f3b9f8fd81c8b9768"), "name" : "Andre Adams", "year" : 1975 }
{ "_id" : ObjectId("5894176c3b9f8fd81c8b976b"), "name" : "Chris Cairns", "year" : 1970 }
{ "_id" : ObjectId("589417723b9f8fd81c8b976c"), "name" : "Chris Harris", "year" : 1969 }

What if you want to get only some fields in the result?

> db.information.find({year:{'$lt':1994}}, {name:true})
{"_id" : ObjectId("5894174f3b9f8fd81c8b9768"), "name" : "Andre Adams" }
{ "_id" : ObjectId("589417583b9f8fd81c8b9769"), "name" : "Brendon McCullum" }
{ "_id" : ObjectId("5894175e3b9f8fd81c8b976a"), "name" : "Carl Cachopa" }
{ "_id" : ObjectId("5894176c3b9f8fd81c8b976b"), "name" : "Chris Cairns" }
{ "_id" : ObjectId("589417723b9f8fd81c8b976c"), "name" : "Chris Harris" }
To get more than one exclusive fields 
 db.information.find({year:{'$lt':1994}}, {name:true, father'sname:true}). 

The _id field is always returned by default.

To hide it use following –

> db.information.find({year:{'$lt':1994}}, {_id:false})

Try to hide name from the o/p

> db.information.find({year:{'$lt':1994}}, {name:false})
{ "_id" : ObjectId("5894174f3b9f8fd81c8b9768"), "year" : 1975 }
{ "_id" : ObjectId("589417583b9f8fd81c8b9769"), "year" : 1981 }
{ "_id" : ObjectId("5894175e3b9f8fd81c8b976a"), "year" : 1986 }
{ "_id" : ObjectId("5894176c3b9f8fd81c8b976b"), "year" : 1970 }
{ "_id" : ObjectId("589417723b9f8fd81c8b976c"), "year" : 1969 }

BTW field inclusion and exclusion cannot be used together.

Dot Notation -: so let’s create some documents in a collection named ‘articles’:

> db.articles.insert({title:'The Amazingness of MongoDB', meta:{author:'Mike Vallely', date:1321958582668, likes:23, tags:['mongo', 'amazing', 'mongodb']}, comments:[{by:'Steve', text:'Amazing article'}, {by:'Dave', text:'Thanks a ton!'}]})

> db.articles.insert({title:'Mongo Business', meta:{author:'Chad Muska', date:1321958576231, likes:10, tags:['mongodb', 'business', 'mongo']}, comments:[{by:'Taylor', text:'First!'}, {by:'Rob', text:'I like it'}]})

> db.articles.insert({title:'MongoDB in Mongolia', meta:{author:'Ghenghiz Khan', date:1321958598538, likes:75, tags:['mongo', 'mongolia', 'ghenghiz']}, comments:[{by:'Alex', text:'Dude, it rocks'}, {by:'Steve', text:'The best article ever!'}]})

Note the dot notation

> db.articles.find({'meta.author':'Chad Muska'})
> db.articles.find({'meta.likes':{$gt:10}})

Searching array:

> db.articles.find({'meta.tags':'mongolia'})

When the key is an array, the database looks for the object right in the array.

> db.articles.find({'comments.by':'Steve'})

Refer to array indexes:

> db.articles.find({'comments.0.by':'Steve'})

Always remember that a quoted number is a string, and is not the same as the actual number. For example:

> db.employee.find({salary:100})

is totally different from

> db.employee.find({salary:'100'})

Try following two –

> db.information.find('this.year > 1971 && this.name != "Andre Adams"')

 

> db.information.find({year:{$gt:1971}, name:{$ne:'Andre Adams'}})

Both above are same.

MongoDB has another operator called $where using which you can perform SQL’s WHERE-like operations.

> db.information.find({$where: 'this.year > 1971'})
{ "_id" : ObjectId("5894174f3b9f8fd81c8b9768"), "name" : "Andre Adams", "year" : 1975 }
{ "_id" : ObjectId("589417583b9f8fd81c8b9769"), "name" : "Brendon McCullum", "year" : 1981 }
{ "_id" : ObjectId("5894175e3b9f8fd81c8b976a"), "name" : "Carl Cachopa", "year" : 1986 }

and

> db.information.find({name:'Andre Adams', $where: 'this.year > 1970'})
{ "_id" : ObjectId("5894174f3b9f8fd81c8b9768"), "name" : "Andre Adams", "year" : 1975 }

 

Update data in MongoDB

Document replacement-

> db.information.update({name:"Andre Adams"}, {father:'Stephen Adams'})

 

Appending –

 

> db.information.update({name:"Andre Adams"}, {'$set':{father:'Stephen Adams', players:['Crickets', 'Hockey']}})

Update an array

> db.information.update({name:"Andre Adams"}, {'$push':{players:'Kabbadi'}})
> db.information.update({name:"Andre Adams"}, {'$push':{players:'Tennis'}})

Eh, we need to remove something from the cast array. We do it this way:

db.information.update({name:"Andre Adams"}, {'$pull':{players:'Kabbadi'}})

Delete data in MongoDB

 

> db.information.update({name:'Andre Adams'}, {$unset:{players:1}})

Delete a field from all the document of a collection:

> db.information.update({$unset:{players:1}}, false, true)

The false parameter if for upsert option, true is for multiple option. We set multiple option to true because we want to delete them all from the collection.

Delete all documents from a collection with name Andre Adams

> db.information.remove({name:'Andre Adams'})

Drop all the documents

> db.information.remove()

The above command truncates the collection.

Delete / drop a collection-

> db.information.drop()

To delete a database select the database and call the db.dropDatabase() on it:

> use information
> db.dropDatabase()
{"dropped" : "information", "ok" : 1 }

Get the count of Documents / Records

 

> db.information.count({})

This will return the total number of documents in the collection named information with the value of year more than 1970:

> db.information.count({year:{$gt:1990})

 

 

Screen Shot 2017-02-03 at 11.32.13 AM

Steps to Verify Hadoop/Elasticsearch/ActiveMQ/Cassandra Installation in One Box Setup VM

Steps to Verify Hadoop/Elasticsearch/ActiveMQ/Cassandra Installation in One Box Setup VM

Hadoop Installation

  •     Start Virtual Box, choose the machine you prepared in earlier step and click on the “Start” button ( green colour ).

screen-shot-2016-12-09-at-11-08-22-am

  • Please login as user Hadoop ( user id hduser), if asked for please enter password ‘abcd1234’

Screen Shot 2017-01-27 at 12.41.46 AM

  • Click on the Ubuntu on the top-left corner and look for terminal and click on the terminal

screen-shot-2016-12-09-at-11-10-51-am

  • Once the terminal is up and running it should look similar to following –

Screen Shot 2017-01-27 at 12.42.28 AM

  • Go to home directory and take a look on the directory presents
    • cd /home/hduser
    • ‘pwd’ command should show path as ‘/home/hduser’.
    • execute ‘ls -lart’ to take a look on the files and directory in general.
  • Close already running applications
    • /home/hduser/stop_all.sh
  • Start hadoop
    • /home/hduser/start_hadoop.sh
  • Confirm that service is running successfully or not
    • run ‘jps’ – you should see something similar to following –

screen-shot-2016-12-09-at-12-44-15-pm

  • Run wordcount program by using following command –
    • /home/hduser/run_helloword.sh
  • At the end you should see something similar –

screen-shot-2016-12-09-at-11-44-33-am

  • Check if the output files have been generated
  • hadoop dfs -ls /user/hduser/output     – you should see something similar to below screenshot

screen-shot-2016-12-09-at-11-46-35-am

  • Get the contents of the output files ( similar to following ) –
    • hadoop dfs -cat /user/hduser/output/part-r-00000

screen-shot-2016-12-09-at-11-48-22-am

  • Finally shutdown the hadoop services
    • /home/hduser/stop_hadoop.sh

Elasticsearch Installation

  • Close already running applications
    • /home/hduser/stop_all.sh
  • Start Elasticsearch –
    • /home/hduser/start_elasticsearch.sh
    • tail /home/hduser/elastic123.log
      • You should see some messages ( it should not have any ERROR ) in the last you may something similar –

screen-shot-2016-12-09-at-12-22-59-pm

  • Verify Elasticsearch instance
    • Open browser ( firefox )
    • goto http://localhost:9200
    • You should see following output

screen-shot-2016-12-09-at-12-26-20-pm

 

  • Start Kibana –
    • /home/hduser/start_kibana.sh
    • tail /home/hduser/kibana123.log
  • You should see some messages ( it should not have any ERROR ) in the last you may something similar –

 

Screen Shot 2017-01-27 at 1.04.45 AM

  • Verify Kibana instance
    • Open browser ( firefox )
    • goto http://localhost:5601/app/kibana#
    • You should see similar output

screen-shot-2016-12-16-at-1-26-42-am

  • Shutdown Elasticsearch and Kibana
    • /home/hduser/stop_elasticsearch.sh
    • /home/hduser/stop_kibana.sh

ActiveMQ Installation

  • Close already running applications
    • /home/hduser/stop_all.sh
  • Start ActiveMQ
    • /home/hduser/start_activemq.sh
  • Run validation test – send messages
    • cd /home/hduser/activemq-5.14.3
    • /home/hduser/activemq-5.14.3/send_message.sh
  • See following output on the screen

Screen Shot 2017-01-27 at 1.19.50 AM

  • Continue – receive messages
    • cd /home/hduser/activemq-5.14.3
    • /home/hduser/activemq-5.14.3/receive_message.sh
  • See following output on the screen

Screen Shot 2017-01-27 at 1.24.01 AM

  • Stop ActiveMQ
    • /home/hduser/stop_activemq.sh

 

Steps to Install VM for workshop

Steps to Install VM for workshop

  1. Install Virtual Box ( follow the steps ) on your computer.
  2. Once the Virtual Box is installed  the VM ( please see the license terms and conditions to use it for any commercial/business purpose) can be downloaded from the following link –
    • https://drive.google.com/open?id=0B2vqFbCIJR_USXBzNVZYZGloOVU
  3. The downloaded image name will be ‘woir.vdi’.
  4. Create a New Virtual Machine in VirtualBox using the uncompressed VDI file as the Hard Drive.
    • Run VirtualBox
    • Click the “New” button
    • Enter the name “Ubuntu-Vasvi”;
    • Select “Linux” with the OS Type dropdown
    • Select “Next”
    • On the “Memory” panel choose around 4 gb memory and click “Next”
    • On the Virtual Hard Disk” panel select “Existing” – this opens the VirtualBox Virtual Disk Manager”
    • Select the “Add” button.
    • Select the “harddisk which you have downloaded” ( in this case it should be Vijayawada.vdi) file.
    • Click “Select”
    • Click “Next”
    • Click “Finished”
    • Click RUN to Start the VM ( you should see Ubuntu running )
    • Use username as woir and password as abcd1234 whenever required.