Archive for the 'Big Data' Category

Query HBase using Apache Drill

Hi guys,

Today, we will see how to query the hbase using apache drill sql interface. Apache drill provides ANSI SQL language for query data from any data source. We will use it to query hbase.

If you don’t have hbase and apache drill on your machine, then follow the below link before going further.

a) https://ranafaisal.wordpress.com/2015/05/13/hbase-insallation-on-ubuntu-14-04/

b) https://ranafaisal.wordpress.com/2015/05/13/install-apache-drill-on-ubuntu-14-04/

 

Now lets start

1) Run apache drill shell “bin/sqlline -u jdbc:drill:zk=local

2) Open WEB UI to enable MongoDB driver, URL of web UI is “http://localhost:8047

3) Now click on storage link

4) In storage , find hbase and click on update

5) Add the following info in the textbox

{
“type”: “hbase”,
“config”: {
“hbase.zookeeper.quorum”: “localhost”,
“hbase.zookeeper.property.clientPort”: “2181”
},
“enabled”: true
}

6) Then click Update button to enable this driver

7) On apache drill shell, run “show databases;“, this will also display the hbase databases too.

8) Now lets query clicks table, “select * from hbase.clicks;“, it will display all clicks information.

9) You need to cast hbase column to varchar to see its correct values, like

SELECT CAST(clicks.row_key as VarChar(20)), CAST(clicks.clickinfo.studentid as VarChar(20)), CAST (clicks.clickinfo.url as VarChar(20)), CAST (clicks.iteminfo.quantity as VarChar(20)), CAST (clicks.iteminfo.itemtype as VarChar(20)) FROM hbase.clicks;

10) Now lets join students and clicks table

select cast(s.account.name as varchar(20)) as name, cast(c.clickinfo.url as varchar(100)) as url from hbase.students as s 

join hbase.clicks as c

on cast(s.row_key as varchar(20)) = cast(c.clickinfo.studentid as varchar(20));

 

Cheers

Query MongoDB using Apache Drill

Hi guys,

Today, we will see how to query the mongodb using apache drill sql interface. Apache drill provides ANSI SQL language for query data from any data source. We will use it to query mongodb.

If you don’t have mongodb and apache drill on your machine, then follow the below link before going further.

a) https://ranafaisal.wordpress.com/2015/05/13/install-mongodb-on-ubuntu-14-04/

b) https://ranafaisal.wordpress.com/2015/05/13/install-apache-drill-on-ubuntu-14-04/

 

Now lets start

1) Run apache drill shell “bin/sqlline -u jdbc:drill:zk=local

2) Open WEB UI to enable MongoDB driver, URL of web UI is “http://localhost:8047

3) Now click on storage link

4) In storage , find mongodb and click on update

5) Add the following info in the textbox

{
“type”: “mongo”,
“connection”: “mongodb://localhost:27017/”,
“enabled”: true
}

6) Then click Update button to enable this driver

7) On apache drill shell, run “show databases;“, this will also display the mongodb databases too.

8) Now lets query zips collection, “select * from mongo.mydb.zips;“, it will display all zips codes save in zips collection.

Cheers

Install Apache Drill on Ubuntu 14.04

Hi guys,

Today, we are going to install Apache Drill. It is a framework which allows to run ad-hoc queries on any data sources. These data sources can be mongodb, hbase, csv file, json file, etc. Lets start with its installation

First, Install Oracle JDK 1.7 on your machine, for this follow this link [https://www.digitalocean.com/community/tutorials/how-to-install-java-on-ubuntu-with-apt-get]

 

1) Download it on ubuntu using “wget http://getdrill.org/drill/download/apache-drill-0.9.0.tar.gz

2) Create directory for its installations “sudo mkdir -p /opt/drill

3) Unzip it into its installation directory “sudo tar -xvzf apache-drill-0.9.0.tar.gz -C /opt/drill

4) Open its directory “cd /opt/drill/apache-drill-0.9.0

5) Run it using “bin/sqlline -u jdbc:drill:zk=local

6) Lets query some json files, download json file [http://media.mongodb.org/zips.json?_ga=1.139282992.2048111731.1429111258] and save it to “/home/yourusername/zips.json

7) Query this file using “SELECT * from dfs.`/home/yourusername/zips.json`

Cheers

Install MongoDB on Ubuntu 14.04

Hi guys,

Today, we will see how to install mongodb on your machine. Please follow the below steps

1) run this command “sudo apt-key adv –keyserver hkp://keyserver.ubuntu.com:80 –recv 7F0CEB10

2) Now run this command “echo “deb http://repo.mongodb.org/apt/ubuntu “$(lsb_release -sc)”/mongodb-org/3.0 multiverse” | sudo tee /etc/apt/sources.list.d/mongodb.list

3) We added the mongodb repository URL to our ubuntu repository links.

4) Now update it using “sudo apt-get update

5) Install mongodb using “sudo apt-get install mongodb-org

6) Start service “sudo service mongod start

7) Congratulations, mongodb is installed, now open the shell using “mongo

8) For shell commands read the following tutorial [http://docs.mongodb.org/manual/tutorial/getting-started-with-the-mongo-shell/]

9) Load dummy data for your experiments, download json file  [http://media.mongodb.org/zips.json?_ga=1.139282992.2048111731.1429111258]

10) Import it using “mongoimport –db mydb –collection zips –file zips.json

Cheers

Hbase Insallation on Ubuntu 14.04

Hi guys,

Today, I am going to install Hbase on my system. I am going to install it on my standalone machine without Hadoop. Lets start

1) Download and install Ubuntu 14.04 on your machine or in virtual machine

2) Install Oracle JDK 1.7 on your machine, for this follow this link [https://www.digitalocean.com/community/tutorials/how-to-install-java-on-ubuntu-with-apt-get]

3) Download tar file of hbase from this link [http://www.apache.org/dyn/closer.cgi/hbase/]

4) unzip it using “tar -xvf hbase-1.0.1-bin.tar.gz

5) Create directory using “sudo mkdir /usr/lib/hbase

6) Move your hbase folder to this directory using “mv hbase-1.0.1 /usr/lib/hbase/hbase-1.0.1

7) In hbase directory you will have hbase-env.sh inside conf directory, open it in any text editor

8) Search “export JAVA_HOME” and change it to this “export JAVA_HOME = /usr/lib/jvm/java-7-oracle“, save this file

9) Now set the hbase path in your enviornment variable using “gedit ~/.bashrc

10)  Add the below lines at the end of .bashrc file and save it

export HBASE_HOME=/usr/lib/hbase/hbase-1.0.1

export PATH=$PATH:$HBASE_HOME/bin

11) run the following command to make these changes effective “. ~/.bashrc

12) Now open conf/hbase-site.xml in text editor and add the below text in it

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>

<property>

<name>hbase.rootdir</name>

<value>file:///home/hduser/HBASE/hbase</value>

</property>

<property>

<name>hbase.zookeeper.property.dataDir</name>

<value>/home/hduser/HBASE/zookeeper</value>

</property>

</configuration>

13) Congratulations, you installed hbase on your system, now start it using "sudo bin/start-hbase.sh"
14) Open hbase shell using "sudo bin/hbase shell"

15) Insert sample data in hbase, please follow this link [https://cwiki.apache.org/confluence/display/DRILL/Querying+HBase]

16) To use hbase shell follow this link [http://akbarahmed.com/2012/08/13/hbase-command-line-tutorial/]

Cheers