Archive for the 'Apache Drill' Category

Query HBase using Apache Drill

Hi guys,

Today, we will see how to query the hbase using apache drill sql interface. Apache drill provides ANSI SQL language for query data from any data source. We will use it to query hbase.

If you don’t have hbase and apache drill on your machine, then follow the below link before going further.

a) https://ranafaisal.wordpress.com/2015/05/13/hbase-insallation-on-ubuntu-14-04/

b) https://ranafaisal.wordpress.com/2015/05/13/install-apache-drill-on-ubuntu-14-04/

 

Now lets start

1) Run apache drill shell “bin/sqlline -u jdbc:drill:zk=local

2) Open WEB UI to enable MongoDB driver, URL of web UI is “http://localhost:8047

3) Now click on storage link

4) In storage , find hbase and click on update

5) Add the following info in the textbox

{
“type”: “hbase”,
“config”: {
“hbase.zookeeper.quorum”: “localhost”,
“hbase.zookeeper.property.clientPort”: “2181”
},
“enabled”: true
}

6) Then click Update button to enable this driver

7) On apache drill shell, run “show databases;“, this will also display the hbase databases too.

8) Now lets query clicks table, “select * from hbase.clicks;“, it will display all clicks information.

9) You need to cast hbase column to varchar to see its correct values, like

SELECT CAST(clicks.row_key as VarChar(20)), CAST(clicks.clickinfo.studentid as VarChar(20)), CAST (clicks.clickinfo.url as VarChar(20)), CAST (clicks.iteminfo.quantity as VarChar(20)), CAST (clicks.iteminfo.itemtype as VarChar(20)) FROM hbase.clicks;

10) Now lets join students and clicks table

select cast(s.account.name as varchar(20)) as name, cast(c.clickinfo.url as varchar(100)) as url from hbase.students as s 

join hbase.clicks as c

on cast(s.row_key as varchar(20)) = cast(c.clickinfo.studentid as varchar(20));

 

Cheers

Query MongoDB using Apache Drill

Hi guys,

Today, we will see how to query the mongodb using apache drill sql interface. Apache drill provides ANSI SQL language for query data from any data source. We will use it to query mongodb.

If you don’t have mongodb and apache drill on your machine, then follow the below link before going further.

a) https://ranafaisal.wordpress.com/2015/05/13/install-mongodb-on-ubuntu-14-04/

b) https://ranafaisal.wordpress.com/2015/05/13/install-apache-drill-on-ubuntu-14-04/

 

Now lets start

1) Run apache drill shell “bin/sqlline -u jdbc:drill:zk=local

2) Open WEB UI to enable MongoDB driver, URL of web UI is “http://localhost:8047

3) Now click on storage link

4) In storage , find mongodb and click on update

5) Add the following info in the textbox

{
“type”: “mongo”,
“connection”: “mongodb://localhost:27017/”,
“enabled”: true
}

6) Then click Update button to enable this driver

7) On apache drill shell, run “show databases;“, this will also display the mongodb databases too.

8) Now lets query zips collection, “select * from mongo.mydb.zips;“, it will display all zips codes save in zips collection.

Cheers

Install Apache Drill on Ubuntu 14.04

Hi guys,

Today, we are going to install Apache Drill. It is a framework which allows to run ad-hoc queries on any data sources. These data sources can be mongodb, hbase, csv file, json file, etc. Lets start with its installation

First, Install Oracle JDK 1.7 on your machine, for this follow this link [https://www.digitalocean.com/community/tutorials/how-to-install-java-on-ubuntu-with-apt-get]

 

1) Download it on ubuntu using “wget http://getdrill.org/drill/download/apache-drill-0.9.0.tar.gz

2) Create directory for its installations “sudo mkdir -p /opt/drill

3) Unzip it into its installation directory “sudo tar -xvzf apache-drill-0.9.0.tar.gz -C /opt/drill

4) Open its directory “cd /opt/drill/apache-drill-0.9.0

5) Run it using “bin/sqlline -u jdbc:drill:zk=local

6) Lets query some json files, download json file [http://media.mongodb.org/zips.json?_ga=1.139282992.2048111731.1429111258] and save it to “/home/yourusername/zips.json

7) Query this file using “SELECT * from dfs.`/home/yourusername/zips.json`

Cheers