Introduction to Oracle Big Data Cloud Service – Compute Edition (Part IV) –...
This is my forth blog post about Oracle Big Data Cloud Service – Compute Edition. In my previous blog posts, I showed how we can create a big data cloud service compute edition on Oracle Cloud, which...
View ArticleIntroduction to Oracle Big Data Cloud Service – Compute Edition (Part V) – Pig
This is my last blog post of my introduction series for Oracle Big Data Cloud Service – Compute Edition. In this blog post, I’ll mention “Apache Pig”. It’s a tool/platform created by “Yahoo!” to...
View ArticleIntroduction to Oracle Big Data Cloud Service – Compute Edition (Part VI) – Hive
I though I would stop writing about “Oracle Big Data Cloud Service – Compute Edition” after my fifth blog post, but then I noticed that I didn’t mention about the Apache Hive, another important...
View ArticleOracle BDCSCE Upgraded: Zeppelin 0.7 and Spark 2.1
Last week, Oracle Big Data Cloud Service – Compute Edition was upgraded from 17.2.5 to 17.3.1-20. I do not know if the new version is still in testing phase and available to only trial users, but...
View ArticleOracle Big Data Cloud Service CE: Working with Hive, Spark and Zeppelin 0.7
In my previous post, I mentioned that Oracle Big Data Cloud Service – Compute Edition started to come with Zeppelin 0.7 and the version 0.7 does not have HIVE interpreter. It means we won’t be able to...
View ArticlePython for Data Science – Importing CSV, JSON, Excel Using Pandas
Although I think that R is the language for Data Scientists, I still prefer Python to work with data. In this blog post, I will show you how easy to import data from CSV, JSON and Excel files using...
View ArticlePython for Data Science – Importing XML to Pandas DataFrame
In my previous post, I showed how easy to import data from CSV, JSON, Excel files using Pandas package. Another popular format to exchange data is XML. Unfortunately Pandas package does not have a...
View ArticlePython for Data Science – Importing table data from a web page
This is another blog post about using Pandas package. This time, I’ll show you how to import table data from a web page. To be able to get table data, there should be a table defined with table tags...
View ArticleOracle Cloud Day Istanbul
Yesterday, I spoke at the Oracle Cloud Day Istanbul. It was an amazing event. The venue (Swissotel the Bosphorus) was great, the conference rooms were comfortable, the presentations are attractive and...
View ArticleUsing Zeppelin to Access MySQL
If you want to access MySQL Cloud Service using Zeppelin of Oracle Big Data Cloud Service Compute Edition (BDCSCE), you can use Spark DataFrames or Zeppelin interpreters. In this blog post, I’ll show...
View ArticleUsing Spark to join data from CSV and MySQL Table
Yesterday, I explained how we can access MySQL database from Zeppelin which comes with Oracle Big Data Cloud Service Compute Edition (BDCSCE). Although we can use Zeppelin to access MySQL, we still...
View ArticleIntroduction to Apache Cassandra
On Friday, I gave a presentation about Apache Cassandra at Big Talk event organized by Komtaş Information Management company. Cassandra is a top level Apache project which is born at Facebook. It is a...
View ArticleBuild a Cassandra Cluster on Docker
In this blog post, I’ll show how we can build a three-node cassandra cluster on Docker for testing. I’ll use official cassandra images instead of creating my own images, so all process will take only a...
View ArticleUsing Spark to Process Data From Cassandra for Analytics
After my presentation about Apache Cassandra, most people asked if they can run analytical queries on Cassandra, and how they can integrate Spark with Cassandra. So I decided to write a blog post to...
View ArticleIntroduction to Apache Spark with Python
Today, I spoke about “Apache Spark with Python” at Big Talk #2 meet-up in Istanbul Teknokent ARI-3, another event organized by Komtas for big data community. We had almost full room. Mine was the last...
View ArticlePySpark Examples #1: Grouping Data from CSV File (Using RDDs)
During my presentation about “Spark with Python”, I told that I would share example codes (with detailed explanations). So this is my first example code. In this code, I read data from a CSV file to...
View ArticlePySpark Examples #2: Grouping Data from CSV File (Using DataFrames)
I continue to share example codes related with my “Spark with Python” presentation. In my last blog post, I showed how we use RDDs (the core data structures of Spark). This time, I will use DataFrames...
View ArticlePySpark Examples #3-4: Spark SQL Module
In this blog post, I’ll share example #3 and #4 from my presentation to demonstrate capabilities of Spark SQL Module. As I already explained in my previous blog posts, Spark SQL Module provides...
View ArticlePySpark Examples #5: Discretized Streams (DStreams)
This is the fourth blog post which I share sample scripts of my presentation about “Apache Spark with Python“. Spark supports two different way for streaming: Discretized Streams (DStreams) and...
View ArticleAn Interesting Problem with ODI: Unable to retrieve user GUID
One of my customers had a problem about logging in to Oracle Data Integrator (ODI) Studio. Their ODI implementation is configured to use external authentication (Microsoft Active Directory). The...
View Article