Skip navigation.

Surachart Opun

Syndicate content
This page contains my experiences and my thoughts about Oracle and etc... about IT. Perhaps With another way in my life.
Updated: 15 hours 14 min ago

Packt - The $5 eBook Bonanza is here!

Tue, 2014-12-23 00:38
 The $5 eBook Bonanza is here!Spread out news for people who are interested in reading IT books. The $5 eBook Bonanza is here! You will be able to get any Packt eBook or Video for just $5 until January 6th 2015.Written By: Surachart Opun http://surachartopun.com
Categories: DBA Blogs

How Linux Works, 2nd Edition What Every Superuser Should Know by Brian Ward; No Starch Press

Sun, 2014-11-30 08:23
Everyone knows about Linux. It's a popular operating system that is the software on a computer that enables applications and the computer operator to access the devices on the computer to perform desired functions.
You can read more on link what I pointed to it. For me, Linux is a great operating system that I can use it as Desktop and Server. I have used it over ten years. It's very interesting operation system. I have used/worked it with many Open Source Software such as Apache HTTP, Bind, Sendmail, Postfix, Cyrus Imap, Samba and etc. It's operating system that I can play with programming languages as C, PHP, JAVA, Python, Perl and etc. I don't wanna say "too much".
Today, I have a chance to pick up some... a book that was written about Linux - How Linux Works, 2nd Edition What Every Superuser Should Know by Brian Ward. It's a cool book that you can learn about Linux as Starter and Linux Administrator. You could learn some things you have never used, but find in this book. It's fun to learn. However, A book, it's not support every skills in Linux. You will learn
  • How Linux boots, from boot loaders to init implementations (systemd, Upstart, and System V)
  • How the kernel manages devices, device drivers, and processes
  • How networking, interfaces, firewalls, and servers work
  • How development tools work and relate to shared libraries
  • How to write effective shell scripts 
It might not be something too much for learning as you are expecting. However, It 's a good book that you can enjoy to read a book about Linux. There's easy to read and understanding in a book. It's for some people who are starting with Linux and Linux Administrators who are enjoying to learn and want to get something new that can use in their fields.

Written By: Surachart Opun http://surachartopun.com
Categories: DBA Blogs

Oracle Database 12C Certified Professional SQL Foundations by Steve Ries; Infinite Skills

Wed, 2014-11-19 01:36
How to become Become an Oracle Certified Associate? That's a good question for some people who want to start working on Oracle Database and get first Oracle Certification for their work life. Step 1 - Pass one SQL Exam: Oracle Database 12c: SQL Fundamentals 1Z0-061 or Oracle Database 11g: SQL Fundamentals I 1Z0-051 or Oracle Database SQL Expert 1Z0-047. Step 2 - Pass Exam Oracle Database 12c: Installation and Administration 1Z0-062.
With Step 1 -  Oracle Database 12c: SQL Fundamentals 1Z0-061 exam that you should have skills SQL SELECT statement, subqueries, data manipulation, and data definition language. I was not encouraging to learn SQL SELECT statement, subqueries, data manipulation, data definition language and things like that, but in the real-world for working with Oracle Database you have to know them. You can read about them in Oracle Documents, Oracle Learning Library, Oracle University and the Internet. If you are looking for some video training that instruct you for Oracle Database SQL Foundations such as  sql-select, dml, ddl and etc. I mention Oracle Database - 12C Certified Professional SQL Foundations by Steve Ries. I watched it on O'reilly,that is very fast for streaming and downloading. The "Oracle Database - 12C Certified Professional SQL Foundations" video training, that helps learning Oracle Foundations looks easier (watch and do by own) and instructor spoke each topic very clear and easy for listening.

FYI, You can watch some free video.

Video Training: Oracle Database - 12C Certified Professional SQL Foundation
Instructor: Steve RiesWritten By: Surachart Opun http://surachartopun.com
Categories: DBA Blogs

Think Stats, 2nd Edition Exploratory Data Analysis By Allen B. Downey; O'Reilly Media

Mon, 2014-11-17 08:15
Lots of Python with data analysis books. This might be a good one that is able to help readers perform statistical analysis with programs written in Python. Think Stats, 2nd Edition Exploratory Data Analysis by Allen B. Downey(@allendowney).
This second edition of Think Stats includes the chapters from the first edition, many of them substantially revised, and new chapters on regression, time series analysis, survival analysis, and analytic methods. Additional, It uses uses pandas, SciPy, or StatsModels in Python. Author developed this book using Anaconda from Continuum Analytics. Readers should use it, that will easy from them. Anyway, I tested on Ubuntu and installed pandas, NumPy, SciPy, StatsModels, and matplotlib packages. This book has 14 chapters relate with processes that author works with a dataset. It's for intermediate reader. So, Readers should know how to program (In a book uses Python), and skill in mathematical + statistical.
Each chapter includes exercises that readers can practice and get more understood. Free Sampler
  • Develop an understanding of probability and statistics by writing and testing code.
  • Run experiments to test statistical behavior, such as generating samples from several distributions.
  • Use simulations to understand concepts that are hard to grasp mathematically.
  • Import data from most sources with Python, rather than rely on data that’s cleaned and formatted for statistics tools.
  • Use statistical inference to answer questions about real-world data.
surachart@surachart:~/ThinkStats2/code$ pwd
/home/surachart/ThinkStats2/code
surachart@surachart:~/ThinkStats2/code$ ipython notebook  --ip=0.0.0.0 --pylab=inline &
[1] 11324
surachart@surachart:~/ThinkStats2/code$ 2014-11-17 19:39:43.201 [NotebookApp] Using existing profile dir: u'/home/surachart/.config/ipython/profile_default'
2014-11-17 19:39:43.210 [NotebookApp] Using system MathJax
2014-11-17 19:39:43.234 [NotebookApp] Serving notebooks from local directory: /home/surachart/ThinkStats2/code
2014-11-17 19:39:43.235 [NotebookApp] The IPython Notebook is running at: http://0.0.0.0:8888/
2014-11-17 19:39:43.236 [NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
2014-11-17 19:39:43.236 [NotebookApp] WARNING | No web browser found: could not locate runnable browser.
2014-11-17 19:39:56.120 [NotebookApp] Connecting to: tcp://127.0.0.1:38872
2014-11-17 19:39:56.127 [NotebookApp] Kernel started: f24554a8-539f-426e-9010-cb3aa3386613
2014-11-17 19:39:56.506 [NotebookApp] Connecting to: tcp://127.0.0.1:43369
2014-11-17 19:39:56.512 [NotebookApp] Connecting to: tcp://127.0.0.1:33239
2014-11-17 19:39:56.516 [NotebookApp] Connecting to: tcp://127.0.0.1:54395
Book: Think Stats, 2nd Edition Exploratory Data Analysis
Author: Allen B. Downey(@allendowney)Written By: Surachart Opun http://surachartopun.com
Categories: DBA Blogs

Getting Started with Impala Interactive SQL for Apache Hadoop by John Russell; O'Reilly Media

Sat, 2014-10-25 11:52
Impala is open source and a query engine that runs on Apache Hadoop. With Impala, you can query data, whether stored in HDFS or Apache HBase – including SELECT, JOIN, and aggregate functions – in real time. If you are looking for a book getting start with it - Getting Started with Impala Interactive SQL for Apache Hadoop by John Russell (@max_webster). Assist readers to write, tune, and port SQL queries and other statements for a Big Data environment, using Impala. The SQL examples in this book start from a simple base for easy comprehension, then build toward best practices that demonstrate high performance and scalability. For readers, you can download QuickStart VMs and install. After that, you can use it with examples in a book.
In a book, it doesn't assist readers to install Impala or how to solve the issue from installation or configuration. It has 5 chapters and not much for the number of pages, but enough to guide how to use Impala (Interactive SQL) and has good examples. With chapter 5 - Tutorials and Deep Dives, that it's highlight in a book and the example in a chapter that is very useful.
Free Sampler.

This book assists readers.
  • Learn how Impala integrates with a wide range of Hadoop components
  • Attain high performance and scalability for huge data sets on production clusters
  • Explore common developer tasks, such as porting code to Impala and optimizing performance
  • Use tutorials for working with billion-row tables, date- and time-based values, and other techniques
  • Learn how to transition from rigid schemas to a flexible model that evolves as needs change
  • Take a deep dive into joins and the roles of statistics
[test01:21000] > select "Surachart Opun" Name,  NOW() ;
Query: select "Surachart Opun" Name,  NOW()
+----------------+-------------------------------+
| name           | now()                         |
+----------------+-------------------------------+
| Surachart Opun | 2014-10-25 23:34:03.217635000 |
+----------------+-------------------------------+
Returned 1 row(s) in 0.14sBook: Getting Started with Impala Interactive SQL for Apache HadoopAuthor: John Russell (@max_webster)Written By: Surachart Opun http://surachartopun.com
Categories: DBA Blogs

Learning Spark Lightning-Fast Big Data Analytics by Holden Karau, Andy Konwinski, Patrick Wendell, Matei Zaharia; O'Reilly Media

Sat, 2014-10-18 12:45
Apache Spark started as a research project at UC Berkeley in the AMPLab, which focuses on big data analytics. Spark is an open source cluster computing platform designed to be fast and general-purpose for data analytics - It's both fast to run and write. Spark provides primitives for in-memory cluster computing: your job can load data into memory and query it repeatedly much quicker than with disk-based systems like Hadoop MapReduce. Users can write applications quickly in Java, Scala or Python. In additional, it's easy to run standalone or on EC2 or Mesos. It can read data from HDFS, HBase, Cassandra, and any Hadoop data source.
If you would like a book about Spark - Learning Spark Lightning-Fast Big Data Analytics by Holden Karau, Andy Konwinski, Patrick Wendell, Matei Zaharia. It's a great book for who is interested in Spark development and starting with it. Readers will learn how to express MapReduce jobs with just a few simple lines of Spark code and more...
  • Quickly dive into Spark capabilities such as collect, count, reduce, and save
  • Use one programming paradigm instead of mixing and matching tools such as Hive, Hadoop, Mahout, and S4/Storm
  • Learn how to run interactive, iterative, and incremental analyses
  • Integrate with Scala to manipulate distributed datasets like local collections
  • Tackle partitioning issues, data locality, default hash partitioning, user-defined partitioners, and custom serialization
  • Use other languages by means of pipe() to achieve the equivalent of Hadoop streaming
With Early Release - 7 chapters. Explained Apache Spark overview, downloading and commands that should know, programming with RDDS (+ more advance) as well as working with Key-Value Pairs, etc. Easy to read and Good examples in a book. For people who want to learn Apache Spark or use Spark for Data Analytic. It's a book, that should keep in shelf.

Book: Learning Spark Lightning-Fast Big Data Analytics
Authors: Holden KarauAndy KonwinskiPatrick WendellMatei ZahariaWritten By: Surachart Opun http://surachartopun.com
Categories: DBA Blogs

Using Flume - Flexible, Scalable, and Reliable Data Streaming by Hari Shreedharan; O'Reilly Media

Thu, 2014-10-09 02:37
Hadoop is an open-source software framework for storage and large-scale processing of data-sets on clusters of commodity hardware. How to deliver log to Hadoop HDFS. Apache Flume is open source to integrate with HDFS, HBASE and it's a good choice to implement for log data real-time collection from front end or log data system.
Apache Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data.It uses a simple data model. Source => Channel => Sink
It's a good time to introduce a good book about Flume - Using Flume - Flexible, Scalable, and Reliable Data Streaming by Hari Shreedharan (@harisr1234). It was written with 8 Chapters: giving basic about Apache Hadoop and Apache HBase, idea for Streaming Data Using Apache Flume, about Flume Model (Sources, Channels, Sinks), and some moew for Interceptors, Channel Selectors, Sink Groups, and Sink Processors. Additional, Getting Data into Flume* and Planning, Deploying, and Monitoring Flume.

This book was written about how to use Flume. It's very good to guide about Apache Hadoop and Apache HBase before starting about Flume Data flow model. Readers should know about java code, because they will find java code example in a book and easy to understand. It's a good book for some people who want to deploy Apache Flume and custom components.
Author separated each Chapter for Flume Data flow model. So, Readers can choose each chapter to read for part of Data flow model: reader would like to know about Sink, then read Chapter 5 only until get idea. In addition, Flume has a lot of features, Readers will find example for them in a book. Each chapter has references topic, that readers can use it to find out more and very easy + quick to use in Ebook.
With Illustration in a book that is helpful with readers to see Big Picture using Flume and giving idea to develop it more in each System or Project.
So, Readers will be able to learn about operation and how to configure, deploy, and monitor a Flume cluster, and customize examples to develop Flume plugins and custom components for their specific use-cases.
  • Learn how Flume provides a steady rate of flow by acting as a buffer between data producers and consumers
  • Dive into key Flume components, including sources that accept data and sinks that write and deliver it
  • Write custom plugins to customize the way Flume receives, modifies, formats, and writes data
  • Explore APIs for sending data to Flume agents from your own applications
  • Plan and deploy Flume in a scalable and flexible way—and monitor your cluster once it’s running
Book: Using Flume - Flexible, Scalable, and Reliable Data Streaming
Author: Hari ShreedharanWritten By: Surachart Opun http://surachartopun.com
Categories: DBA Blogs

rsyslog: Send logs to Flume

Mon, 2014-10-06 04:12
Good day for learning something new. After read Flume book, that something popped up in my head. Wanted to test "rsyslog" => Flume => HDFS. As we know, forwarding log to other systems. We can set rsyslog:
*.* @YOURSERVERADDRESS:YOURSERVERPORT ## for UDP
*.* @@YOURSERVERADDRESS:YOURSERVERPORT ## for TCPFor rsyslog:
[root@centos01 ~]# grep centos /etc/rsyslog.conf
*.* @centos01:7777Came back to Flume, I used Simple Example for reference and changed a bit. Because I wanted it write to HDFS.
[root@centos01 ~]# grep "^FLUME_AGENT_NAME\="  /etc/default/flume-agent
FLUME_AGENT_NAME=a1
[root@centos01 ~]# cat /etc/flume/conf/flume.conf
# example.conf: A single-node Flume configuration
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
#a1.sources.r1.type = netcat
a1.sources.r1.type = syslogudp
a1.sources.r1.bind = 0.0.0.0
a1.sources.r1.port = 7777
# Describe the sink
#a1.sinks.k1.type = logger
a1.sinks.k1.type = hdfs
a1.sinks.k1.hdfs.path = hdfs://localhost:8020/user/flume/syslog/%Y/%m/%d/%H/
a1.sinks.k1.hdfs.fileType = DataStream
a1.sinks.k1.hdfs.writeFormat = Text
a1.sinks.k1.hdfs.batchSize = 10000
a1.sinks.k1.hdfs.rollSize = 0
a1.sinks.k1.hdfs.rollCount = 10000
a1.sinks.k1.hdfs.filePrefix = syslog
a1.sinks.k1.hdfs.round = true


# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
[root@centos01 ~]# /etc/init.d/flume-agent start
Flume NG agent is not running                              [FAILED]
Starting Flume NG agent daemon (flume-agent):              [  OK  ]Tested to login by ssh.
[root@centos01 ~]#  tail -0f  /var/log/flume/flume.log
06 Oct 2014 16:35:40,601 INFO  [hdfs-k1-call-runner-0] (org.apache.flume.sink.hdfs.BucketWriter.doOpen:208)  - Creating hdfs://localhost:8020/user/flume/syslog/2014/10/06/16//syslog.1412588139067.tmp
06 Oct 2014 16:36:10,957 INFO  [hdfs-k1-roll-timer-0] (org.apache.flume.sink.hdfs.BucketWriter.renameBucket:427)  - Renaming hdfs://localhost:8020/user/flume/syslog/2014/10/06/16/syslog.1412588139067.tmp to hdfs://localhost:8020/user/flume/syslog/2014/10/06/16/syslog.1412588139067
[root@centos01 ~]# hadoop fs -ls hdfs://localhost:8020/user/flume/syslog/2014/10/06/16/syslog.1412588139067
14/10/06 16:37:31 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 1 items
-rw-r--r--   1 flume supergroup        299 2014-10-06 16:36 hdfs://localhost:8020/user/flume/syslog/2014/10/06/16/syslog.1412588139067
[root@centos01 ~]#
[root@centos01 ~]#
[root@centos01 ~]# hadoop fs -cat hdfs://localhost:8020/user/flume/syslog/2014/10/06/16/syslog.1412588139067
14/10/06 16:37:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
sshd[20235]: Accepted password for surachart from 192.168.111.16 port 65068 ssh2
sshd[20235]: pam_unix(sshd:session): session opened for user surachart by (uid=0)
su: pam_unix(su-l:session): session opened for user root by surachart(uid=500)
su: pam_unix(su-l:session): session closed for user rootLook good... Anyway, It needs to adapt more...



Written By: Surachart Opun http://surachartopun.com
Categories: DBA Blogs