Here’s a pet peeve of mine: Customers who don’t read the error messages. The usual symptom is a belief that there is just on error: “Doesn’t work”, and that all forms of “doesn’t work” are the same. So if you tried something, got an error, your changed something and you are still getting an error, nothing changed.
I hope everyone who reads this blog understand why this behavior makes any troubleshooting nearly impossible. So I won’t bother to explain why I find this so annoying and so self defeating. Instead, I’ll explain what can we, as developers, can do to improve the situation a bit. (OMG, did I just refer to myself as a developer? I do write code that is then used by customers, so I may as well take responsibility for it)
Here’s what I see as main reasons people don’t read error messages:
- Error message is so long that they don’t know where to start reading. Errors with multiple Java stack dumps are especially fun. Stack traces are useful only to people who look at the code, so while its important to get them (for support), in most cases your users don’t need to see all that very specific information.
- Many different errors lead to the same message. The error message simply doesn’t indicate what the error may be, because it can be one of many different things. I think Kerberos is the worst offender here, so many failures look identical. If this happens very often, you tune out the error message.
- The error is so technical and cryptic that it gives you no clue on where to start troubleshooting. “Table not Found” is clear. “Call to localhost failed on local exception” is not.
I spend a lot of time explaining to my customers “When <app X> says <this> it means that <misconfiguration> happened and you should <solution>”.
To get users to read error messages, I think error messages should be:
- Short. Single line or less.
- Clear. As much as possible, explain what went wrong in terms your users should understand.
- Actionable. There should be one or two actions that the user should take to either resolve the issue or gather enough information to deduce what happened.
I think Oracle are doing a pretty good job of it. Every one of their errors has an ID number, a short description, an explanation and a proposed solution. See here for example: http://docs.oracle.com/cd/B28359_01/server.111/b28278/e2100.htm#ORA-02140
If we don’t make our errors short, clear and actionable – we shouldn’t be surprised when our users simply ignore them and then complain that our app is impossible to use (or worse – don’t complain, but also don’t use our app).
A question that keeps popping up is “Should we use Kafka or Flume to load data to Hadoop clusters?”
This question implies that Kafka and Flume are interchangeable components. It makes as much sense to me as “Should we use cars or umbrellas?”. Sure, you can hide from the rain in your car and you can use your umbrella when moving from place to place. But in general, these are different tools intended for different use-cases.
Flume’s main use-case is to ingest data into Hadoop. It is tightly integrated with Hadoop’s monitoring system, file system, file formats, and utilities such a Morphlines. A lot of the Flume development effort goes into maintaining compatibility with Hadoop. Sure, Flume’s design of sources, sinks and channels mean that it can be used to move data between other systems flexibly, but the important feature is its Hadoop integration.
Kafka’s main use-case is a distributed publish-subscribe messaging system. Most of the development effort is involved with allowing subscribers to read exactly the messages they are interested in, and in making sure the distributed system is scalable and reliable under many different conditions. It was not written to stream data specifically for Hadoop, and using it to read and write data to Hadoop is significantly more challenging than it is in Flume.
Use Flume if you have an non-relational data sources such as log files that you want to stream into Hadoop.
Use Kafka if you need a highly reliable and scalable enterprise messaging system to connect many multiple systems, one of which is Hadoop.
Another Oozie tip blog post.
If you try to use Sqoop action in Oozie, you know you can use the “command” format, with the entire Sqoop configuration in a single line:
<pre><workflow-app name="sample-wf" xmlns="uri:oozie:workflow:0.1"> ... <action name="myfirsthivejob"> <sqoop xmlns="uri:oozie:sqoop-action:0.2"> <job-traker>foo:8021</job-tracker> <name-node>bar:8020</name-node> <command>import --connect jdbc:hsqldb:file:db.hsqldb --table TT --target-dir hdfs://localhost:8020/user/tucu/foo -m 1</command> </sqoop> <ok to="myotherjob"/> <error to="errorcleanup"/> </action> ... </workflow-app>
This is convenient, but can be difficult to read and maintain. I prefer using the “arg” syntax, with each argument in its own line:
<workflow-app name="sample-wf" xmlns="uri:oozie:workflow:0.1"> ... <action name="myfirsthivejob"> <sqoop xmlns="uri:oozie:sqoop-action:0.2"> <job-traker>foo:8021</job-tracker> <name-node>bar:8020</name-node> <arg>import</arg> <arg>--connect</arg> <arg>jdbc:hsqldb:file:db.hsqldb</arg> <arg>--table</arg> <arg>TT</arg> <arg>--target-dir</arg> <arg>hdfs://localhost:8020/user/tucu/foo</arg> <arg>-m</arg> <arg>1</arg> </sqoop> <ok to="myotherjob"/> <error to="errorcleanup"/> </action> ... </workflow-app>
As you can see, each argument here is in its own “arg” tag. Even two arguments that belong together like “–table” and “TT” go in two separate tags.
If you’ll try to put them together for readability, as I did, Sqoop will throw its entire user manual at you. It took me a while to figure out why this is an issue.
When you call Oozie from the command line, all the arguments you pass are sent as a String array, and the spaces separate the arguments into array elements. So if you call Sqoop with “–table TT” it will be two elements, “–table” and “TT”.
When using “arg” tags in Oozie, you are basically generating the same array in XML. Oozie will turn the XML argument list into an array and pass it to Sqoop just the way it would in the command line. Then Sqoop parses it in exactly the same way.
So every item separated with space in the command line must be in separate tags in Oozie.
Its simple and logical once you figure out why :)
If you want to dig a bit more into how Sqoop parses its arguments, it is using Apache Commons CLI with GnuParser. You can read all about it.