Using the data from pwcmoneytree.com and easy to use dashboard software we perform analytics on a huge dataset that spans 20 years of Venture capital investment data from 1995 onward. Having data that goes far into the history should give us enough to extract the necessary analytical juice out of it.
The year 2000 was definitely the peak for VC investment craziness. A whopping 105 Billions was pumped into startups and bringing them quickly for IPO. Ever since after the crash of 2000… Continue reading the original article
Lot of the entities have already paid the amount back to the government with interest and the government has made profits! Government has also lost money on lot of other organizations/companies that failed to repay back.
Checkout the full Visualization on the profit/loss analysis of Government Bailout money
And Here is the big list of all companies/organizations that made profit/loss and the percentage
Click on the image to see the full list
How has the interest in Big Data, Hadoop, Business Intelligence, Analytics and Dashboards changed over the years?
One easy way to gauge the interest is to measure how much news is generated for the related term and Google Trends allows you do that very easily.
After plugging all of the above terms in Google trends and further analysis leads to the following visualizations.
Aggregating the results by year
It is very amazing to see that the stream representing Dashboards has remained constant through out the years.
So does the stream for Analytics and Business Intelligence in general exihibit similar trend.
Analytics is kind of widening its mouth as we move forward and that is being helped by the combination of terms such as Hadoop + Big Data + Analytics being used almost together.
Now check the line chart below
Looks like the Trend for Dashboards define the lower bound and the trend for Business Intelligence define the upper bound. The trend for Hadoop started around first Quarter of 2007. The trend for Big Data started around third Quarter of 2008 and ever since they both are rapidly increasing. It remains to see whether they will cross “Business Intelligence” in terms of popularity of kind of merge and find a stable position somewhere in the middle.
Before Big Data and Hadoop came into picture the term “Analytics” exhibited a stable ground closer to dashboards but now the trend for Analytics seems to be following Big Data and Hadoop.
Let us take a deeper look into each week since 2004
Look at the downward spikes occuring around Christmas time. Nobody wants to hear about Big Data or Dashboards during holidays.
And finally, here is a quarterly cyclical view
Click here to view the full interactive Visualizations
Quarterly breakup of units sold by manufacturer
Here is more individual breakdown by Quarterly expenditure on Fruits (figures in 100 million)
Average undergraduate tuition and fees and room and board rates
These figures are inflation adjusted and look how just the tuition fees have increased compared to the Dorm and Board rates
Now comparing the rate increase for 2-year program
So for the 2 year program, the board rates have remained at the same level compared to the dorm rates.
Now check out the interesting graph for 4 year program below
Comparing the slope of 2 year Board rates to the 4 year Board rates, the 4 year has significant increase
If price of meals is same for both programs then both 4 year and 2 year programs should have the same slope. So why is the 4 year slope different than 2 year?
Now, let see about the Dorm rates
And finally the 4 year vs 2 year Tuition rates
Here is the data table for the above visualization
The stack bar gives a total view of the killings and how it has grown over the years
By comparing the killings on a line chart we see that the female Deer killings has an uptick from 2008 onwards
iCasualties.org maintains documented list of all fatalities for Iraq and Afghanistan wars.
Analysing the dataset for Afghanistan, we summarize the results by the year
NOTE: This contains only Afghanistan metrics. We will later update the visuals to reflect Iraq war.
We are approaching the levels of 2002 and hope for the best that we don’t have to suffer another wars.
Here is another view by year and month
InfoCaptor : Analytics & dashboards
The dataset contains the age of each person died in the war so summarizing by Age
Checking it against the year
Why so many young deaths between age 20 and 30 for the year 2014?
Where did most of the deaths occur?
Where were the soldiers from?
InfoCaptor : Analytics & dashboards
Cause of Death
Helicopter Crash is the one of the top death cause in Non Hostile situations
GitHub maintains a list of all DMCA takedown notices along with counteractions and retractions if any.
Analysing all the notices from 2011, it seems that the takedown notices are on the rise.
Year View : Notice the sharp increase in 2014
Quarterly view : Now looking at the quarterly breakup, seems like the takedowns are cooling off in the later quarters.
So who is issuing these DMCA takedowns?
Here is the complete list of all companies who issued DMCA takedowns
NOTE: The names were extracted from the description text
And here are the counteractions and retractions
So the important question is “Why the DMCA takedown notices have increased?”
One important thing to note is sites like Stackoverflow encourage to replicate the content of the web page from where the original idea/algorithm or source code is copied from. To be honest it is a good thing because lot of times these referring sites become zombies and you don’t want to lose this knowledge. But could it be the case that such non-referenceable source codes end up in GitHub and hence causing the increase in the takedown notices as companies start discovering them?
Let us move on from Grass Eating Sauropods and talk about who’s who in the analytic space.
For every dime there are dozen analytic companies. Everybody who provides a freaking dashboard is an analytic company. Anybody that merely mentions Google, Facebook, Hadoop etc in the same sentence is somehow into BigData. Haven’t you stumbled across company pages where they claim to be expert in analytics and big data but they want you to schedule a call with them. They don’t have any products or solutions to show case yet they are Big Data/analytics folks.
So to make things easy, Mattermark released this highly curated list of 100 analytic companies. No offense to BigData, but small datasets like these are always juicy.
Mattermakr ranks each company using their own algorithm and calls it “Mattermark Score”. After loading it up, we came up with these visualizations
InfoCaptor : Analytics & dashboards
For each funding stage, it shows the listing of companies by Mattermark score.
Some interesting questions
1. How many companies by funding stage?
2. What is the funding by location and stage?
InfoCaptor : Analytics & dashboards
We thought the above visual would tell us what kind of logic did Mattermark used to rank the companies. As suspected, apparently we cannot reverse engineer it without some additional information about the companies.
First we asked what is the top most program (duh!!) but by how much and who are next in the list and so on.
Like most Data scientists who believe in the power of simple bar graphs we used our first “chart weapon” of choice and here it is what it rendered.
Y Combinator is freaking huge like a dinasaur, infact very much resembles the grass eating Sauropods. In fact we had to create a chart that was 3000 pixels wide just to accommodate all.
See the resemblance between the chart and the Sauropod?
To get better perspective we rendered it in a Treemap as shown
Looking at the treemap, Y Combinator occupies more than the sum total of all the remaining accelerators. That is super amazing but the problem our charts were not coming up beautiful. YC is clearly the outlier and was causing us difficulty to understand the remainder startup ecosystem.
We said, lets cut off the head to dig deeper.
The moment we filtered out YC from our analysis, all of the regions became colorful and that was certainly a visual treat.
Now we could clearly see what are the other accelerators/programs that are roughly the same size.
TechStars Boulder and AngelPad are roughly the same
TechStars NYC, TechStars Boston and 500Startups are in the same club
Similarly DreamIT, fbFund and Mucker Lab share the same color.
Now let us try to see from the location angle
So we re-established that YC is freaking huge and having them on a chart with other accelerators does not create beautiful visualizations.
Digging into the Boston public Dataset can reveal interesting and juicy facts.
Even though there is nothing juicy about Bed bugs but the data about Boston open cases for Bed bugs is quite interesting and worth looking at.
We uploaded the entire 50 mb data dump which is around 500K rows into the Data Visualizer and filtered the category for Bed Bugs. Splitting the date into its date hierarchy components we then plotted the month on the Y axis.
It seems that the City of Boston started collecting this data around 2011 and has only partial data for that year.
Interestingly, the number of Bed bug cases seem to rise during the summer months.
Now if we break the lines into Quarters (we just add the quarter hierarchy to the mix)
Recently, here at InfoCaptor we started a small research on the subject of flags. We wanted to answer certain questions like what are the most frequently used colors across all country flags, what are the different patterns etc.
The innovation engine in the field of Business Intelligence and Data visualization tools , is certainly cranked up. Qlikview, Tableau and Tibco Spotfire introduced new category of Data Visualization term in the field of Business Intelligence.
Now every vendor offers some form of Data Discovery. Oracle is also working on something similar adding to their confusing mix of OBIEE stack.
With the launch of new InfoCaptor, you can perform ad-hoc data visualizations and build dashboards all within the browser. Now that is refreshing. The browser is the key here. Once you deploy on the server, users can simply login, upload their datasets or point to existing database connection. Before you know users are already slicing and dicing their datasets and swimming in the world of beautiful visualizations. Yes, the visualizations are absolutely stunning and why shouldn’t they be. It is based on the excellent d3js.org library.
The key here is that the browser is your canvas and it is pretty huge, for e.g the detfault size for the visuals takes up my entire browser screen real estate. I like big visuals and if I am producing a Trellis chart then I can simply drag the corners and resize it. The visualization library is very comprehensive and offers around 30 visuals. It provides the bullet graph as well for KPI tracking.
Here are some screenshots from the website
InfoCaptor is also available on the cloud as a service and based on that there are few live analysis to try out without login or installing anything.
I would say with this release small business owners have truly found their Tableau or Qlikview alternative.
Go check out the new InfoCaptor Data Visualizer