KhataBook vs. Goliath

Last week when I saw the following Tweet from the KhataBook co-founder:

It got me thinking about another David vs. Goliath battle….Dropbox vs. Google Drive. Dropbox is a cross platform file-syncing app that was launched in 2007. Since it first launched everyone kept on saying “this is a a feature not a product” that any large technology company can launch. I used Dropbox when it first came out and it was magical because it just worked – files would get synced.

In 2011, Dropbox had a meeting with Steve Jobs of Apple who wanted to buy Dropbox and use their technology as the core for their upcoming iCloud platform. Dropbox passed and said it wanted to be independent. Then in 2012, Google finally launched their long awaited Google Drive syncing product. Which people had been saying for years was going to kill Dropbox. Well, Dropbox didn’t die and in fact in 2018, it went public and as of early 2020 it has a market cap that is just under USD $7.4 billion.

DBX price chart

I know in Silicon Valley that market cap means it’s a feature or lifestyle business when compared to Apple at $1.3 trillion or Google near $1 trillion. Part of that is because there are many file-syncing products that offer similar capabilities and most consumers can’t tell the difference. And when Google Drive is free, most users will opt for that and that makes it tough to compete against free. But, when you compare the market cap of Dropbox against some of the public companies in India it would rank among Grasim and Tata Steel. Dropbox has built a business that most people would be happy to call their own.

So what is KhataBook? It’s an Indian based startup that has taken the simple idea of keeping track of a small businesses credits and debits on a simple Android app. The startup has raised over $29 million from investors like Sequoia, DST Global and others. I’m guessing part of their monetization strategy will be based on their ability to give short term loans to those businesses. Recently the 800 pound gorilla, Paytm, entered with their own version called Business Khata.

For KhataBook the reality is that they do have an uphill battle and companies like Paytm, Instamojo and many others have taken the wind out of their sail by offering a simple to use debit/credit app. For Paytm this is another product offering and they will capture the lion’s share of the market because of their brand name, reach and distribution. On the other hand, the founders and employees of KhataBook will do well and end up building a respectable business just like Dropbox has. From a VC perspective, KhataBook will not be some crazy 50x return with hyper growth going forward. Of course, I really hope I’m wrong.

IoT Analytics – Processing the Data

This is the 3rd of a 4-part series on the Internet of Things (IoT). As I mentioned in my first post, there are 3 parts to the IoT architecture, in what I call “Triple A” IoT architecture:

Let’s continue with the previous blog post and assume you have 1000’s of air quality sensors that are sending data every minute. Now that you have 1000’s of sensors generating all this data, how do you slice and dice the data to make sense of it all. This is where your data science team would take all this data and create something useful from it.

Below is a YouTube clip of someone going through the IoT sensor data to try and build an AQI prediction algorithm.

However, let’s first step back and understand all the hype around the buzzwords you might have heard like Artificial Intelligence (AI), Machine Learning (ML) and Natural Language Processing (NLP).

AI has been around since the 1950s and Machine Learning from the 1980s, but because of advancements in software and hardware it has caused a dramatic growth in the viability of using machine learning in a day to day setting. The basic premise is that the computer can do the thinking that was previously done by humans.

In Machine Learning if you give the computer enough training data about cats and dogs, it will be able to classify future images based on what it learned from the training data. You can go one step further and teach it to classify the breed, here is a link to see how that happens.

With Deep Learning, the computer would “learn” the difference of the breeds and present that data to you. This is very powerful and one of the reasons why so many startups that are involved in AI/ML are so highly valued.

In our example if you deployed 5,000 air quality sensors in Delhi and the data is sent every 60 seconds. That means in 1 hour you will have generated 300,000 “rows” of data and in 24 hours you will have 7.2 million “rows”.

Using other data points such as humidity and temperature the data science team can start to build out an air quality index (AQI). It can start to predict what the AQI levels will be based on weather patterns. Then it can push the envelope and see if there is an AQI difference between buildings in the same area. Or does the AQI level of buildings from the same builder have a consist AQI number across the city. This is really where a data scientist would come up with these scenarios to test against a subset of the data and then start to run it against all the data available.

Much of what the data science team will do is based on commercials inputs from the business analysts and business development teams. Building a data science team is similar to building a tech team – there is no one size fits all. The data scientist leads the team and understands the business problem the company is trying to solve. Then the data engineers and data analysts help to create and deploy the solution. I know this is an oversimplification but for this example it will suffice.

The next and last post in this IoT series is about Action – what happens with the information that’s generated from the data science team.