KhataBook vs. Goliath

Last week when I saw the following Tweet from the KhataBook co-founder:

It got me thinking about another David vs. Goliath battle….Dropbox vs. Google Drive. Dropbox is a cross platform file-syncing app that was launched in 2007. Since it first launched everyone kept on saying “this is a a feature not a product” that any large technology company can launch. I used Dropbox when it first came out and it was magical because it just worked – files would get synced.

In 2011, Dropbox had a meeting with Steve Jobs of Apple who wanted to buy Dropbox and use their technology as the core for their upcoming iCloud platform. Dropbox passed and said it wanted to be independent. Then in 2012, Google finally launched their long awaited Google Drive syncing product. Which people had been saying for years was going to kill Dropbox. Well, Dropbox didn’t die and in fact in 2018, it went public and as of early 2020 it has a market cap that is just under USD $7.4 billion.

DBX price chart

I know in Silicon Valley that market cap means it’s a feature or lifestyle business when compared to Apple at $1.3 trillion or Google near $1 trillion. Part of that is because there are many file-syncing products that offer similar capabilities and most consumers can’t tell the difference. And when Google Drive is free, most users will opt for that and that makes it tough to compete against free. But, when you compare the market cap of Dropbox against some of the public companies in India it would rank among Grasim and Tata Steel. Dropbox has built a business that most people would be happy to call their own.

So what is KhataBook? It’s an Indian based startup that has taken the simple idea of keeping track of a small businesses credits and debits on a simple Android app. The startup has raised over $29 million from investors like Sequoia, DST Global and others. I’m guessing part of their monetization strategy will be based on their ability to give short term loans to those businesses. Recently the 800 pound gorilla, Paytm, entered with their own version called Business Khata.

For KhataBook the reality is that they do have an uphill battle and companies like Paytm, Instamojo and many others have taken the wind out of their sail by offering a simple to use debit/credit app. For Paytm this is another product offering and they will capture the lion’s share of the market because of their brand name, reach and distribution. On the other hand, the founders and employees of KhataBook will do well and end up building a respectable business just like Dropbox has. From a VC perspective, KhataBook will not be some crazy 50x return with hyper growth going forward. Of course, I really hope I’m wrong.

IoT Analytics – Processing the Data

This is the 3rd of a 4-part series on the Internet of Things (IoT). As I mentioned in my first post, there are 3 parts to the IoT architecture, in what I call “Triple A” IoT architecture:

Let’s continue with the previous blog post and assume you have 1000’s of air quality sensors that are sending data every minute. Now that you have 1000’s of sensors generating all this data, how do you slice and dice the data to make sense of it all. This is where your data science team would take all this data and create something useful from it.

Below is a YouTube clip of someone going through the IoT sensor data to try and build an AQI prediction algorithm.

However, let’s first step back and understand all the hype around the buzzwords you might have heard like Artificial Intelligence (AI), Machine Learning (ML) and Natural Language Processing (NLP).

AI has been around since the 1950s and Machine Learning from the 1980s, but because of advancements in software and hardware it has caused a dramatic growth in the viability of using machine learning in a day to day setting. The basic premise is that the computer can do the thinking that was previously done by humans.

In Machine Learning if you give the computer enough training data about cats and dogs, it will be able to classify future images based on what it learned from the training data. You can go one step further and teach it to classify the breed, here is a link to see how that happens.

With Deep Learning, the computer would “learn” the difference of the breeds and present that data to you. This is very powerful and one of the reasons why so many startups that are involved in AI/ML are so highly valued.

In our example if you deployed 5,000 air quality sensors in Delhi and the data is sent every 60 seconds. That means in 1 hour you will have generated 300,000 “rows” of data and in 24 hours you will have 7.2 million “rows”.

Using other data points such as humidity and temperature the data science team can start to build out an air quality index (AQI). It can start to predict what the AQI levels will be based on weather patterns. Then it can push the envelope and see if there is an AQI difference between buildings in the same area. Or does the AQI level of buildings from the same builder have a consist AQI number across the city. This is really where a data scientist would come up with these scenarios to test against a subset of the data and then start to run it against all the data available.

Much of what the data science team will do is based on commercials inputs from the business analysts and business development teams. Building a data science team is similar to building a tech team – there is no one size fits all. The data scientist leads the team and understands the business problem the company is trying to solve. Then the data engineers and data analysts help to create and deploy the solution. I know this is an oversimplification but for this example it will suffice.

The next and last post in this IoT series is about Action – what happens with the information that’s generated from the data science team.

IoT Aggregation – Building an IoT Device

This is the 2nd of a 4-part series on the Internet of Things (IoT). As I mentioned in my first post, there are 3 parts to the IoT architecture, in what I call “Triple A” IoT architecture:

IoT Triple A – Aggregation, Analytics & Action

Aggregation is the process of collecting data from sensors and storing it into a cloud provider like Amazon Web Services (AWS), Google Cloud Platform (GCP) or Microsoft Azure. For data aggregation there are 5 components:

  • Sensors
  • Microcontrollers
  • Wireless Connectivity
  • Data Transport
  • Cloud Providers


The sensors are the most basic part in an IoT device. Without a sensor how do you measure the air quality, the temperature or energy consumption to name a few examples. Suppose you are building an air monitoring device, the first thing you would is a sensor that can monitor PM2.5 levels.

One of the more popular sensors is the inovafit SDS011 from China (of course, where else from!). There are probably 100s of air quality sensor models you can pick from to fit the specific use case.


These sensors need power and also a place to send their data. That’s where a microcontroller comes into play. Not, microprocessor but a microcontroller which includes a processor, memory and input/output (I/O) peripherals on a single chip. You would never use a microcontroller to run Windows 10 or macOS since they are not fast enough or have enough memory on board. Some of the companies that produce these chips are Cypress, Microchip (which acquired Atmel), NXP, ST Microelectronics and Texas Instruments.

Microchip ATMega32

But the company that has really come on the microcontroller scene is Espressif and their ESP8266 and ESP32 microcontrollers. Their chips come with built-in WiFi which has become the de-facto standard for many IoT devices connecting to the internet.

However just because you bought a microcontroller and a sensor you can’t really do much. Because as you can see from the image above the ATmega32 microcontroller is just a another chip. That’s where development boards play a big part in the maker community to create cheap IoT devices quickly, almost like rapid prototyping.

Espressif ESP32 microcontroller (MCU)

You might have heard of Arduino, it’s an open source platform that sells dev boards and software to create IoT devices. The other big company selling dev boards is Particle which just raised $40 million to further expand their IoT offering. Another company that recently started to play in this space is Rasberry Pi, which launched their Rasberry Pi Zero which is a cheap dev board to compete with Arduino and Particle. And there are 100s of other dev board makers which is one of the reasons the IoT space is growing so rapidly because of the ability to build prototypes so quickly.

Along with the microcontroller you will need some firmware/software to run on it. All the MCUs have their own software but you can load other open source versions such as Arduino, FreeRTOS, MongooseOS, MicroPython or NodeMCU for example. The biggest constraint for these software options is the amount of memory available on the microcontrollers.

Wireless Connectivity

Since we are talking about IoT devices that implies it will be connected to the internet. Which means old school technology like an RS-485 serial connection is out of the question. Don’t get me wrong, I’m sure someone has created an RS-485 to WiFi module that speaks TCP/IP but if you are building a device from scratch, RS-485 or ethernet is probably out of the question. Although many older IoT devices have ethernet support, I’m willing to bet moving forward we will see less and less of them.

So that leaves us with all the wireless options which there are many. As always, it depends on your specific use case. So for example, an activity tracker could probably get away with Bluetooth connectivity since it just needs to connect to the phone and then the fitness app on the phone will upload the data to the cloud.

For most other devices it will come down to WiFi, GSM (2G/3G), LTE (4G) and 5G. Since 2G/3G networks no longer exist in many countries that further reduces the options. Narrow Band IoT (NB-IoT) which is a subset of 4G is another option for IoT devices since it supports low bandwidth devices. Jio the telecom giant created by Reliance is planning to rollout their NB-IoT network starting in 2020 in India.

Of course, the real push is for 5G since it can handle large amounts of data from billions of devices. The idea of having millions of high-definition 5G enabled cameras streaming to a cloud provider is the ultimate use case. If the network can handle large media streams then it can easily handle billions of smaller IoT data streams from cars, buildings, lamp posts, industrial machinery, etc…

Data Transport

Taking our IoT example, so far we have picked the air quality sensor and connected it to an ESP32 microcontroller. For the firmware we will use Arduino and since the ESP32 supports WiFi we will connect to an access point to get internet access.

The air quality sensor will be generating data and sending it to the microcontroller. Whichever firmware/software you select there are 3 ways to get the data from the microcontroller to your cloud provider.

  1. MQTT – MQ Telemetry Transport is the preferred method for sending low bandwidth information. MQTT is based on the pub/sub network protocol that transports messages between devices. It was created by IBM in 1999 and is open-source.
  2. HTTP – RESTful APIs via HTTP/HTTPS to a server
  3. Websockets – this creates an open connection between the device and an endpoint to allow streaming of data

Cloud Providers

Now that the data is in transit it needs to go somewhere. Luckily there are many cloud providers that can easily ingest all this data. There is of course the 3 biggest public cloud providers with their IoT offerings:

  • AWS has IoT Core
  • Google has Cloud IoT Core
  • Microsoft has Azure IoT

In addition, there are a host of IoT specific cloud providers that focus on the business of taking in all that data from IoT devices. Some of them include:

There you have it, we have gone through the steps of creating an IoT device from scratch and sending that data to a cloud provider. The next blog post will be about analyzing the data once it has reached the cloud provider.

Introduction to IoT Devices

Initially, this was going to be a simple blog post about building an internet connected device more commonly referred to as Internet of Things (IoT) device. But, as I was gathering information I soon realized there is a lot that goes into building a device. So I decided to break this up into 4 blog posts over the next few months:

What is an IoT Device?

An IoT device is a piece of hardware with a single sensor or multiple sensors that transmits the data from the sensor to the cloud via the internet. Some examples of IoT devices are a fitness tracker, smart door locks, proximity badge readers, etc…

In an industrial setting they are called IIoTs (Industrial Internet of Things) and used to monitor the efficiency of machinery, room temperature and humidity, predictive maintenance, etc… In an industrial setting they have had sensors and systems for decades but that is all old school technology using RS-485 connections which means the data is usually stored locally.

There is a big trend called Industry 4.0 which is the “fourth industrial revolution” and part of that is to have these sensors connect via TCP/IP (the protocol of the internet) and send the data to the cloud where it can be aggregated and analyzed.

It’s easy to find reports on how large the IoT market is and all the big consulting companies will roll out their 2020 and 2025 future trends reports. But, I like to look at what the programmers have to say and what do they think of the technology and space. Below is a graph from the latest HackerRank 2019 Developer Skills Report and you can see IoT is at the top of what technology people feel is the most real emerging technology.

2019 Developer Skills Report – HackerRank

Even in a consumer setting the growth of IoT devices is explosive. Using a Nest Thermostat you can control the temperature of your home from anywhere. Using an August door lock you can literally open the door to your home from anywhere in the world. The Canary camera is a security camera that also monitors the air quality, temperature, and humidity in your home. All these devices have sensors that send data back to the cloud and then using an app you can control the various settings when you are not physically in your home.

The Market Potential of IoT Devices

Why is the market getting bigger and do we really need all these devices? I will not bore you with potential market size numbers but instead give you some examples of how the space is growing and why.

Take the above example I gave of the 3 IoT devices for a home, imagine you have listed your house on Airbnb those 3 devices make it so much easier to manage and understand the usage of your property. Without them putting your house on Airbnb would seem risky.

As I mentioned the industrial complex has always had sensors to monitor equipment but the data was always stored locally. With an IoT device the company can now aggregate all their data and make better predictions and spot trends much easier from a centralized team of machine learning experts.

Lastly, IoT is replacing repetitive tasks that previously were handled by humans such as energy meters. In India, the electricity bill you get every month has energy usage that has been collected by a person that visits every meter and takes the reading. Today, it’s a manual process in the future it will be replaced by an IoT sensor – that can detect the energy consumption, send that data to the cloud and then your bill will get generated.

The next post in this IoT series is about Aggregation, building an IoT device to capture the data and sending it to the cloud.

First “git push” to GitHub

Back in November 2011, I created a GitHub account because all the cool kids were doing it. After letting that account sit for 8 years, I finally got around to pushing some code to my first public repo.

I decided to spend some time looking at DevOps and understand what all the fuss is about containers and streamlined deployments. A container is a simple way of making sure your code always runs on any type of computing environment, whether its staging, dev, production, Mike’s laptop, Sonali’s desktop, etc… A more detailed explanation of containers is available here from Docker, which is the leader in containerization.

The purpose of the exercise was to learn all the components involved in building a simple “Hello World” Node.js app that uses the Express framework, Docker and AWS Fargate. Fargate is a way to use containers without all the other headaches that are involved with managing clusters and servers.

After spending time with the technology I can see why people just want to go serverless. The movement from on-premise to cloud computing made sense. I feel the next step is serverless and not this pain in the ass middle step called containers. I can see why DevOps folks love it because it gives them something to do, but from a business perspective it’s not worth it. I would spend those engineering dollars on re-architecting the codebase to go serverless.

Anyways, enjoy playing around with my first public repo on GitHub!

Open-Sourcing the Telecom Rack

I started my career many years ago in the data center configuring Cisco routers and switches and thinking….damn these “boxes” are expensive as shit. I wasn’t thinking much about the future of the data center as I was more concerned about getting XOT working…yes kids, that’s X.25 over TCP/IP.

I was on the Iridium project for Accenture and we worked besides equipment like the Siemens D900 GSM Switch in a telecom data center.

Software-defined networking (SDN) and the Open Compute Project were established to break the proprietary nature of the hardware used in data centers. Instead of buying a Cisco switch with its software you would buy generic hardware from Taiwan or China and use the open source OpenFlow protocol.

Software-Defined Networking (SDN)

This commodization explains the growth of cloud providers like Amazon Web Services (AWS), Microsoft Azure and Google Cloud Platform (GCP). Companies want to ditch their existing data centers and move to cloud providers that have a perceived lower cost since they are using open source hardware and software to bring down the build-out costs.

While all this was happening in the corporate data centers, I figured the telecom data centers were business as usual with expensive proprietary hardware. Well, I was wrong.

I was recently chatting with a friend who is in the telecom field and he told me the same commoditization is happening in the telecom data center. The initiative is called Network Functions Virtualization (NFV). Reading the NFV white paper is a great way to get over your insomnia and also packs a lot of information in it.

The idea was that the core components of the data center – routers and switches were getting commoditized but the applications and servers were still proprietary in nature. NFV is changing that and it has enabled them to bring down the cost for telco providers but more importantly allow them to use the latest cloud technologies such as containers…that’s for another blog post.

Not surprising is that the NFV initiative was driven by the telcos like AT&T, British Telecom, China Mobile, Telecom Italia, Verizon and several more providers. I guess when you have these heavyweights behind a project it gains traction. With the upcoming 5G deployments around the world the telco providers figured they needed to cut costs somewhere and NFV was a good start. The one place I don’t think we will see open source equipment is the radios in the cell towers…that’s where all the money is going to be made by the 5G vendors like Samsung, Huawei, Ericsson and Nokia.

WeWork Collateral Damage in India

The last couple of weeks have been a real shit show for WeWork. It all began when WeWork started it’s roadshow for its upcoming IPO which was being led by JP Morgan and Goldman Sachs. The IPO was targeting a USD $3.5 billion offering. WeWorks’ last valuation was at USD $47 billion based on their Series H raise from SoftBank back in January 2019.

The roadshow highlighted many of the issues that people have been complaining about, mainly their business model. They take on 10-15 year leases on buildings and then turn around and sell seats on a monthly basis. What really kicked up the negativity was a blog post by NYC Professor Scott Galloway titled WeWTF (click for blog post), and WTF was not We Truly Fine! The blog post highlighted many of the red-flags about the upcoming WeWork IPO. It’s fair to say the professor is probably not welcome at any WeWork facility around the world.

So how does the WeWork IPO train wreck lead to collateral damage in India? Well, there are 2 entities that will get affected – Embassy Group and OYO.

Let’s talk about Embassy Group first, they are a property developer based in Bangalore (Bengaluru) and the local JV partner in India for WeWork.

The JV entity is called WeWork India Management Pvt. Ltd., 30% is held by WeWork and the balance 70% is owned by the Embassy Group. In June 2019, they were hammering out valuations and it was reported the JV was worth USD $2.75 billion. Which meant if WeWork wanted to buy out the Embassy Group it had to fork over USD $1.9 billion. At the time of the negotiations WeWork was valued at $47 billion, but after all the drama some are now estimating the company to be worth under USD $10 billion. Which means there is NO WAY the India JV is worth USD $2.75 billion. If WeWork really has taken an 80% haircut then the JV is probably worth in the neighborhood of USD $600 million as the new enterprise valuation.

The other startup that is going to face valuation drama is Oyo. Oyo is backed by SoftBank which also happens to have funded WeWork and Uber. SoftBank seems to have a track record of going big on these bets and pushing for public market valuations even though these startups don’t make a dime in profit. Uber had the same drama with the founder before it’s IPO and was ultimately fired. Uber is currently trading at it’s all-time lows and has yet to figure out it’s path to profitability.

I’m pretty sure in the coming months we are going to hear about OYO along the same lines of these other SoftBank portfolio companies. In fact the NY Times recently had an article about SoftBank founder Masayoshi Son and how these bets may not turn out as expected. But don’t feel bad for Masa, some of his other bets have done fantastic. In 2000, SoftBank made its most successful investment ever – USD $20 million to a then fledgling Chinese Internet venture Alibaba. This investment turned into $60 billion when Alibaba went public in September 2014.