Neural networks and how they help our data structure

Given that we have +160 thousand startups in our database and going through them manually requires time, a luxury we don’t have, we were forced to discover other options to systemize our data. Neural networks, while not the simplest solution, provided some of the most promising results.

The “neural network” we’re referring to were originally inspired by the networks within the human brain where information is processed through hundreds of millions of different neurons to produce an outcome in which we make decisions or take action. Neural networks in computing help to crack problems that aren’t very easily solvable by simple intuition or formulas, and they can be used in various fields like image recognition, classification (which we do at Funderbeam), robotics, regression analysis, etc.

Before we go into detail on how we use neural networks at Funderbeam, a brief insight into the concept seems appropriate. Say you want to invest in a startup on Funderbeam Markets, but prior to doing so you’d like to perform some due diligence and leverage your knowhow with data to determine whether it will be a worthwhile investment or not. You then decide to choose a number of different data inputs which you determine will have an effect in evaluating the potential outcome of your investment.

There can potentially be thousands of these inputs to choose from, but for the sake of simplicity, let’s stick with three for now:
– Whether the syndicate was completed in less or more than 60 days
– Whether the startup’s CEO has had previous leadership experience
– If the startup can cover their costs with total revenue

Let’s assume, you also believe the CEO having previous leadership is more important than the startup’s ability to cover their costs with total revenue, so you apply heavier weight to the former input as you want it to have bigger impact in the decision-making.

The caveat here, is in a neural network there are more layers of inputs than in a simple regression. After the initial layer of inputs we laid out, you’ll have another ‘hidden’ layer of inputs which are actually the outputs of the first layer. In this way, neural networks can conduct many scenarios which are too complicated for regular formulas and ultimately produce an output to help you with making a final decision. This output is produced with the help of a threshold value you assign in advance — if the combination of input values and their weights exceed the threshold value, you theoretically would accept the investment and vice versa.

So, how does Funderbeam use neural networks? Well, a lot of the data provided by different sources for startups is incomplete. We then have to fill the gaps and somehow apply tags to startups without them. To do that, we have developed our algorithm on the existing set of data (about which we’ll go into more detail in our next post) so it can automatically assign tags when necessary — a process called artificial neural network modelling.

Of course, there are plenty of errors to algorithmic modelling, some of which can be spotted within the coexistence matrix depicted below. For example, we can see that tags music and agriculture seem to have a strong correlation. One of the startups who’s tagged under both is Kloset Karma which is actually a fashion company. After a little scrutinizing, we discovered the startup had the tag sustainability, and as a result was automatically marked under agriculture as well — a bug we obviously fixed afterwards. Another example is Landr, who specializes in audio production, but our algorithm, for whatever reason, assigned the tag environment to the startup. Landr was then also automatically labelled as an agricultural startup as the tag environment belongs to the sub-category of agriculture. Albeit, most of the strong correlations are logical, like financial services & internet technology or electronics & news and media. While our artificial neural network model has contributed to our data acumen, some of the tag assignments should still be taken with a grain of salt.

artificial neural network modelling

The crossings represent the frequency of correlation between the two tags in various sources — darker boxes indicate stronger correlation.
In our next post, we will cover how we train and tune the model for the algorithm, summarize what has been the outcome of these processes, and how can we improve the neural network even further.
Please leave a comment if you have any input, and make sure to follow so you get notified on our next post!

The other articles in this series:
– How do you organize +160,000 startups into meaningful clusters
Training and tuning your algorithm for optimal performance

Please leave a comment if you have any input, and make sure to follow so you get notified on our next post!

Have data needs?

Funderbeam has updated data on more than 160k startups and 20k investors. We are working together with a range of startups, accelerators, VCs and more to provide data services. If you would like to learn more, get in touch with Nick, our head of data via: Nicholas.Vandrey@Funderbeam.com or connect on Twitter: @nsvandrey.