With the rise of languages that provide package management tools, developers and software engineers are spending more time integrating than coding. There are many great open source projects around the internet, with many that are free to use. These projects vary greatly in terms of what they provide. Some provide advanced features such as machine learning frameworks, user authentication, data modeling, and time-series analysis. Others provide simple functionality such as padding text to the left or right on a webpage. Open source packages save developers hundreds of hours by providing functionality that the developer does not have to write herself. The developer can take the open source project and the code, modify it, if needed, and integrate the new functionality into her application.
A convenient way to include open source projects in software projects is through package managers. The process for including an open source project, in most cases, is as easy as defining the project name and version in a text file. The developer can then run an install command to bring in the open source code. The code is now part of the developer’s software. Many developers then take advantage of the new features which they just installed and continue with the project.
The key statement is that the open source project code is now part of the developer’s codebase for her project. Not only is the code part of the codebase, but the code will, in most cases, be pushed to production. This should raise a series of questions within the organization. What dependencies did the open source project bring in? Do those dependencies have known vulnerabilities? What other features or functions does the open source project provide? Do those features or functions increase the attack surface for hackers by providing additional attack vectors? Or worse, does the project contain backdoors or other avenues for malicious authors/contributors to gain access?
These are all risks associated with using open source projects. What data points could alleviate some of the risks to the organization when researching projects? Are projects with large communities more secure? Are projects with a certain number of downloads more secure? Does the number of collaborators make a difference? How about the number of commits? The number of pull requests? What metrics can be used to highlight and differentiate one project as potentially more secure than another? The total lines of code? I’ll attempt to answer these questions as I continue writing other parts of this series.