Should I Use This Open Source? M.Sc CS Thesis🎓
We, users of open source, need better tooling to quickly assess security, quality, maintainability and legality of open source repositories. I am a Computer Science Masters of Science student who is researching open source and creating a system to answer current gaps. In this article I am “open sourcing” the kickoff of my work.
Open source has become a huge community of developers. Only on GitHub for instance, as of Oct 19, there are almost 33 million users. To put in perspective, it may not sound a lot compared to social networks with +1B users so to better understand this 33M number- it’s 7M more than the entire population in Australia! 🇦🇺). Those users are working on more than 29 million public repositories.
Our story begins in 1983
In 1983, Richard Stallman launched the GNU Project to write a complete operating system free from constraints on the use of its source code. This was the kickoff for the GNU open source licence which is one of the most popular licenses in use today (right behind Apache and MIT as of 2018) for open source projects ever since Linus Torvalds released Linux to the world back in 1991. People are using open source for more than 35 years now. Compared to a man’s lifespan- open source should now be in its golden time 👨 . Is it?
README.md is not enough
Meet Joe. Joe is an average developer who needs to build a web service in python. He doesn’t know which framework he should use. No problem. A bit of Google and Github scanning to find a nice article comparing all the most popular frameworks:
Joe thinks to himself: “OK, there are several here that can match my specific need. I’ll simply go for the most popular one.”
Wait, there is more
How would you define “popular”? Github’s stars? ⭐️ Forks? contributors? PRs count? Latest commit date? Insights? How do people talk about it on Stack overflow ? Twitter? and what about security? Is it safe to use? Is the license 👨⚖️ suitable for me? How clean is its code base if I’ll need to understand it deeper? How well was it documented? You see where I am getting. It can take Joe hours and days to examine open source repos.
Do you remember that commercial above? To educate against piracy of movies or music download, they used a car. Let’s go to the vehicle industry one more time to justify what I am about to say:
You wouldn’t buy a used car without getting your car mechanic to vet it
What makes Joe think he is qualified enough to vet an open source repository? There are so many things to consider, as I mentioned above. Where is the mechanic that can easily vet it all for Joe and give him a detailed report and summary on the repository in question?
Innovative inspiration- sourcecred.io
You can’t really get the picture of maintainers activity or community pulse by checking out the “committers count”. It is frustrating and misleading. Then came sourcecred.io (let me help you read it- “source-cred” as in “credit”). The idea behind it is that you wouldn’t say the USA All-Stars basketball team equals to your high-school’s team just because both of them have 5 team members and are playing once a week, right? sourcecred is doing an awesome work in that domain using open plugins architecture and community endorsement to close the gap of correctly estimating the credit of the contributors to the repository’s life.
This is what I need
We have talked about the background and the problems with using open source. Let’s get down to specify the requirements from a solution:
A system that will examine open source repository and will grade it for its quality, community, security, license, …. The list is too long. Every open source user cares about a couple of other things as well. We’ll need a pluggable system that will be easily configurable and extensible by the community. Each plugin will know to do a specific evaluation and the mother system will merge it all to a cohesive report and a final grade.
Meet “vett”. The solution.
vett will know to answer the following question:
Should I use this open source?
vett will utilize a pluggable design pattern to make it easy for the community to extend it with all the tools open source users need to vet open source repositories on all of the different aspects we have discussed. Each vett user will be able to assemble their own suite of plugins to look into what they care the most, or use the community/default standard to go through all the common practices.
Don’t get excited yet. Just tell me what you think
Nice to meet you. Thank you for reading thus far :)
My name is Regev Golan and I am a Computer Science Masters student who is starting his thesis project. I am researching open source and building this “vett” system. I will be happy to hear your thoughts. I believe in feedback prior to starting to code 💻
I am also a believer in experiencing from first hand any research you are doing, so if one is doing a research about open source- He or she should also open source the research itself. Right? :) Please see the following directory of articles I have gathered to understand open source better.
Main question I am asking and welcoming specific feedback to is what is the State-of-Art practices regarding evaluating open source? How do you do it in your company? What do you care about when asking “should I/we use this open source”?
I’ll appreciate your claps to know you like the direction I am taking 👏
Comment / connect me / join the gitter channel. Let’s talk! 📣
Stay tuned. Hopefully I’ll be able to share a quarterly article about my progress and open sourcing my findings, materials and of course codebase (star it⭐ ️and watch 👁 to be updated).
- “MTA, The Academic College of Tel Aviv–Yaffo” — The college I have the privilege of learning for the M.Sc, and especially Dr. Uri Globus for his amazing mentorship.
- HiredScore — The revolutionising HR-Tech start-up I am a part of for the past 3 years as a software engineer. Thanks team for your support allowing me to expand my horizons with this academic degree.